DataScava

“DataScava perfectly complements existing approaches to unlocking the value of unstructured text data – by helping companies to model higher-level intents and purposes behind the labeling and classification of data – by capturing the abstract topics and themes that represent their own business and subject matter expertise – and by applying both to big data sets in real-time.”

–Scott Spangler, Chief Data Scientist, IBM Distinguished Engineer, Author “Mining the Talk: Unlocking the Business Value in Unstructured Information”

Visit the DataScava website to learn more.

How TalentBrowser Uses DataScava

TalentBrowser’s Domain-Specific Search, Automated Skills Analytics, Weighted Topic Scoring, and Talent Matching are powered by DataScava’s unstructured data miner, which is built upon two U.S. Patents.

Our tools generate value-added structured metadata from raw, unstructured text — including resumes, professional profiles, job descriptions, and related content. This metadata can be used in Talent Matching, People Analytics, Workforce Planning, Resource Management, and other HCM initiatives.

Why DataScava Is Different

DataScava’s deterministic core technologies —

DSIndex | Domain-Specific Language Processing (DSLP)
DSTopics | Tailored Topics Taxonomies (TTT)
DSMatch | Weighted Topic Scoring (WTS)

— work as an adjunct or alternative to NLP/NLU, delivering precise, explainable, and measurable results. Unlike black-box systems, DataScava gives you full control over and visibility into how unstructured text content is processed.

With DataScava, you can model and extract topics across diverse and messy datasets using custom taxonomies that reflect your domain language and business priorities — allowing you to build your own specialized vocabulary and logic for complex document mining.

From the Expert’s View

To illustrate how DataScava stands apart, we commissioned a series of articles from Scott Spangler.

In these pieces, Scott explores how DataScava’s approach to mining unstructured text aligns with real-world needs across AI, ML, RPA, BI, Research, Operations, Talent, and more. He also compares our methods with standard NLP and NLU techniques — showing where deterministic modeling outperforms probabilistic inference in enterprise settings.

The Articles

“Machines in the Conversation: The Case for a More Data-Centric AI” article published in CDO (Chief Data Officer) Magazine

“Machines in the Conversation: The Case for a More Data-Centric AI” original article including a section about DataScava’s approach

“Executive Q&A: DataScava, AI and ML”

“The Key Ingredients for Game-Changing Business Intelligence (BI) from Unstructured Textual Data”

“Consistent High-Quality Robot Process Automation (RPA) Requires Deep Customer Understanding”

“Who’s in Charge of Your Business: The Humans or the Machines?”

In Scott’s first article, he discusses:

The pitfalls of using a fully automated approach to critical decision-making.
The desirability of having a parallel human-machine partnership that regulates and monitors the inputs and outputs of automated approaches.
The three basic ingredients that are needed to make that hybrid process successful and how DataScava implements each of these components.

Here’s an excerpt:

“Algorithms will be more effective in the long run if they are part of a more holistic framework that includes user-controlled domain-specific ontologies, statistical analysis, and rule-based reasoning strategies. These are the basic ingredients that a tool like DataScava provides.”

DataScava . . .

“Is a robot ally in humanity’s struggle for control of how we utilize big data to make decisions. By providing tools for capturing the key underlying topics and rules that govern important concepts of the business needs, it evens the playing field so that machine learning no longer has to have the final say on critical business decisions.

Can supervise the process based on human-provided expertise and determine which data to use for training and which to avoid, as well as in which situations to trust deep learning decisions and when to fall back on more rule-based approaches. Such processes put the humans back in charge and allow the machines to serve their intended role as adjuncts and trusted advisors.

In partnership with a trained human mind – can act effectively as a tool for giving the left brain an equal say in big data decision-making tasks.

Can play a leading role in helping businesses manage and maintain their big data more efficiently using information ontologies, statistics with visualization and rule-based approaches.

Perfectly complements existing approaches to unlocking the value of unstructured text data – by helping companies to model higher-level intents and purposes behind the labeling and classification of data – by capturing the abstract topics and themes that represent their own business and subject matter expertise – and by applying both to big data sets real-time.

Provides a practical, easy-to-use tool-set for capturing the critical business ontologies that provide the critical bridge between unstructured text data analysis using standard data science techniques and the human expertise that gives your business its competitive edge.

When a deep learning system and DataScava agree on a classification, that’s ideal because then we now have a plausible explanation for why the deep learning algorithm decided the way it did.

Can help data professionals and business people use machine and human intelligence together to make their messy unstructured text data more accessible, understandable and actionable.”

Your data. Your expertise. Your rules.

TalentBrowser keeps you in control.