Understanding data discovery tools in the enterprise

Data discovery tools, aided by the data discovery process, visualize and contextualize data for business users. They are essential for informed business decision-making.

Data is useless unless it can deliver insights that help organizations make better business decisions. A good data discovery tool is key to delivering those insights.

There are data discovery tools that provide functionality in multiple areas, but the data discovery process may require separate tools for data preparation, visual data analysis and advanced analytics, as many tools in this category are dedicated to a single aspect. These tools are software applications within the business intelligence category that enable organizations to search for patterns or specific items with the data or the data sets they gather from a variety of sources.

Data discovery tools typically use visual presentation mechanisms, such as geographical maps, to speed up the data discovery process and the process of discovering patterns or specific items within the data. The visualized data often comes in the form of dashboards, reports, charts and tables.

With traditional BI applications, data visualization arrives through standard charting, limited graphical representations, key performance indicators or other methods. With the growth of big data and analytics, BI has evolved, with an emphasis on data analysis and discovery by users, access to larger data volumes and the ability to create more advanced information presentations.

Search-based data discovery tools

Search-based discovery tools enable users to develop and refine views and analyses of structured and unstructured data using search terms. According to research firm Gartner, search-based tools have three main attributes:

  1. a proprietary data structure to store and model data gathered from disparate sources, which minimizes reliance on predefined BI metadata;
  2. a built-in performance layer using RAM or indexing that decreased the need for aggregates, summaries and pre-calculations; and
  3. an intuitive interface that allows users to explore data without much training.

In addition to having a broader scope than a visualization-based data discovery tool, which focuses exclusively on quantitative data, a search-based tool differs at the UI layer. It uses text search input and results to guide users to the information they need.

Who needs data discovery tools?

Companies need data discovery tools to unlock the true value of data, enabling many types of business users to make informed business decisions. For example, business analysts can use these tools to identify significant trends and relationships in data that could help a company fine-tune a sales or marketing campaign. Rather than simply relying on historical sales data based on monthly reports, analysts can gain new insights from the data by looking at it from different angles, discovering patterns that might not have been evident before.

Another example is in manufacturing. A company could use data discovery tools to identify flaws in factories that hinder the production process, or supply chain issues that are slowing down the delivery of goods to customers and retailers.

A data discovery tool can also help a company reduce risk. For instance, risk analysts at a bank or insurance company can use the tool to better determine which customers are higher risks from a financial perspective, based on financial stability, repayment records and insurance filings.

Another benefit of these tools is they offer customized insights for business users. Every company has specific goals and priorities in mind with their data analytics efforts, and individual users within a company have their own particular points of interest with regard to data. Users can often customize data discovery tools to meet those needs and focal points. Customization includes being able to select parameters and set the ways data is displayed for users.

Data discovery tools can also play a role in regulatory compliance efforts. The General Data Protection Regulation -- a set of rules to provide data protection for citizens of the European Union -- is creating interest in data discovery tools. Companies are using them to find customer data tucked away in email messages, presentations and other random corners of the organization.

Limited IT need

One of the biggest benefits of a good data discovery tool is it empowers business users and decision-makers with greater insights -- all with limited IT support. Many of these tools and their features are self-service. Business analysts and other users can retrieve and analyze data and share findings and reports with colleagues without needing to rely on IT staff to pull the numbers or for guidance.

The trend of self-service BI has opened up new opportunities for companies to gain value from data more quickly, because it takes IT out of the picture, which reduces delays. These tools streamline the data discovery process and business insight, leading to greater agility for companies. Decision-makers can act on information fast enough to keep from missing out on market opportunities, which might have passed them by if there were a need for IT intervention.

Data discovery challenges

Along with the benefits of data discovery tools come several challenges that organizations need to address. The self-service capabilities of many of these tools, while providing greater efficiencies, can also create risk. Without IT involvement and intervention, questions related to data governance arise.

These include data quality issues. Poor data quality cost companies an average of $15 million in 2017, according to Gartner's Data Quality Market Survey. Furthermore, poor data quality can cause project failures. IT might need to intervene to help ensure data accuracy and consistency.

Also related to data governance is the challenge of data security and privacy. How can organizations be sure that data is protected and being used in a responsible manner, especially when business users have broad latitude with data discovery tools? It's a valid concern, considering the rise of data breaches and the amount of personally identifiable information companies are gathering.

How can organizations be sure that data is protected and being used in a responsible manner, especially when business users have broad latitude with data discovery tools?

And companies are gathering lots of data, as sources continue to expand. For example, internet-of-things sensors and devices gather product usage, environmental and location data, which was simply not available 10 years ago.

A good data discovery tool can easily and quickly process large data volumes from a variety of sources, both internal and external. But, again, it's up to the company and IT to ensure proper security and governance.

Challenges notwithstanding, data discovery tools are clearly becoming vital components of the data strategies at many organizations. And future developments will add even more capabilities.

For instance, AI and machine learning enable discovery tools to be "smarter," broadening their appeal even more for enterprises. Gartner noted in a 2017 report that the rise of augmented analytics -- an approach that automates insights using machine learning and natural-language generation -- marks the next wave of the data and analytics market.

This is not a far-off concept. Organizations should be planning now to adopt augmented analytics, as these capabilities continue to mature.

Dig Deeper on Business intelligence management