Getty Images/iStockphoto
10 top data discovery tools for insights and visualizations
Data discovery can use sampling, profiling, visualizations or data mining to extract insights from data. Choose from 10 of the top platforms in the market that best fit the user base.
Anyone trying to discern patterns and extract insights from their data must employ data discovery. Success depends on finding and using the right tool for the job.
The term data discovery tool can refer to tools that enable the discovery of valuable data through features such as sampling and profiling. It might also refer to tools that make useful discoveries within data sets, perhaps with visualizations or data mining. The most common use is to classify the more advanced self-service BI tools that enable users to explore data sets through query tools and visualizations to create dashboards and reports. In the current market, many data discovery tools include augmented analytics, which can automatically apply machine learning techniques that make additional discoveries.
Market research, customer reviews from reputable sources -- including Capterra, Gartner and G2 -- and author experience identified 10 of the top data discovery tools. Each tool selected has strong market presence and feature support for discovery and exploration, rather than merely presentation. An analyst or analytics department could adopt and use any of the highlighted tools as a standalone product. Some excellent enterprise-scale machine learning or embedded analytics platforms are not included.
In my career, I have worked on designing and developing several of these tools as a product team member or as a consultant and advisor. However, rather than having some hidden preferences, this experience has shown me that the use cases and best practices for data analysis tools vary widely with the working practices, thought processes and organizational habits of users and teams. Each vendor can find its niche within the ecosystem of users and organizations.
The unranked list is in alphabetical order.
Amazon QuickSight
Amazon QuickSight is a cloud BI service for simple data visualization and dashboards, which can integrate with machine learning to generate insights. QuickSight features limited customization and data connectivity. It works best with straightforward, structured data models, but it's a cost-effective option with serverless architecture and pay-per-use pricing.
QuickSight appeals to existing AWS users. It seamlessly integrates with other AWS services, such as the Redshift data warehouse, without requiring extensive setup or configuration. Being part of the AWS data and analytics stack is an advantage for QuickSight customers because many enterprise architects prioritize cloud integration over competitive differentiation of analytics features.
QuickSight's integration with Amazon SageMaker for machine learning enables users to access SageMaker models within their QuickSight analyses directly. The combination of tools enables advanced scenarios, such as anomaly detection and forecasting, without requiring extensive technical expertise.
QuickSight does have some constraints compared to its competitors. The peer user community is emerging slowly with limited user training. Regional third-party integrators and service providers are few and far between.
Despite the limitations, Amazon QuickSight remains a compelling choice for those looking for a cost-effective, scalable and cloud-native BI tool, especially for teams already invested in the AWS ecosystem.
Domo
Domo is a cloud-based BI platform that provides comprehensive data integration, visualization and collaboration tools. It offers low-code/no-code tools for creating business apps, making it particularly popular among executive users due to its mobile-friendly design.
The platform features a wide range of intelligent data connectors, enabling integration with numerous business data sources, including spreadsheets, databases, social media, and cloud-based and on-premises software applications.
Domo's Magic ETL tool is complete but simple and scalable for data exploration and extraction, even for nonspecialists. Business users can use the app framework to build apps for analysis and simple workflows.
The platform has a reputation for premium prices, but Domo customers generally feel they get value for the money, especially from the extract, transform and load (ETL); machine learning; and app building capabilities.
Google Cloud Looker Studio
Despite the names, Google Cloud Looker and Looker Studio are two different data analytics and visualization tools in the Google Cloud ecosystem. Although both tools assist in data discovery, they serve different user needs and skill levels.
Looker is an analytics platform much favored by developers for embedded analytics. It has strong modeling language and excellent APIs. Google acquired Looker in 2020. Looker Studio was internally developed and previously called Google Data Studio until the branding was consolidated in 2022.
Looker Studio is a free, web-based data visualization and reporting tool that can create interactive dashboards and reports from various data sources. It features a user-friendly, drag-and-drop interface. The tool seamlessly integrates with various Google products and, unusually for simple tools, supports real-time data updates.
Another critical advantage of Looker Studio is its cost-effectiveness, offering a free tier for easier startup. The tool provides a surprisingly wide range of data source integrations, including Google Sheets, BigQuery and widespread marketing platforms. It primarily focuses on data visualization and needs advanced features for complex analysis.
Its ease of use, collaborative features and real-time capabilities make it a competitive option for users already invested in Google Cloud that need simple data visualization and reporting tasks.
Microsoft Power BI
Microsoft Power BI is currently the leading data discovery and BI application in the market, offering cloud and desktop versions. Its close integration with the Microsoft ecosystem, such as Microsoft 365, Teams and Fabric, underlies its success. Power BI supports ad hoc analysis for self-service users and a visualization marketplace for third-party add-ins.
The integration of Copilot in Power BI enables useful generative features, such as natural language queries, automated report generation and formula suggestions.
It is a mistake to think of Power BI on Azure as a simple, default tool for existing customers, such as Amazon QuickSight and Google Cloud Looker Studio. Power BI is a far more capable application for standalone BI use cases. Embedding and automation capabilities come from Power Apps for no-code business apps and Power Automate.
Despite its dominant role in the market, Power BI has potential downsides. Large data sets -- notably in the desktop version -- can have performance issues, and users commonly report crashes. Also, the platform can be complex for beginners, especially when working with the Data Analysis Expressions formula language.
Power BI's complete functionality and a supportive global network of enthusiasts, developers and consultants make it worth consideration for any data discovery operation.
MicroStrategy
MicroStrategy started as a reporting and dashboarding platform more than 30 years ago, but it is a leader in security and governance features, scalability and mobile apps. It remains a popular choice among the largest enterprises, often combined with Teradata on the back end. It is available in the cloud and on premises.
Its benefits come with a steeper learning curve, older UI and higher costs. It's working to develop its machine learning and generative AI features.
MicroStrategy is still a front-runner for many larger enterprises that value its key differentiators: scale, security and rock-solid performance.
Pyramid Analytics
Pyramid Analytics offers a sizable vertical stack of analytics capabilities for a relatively small vendor. The data engine excels at performance and scale. UX is helpful and productive for nontechnical users, with friendly terms such as "present" or "illustrate" to give a capable environment a familiar feel. It has two deployment options: self-hosted in the cloud or on premises.
The platform takes self-service from data sources to collaboration seriously. Pyramid offers management tools across the data stack, including data quality, security and governance. It's an attractive option for healthcare or financial services teams where regulatory demands might be challenging.
In March 2023, Pyramid introduced machine learning features and some AI integration. It remains most attractive to organizations that need secure, strong, high-performance and cost-effective analytics for smaller teams and departments.
Qlik Sense
Qlik Sense is the second generation of Qlik's original QlikView, an innovative self-service application for desktop analytics. It inherits from QlikView its associative engine, which is a highly flexible and insightful tool for data exploration and discovery. Not every user finds a need for the full power of the engine, but the ones who do often say they could not get their results with any other tool.
Following several acquisitions of data management, connectivity and ETL vendors, Qlik Sense offers excellent data integration and data quality capabilities. Qlik Sense is simple to deploy either on premises, in Qlik Cloud or in multi-cloud environments. It integrates well with other platforms and has good IT governance features.
Partly because the platform grew by acquisition, UX can be uneven, making the learning curve steep beyond the basics. It's no longer the simple desktop self-service tool it once was, but it offers unique capabilities.
Salesforce Tableau
Tableau is a longstanding thought leader in the data discovery market. Salesforce acquired Tableau in 2019. The platform is more of a visual analytics tool within the Salesforce ecosystem and less of a standalone application than before.
Tableau excels at creating interactive dashboards, reports and visualizations without being technical. It's still unequaled in the range and quality of its compelling visualizations.
As Tableau consolidated with Salesforce, some more intriguing capabilities, such as Tableau Pulse, its AI-powered insight engine, are less standalone and more integrated. It's an improvement for the everyday business user who needs insight, but it's less critical to data explorers digging in for their data discoveries.
For Salesforce users, Tableau is a natural first choice for analytics at all levels. It remains an essential tool for people who see data discovery as a primarily visual process. Long-term concerns focus on uncertainty around how long it will remain an authoritative standalone tool and, given Salesforce's own cloud-only focus, how long the desktop version remains available.
Tellius
Tellius is the smallest and newest vendor on the list. It has excellent natural language query capabilities for business users who wish to make data discoveries using everyday language.
It has taken some exciting approaches for business users, such as Vizpads, a way to explore data using multiple visualizations on one page using various data sources and global filters. It's a valuable feature for business users analyzing data across their business but lacking the skills to define complex joins or apply advanced filters.
Although Tellius uses natural language and AI in its platform, it offers little integration with other platforms that a well-established business may already use.
As an emerging vendor, Tellius is worth watching. Its innovative and imaginative approach to analysis for business users can be productive. Tellius is available in the cloud with a microservices-based deployment for scaling up or down as needed.
ThoughtSpot
ThoughtSpot entered the market a few years ago by offering a new paradigm: search-based analytics developed by former Google engineers. ThoughtSpot proliferated because few competitors had similar natural language or search features. ThoughtSpot's momentum visibly slowed down as generative AI, natural language queries and interfaces become generally more commonplace.
The tool still has much to offer users. It excels at integrating natural language, search, analytics, visualization and collaboration into a coherent analytics practice. Even if competitors offer similar core features, the tooling and administrative features ThoughtSpot developed over the years remain important for productive work.
It is challenging to deploy, requiring a well-resourced IT team. Initially, the most common deployment was an on-premises hardware appliance. Today, it is more common to deploy to ThoughtSpot Cloud, a SaaS offering available on AWS and Google Cloud.
The costs are somewhat higher than many users prefer. Although its paradigm is productive, it is still new, and business users have a learning curve to take full advantage of its capabilities.
Overall, ThoughtSpot is in a strong position to use generative AI features. Its end-to-end UX shows that it understands the workflow of natural language query, discovery and the associated needs for security, governance, compliance and oversight.
Making a choice
The data discovery market is diverse and dynamic, with each platform offering unique strengths, capabilities and weaknesses.
Organizations must consider several factors to select the best tool for their situation. Begin by assessing the tool's compatibility with existing infrastructure -- especially connectivity -- its scalability and performance capabilities to handle the required data volume and complexity.
If users lack technical expertise, evaluate the ease of use and learning curve. Consider the visualization and exploration capabilities and support for advanced analytics and AI. Match the tool experience to employees' working styles.
Prioritize tools that support collaboration and sharing of insights across teams and departments, while meeting security and governance needs, especially in a highly regulated business.
Price is important, too. It's essential to consider total cost of ownership, including training and support. Long-term costs can be surprising.
The integration of AI and machine learning plays a critical role in the future development of the data discovery market. It affects many vendors, providing users with automated insights and recommendations. The evolution of AI might usher in a new era of data discovery in which users of all skill levels can easily access, explore and derive value from data.
Selecting which data discovery tool to use depends on an organization's employees, their preferred ways of working, the budget and any existing technology stack already in use.
Donald Farmer is principal of TreeHive Strategy, who advises software vendors, enterprises and investors on data and advanced analytics strategy. He has worked on some of the leading data technologies in the market and in award-winning startups. He previously led design and innovation teams at Microsoft and Qlik.