Getty Images/iStockphoto
Treating data as a product a method to grow analytics use
Treating BI assets such as models and dashboards as commodities is an emerging trend as organizations continue to seek new ways of making analytics use more widespread.
As enterprises seek out new ways to extend analytics tools to more employees, the approach of treating data as a product is gaining momentum.
Studies show that the percentage of employees within organizations using data has been stagnant for about two decades, hovering at around a quarter of potential users.
Analytics tools are complicated, usually requiring code to query and analyze data, and at the least requiring significant data literacy training. In response, vendors in recent years have developed natural language processing (NLP) capabilities and low-code/no-code features that reduce coding requirements.
Nevertheless, analytics adoption has remained unchanged, given the limitations of those NLP and low-code/no-code tools.
Most organizations, meanwhile, want to increase analytics adoption. They want to be data-driven, said Kevin Bohan, director of product marketing at data integration vendor Denodo, on Feb. 1 during TDWI's Virtual Summit, a conference hosted by the data and AI education and research company.
Bohan noted, however, that a NewVantage Partners survey of executives from more than 100 Fortune 500 companies showed that only 24% characterize themselves as data-driven. Even fewer -- 21% -- self-reported having developed a data culture within their organization.
"There's a lot of energy and effort going into becoming data-driven, yet most organizations are identifying as not having been successful to date," Bohan said.
Kevin BohanDirector of product marketing, Denodo
With the rise of generative AI large language models (LLMs) over the past year, technology could finally be reaching a point where it can make analytics use simpler and thus help organizations expand their use of data to inform decisions.
LLMs have far more extensive vocabularies than the NLP tools analytics vendors have developed. In addition, LLMs are trained to understand intent. The two combine to make true conversational interactions possible, greatly reducing the need to write code to work with data.
In addition, embedded analytics, which delivers data in a consumable form to workers within their normal workflows rather than forcing them to learn how to use a BI platform, continues to gain popularity as a means of making analytics more consumable.
But more than just technology is needed to maximize analytics use.
A philosophical shift is needed as well. Data -- as well as data assets such as dashboards, reports and models -- needs to be treated differently. It should be easy to find, and potential consumers of the data need to be shown why it's useful.
Data needs to be treated as a product rather than merely information. Technology, in turn, needs to foster that philosophical ideal.
Data as a product
Data assets such as models, reports, dashboards and other data applications are often referred to as data products. Terming them data products is not, however, the same as treating data as a product.
Like data itself, data assets are often isolated within organizations. Different departments often use different tools for their data operations, from data ingestion through management and analysis. And those data systems are disconnected from one another.
The data collected by the finance department, for example, is isolated from the data collected by the marketing department. Similarly, the models, reports, dashboards and other data assets developed for the finance department to inform its decisions are isolated from the data assets developed for the marketing department to inform its decisions.
Even when not isolated, data assets are frequently tightly controlled by a centralized data team, or dumped in some repository with limited access or inefficient search parameters that make them difficult to discover.
As a result, data duplication is an inevitable consequence with multiple people commissioning or creating similar data assets. So is a lack of data lineage that enables potential users to understand where data originated and how it has been used since it was ingested.
Both can lead to data quality problems.
"There is a disconnect between what is being provided to data consumers and what those data consumers need," Bohan said. "Users are being expected to deal with a level of complexity in the data that isn't sufficient for what they're doing. By doing that, it is causing problems overall for their organizations."
Data as a product is a way of thinking that addresses that disconnect by seeking to make data assets easy for data consumers to discover and operationalize. It treats the assets as would a retailer that sells products by making them appealing to potential customers and easy to find.
"We can make it easier for users," Bohan said. "If we can lower the level of skill that's required to be able to leverage data effectively, that's the quickest path to a win. We're not delivering data in a way that consumers can easily make use of it."
Displaying the contents of a data asset alone doesn't equate to treating it as a product, he continued.
That would be the approach a grocery store takes with its produce, piling oranges on top of one another with no information about each orange -- where it was grown, when it was picked, how long it was in transit and what chemicals were used to keep it fresh -- other than cost.
Instead, treating data as a product is more akin to the way medication is packaged and displayed on a shelf in a pharmacy to enable consumers to make an educated decision about which medicine to buy. Beyond the brand name and its price, the packaging contains information such as the medication's intended use, ingredients, potential side effects and expiration date.
Enterprises should apply that same product management mindset to data, according to Bohan.
Enterprises need to consider how data assets are going to be consumed and how assets should be packaged and stored so that they are useful and discoverable to anyone who wants to use them.
"The more that [an organization] is thinking of reuse, the better value they are going to be able to get from each of the data products they create," Bohan said. "You don't want to create a data product that is going to be used for one use case and that's it. That is just creating a particular report. The value comes from reuse and making it applicable to more and more users within the organization."
Technology
While treating data as a product involves a philosophical shift, technology needs to enable the shift, according to Bohan.
However, organizations need to strategically choose and deploy the technology they use. Technology should help users understand the availability of a data asset, how the data used to inform the asset is structured and how users can use the asset.
"What we need to do is deliver an experience that creates trust," Bohan said.
Part of delivering that experience involves borrowing strategies developed by e-commerce vendors over the past two decades, such as using tools that help deliver personalized recommendations. In addition, the tools should be able to provide users with shortcuts to finding their organization's most popular data assets, foster collaboration during the analysis and decision-making processes, include a feedback loop to ensure data quality and improve data trust, and deliver data assets in real time when they're relevant.
Data catalogs are one tool that can make data assets easy to find, encourage collaboration, fuel confidence and be programmed to deliver recommendations. Data catalogs are applications that create an inventory of an organization's data and data products to make raw data, reports, dashboards, models and other assets easy to find and operationalize.
In addition, catalogs enable administrators to govern data and data assets with access layers. Such layers safely limit certain employees from access to sensitive information to ensure the proper use of data. They also provide overseers with tools to set up recommendation systems so that data consumers don't have to proactively search for information that is potentially relevant to their work.
"It's not enough to provide data in the format that's within a database and use the naming that's within a database," Bohan said. "It should be provided in business-friendly ways so users get access to data they're able to understand in the language they're accustomed to using."
In addition to making it easy to search for and discover data, data catalogs reduce the risks associated with data duplication because they make data accessible to anyone with the proper access credentials, he added.
Among data management vendors, Alation and Collibra are data catalog specialists, while larger data platform vendors such as AWS and Google provide data catalogs among many other offerings.
Beyond data catalogs, a federated approach to data stewardship is a tenet of treating data as a product.
Historically, data was overseen by centralized data teams that limited access and parsed out data upon request. As both the volume and complexity of data grew, however, a centralized approach to data management proved ineffective. Bottlenecks developed, and the lag time between when a request for a report or dashboard was submitted and the time it could be developed and delivered made real-time decision-making impossible.
In response, decentralized approaches to data management including data mesh and data fabric have been developed. Such approaches distribute data ownership, removing it from centralized teams and federating it to domains and departments where domain expertise and having an actual stake in decisions help change how users view data.
Meanwhile, to avoid data isolation, organizations that adhere to this approach employ tools such as data catalogs to connect the organization's domains and bring its data together.
"Creating data products is often going to be an effort [related to] federation of the information," Bohan said. "Having different owners own the data who understand the data better [than a centralized team] is important."
Vendors specializing in data mesh include Starburst and Informatica, while Denodo and Cloudera focus on data fabric.
Outcomes
While treating data as a product might represent a philosophical shift, the approach has tangible results.
According to McKinsey & Company, a data-as-a-product approach can reduce data operations costs -- including technology, development and maintenance -- by 30%, while increasing the speed of new business use cases by as much as 90%.
In addition, given the tools such an approach requires, data governance burdens and risk of misuse both decline.
The intangible results, however, might be even more important, according to Bohan.
Simplified access to data and broader use of data aren't easily quantified. But they are what lead to widespread self-service BI within an organization, which can result in the agile decision-making needed to act and react amid a challenging economic climate.
Worldwide events beginning with the onset of the COVID-19 pandemic in 2020 and including the war in Ukraine, repeated supply chain disruptions, rising inflation and fears of a recession have combined to create a constant state of uncertainty in the business world.
Real-time decision-making is required to manage that turbulence. And informed real-time decision-making can only result when an organization makes data a core asset.
"The true value, in my eyes, is allowing users to more easily ... understand what the data is used for and use it in their job," Bohan said. "When they're able to do that, that's when you start building a data culture."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.