Getty Images/iStockphoto
GoodData launches new platform for AI, ML and BI workflows
The vendor's FlexHouse Analytics Lake provides a single environment for normally disparate data assets to simplify AI, analytics and machine learning development and analysis.
GoodData unveiled FlexQuery Analytics Lake, a new data platform that incorporates generative AI and combines a host of usually separate capabilities in a single location to simplify and speed AI, BI and machine learning development and analysis.
Generally, features such as data products, datasets, metadata, result caches and other assets that fuel decisions all require different storage locations. For example, data products such as reports, dashboards and machine learning models are frequently organized and accessed through data catalogs while metadata is stored in a semantic layer.
FlexQuery Analytics Lake instead combines the myriad data assets that organizations develop in a single location so they can be found and operationalized to fuel AI, machine learning and traditional business intelligence workloads.
In addition, the platform features generative AI capabilities, including a virtual assistant that provides summaries and enables users to ask questions of their data to speed discovery, development and decision-making together with GoodData's composable BI platform.
Essentially, FlexQuery Analytics Lake provides a single location for different analytics capabilities in the same way a data lake provides a single location for different data types, according to Roman Stanek, GoodData's founder and CEO.
That approach to data management and analytics, meanwhile, stands out and represents significant innovation, according to Donald Farmer, founder and principal at TreeHive Strategy.
"I think it is a real advancement," he said. "The converged architecture bringing together all core components of modern analytics -- storage, compute, metadata, reporting-- under one optimized platform is an innovative approach that could greatly simplify and streamline data workflows."
Some large vendors with full-featured data management and analytics capabilities offer something similar, Farmer continued. But GoodData as an independent BI vendor is following the same path.
"The others who are doing this are major vendors with a much more complete stack. So it's great to see an independent vendor taking this on," he said.
The platform
GoodData plans to roll out FlexQuery Analytics Lake in four stages. The first, which the vendor is calling FlexCache, was made generally available on Oct. 17.
FlexCache was built on the in-memory Apache Arrow and Intel AVX-512 architectures and enables caching with no limitations on scale, which could result in substantial savings, according to Farmer.
By enabling organizations to cache data and BI assets in one location without limitations, GoodData aims to reduce query times and lower processing costs by reducing the need to query cloud data warehouses.
"By reducing dependency on cloud data warehouses for query processing, FlexQuery can possibly lead to substantial cost savings given cloud data warehouses are rather expensive in practice," he said.
Beyond FlexCache, FlexQuery Analytics Lake will feature three additional releases over the next 12 months, according to the vendor.
The second will feature governance capabilities including extract, transform and load (ETL) as well as data engineering tools. The third will focus on data source federation. The last will feature AI-driven intelligence capabilities, such as pre-caching.
The key to the whole platform, however, is that it enables organizations to easily organize vast numbers of analytics assets in one location, which can then be used to meet the demands of modern data-driven organizations, according to Stanek.
That includes AI and machine learning, which are gaining momentum as generative AI becomes a reality and organizations train large language models using their own data to inform decisions.
"The problem is, there are tons of analytics artifacts, like semantic models, caches and pre-calculated metrics, and there's no one place to put them," Stanek said. "That was okay when everyone had everything in their [BI platform]. But now we're at a confluence of machine learning, AI and BI. Having one place where all these [assets] are located is needed."
Without an environment such as FlexQuery Analytics Lake, developing a machine learning model takes much work, Stanek said.
In a typical workflow, a user would have to take data from their BI platform, export it to a CSV file, load that CSV file into R Studio or another development environment, do clustering within that development environment, return the transformed data to a CSV file, and then upload it back into a BI platform.
"It's a lot of copying and converting, which we thought was wrong," Stanek said. "There needs to be a place where all of these data files and models live regardless of who uses them."
Beyond providing a means to organize and locate troves of data and data products -- which will only grow as the volume of worldwide data grows exponentially and organizations' own data follows suit -- one of the goals of FlexQuery Analytics Lake is to also provide an environment that caters to various different personas within organizations, according to Stanek.
Because the platform features both low-code/no-code interfaces with generative AI capabilities as well as analytics as code, the platform can be accessed by analytics engineers, business analysts, machine learning engineers and self-service users.
Targeting those varied personas is significant because it has the potential make data accessible to a broad audience, according to Farmer.
"Support for both expert and casual users helps drive broader data access and democratization," he said.
Analytics as code, meanwhile, is perhaps the platform's most significant differentiator, Farmer continued.
Analytics as code is the use of code to create and manage analytics products and workflows. While there's an emphasis on no-code and low-code interfaces to enable more people within organizations to work with data, code allows for more complex development and analysis.
Using an analytics-as-code approach, data teams are able to collaborate on data products using engineering tools such as GitHub and develop continuous integration/continuous delivery pipelines to automatically test and deploy changes.
"Analytics as code is perhaps the greatest differentiator and the thing that gets me personally most excited," Farmer said. "[It enables] faster and lower risk iterations as changes can be tested and rolled back easily. Also, manual, repetitive tasks can be automated to boost efficiency. And there are a lot of these tasks in the world of ETL, data quality and designing aggregations."
However, analytics-as-code could pose problems for some organizations, Farmer added. It's a different paradigm than traditional analytics development and, therefore, will take time for data teams to learn.
"Analytics-as-code may be a very promising paradigm for the future. But it's a steep learning curve for teams which have already deployed traditional analytics," Farmer said. "And that, in one form or another, is the majority of enterprises."
Another potential hurdle for FlexQuery Analytics Lake could be the complexity of migrating data products and other data assets to the platform, according to Farmer.
For customers just getting started with analytics that have no existing models or metadata files to migrate, there's no issue. But for those with extensive libraries and catalogs, data migration might not be simple.
"FlexQuery Analytics Lake may be a great offering for new implementations but it could be complex to migrate existing systems to this architecture," Farmer said. "Overall, I think the offering is excellent, but there are downsides."
Next steps
GoodData has a clear vision and schedule for updating and improving FlexQuery Analytics Lake.
The vendor outlined the four-release process in a blog post publicly revealing the platform for the first time -- there was no public statement during the testing or preview stages -- and it promised delivery over a 12-month period.
Ultimately, GoodData's vision for FlexQuery Analytics Lake is to provide an environment in which those who work with data within an organization can work together using BI products and other data-related features that previously would not have been found together, according to Stanek.
Donald FarmerFounder and principal, TreeHive Strategy
"The end goal is to support a collaborative approach where, for example, data engineers build new data models, datasets and data perspectives; someone else easily pick them up and starts using BI; and then embeds machine learning into it," he said. "There's a collision between tools, and that collision wouldn't be possible if they were stored in different places."
Farmer, meanwhile, said the roadmap for FlexQuery Analytics is promising.
"The focus on performance, cost efficiency, openness and governance addresses key analytics challenges companies face today around scale, costs and data quality," he said. "The vision of a unified analytics platform optimized for both builders and consumers aligns well with modern data stack requirements."
In addition, because GoodData's new platform is an industry first among independent BI vendors, it will likely influence other independent vendors to develop similar tools that reduce the complexity of an organization's data operations, Farmer continued.
"GoodData has a first-mover advantage and will be attractive as an independent vendor rather than a stack vendor," he said.
However, GoodData would be prudent to make governance a priority as it continues to roll out more capabilities over the next 12 months, Farmer added. Beyond FlexQuery Analytics Lake, GoodData's traditional BI platform could benefit by supporting more cloud options.
"I'd like to see hybrid- and multi-cloud well supported," Farmer said. "Also given the comprehensive nature of [FlexQuery Analytics Lake], integration with data catalog solutions could make governance and policy management more effective."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.