your123 - stock.adobe.com
Dremio adds text-to-SQL tool, new partnerships to fuel GenAI
The latest update from the data lakehouse specialist includes text-to-SQL translation capabilities and new partnerships that help create an ecosystem for GenAI development.
Capabilities that address speed, cost control, ease of use and an open environment for developing generative AI models are all part of the latest platform update from Dremio.
The vendor's new features, revealed on May 2 during a live user event in New York City, include expansion of its Apache Iceberg-backed data lakehouse to suit any deployment environment, generative AI tools including text-to-SQL translation and integrations that enable AI model development.
Taken together, the new features have the potential to substantially benefit Dremio customers, according to Stephen Catanzano, an analyst at TechTarget's Enterprise Strategy Group. In particular, the integrations will be beneficial given that Dremio's data lakehouse platform stores data, but needs to connect to other systems for analysis.
Stephen CatanzanoAnalyst, Enterprise Strategy Group
"It's an important announcement to say they have interoperability with leading analytics companies and others in the GenAI space," Catanzano said. "The lakehouse is really a data repository, and they have some capabilities internally, but need to partner to enable GenAI and analytics to deliver solutions."
Based in Santa Clara, Calif., Dremio is a data lakehouse vendor whose tools combine the structured data management capabilities of data warehouses with the unstructured data management capabilities of data lakes. Because lakehouses enable users to combine disparate types of data to create large data sets that provide a complete view of an organization's operation, they are one of the preferred repositories for data that can be used to train AI models and applications, including generative AI.
In addition to Dremio, Databricks is a lakehouse specialist, while tech giants including Microsoft and Google offer data lakehouses as part of their widespread data management and analytics offerings.
New capabilities
Generative AI, which enables true natural language interactions with data, has been a significant focus for many tech vendors, including data management and analytics specialists, over the past 18 months.
True natural language processing (NLP) has the potential to enable more people within organizations to work with data than was previously possible -- when knowing how to write code was required to use data management and analytics platforms. In addition, it has the potential to help data experts become more efficient by reducing time-consuming tasks.
True NLP, however, wasn't possible before large language models such as ChatGPT from OpenAI and Google Gemini were developed with vocabularies as large or larger than a dictionary and the ability to understand intent. With enterprises able to choose LLMs now available from not only OpenAI and Google, but also Anthropic, Mistral and Cohere -- among many others -- data management and analytics vendors have made integrating with such platforms a priority.
Through such integrations, capabilities including AI assistants and text-to-code translators that simplify use of their platforms are possible, as is creating environments where customers can develop their own generative AI models and applications.
Dremio first unveiled generative AI features in June 2023, including the preview of text-to-SQL translation capabilities, a vector lakehouse and an autonomous semantic layer.
Now, the vendor has made generally available text-to-SQL translation capabilities that enable intuitive natural language queries and responses, along with generative AI-driven data descriptions and labeling that help users quickly discover and curate data for analysis. In addition, Dremio unveiled partnerships with AI and machine learning specialists Dataiku and DataRobot aimed at helping customers build AI models and applications.
While text-to-SQL translation is important, and generative AI capabilities are becoming must-haves for any vendor, the partnerships with Dataiku and DataRobot could be even more significant for Dremio users, according to Catanzano.
Generative AI capabilities let users more easily interact with data. Partnerships with AI specialists enable those users to build and manage the models and applications that lead to deep analysis.
"I think the most significant update is the expansion of partners relating to GenAI," Catanzano said. "As a lakehouse, they need interoperability with leading ... solutions like DataRobot."
The impetus for developing the new generative AI capabilities, meanwhile, came from Dremio's goal to provide self-service capabilities to its customers, according to James Rowland-Jones, the vendor's vice president of product management.
Past integrations with self-service analytics platforms such as Tableau and Microsoft Power BI, as well as developing low-code/no-code capabilities, were aimed at enabling Dremio users to engage with data without needing the expertise of a data engineer or data scientist, he noted.
Text-to-SQL translation and generative AI-driven descriptions and labeling add new ways for customers to easily work with data.
"These capabilities now enable a whole new user base that is not SQL-savvy to discover data and ask questions they could never easily ask before, and derive value and insights from their data," Rowland-Jones said.
Beyond new generative AI features and partnerships with AI vendors, new Dremio capabilities include the following:
- An expansion of the vendor's Apache Iceberg-based lakehouse so that it can be deployed in any environment, whether cloud, on-premises or hybrid.
- New security measures for the Apache Iceberg lakehouse, including air-gapped network security, so that it can be deployed in highly regulated environments.
- Partnerships with Vast Data to introduce the Zero Trust Lakehouse Platform with Dremio, and StackIt to provide European organizations with a fully managed open lakehouse platform.
- Improvements to Dremio's SQL engine for Iceberg aimed at accelerating query speeds, including recommendations for optimal usage and scheduled use of Reflections to ensure the inclusion of up-to-date data.
- The incorporation of the open source Project Nessie's transactional catalog with Dremio's software to simplify data engineering.
- Support for AWS Graviton processors aimed at helping customers improve performance that can lead to cost savings.
All but the support for AWS Graviton are generally available. Support for AWS Graviton is in preview.
With the addition of generative AI capabilities along with partnerships that better enable customers to develop AI models and applications, Dremio is providing capabilities in line with those of competitors such as Databricks, according to Catanzano.
However, Databricks is developing more of its own tools than Dremio is, he noted. For example, Databricks has built two LLMs -- Dolly and DBRX -- and its Model Serving environment enables users to deploy and manage AI models. Dremio, on the other hand, is using partnerships to provide users with integrations that enable similar development, deployment and management capabilities.
"Some competitors like Databricks partner with everyone and then also have a high level of internal capabilities for AI and are continuing to expand on them," Catanzano said. "Dremio is accomplishing the same capabilities, but with their partners."
Next steps
As Dremio plots its roadmap, enabling self-service data management and analysis remains a theme for the vendor, according to Rowland-Jones. In addition, performance optimization and ease of use are ongoing areas of focus.
Catanzano, meanwhile, suggested that Dremio would be wise to develop more of its own capabilities to support development and deployment of AI models, especially generative AI models.
Just as Databricks has not only established partnerships with vendors to expand model development and management, but also built tools of its own, Dremio could complement external partnerships with more internal product development.
"Similar to the Databricks model, [Dremio should] continue to build or acquire more internal capabilities to support the consolidation of data for ... foundations used in AI and other related AI technologies," Catanzano said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.