Getty Images/iStockphoto
Databricks launches generative AI tools, including assistant
The vendor's assistant and automatically generated metadata descriptions together enable users to work with data using natural language and receive accurate outputs in response.
After first unveiling plans to develop a generative AI assistant in July 2023, Databricks has now made the new feature generally available.
In addition to Databricks Assistant, the vendor revealed that AI-Generated Comments is also now available to all customers. Both are free to existing customers
Databricks Assistant, similar to the generative AI chatbots being developed by many analytics and data management vendors, enables customers to use natural language rather than code to prepare and analyze data as well as develop data and AI products. AI-Generated Comments, which is part of the Databricks Unity Catalog, works with Databricks Assistant by using generative AI to describe data tables and columns to improve the accuracy of AI-generated responses.
While general availability is seemingly the eventual next step after a tool such as Databricks Assistant is unveiled and moved to preview, making generative AI tools generally available is significant, according to Kevin Petrie, an analyst at BARC U.S.
"Both AI Assistant and AI Comments are GA, which shows that we're moving past the stage of vendor hype into production offerings and mainstream adoption," he said.
The timing of Databricks' launch of its AI assistant is largely in step with the similar tools that have been unveiled by its fellow data platform vendors.
For example, Microsoft has made one available for Power BI in Fabric, and AWS has made Q available across multiple tools. But Snowflake's AI assistant is still in preview, as is an AWS assistant for Redshift and Google's Gemini in BigQuery.
However, given that most AI assistants are similar and the era of generative AI has only just begun, whether one vendor gets their AI assistant to market faster than another is largely immaterial, according to Doug Henschen, an analyst at Constellation Research.
"Databricks is generally very good about getting to general availability within three to six months of announcing a new feature," he said. "It's still early days for GenAI. I doubt people will remember or that it will make a huge competitive difference a year from now which vendor went GA first."
Based in San Francisco, Databricks is a data platform vendor that helped pioneer the data lakehouse with its development of the Delta Lake storage format. Lakehouses combine the structured data storage capabilities of warehouses with the unstructured data storage capabilities of lakes, enabling users to combine data to gain a more competitive view of their organization.
Over the 19 months since OpenAI's launch of ChatGPT showed a significant improvement in the capabilities of large language models, Databricks has expanded its platform to include an environment for developing generative AI models and applications.
Toward that end, in June 2023, Databricks acquired MosaicML, which now forms the foundation for Databricks' AI and machine learning capabilities. Most recently, at its user conference on June 12, the vendor introduced new features to improve model accuracy, improve data and AI security, and lower the cost of developing generative models and applications.
Simplification and efficiency
Data management and analytics vendors have long sought ways to make their platforms easy to use and accessible to any employee within an organization who can benefit from data.
The complexity of their tools, which usually require coding to carry out tasks along with data literacy training to interpret outputs, hindered that widespread use. Despite the advent of natural language processing (NLP) and low-code/no-code capabilities, only about a quarter of employees within organizations actively use analytics to inform their work.
Generative AI has the potential to change that by enabling the true natural language interactions that past NLP tools, which had limited vocabularies and still required training and expertise to use, could not. Just as significant, generative AI tools can make those who already have expertise working with data more efficient by reducing the amount of code they need to write to carry out tasks.
As a result, both Databricks as well as many data management and analytics vendors now make generative AI a focal point of product development.
Beyond Databricks' rival data platform vendors, such as Snowflake, more specialized vendors such as Alteryx, Alation, Informatica, MicroStrategy, Qlik and Tableau are some of the many that are developing generative AI tools to help customers manage and analyze data.
One of the most common capabilities most vendors have identified as a means of making data management and analytics easier is an AI-powered assistant that enables true NLP. Databricks Assistant is one such tool that was in development and preview for just under a year before being made generally available on June 27.
The tool accesses an individual enterprise's metadata to develop a semantic understanding of that enterprise, which subsequently enables users to ask questions related specifically to the enterprise and Databricks Assistant to deliver accurate responses.
Databricks Assistant is now on every page within the Databricks environment, rather just a few specific locations, and can be used to help with tasks such as data discovery and modeling. Eventually, it will also generate AI- and BI-fueled dashboards and charts, though that feature is now in preview.
Kevin PetrieAnalyst, BARC U.S.
In addition, the tool adheres to the security and governance standards organizations establish in Unity Catalog as well as meets the compliance standards set forth by highly regulated industries, Databricks said.
"The Databricks Assistant is your typical GenAI assistant," Henschen said.
Like most other AI assistants developed by data management vendors, Databricks Assistant is aimed at administrators, data scientists, data engineers and power users, he continued.
"The headline benefits are time savings and improvements in productivity," Henschen said.
According to Databricks, which unveiled the general availability of the AI assistant and AI-Generated Comments in a blog post, Databricks Assistant had 150,000 active monthly users while it was in public preview, with those users reporting productivity increases of up to 50%.
Given that potential for widespread use and improved efficiency, making Databricks Assistant generally available is notable, according to Petrie.
However, given that Databricks is providing the tools for free rather than charging extra for their use, it might signify that at this early stage of generative AI development, providing generative AI capabilities is as much about marketing as it is about generating profits, he said.
In addition, Petrie noted that while NLP tools reduce the technical skills required to work with data, it's vital that all generative AI-generated work is checked for accuracy.
"Databricks says these GenAI features will democratize data management by reducing the technical skills -- coding in particular -- required to build pipelines and document artifacts. While this is true, I believe all these features still need vigilant oversight, inspection and fact-checking by expert humans," he said. "Otherwise GenAI runs the risk of putting errors into production."
While Databricks Assistant is the generative AI-powered interface through which customers can engage with data, AI-Generated Comments is a behind-the-scenes feature that provides the relevant information to inform responses.
By adding descriptive comments for tables and columns, customers can improve the accuracy of generative AI-powered outputs. However, adding such comments is a laborious process when done manually. AI-Generated Comments uses AI to automatically fill those descriptions.
Next steps
With Databricks Assistant and AI-Generated Comments now generally available, Henschen said Databricks would be wise to continue adding generative AI capabilities that make its tools easier to use.
Databricks' platform was aimed at technical experts when the vendor first launched it about a decade ago. Its users were often data scientists developing complex models and applications rather than self-service business analysts.
Rival Snowflake, however, made ease-of-use a priority when first developing its platform. Even though Snowflake didn't prioritize generative AI as quickly as Databricks, when appealing to potential new customers, it has an advantage as far as being accessible to users without technical expertise.
Therefore, the more Databricks can do to simplify use of its tools -- building on recently revealed support for serverless operation to simplify administration -- the better.
"The Assistant could help to democratize use of the platform, so more GenAI capabilities aimed at business users -- particularly in the AI/BI vein -- would be helpful," Henschen said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.