Getty Images/iStockphoto
Snowflake demonstrates shift to AI with newest features
After a slow start to building an environment for developing AI applications, the vendor unveils new features that show it is catching up to its peers.
Snowflake on Tuesday introduced a set of capabilities designed to help customers easily develop and operationalize generative AI applications, including tools that enable users to quickly create chatbots trained on their organization's proprietary data.
In addition, among other features, Snowflake unveiled new governance capabilities, a tool aimed at improving workload performance and cost optimization, and a data catalog for Apache Iceberg data tables. The new capabilities were revealed during Snowflake Summit, the vendor's user conference in Las Vegas.
While some features are generally available, most -- including the tools that enable customers to develop chatbots so that they can ask questions of their data and easily build generative AI applications -- are still in the development and preview stages.
However, despite not yet being ready for general availability, the new capabilities demonstrate Snowflake's move to providing customers with a development environment for generative AI that matches similar environments being developed by competitors, according to Matt Aslett, an analyst at ISG's Ventana Research.
Snowflake was somewhat slower to embrace generative AI than chief rival Databricks and tech giants such as Google and Microsoft. However, since the naming of Sridhar Ramaswamy as CEO to replace Frank Slootman in February, the vendor has been more aggressive, revealing a series of AI-related features such as an integration with Mistral AI and the launch of its own large language model (LLM).
Matt AslettAnalyst, ISG's Ventana Research
"I have previously described the company's plans for generative AI as comparatively nascent," Aslett said. "Although many of the new capabilities are in preview at this stage, Snowflake is now in a much better position following these announcements to talk about the depth and breadth of its capabilities to address enterprise AI initiatives."
Based in Bozeman, Mont., but with no central headquarters, Snowflake is a data cloud vendor whose platform enables customers to store, query and analyze data on a single platform. In addition, the vendor is now building up a development environment for AI where users can create, train and operationalize models and applications using their organization's own data.
On May 30, a blog post revealed that a threat actor using stolen credentials breached Snowflake customers, according to reports. Snowflake, however, denied that its tools were to blame for the breach.
Focus on GenAI
Generative AI has the potential to transform data management and analytics.
Data-informed decision-making has long been hampered by complexity. The need to write code to manage data has limited the number of employees within any organization who can work with data. Likewise, the need for data literacy training to interpret and analyze data has also limited that number.
Generative AI reduces that complexity by enabling true natural language interactions with data, which both enables nontechnical workers to engage with data and helps trained experts be more efficient. As a result, many data management and analytics vendors have made generative AI a focus of their product development in the 18 months since OpenAI released ChatGPT.
Snowflake's first significant foray into generative AI was its May 2023 acquisition of Neeva, a search engine vendor co-founded by Ramaswamy. The next month, Snowflake unveiled initial generative AI development plans and followed that with the introduction of more generative AI features in November.
However, it wasn't until Ramaswamy became CEO that Snowflake's pace of generative AI development began to match that of its main competitors. Now, the vendor has a clear strategy for generative AI and continues to unveil aggressive product development plans.
"We've embraced AI in a pretty big way," Ramaswamy said on May 29 during a virtual press conference. "But we've done it in a very practical way. We want AI to become easy to use, and we want it to become trusted."
The chatbot development capabilities address making AI easy to use.
Snowflake Cortex Analyst and Snowflake Cortex Search -- both of which will soon be in public preview -- are APIs that aim to enable customers to develop chatbots in minutes that understand an enterprise's proprietary structured and unstructured data. Subsequently, the chatbots can be used to query data and develop applications in Snowflake.
Cortex Analyst, developed using the Mistral Large and Meta Llama 3 LLMs, lets users run queries and develop applications using structured data. Cortex Search uses vector search and ranking technology developed by Neeva to do the same using documents and other text-based data sets.
Once generally available, Cortex Analyst and Cortex Search will be significant additions for Snowflake users, changing how they explore data by simplifying data discovery and analysis, according to Kevin Petrie, an analyst at BARC U.S.
"Their Cortex chat capabilities ... will enrich how users explore and analyze multistructured data sets," he said. "This convergence of keyword search with GenAI will give users more confidence in the outputs because they can review the trusted source content alongside the bot's interpretation of it."
In addition to the chatbot development APIs soon to be in public preview, Snowflake will soon make Document AI and Snowflake Copilot generally available.
Snowflake Copilot is a text-to-SQL assistant that targets user productivity by letting users employ natural language rather than SQL code to carry out tasks. Document AI is a tool that enables users to extract text from documents so that the text can be queried and analyzed in natural language.
Beyond conversational capabilities, Snowflake's newly unveiled generative AI features include the following:
- Snowflake Cortex Guard, an LLM-based security tool that filters content across data sets and data products for harmful content, such as violence and hate, and notifies users of such content to help ensure data used to train models is safe and usable.
- Snowflake AI & ML Studio, a no-code interface now in private preview that aims to help accelerate the development of AI applications.
- Cortex Fine-Tuning, a feature now in public preview that enables developers to create personalized generative AI experiences for end users by customizing a subset of language models from Mistral AI and Meta.
- Machine learning operations (MLOps) capabilities including the Snowflake Model Registry to govern access to AI models, the Snowflake Feature store to manage and store machine learning features for model training, and ML Lineage to enable users to trace models and the data used to inform them throughout their lifecycle.
While Cortex Analyst and Cortex Search will perhaps have the biggest impact of all the new AI-related capabilities, those that enable Snowflake users to develop and manage AI applications are also important additions that help create a cohesive environment for development and analysis, according to Aslett.
"Cortex Analyst and Cortex Search will better enable Snowflake users to develop applications that support enterprise decision-making ... while Snowflake AI & ML Studio is designed to accelerate the development of AI applications, complemented by new MLOps capabilities," he said.
Other new capabilities
While Snowflake unveiled a spate of new AI-related capabilities during the conference, the vendor also introduced new features for its traditional data management platform, including new governance tools.
Snowflake Horizon, first introduced in November, is a governance layer that unifies compliance, security, privacy, interoperability and access controls in a single environment.
Now, the vendor is adding new capabilities to Horizon.
Internal Marketplace, currently in private preview, is a location where users can publish data products such as models, dashboards and reports so that they can easily be discovered and operationalized by others within the organization to inform their own work. In addition, Marketplace includes security measures that prevent unintended sharing with external systems and access controls that ensure only employees with proper credentials can work with their organization's data products.
Beyond Marketplace, new collaboration capabilities in Horizon include the sharing of AI models, soon to be in private preview, and sharing of Apache Iceberg tables and dynamic tables.
To further enable data discovery, Universal Search, which lets customers search not only Snowflake, but also Iceberg storage and data storage facilities from third-party providers, is now generally available. The feature was built using search engine technology from Neeva and enables customers to use natural language to find data products.
Meanwhile, in addition to the governance capabilities in Horizon, Snowflake unveiled the Polaris Catalog, a data catalog designed specifically for use with Apache Iceberg that enables users to index and organize data. Iceberg, an open source table format for managing large data sets in data lakes and lakehouses, is widely used to develop a foundation for organizations' data operations.
Snowflake plans to make Polaris Catalog an open source data catalog -- it is not yet open source -- according to Christian Kleinerman, the vendor's executive vice president of product. Meanwhile, tools such as Polaris Catalog, Universal Search and Internal Marketplace are designed to make it simpler for Snowflake users to find and operationalize their data, he continued.
"One of the key aspects of Snowflake's architecture ... is making it easier for customers to leverage the technology and get maximum value out of their data," Kleinerman said. "Our goal is to do away with the need for customers to know where to find what and [provide] a single, central experience to surface a set of data products that will help them with the task at hand."
Other new data cloud capabilities introduced during Snowflake Summit include the following:
- Snowflake Trail, a set of data observability capabilities that provide visibility into data quality as data progresses through pipelines and informs applications.
- The Snowflake Performance Index, a tool that measures the efficiency of the Snowflake platform as it attempts to reduce the time it takes to run queries and other workloads so that users can lower cloud computing expenditures.
- Snowflake Notebooks, a model and application development environment in public preview that integrates with Snowpark ML and Snowflake Cortex AI to provide an interface for the Python, SQL and Markdown coding languages.
- An integration with Git in public preview that enables users to improve collaboration during the development stage.
- Integrations with Alation, DBT Labs, Neo4j, Nvidia and Immuta, among others.
Just as Snowflake's recent tilt toward AI will make the vendor more competitive with its peers, governance capabilities such as Snowflake Trail and the Polaris Catalog are important advancements for the vendor, according to Aslett.
"Snowflake's investment in data governance is strengthening its value proposition as a strategic data platform, as illustrated by Snowflake Horizon, the Polaris data catalog and the Snowflake Trail observability capabilities," he said.
Petrie, meanwhile, noted that it's critical that Snowflake isn't randomly adding new features. It's adding features for developers that work together and complement one another to create a base for building advanced applications, including traditional AI and generative AI.
"Snowflake is right to focus on helping developers build AI applications," Petrie said. "Ultimately, GenAI, predictive ML and other models are features rather than solutions in themselves. To create business value, they need to be part of an integrated user experience."
And combined with the new AI-related capabilities, the governance tools, integrations and other features demonstrate that Snowflake is following through on its heightened focus on AI, he continued.
"This is a wide-ranging set of announcements that accelerate Snowflake's move into the AI space," Petrie said. "Snowflake and Databricks continue to collide with one another. ... While many of Snowflake's announced offerings are not generally available yet, they signal a high level of commitment to the AI and GenAI segment."
Looking ahead
Now that Snowflake has introduced capabilities that demonstrate a distinct focus on AI, the next important step is to make the features generally available, according to Aslett.
Not only is most of what Snowflake unveiled during its user conference not generally available, but much of it has also not even reached the public preview stage. To truly compete with Databricks and the tech giants, Snowflake needs to move its tools beyond the development stage.
"While these new announcements collectively represent a big step forward, other vendors are also aggressively investing in enhanced functionality," Aslett said. "It will be important for the company to bring these new features to general availability as soon as possible and invest in further differentiating features and functionality."
Petrie, meanwhile, noted that Snowflake needs to be careful as it simplifies AI development.
While reducing the need to code and providing AI assistants are important ways of enabling more employees within organizations to work with data and AI, data security and model accuracy also have to be primary focal points as Snowflake builds its development environment.
In particular, Petrie noted that ensuring experts are the primary developers of AI applications is critical.
"Snowflake says it democratizes AI by reducing the coding skills required to customize and deploy it," he said. "This is a risky proposition. AI models, especially GenAI, can generate inaccurate outputs that harm the business. To reduce risk, you need AI experts to lead these deployments. There is some evidence that companies are overcoming these risks, but not enough for mainstream adopters."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.