Getty Images/iStockphoto

Cohesity exec: Gaia will 'unlock' data for AI applications

In this Q&A, Cohesity's Greg Statton explains how the Gaia offering connects data and AI services, with more capabilities coming following the acquisition of Veritas last year.

Backup vendor Cohesity had a very busy 2024.

The vendor debuted Cohesity Gaia last February and added new features and capabilities throughout the year. Cohesity also acquired its former competitor Veritas to become the largest data backup company in the market, according to industry analysts.

Cohesity Gaia is a conversational generative AI assistant that was added to the Cohesity Data Cloud (CDC) platform in February 2024. Gaia -- which stands for "general AI application" -- indexes customer data stored in the CDC through the Microsoft Azure OpenAI large language model service to create reports, audits, knowledge bases and other written content.

Gregory Statton, vice president of AI solutions, CohesityGregory Statton

Gregory Statton, vice president of AI solutions at Cohesity, is leading the company's AI efforts. Statton sees Gaia as the start of a larger AI future for Cohesity, tying the backup platform into AI ecosystems beyond Gaia's chatbot interface.

Informa TechTarget spoke with Statton about Gaia's evolution in the coming year, how the merger with Veritas has affected development and how AI efforts can remain relevant even as popular opinion turns against the market buzzword.

Editor's note: This interview has been edited for clarity and length.

What does the 2025 roadmap for Cohesity Gaia look like?

Gregory Statton: When we tell our clients that they can ask questions of their data, they say, 'That's cool. But what should I ask of my data?'

We recently released this ability [called] Explore Topics to visually explore the themes [of their data]. [The capability] takes all the themes, clusters them and labels them, so the customer can explore a word cloud and start to generate sample questions.

It's sparked some interesting innovations that our clients have built on top of this. We have one client who's using the [Explore Topics] output to help them gain a better understanding of the important data and where it's stored in their environment. They found that they could start saving significantly on their cloud storage bill by rearchitecting where the data lives based on what's in the data.

We're going to continue to innovate here, especially after we consolidated with Veritas. There's still a lot of work to integrate the data from the two companies, which starts to unlock hundreds of exabytes of data into our platform.

There's been a handful of AI and ML [machine learning] tools that Veritas has created and released that we're integrating into the Cohesity [AI] ecosystem.

How has your team and the development of AI services at Cohesity changed since the acquisition of Veritas last year?

Statton: The teams integrated very quickly. We've been working hand in hand since Day 1 of the [deal] closure, and we've already begun putting together plans on what it's going to take to fully integrate the data between the two platforms. Selfishly, I want to be able to unlock all of that data so we can use it in some of our AI applications, but that work is well underway and making great progress.

There's suites of tools Veritas has worked on in the AI space, especially around things like operational insights and conversational reporting. We're taking a look at what they've done, and we're integrating it backward into Cohesity.

What's your future development vision for Gaia and other Cohesity AI offerings?

Statton: We collect a lot of metadata like file size, access time, permissions of files over time. Now, with the generative AI [capabilities], we've captured the semantic embeddings and topic analysis of this data.

What we're looking to do is leverage all this and provide more robust information [about the data]. This could plug in upstream to other data warehousing tools. This could help security teams identify files from a forensic analysis perspective. It's no longer about working with these pieces of metadata in an isolated workflow.

Eventually, we want to get in the space where we can provide tools and services to [AI] model builders.
Gregory StattonVice president of AI solutions, Cohesity

[The future development] is around the [partner] ecosystem. [We see] our ecosystem as providing tools, services and platforms to help our customers deliver AI to the business. Cohesity has always been API first, and being able to plug API services into agentic flows is one of the reasons why these systems are becoming so popular.

We're working with some of the major [AI] services to build native integrations. What I hope to achieve is that data being managed and served from Cohesity helps drive business decisions and outcomes for the future.

People live in Slack, in Teams, in Copilot and other agents. We should bring the data to where they're operating. Eventually, we want to get in the space where we can provide tools and services to [AI] model builders. [Preparing] data is a very manual process today, so one of our goals is to provide native integration into platforms and tools that model builders use day to day.

How have you explained Cohesity's forays into AI to customers who see the platform exclusively for backups?

Statton: We're traditionally a backup and data security company. People always say, 'What are you doing in this space? Why are you saying you're doing anything with AI? You're just jumping on the bandwagon.'

One of the most important assets to make AI better requires data. One of the biggest problems historically in data science and machine learning has been access to high-quality data. Enterprise information retrieval is a core problem across AI, but that's where we position ourselves.

[Cohesity is] backing up the enterprise data across all our applications and across time. People want to bring enterprise data to AI applications, but they're squeamish on the idea of bringing the data to another SaaS environment.

With a platform like Cohesity, we're already protecting the data either on-premises or in the cloud. There's a whole lot of security [protocols] that we've already implemented.

What we do with Cohesity Gaia when we create these embeddings [for these services] is we leave that data in our secure data plane. In the not-too-distant future, we'll release the ability for the customer to bring their own language model.

Tim McCarthy is a news writer for Informa TechTarget covering cloud and data storage.

Dig Deeper on Data backup and recovery software