Nabugu - stock.adobe.com
Vast Data looks beyond storage to the data lakehouse
Vast Data, a storage software vendor, introduces a new data management platform that unites its software capabilities into a platform.
Vast Data doesn't just want to pool enterprise data but also plans to dredge for new data lakehouse capabilities.
Vast, a software vendor specializing in all-flash storage capabilities, is unveiling its new Vast Data Platform, which introduces data management features to ingest and analyze unstructured data for business intelligence, AI and more.
The Vast Data Platform uses general data management standards and connections, such as SQL databases with NAS and object storage, for an enterprise data lakehouse capability, according to Vast.
Enterprises continue to struggle to extract value out of the ever-growing reams of unstructured data, said Steve McDowell, analyst and founding partner of NAND Research. Bringing lakehouses and other organizing capabilities directly into the Vast storage environment could eliminate some technical and cost overhead.
"They're compressing and bringing all of the elements you need in the storage stack," McDowell said. "Nobody understands your data better than your storage."
From data storage to data management
The Vast Data Platform has always been a planned product, said Jeff Denworth, co-founder and chief marketing officer at Vast.
"We want to give [customers] the platform where you have the infrastructure that can create new information," Denworth said. "We think this, at its core, represents a massive consolidation [of technologies]."
The Vast Data Platform includes existing and newly rebranded software DataStore, for its storage capabilities, and DataSpace, the global storage namespace and associated system management capabilities. It also includes DataBase, the new data management capability.
The company also plans to add to the platform in 2024 with Vast DataEngine, which company executives said would act as "a global function engine that consolidates data centers and cloud regions into one global computational framework." This capability will add support for programming languages such as SQL and Python, and an event notification system designed to help customers build their own AI frameworks or large language models.
The consolidation of tech should lead to savings, according to Vast, which estimated Vast DataBase would cost $0.02 per gigabyte on NVMe drives compared to $0.17 per gigabyte for Google Bigtable on SSDs.
"It's just one scalable, transactional and analytical database management system ... that is integrated with your file and object storage system," Denworth said.
Vast DataBase will support Apache data importers Parquet, Spark and Trino alongside query interfaces such as Databricks, Presto, HPE Ezmeral and Vast's own REST API. Specific features will include remote edge caching, consistent global snapshots, multi-tenancy and other enterprise mandated features.
The service will be sold as part of the on-premises hardware packages offered by Vast and through cloud hyperscalers AWS, Microsoft Azure and Google Cloud Platform.
Costs all float down here
McDowell expects storage vendors will continue to add lakehouse, warehouse or other advanced data management capabilities to their hardware and software in the future, especially as the enterprise looks to draw more value from unstructured data.
Vast Data Platform has no managed service component, requiring an enterprise to build and sustain its own data lake, unlike offerings from Snowflake or Databricks, according to McDowell.
Other storage vendors will add data management components, McDowell said, including Dell Technologies, Weka and NetApp, but would need to build out their software development capabilities as well.
"There's not a lot of companies that have a sophisticated storage stack," he said. "I wouldn't be surprised if other [storage vendors] follow."
Vast's customers, which tend to have petabytes or exabytes of data, could take advantage of the new data lakehouse capabilities, said Marc Staimer, president and founder of Dragon Slayer Consulting.
Scott SinclairAnalyst, Enterprise Strategy Group
The challenge, though, will come from the expense both in terms of personnel and hardware costs of maintaining the growing lake. Today, many enterprises instead choose to offload these stresses onto the cloud through a SaaS, which can lead to slower performance but less operational overhead.
"The problem is you own the data lake [with Vast]," Staimer said, adding that the overhead only becomes more complex as the customer adds more features and data sources.
Transitioning into the data management might give Vast some differentiation in the storage market, according to Scott Sinclair, an analyst at TechTarget's Enterprise Strategy Group. But data management platforms is a mature market, leaving growth opportunities limited to just the standouts.
"They may be able to deliver the best mouse trap in this environment, but they're opening themselves up to new types of buyers and competitors," Sinclair said. "They can't be 20% better, they need to be two times better."
Tim McCarthy is a journalist from the Merrimack Valley of Massachusetts. He covers cloud and data storage news.