New Hammerspace capability sets up enterprise AI
The latest update to Hammerspace's software-defined storage system removes metadata from data path, enabling HPC to use data from any storage source without specialized hardware.
Hammerspace's latest update shifts metadata away from the storage stack's data path to enable almost any storage an enterprise has for demanding HPC tasks like AI or ML creation.
Hyperscale NAS, the new capability from the software-defined global-file system vendor, aims to add flexibility and speed for customers who want to shift data into HPC over NAS. Customers can use existing off-the-shelf storage hardware or the cloud with linear scaling performance for up to thousands of storage nodes, according to Hammerspace.
This capability is available today for all Hammerspace customers at no additional cost. Hammerspace is priced by the total amount of data under management.
Enterprises are looking to develop AI or ML models, but most of their usable data is stored in unstructured silos that can be slow to access, said Dave Raffo, an independent storage analyst. Bridging those silos with the GPUs powering HPC is an expensive proposition and a technical challenge.
Hammerspace uses metadata tags and unified namespace to unearth that data, but enabling data from a variety of storage mediums for HPC shifts the product's focus away from data management and into storage infrastructure, Raffo said.
"They're setting the table for AI," Raffo said. "Up until now, these [NAS] products have talked more about collaboration and management. Now it looks like [Hammerspace] is trying to be more of a [HPC] player."
Open standard bearer
Hyperscale NAS uses a handful of open storage standards for its technology to create what Hammerspace considers a parallel file system performance over NAS standards.
The standards include the Hammerspace-developed Parallel NFS (pNFS) Flexible File Layout, NFS 4.2 and NFS 3.0.
The pNFS, which runs a metadata control server using Linux, breaks out the metadata processing tasks from the data path between the storage nodes and HPC nodes. NSF 3.0, a standard for enterprise hardware, moves the storage data into HPC, while metadata traffic is synched with NFS 4.2.
There are options from other hardware or software-defined storage vendors to move metadata away from the data path. But many of these have legacy protocols or hardware sales to consider, said Marc Staimer, president and founder of Dragon Slayer Consulting.
"Anybody can come up with their own hyperscale NAS," Staimer said. "[Hammerspace] is using open standards, but it's how they're using them that's innovative."
Not having a hardware component in its storage product enables the Hammerspace software to remain flexible on a variety of hardware, he added.
"Hammerspace doesn't care if they sell you the storage," Staimer said. "For someone else to do this in the NAS space, they'd have rewrite their whole stack."
Parallel progress
Parallel file systems have long been used for HPC needs, said Ray Lucchesi, president and founder of Silverton Consulting. Compared to traditional NAS, these systems break up the data into smaller amounts to distribute across multiple drives. Chunking that data enables multiple I/O paths concurrently for increased performance to maximized HPC capabilities.
Marc StaimerPresident, founder; Dragon Slayer Consulting
Open source parallel file systems such as Lustre and Panasas PanFS are available, but both require specific storage system configurations or hardware, he said. Hammerspace's approach meets performance demands of HPC that enterprises now need for AI development while avoiding an entire teardown of existing storage infrastructure.
"The [storage] requirements for the enterprise are approaching HPC performance levels," Lucchesi said. "It's like a perfect storm. OpenAI releases ChatGPT and all of the sudden the need for [high-performing storage] in the enterprise becomes apparent."
Lucchesi and Staimer agree that the capability enables storage services on Hammerspace to perform at similar or better standards than vendors such as Vast Data or WEKA with greater scalability.
"Hammerspace had a blank page when they started this discussion years back," Lucchesi said. "They've gone down the path to implement this from the start."
Tim McCarthy is a journalist from the Merrimack Valley of Massachusetts. He covers cloud and data storage news.