Sergey Nivens - Fotolia
SwiftStack 7 storage upgrade targets AI, machine learning use cases
SwiftStack's latest upgrade extends its ProxyFS file system to the edge and cloud namespace to file data namespace to cater to artificial intelligence, machine learning workloads.
SwiftStack turned its focus to artificial intelligence, machine learning and big data analytics with a major update to its object- and file-based storage and data management software.
The San Francisco software vendor's roots lie in the storage, backup and archive of massive amounts of unstructured data on commodity servers running a commercially supported version of OpenStack Swift. But SwiftStack has steadily expanded its reach over the last eight years, and its 7.0 update takes aim at the new scale-out storage and data management architecture the company claims is necessary for AI, machine learning and analytics workloads.
SwiftStack said it worked with customers to design clusters that scale linearly to handle multiple petabytes of data and support throughput of more than 100 GB per second. That allows it to handle workloads such as autonomous vehicle applications that feed data into GPU-based servers.
Marc Staimer, president of Dragon Slayer Consulting, said throughput of 100 GB per second is "really fast" for any type of storage and "incredible" for an object-based system. He said the fastest NVMe system tests at 120 GB per second, but it can scale only to about a petabyte.
"It's not big enough, and NVMe flash is extremely costly. That doesn't fit the AI [or machine learning] market," Staimer said.
This is the second object storage product launched this week with speed not normally associated with object storage. NetApp unveiled an all-flash StorageGrid array Tuesday at its Insight user conference.
Staimer said SwiftStack's high-throughput "parallel object system" would put the company into competition with parallel file system vendors such as DataDirect Networks, IBM Spectrum Scale and Panasas, but at a much lower cost.
New ProxyFS Edge
SwiftStack 7 plans introduce a new ProxyFS Edge containerized software component next year to give remote applications a local file system mount for data, rather than having to connect through a network file serving protocol such as NFS or SMB. SwiftStack spent about 18 months creating a new API and software stack to extend its ProxyFS to the edge.
Founder and chief product officer Joe Arnold said SwiftStack wanted to utilize the scale-out nature of its storage back end and enable a high number of concurrent connections to go in and out of the system to send data. ProxyFS Edge will allow each cluster node to be relatively stateless and cache data at the edge to minimize latency and improve performance.
SwiftStack 7 will also add 1space File Connector software in November to enable customers that build applications using the S3 or OpenStack Swift object API to access data in their existing file systems. The new File Connector is an extension to the 1space technology that SwiftStack introduced in 2018 to ease data access, migration and searches across public and private clouds. Customers will be able to apply 1space policies to file data to move and protect it.
Arnold said the 1space File Connector could be especially helpful for media companies and customers building software-as-a-service applications that are transitioning from NAS systems to object-based storage.
"Most sources of data produce files today and the ability to store files in object storage, with its greater scalability and cost value, makes the [product] more valuable," said Randy Kerns, a senior strategist and analyst at Evaluator Group.
Kerns added that SwiftStack's focus on the developing AI area is a good move. "They have been associated with OpenStack, and that is not perceived to be a positive and colors its use in larger enterprise markets," he said.
AI architecture
A new SwiftStack AI architecture white paper offers guidance to customers building out systems that use popular AI, machine learning and deep learning frameworks, GPU servers, 100 Gigabit Ethernet networking, and SwiftStack storage software.
"They've had a fair amount of success partnering with Nvidia on a lot of the machine learning projects, and their software has always been pretty good at performance -- almost like a best-kept secret -- especially at scale, with parallel I/O," said George Crump, president and founder of Storage Switzerland. "The ability to ratchet performance up another level and get the 100 GBs of bandwidth at scale fits perfectly into the machine learning model where you've got a lot of nodes and you're trying to drive a lot of data to the GPUs."
SwiftStack noted distinct differences between the architectural approaches that customers take with archive use cases versus newer AI or machine learning workloads. An archive customer might use 4U or 5U servers, each equipped with 60 to 90 drives, and 10 Gigabit Ethernet networking. By contrast, one machine learning client clustered a larger number of lower horsepower 1U servers, each with fewer drives and a 100 Gigabit Ethernet network interface card, for high bandwidth, he said.
An optional new SwiftStack Professional Remote Operations (PRO) paid service is now available to help customers monitor and manage SwiftStack production clusters. SwiftStack PRO combines software and professional services.