HPC, supercomputing share stage with enterprise AI at SC24

AI's effects on supercomputing helped shift the focus toward enterprise demands at SC24, as vendors such as Dell and HPE showcased new offerings.

The annual Supercomputing Conference is typically aimed at high-performance computing and academia, but AI is starting to change that.

This year at SC24, IT vendors such as Dell, HPE, Weka, Pure Storage and DDN not only unveiled new supercomputing products, they also focused on enterprise offerings that suggest HPC's shift from research and academia to the enterprise. Some vendors, such as HPE and Dell, also highlighted liquid cooling at the show, as the technique for cooling infrastructure gains interest due to the heat generation and energy demands of GPUs in AI workloads.

While the Supercomputing Conference has been around since 1988, there has been a noticeable change in the last 10 years from displays of distant future enterprise technology to the tech of today, according to Matt Kimball, an analyst at Moor Insights & Strategy.

"The rate and pace of innovation has drastically accelerated from the innovators to the deployers," he said.

AI and the enterprise

Enterprise and supercomputing might have different goals and processes, but they share several of the same vendors, according to Camberley Bates, an analyst at The Futurum Group.

HPE owns HPC maker Cray. It also built the AMD-powered Frontier supercomputer of Oak Ridge labs, a federally funded science and research lab located in Oak Ridge, Tenn. And Dell powers the Texas Advanced Computing Center supercomputer in Austin. Enterprise infrastructure isn't becoming the dominate draw at SC, but the distinction between supercomputing and HPC workloads and enterprise workloads is beginning to blur due to AI.

This is causing a shift in who comes to SC conferences and how the show is presented, Bates said.

Another shift is the rapid adoption of AI workloads. Next year, AI will fully come into the enterprise as companies move beyond proofs of concepts, according to Steve McDowell, founder and analyst at NAND Research.

"Large language models are going to bring the need for additional compute, even if I'm doing RAG [retrieval-augmented generation] and fine-tuning," McDowell said.

Supercomputing and HPC require high levels of compute, he said, but it is still unclear how much of these computing methods or advancements will filter into the enterprise.

Some of the SC24 highlights from infrastructure and storage vendors included the following:

Dell

Dell's new high performance PowerEdge XE9685L and XE7740 servers were launched for enterprise AI and HPC workloads. The new servers are also part of the vendor's Integrated Rack Scalable System (IRSS). IRSS is a pre-configured rack scale system that includes Dell Smart Cooling, which monitors and manages various cooling technologies, and aims to ease deployment of AI with its plug and play nature.

Dell also said it would support the upcoming Nvidia GB200 Grace Blackwell NVL4 superchip, which combines GPUs and CPUs, in one of its IRSS, the Dell IR7000. Dell expanded its Data Lakehouse to now include Apache Spark for data processing at scale.

HPE

HPE is the vendor behind El Capitan, a direct liquid-cooled supercomputer. HPE also highlighted its new, liquid-cooled Cray Supercomputing blades, the EX4252 Gen 2 and EX154n Accelerator; its new Cray storage system, the E2000; and two new ProLiant Servers for enterprise AI, the XD680 and liquid-cooled XD685.

DataDirect Networks

DDN launched its fourth-generation A3I, an AI storage system with higher performance and scalability. DDN also collaborated with Nvidia on xAI's Project Colossus, a supercomputer built by Elon Musk's AI company primarily aimed at training xAI's chatbot Grok.

Pure Storage

Pure Storage introduced its GenAI Pod, which is the vendor's validated designs for turnkey offerings around generative AI storage. The vendor's FlashBlade//S500, the vendor's performance and capacity optimized storage for unstructured data, achieved Nvidia DGX SuperPOD certification. And last week, Pure also invested in the GPU cloud provider CoreWeave.

Weka

Weka previewed a high-performance storage offering combining its parallel file system, Nvidia Grace CPUs, Supermicro servers, Nvidia ConnectX-7 network interface card and Nvidia BlueField data processing units. This offering is focused on enterprise AI use cases.

Weka also introduced its reference architecture for retrieval-augmented generation, the Weka AI RAG Reference Platform or WARRP. It provides users a blueprint to execute RAG as well as inferencing using its data platform, according to the vendor.

Storage's role in SC and AI

McDowell believes the theme of AI optimization seen at SC24 will find its way into storage offerings for enterprises. A year ago, only a few vendors, such as Weka, Vast and Hammerspace, talked about storage's potential for data management and data manipulation. Those discussions will only grow as compute is optimized for AI, and storage follows, he said.

"This change is happening in AI training first, but it is absolutely finding its way into enterprise," McDowell said.

HPC and supercomputing have been the domain of parallel file systems, Futurum Group's Bates said. This is also the case with AI, but the data requirements for AI are different. HPC does massive amounts of reads and then gives a conclusion whereas AI involves more activity with smaller files, she said.

"We are changing in terms of what those requirements look like," Bates said.

Whether AI will cause drastic changes for storage is yet to be seen, according to Bates. It will depend on what the data is being used for and what type of data is being analyzed. But, she added, it's unlikely to depend on the size of the data.

"If data blows up, we will figure out an invention to shrink it," she said.

Liquid cooling is cool

Liquid cooling was a significant topic at SC24 represented by 22 independent vendors as well as offerings from others such as Dell, HPE and Lenovo. Moor Insights & Strategy's Kimball found the scale of vendors to be impressive. However, he said he still sees different methods trying to find their place in the market.

"I think we are still very early in this cooling game, and what we are seeing in today's market is kind of like the days of discovering fire and inventing the wheel," Kimball said.

The liquid cooling problems seen on the front lines of AI are problems supercomputing has been solving for the last three decades, according to Chirag Dekate, an analyst at Gartner. HPC and supercomputing have been using the technique to control heat and energy for some time; now, he said, vendors such as Dell and HPE are making the concepts more mainstream in enterprise IT.

"Suddenly the nerds are cool kids in town," he said.

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.

Dig Deeper on Data center hardware and strategy