Getty Images/iStockphoto

Nvidia beefs up its line of AI supercomputers, chips

Nvidia rolls out a new GPU architecture, supercomputers and chips that in concert will aid developers in creating the hardware foundation for the next generation of AI applications and services.

Complementing the basket of AI software introduced at GTC, its annual conference, Nvidia debuted a new lineup of hardware designed to help users and developers power its upcoming generation of software and services.

Nvidia showed off its new Hopper GPU architecture along with a new H100 Tensor Core GPU built using the new architecture.

The company also offered a look at a new version of the NVLink high-speed interconnect that Nvidia initially introduced in 2014. The new version will connect all of the company chips going forward, including CPUs, GPUs, data processing units and system-on-a-chip (SOC) products.

In his keynote address, given from a virtual environment in Nvidia's Omniverse real-time 3D platform, Nvidia CEO Jensen Huang said that "with AI racing in every direction," he plans to attack the challenge with Earth-2, which he trumpeted as the industry's first digital twin supercomputer.

Last November, Nvidia revealed plans to build Earth-2, which the company said would be the most powerful AI supercomputer dedicated to predicting climate change. The system would create a digital twin of the Earth that would exist in the Omniverse.

Nvidia also unwrapped the Grace CPU Superchip, the company's first CPU for the high-performance computing market. The offering is comprised of two CPUs connected over a 900 GBps NVLink, creating a 144-core processor with 1 terabyte per second memory bandwidth, Huang said.

Nvidia has always been about accelerating data and workloads, and with this level of horsepower, they are tuning their systems for next-generation applications.
Dan NewmanPrincipal analyst, Futurum Research, and CEO, Broadsuite Media Group

"Nvidia has always been about accelerating data and workloads, and with this level of horsepower, they are tuning their systems for next-generation applications," said Dan Newman, principal analyst at Futurum Research and CEO of Broadsuite Media Group. "They also realize it's hard to be on top and stay on top, so there are competitive pressures to continually deliver [hardware] improvements."

With nine times the performance of its predecessors, the Nvidia H100 is the biggest gain in performance the company has achieved with a GPU, Huang said. When working in concert with the new NVLink switch, the new offering can connect up to 32 DGX servers, turning it into a 1 exaflop system that can perform 1 quintillion floating-point operations per second, according to the company.

"Enterprises seeking to build deep learning training infrastructure stacks will likely see the potential for capturing greater value with this latest generation," said Chirag Dekate, vice president and analyst at Gartner.

The H100 has 80 billion transistors and uses the Taiwan Semiconductor Manufacturing Company's 4-nanometer manufacturing process. The chip was designed for both scale-up and scale-out architectures. One H100 chip can sustain 40 terabits per second of I/O bandwidth, the company said. Dekate added a cautionary note that developers and users should consider how to support more powerful hardware.

"Data center designers and users trying to leverage these latest technologies should also be planning and organizing their facilities to accommodate the greater power budget these processes will likely demand," he said.

Jensen Huang, CEO, NvidiaJensen Huang

During his keynote, Huang said the H100 is the first GPU capable of conducting confidential computing. Until now, only CPU-based systems could support that technology.

"It [H100] protects the confidentiality and integrity of AI models and algorithms of the owners," Huang said. "Software developers can now distribute and deploy their proprietary AI models on remote infrastructure, protect their intellectual property and also scale their business models."

Now in production, the H100 is expected to be available sometime in the third quarter.

Nvidia also introduced its DGX H100 AI computer. When used with NVLink to connect to other systems, it can be transformed into a single GPU with 640 billion transistors and carry out 32 petaflops of AI performance, with 640 GB of high-bandwidth memory.

Huang debuted yet another system, the Nvidia Eos, which he trumpeted as the world's fastest AI supercomputer when it becomes available in "a few months." Currently being built in the Hopper AI factory, the system will feature 576 DGX H100 systems with 4,608 DGX GPUs and be capable of providing 18.4 exaflops of AI performance -- faster than the Fugaku supercomputer in Japan, now considered the fastest system. The Eos system will serve as a blueprint for advanced AI infrastructure from Nvidia and its partners, the company said.

Because the system uses elements of quantum computing, it can provide bare-metal class performance and feature multi-tenant isolation, ensuring that one application doesn't affect any other applications.

"Multi-tenant capability is critical for us because even our own Eos computer will be used by our AI research teams as well as by numerous other teams, including engineers working on our autonomous vehicle platform and conversational AI software," Huang said.

Huang wrapped up the tidal wave of new system hardware announcements with the availability of Orin, a centralized AV and AI computer that serves as the engine for electric vehicles, robo-taxis and trucks that started shipping earlier this month.

He also pulled back the curtain on Hyperion 9, which will feature Nvidia's Drive Atlan SoC for autonomous driving. Production won't begin until 2026.

As Editor at Large with TechTarget's News Group, Ed Scannell is responsible for writing and reporting breaking news, news analysis and features focused on technology issues and trends affecting corporate IT professionals.

Next Steps

Nvidia, Intel team up on energy efficient AI server

Dig Deeper on Data center hardware and strategy