DPUs vs. SmartNICs: What storage admins need to know
To determine whether a SmartNIC or DPU is right for their organization, admins must understand the capabilities of different kinds of NICs and DPUs.
Cookie-cutter environments are no longer adequate for many organizations, especially those that process large amounts of data. SmartNICs are right for some companies, but others might need hyperscale DPUs.
Network interface cards have evolved over time to meet increased network demands, and now, there are several types. The basic or foundational NIC is what most network and server administrators are familiar with. It commonly comes with the server, and it supports 1 Gbps, 10 Gbps, 25 Gbps and, sometimes, even 50 Gbps. For most standard applications -- virtualized or not -- these NICs are fine. They're the least expensive and should be more than adequate. They are colloquially called "dumb" NICs because they passively hand off network packet processing to the server CPU. However, as networking speeds have increased, so has the burden of network packet processing on the server CPU.
The increasing popularity of even higher speeds -- such as 100 Gbps, 200 Gbps and even 400 Gbps networks -- has changed the ecosystem, especially for storage networking. These much higher speeds have placed processing loads that often exceed 30% on the server CPUs. Server CPU cycles for packet processing cannot be used for application processing, which led to offload NICs.
Offload NICs to the rescue
Offload NICs have come to market several times in the past, but this time is different. Historically, as CPUs gained transistors following Moore's law, offload NICs became an expensive and unnecessary option. However, Moore's law has been slowing to a crawl over the past few years. That slowdown made offload NICs viable. They tend to offload network traffic functions, such as packet processing in the TCP/IP stack. This frees up the server CPU to return cycles back to the applications.
Offload NICs are useful when those standard NICs cause the servers to slow down. When NICs bog down the server, it reduces the number of VMs or containers that can run effectively, which means organizations need an offloading solution -- or more physical servers. It's easy to justify the higher price of offload NICs when they reduce the number of physical servers required.
SmartNICs take offload a step further
Many organizations determined that offload NICs were not enough. This led to SmartNICs, which do more than just offload the TCP/IP stack. They're somewhat more flexible than offload NICs, and they have a more programmable pipeline. SmartNICs offload more network processing from the server CPU. In fact, they have their own CPU, memory and OS. What they offload varies by vendor, but SmartNICs can offload tasks such as network compression and decompression, encryption and decryption, and even security.
SmartNICs usually cost more than offload NICs. But, when servers are bogged down again from processing compression and decompression or encryption and decryption, instead of adding more servers, SmartNICs become an obvious choice.
Still, as higher networking speeds have proliferated across IT ecosystems -- especially in storage networking -- IT admins have been looking for more. The next turn of the screw has been the data processing unit (DPU).
What are DPUs?
DPUs are a major evolution of the SmartNIC. The DPU includes the offload, flexible programmable pipeline, processing and CPU of SmartNICs. But the DPU is the network infrastructure endpoint, not the server it resides in. DPUs include custom chips and, in some cases, customized field-programmable gate arrays or custom application-specific integrated circuits. A DPU can support much more than a SmartNIC, including networking based on P4 programmable pipelines, stateful Layer 4 firewalls, L2/L3 networking, L4 load balancing, storage routing, storage analytics and VPNs. DPU functionality varies by vendor. Some of the major players in the market in 2022 are Fungible, AMD Pensando and Marvell.
DPUs help support high-performance storage. They reduce shared storage networking issues and provide storage latencies equivalent to embedded NVMe storage media within servers. That's a significant accomplishment, but it still might not be enough for high-performance storage networking. The issue this time is the switched network.
Switches cannot support massive scale or hyperscale, which didn't exist until the advent of hyperscalers, such as Meta, and public cloud service providers, such as AWS, Azure, Google Cloud Platform, Oracle, IBM and Alibaba. This shortcoming of switches has started to rear its ugly head as organizations have discovered the intrinsic value of their data, which they analyze and mine with analytics databases, machine learning and AI.
The amount of data being analyzed ranges from petabytes to exabytes -- volumes that would have been unheard of just a few years ago. In these data analytics processes, latency matters a lot. The leaf architecture of switches ultimately adds too much latency at hyperscale levels, especially tail latencies. The unpredictable latencies are even worse. This brought about the development of hyperscale DPUs.
DPUs are ideal for storage networking. They cost more than the other NICs, but they do a lot more and can make storage systems more efficient with lower latencies. That can reduce the number of storage controllers required for a given high-performance application.
Hyperscale DPUs take it one step further
Hyperscale DPUs are extremely programmable, and they eliminate east-west switching. They do not eliminate north-south switching. They're generally deployed in a torus mesh and, potentially, a dragonfly or even a slim fly. Rockport Networks is one hyperscale DPU vendor.
Hyperscale DPUs do everything that DPUs do and more. They reduce tail-end latencies and predictable latencies, but they do not necessarily cost more than DPUs. And they eliminate a lot of cost by significantly reducing the switch infrastructure.