7 NVMe storage gotchas to avoid
Avoiding common deployment errors when implementing NVMe technology will save you time, money and professional embarrassment. Here's what to look out for.
Failing to use the right type of NVMe for a particular application -- or deploying the correct technology in the wrong way -- can lead to performance problems and needless additional costs, not to mention a great deal of aggravation.
Fortunately, most common NVMe storage errors can be avoided by doing some advance research. To get started toward a smooth NVMe deployment, here are seven big mistakes to watch out for and avoid.
1. Stranding direct-attached NVMe SSD storage in servers
Be sure to deploy pooled SSD storage across the entire data center, advised Eran Kirzner, founder and CEO of Lightbits Labs, which offers a technology that separates compute and storage, enabling each to scale independently. An SSD pool places a cache of SSD storage in front of higher-capacity drives to provide cost-efficient improved performance. Without it, performance can suffer.
"Technologies, such as NVMe/TCP, provide near-direct-attached SSD performance, while providing flexibility and allowing for dynamic provisioning of storage," Kirzner said.
2. Believing all NVMe SSDs are equal and selecting vendors on the basis of cost
There are large differences among NVMe SSDs in endurance, I/O consistency and quality of service. "The general trend is toward lower cost, but that often comes with a price of lower endurance and performance," Kirzner said. "Having the right intelligent software layer that manages all the SSDs in the box can dramatically improve endurance and performance of lower-cost SSDs."
Buyers need to ask more questions about use cases, said Caitlin Gordon, vice president of storage product marketing for Dell EMC. "There are essentially a few levels of NVMe support in a storage array, so customers need to understand what a solution offers and how that matches their workload requirements." For instance, there are NVMe-based storage drives -- either NVMe-based flash or NVMe-based storage-class memory (SCM). "These can be deployed as cache or used as persistent media," Gordon noted. "SCM drives, leveraged as persistent media alongside flash, are what change the performance game."
3. Deploying an NVMe storage product that isn't fully compliant
Some NVMe products claim full NVMe compliance, yet are actually little more than proprietary offerings wrapped in fancy marketing. These products may not deliver the expected performance and endurance improvements and cost savings.
Kirzner suggested it's important to confirm an NVMe product is fully compliant. "Ensure that the solution interoperates with other vendors' [products], complies to NVMe specifications and has passed UNH-IOL [University of New Hampshire InterOperability Laboratory] NVMe compliance testing," he said.
4. Attempting to create your own NVMe SSDs
Many large data centers hope to cut storage costs by building their own NVMe SSDs only to find that they can't buy enough NAND when NAND supplies are constrained. "They also quickly learn that they can't keep up with NAND transitions," Kirzner explained. He suggested focusing on generic NVMe features, qualifying a number of suitable SSD suppliers and then reducing TCO by maximizing infrastructure utilization by disaggregating SSD storage from compute.
5. Failing to analyze the performance requirements of application workloads
Henry Hedirector of product management, Virtual Instruments
A cost-benefit analysis is strongly recommended for all new technology purchases and deployments. With NVMe storage, in particular, you want to make sure that the expense of transitioning is justified based on the performance requirements of the application workloads involved.
"To determine if NVMe makes sense, application workload profiles must be measured and analyzed," said Henry He, director of product management at infrastructure performance management software developer Virtual Instruments. The workload profiles can then be used with a workload simulation platform to evaluate the performance gains and determine if the cost is justified on a per-workload basis. If the cost appears to be excessive, adopters should instead focus on optimizing their current fiber channel and consider moving to all-flash arrays, he said.
6. Deploying NVMe on top of the same architectures used for traditional flash
The key issues here are latency and controller bottlenecks. "Traditional controller-based architectures can only do low levels of I/O processing before they slow down, increasing latency and eventually topping out on performance," warned Josh Goldenhar, vice president of customer success for Excelero, a distributed block storage technology supplier.
Current controllers can only accommodate the performance of a single NVMe drive; most enterprise drives perform at 750,000 IOPS, Goldenhar said. Any organization with high-performance, low-latency NVMe demands should consider deploying a controllerless distributed architecture. Using this approach, I/O processing won't suffer traditional controller limitations as devices and network bandwidth are scaled.
7. Believing that NVMe is a type of media
Since the terms flash storage and flash drive are commonly interchanged, it's easy to assume that all NVMe storage devices or arrays use NAND flash. Yet, NVMe is actually an interface and storage protocol, not media. "There are NVMe devices that use other types of persistent memory, including Intel's Optane technology and battery- or supercapacitor-backed DRAM [dynamic RAM]," said John Kim, director of storage marketing for storage and network technology supplier Mellanox Technologies. "It's likely that, in the future, we will see even more types of media in NVMe storage devices."