Getty Images

How CXL 3.0 technology will affect enterprise storage

Understand CXL 3.0 technology before its impacts on storage take serious effect. While the specification improves on previous generations, it could also demand more from storage.

Admins might construe the 3.0 specification of Compute Express Link as a faster and more expensive form of networked storage, but it's aimed at a different application.

CXL is the last leg in system disaggregation and enables IT to reassign memory as a resource, as it already does in storage, I/O and servers. In this way, large data centers hope to use memory more efficiently, reduce "stranded" memory and provide applications with larger memory sizes than a server can economically support. There will be an impact to storage.

CXL 3.0 takes the standard several steps further

The CXL 3.0 specification expands on previous standard generations to increase scalability and optimize system level flows with switching and fabric capabilities, peer-to-peer communications and resource sharing across multiple compute domains, according to the CXL Consortium.

CXL 3.0 migrates CXL from the original PCIe 4.0 in CXL 1.0 and 1.1, and CXL 2.0's PCIe 5.0, to PCIe 6.0. The bandwidth has doubled every generation and will now run at 64 gigatransfers per second. Version 3.0 implements several complex features that improve the coherency that all CXL versions maintain across the system without increasing latency. The standards are also backward-compatible.

Chart comparing CXL 1.0, 2.0 and 3.0

The most significant changes to the specification relate to memory since CXL will disaggregate memory. In its first incarnation, CXL connected memory -- or persistent memory -- to the processor point-to-point. CXL 2.0 introduced switches that enabled up to 16 hosts to access multiple memory entities -- or portions of memory entities -- to further support disaggregation, which boosted the spec's connectivity from server-level to rack-level.

CXL 3.0 technology enables multiple hosts to share memory without coherency concerns, a feature that admins can use for simple semaphores. However, there will be other uses as system architects embrace this option.

Screenshot of multiple switch configuration with CXL 3.0
CXL 3.0 can cascade multiple switches to achieve network fabrics.

While earlier versions of CXL allowed only one accelerator to attach to a switch, CXL 3.0 can now manage up to 16, maintaining coherency between the host processor(s), the accelerators and memory. With cascaded switches, the system can coherently manage as many as 4,095 memory entities. This advancement appears in the center bottom of the diagram below as GFAM, or global fabric-attached memory.

Screenshot of spine-leaf configuration with CXL 3.0
There are ways to connect more elaborate fabrics with CXL 3.0, like this spine-leaf configuration.

CXL 3.0's direct impact on storage

CXL 3.0 technology supports peer-to-peer reads/writes. Storage will have the option of moving data directly to or from memory without host intervention. With the remote direct memory access approach in CXL 2.0, the host gets in the middle of the transaction and slows it down, so this change renders CXL 3.0 significantly faster than RDMA. It also frees the processor from managing this task.

CXL 3.0's support of memory sharing implies that software that currently trades messages through storage will eventually move that communication to shared memory.

From the viewpoint of I/O traffic, peer-to-peer transactions will support higher traffic volume since the host won't have to start and stop the I/O stream as it switches between tasks. The host should then focus more tightly on tasks at which it excels. Additionally, I/O will speed up because it no longer must wait for the host, and a faster host will require greater I/O bandwidth to keep it busy.

In a completely different direction, CXL 3.0's support of memory sharing implies that software that currently trades messages through storage will eventually move that communication to shared memory. Once again, this will accelerate processes, so the net result should be higher I/O bandwidth even though the messaging task has moved away from storage. The implementation of this change is likely to take a long time, though, simply because it requires a number of structural changes to the software, and such changes tend to take some years to fall into place.

Third, memory pooling will enable memory-hungry applications access to far more memory than is currently economical. The entire code and data sets for even the largest application could reside within memory with no page faults. While this translates into especially low I/O traffic during the performance of the task, it puts enormous pressure on storage to respond quickly when a task has ended and its memory is reassigned to another task.

CXL is more about increasing the system's performance without increasing its cost. When any one part of the system's performance improves -- in this case memory management -- then other parts of the system must keep pace or they will become a bottleneck. CXL will increase bandwidth demands on storage since servers will perform more efficiently and, therefore, demand more data.

CXL 3.0's indirect impact on storage

CXL 3.0 has more to do with memory than storage, but it is likely to speed up computation, which will put more demands on storage performance. But isn't that nearly always the case?

The CXL 3.0 specification was released in August 2022; it is likely to take about a year before it's available widely in hardware. Admins will then need to upgrade applications to take advantage of all that the CXL 3.0 technology has to offer, and that will also take some time. Mainstream changes could take years.

On the bright side, this gives administrators plenty of time to get familiar with the technology before they will need to worry about its impact on their systems.

Dig Deeper on Storage management and analytics