What is high-bandwidth memory (HBM)?
High-bandwidth memory is a type of computer memory that is optimized for fast data transfer and reduced power consumption.
It has become increasingly used and deployed alongside high-performance computing (HPC) and artificial intelligence workloads that require optimized high-speed memory. HBM is used in various computing platforms -- including graphics processing units (GPUs), field programmable gate arrays and AI accelerators.
HBM isn't an entirely unique type of memory. It is an implementation that deploys dynamic random access memory silicon in a different way than it is conventionally used. With HBM, DRAM silicon dies are stacked vertically with a connection technology known as through-silicon vias to achieve high density and performance. TSV technology is a process where thin electrical wires run through holes drilled in the silicon chips to connect multiple layers to a base logic chip. HBM memory can be directly integrated alongside a processor, providing a highly optimized data path.
The goal with HBM is to place more memory closer to a processor, which reduces signal travel distance and decreases latency. The ability to provide more memory with less latency helps accelerate data transfer rates.
HBM technology got its start at AMD in 2008 when the chipmaker began to explore different ways to improve memory performance. The actual first HBM chip was developed by AMD's partner, South Korean silicon vendor SK Hynix, in 2013. That same year, the technology became an industry standard memory type from Joint Electron Device Engineering Council. HBM is now manufactured by multiple vendors, including AMD, Samsung and Micron.
Much like other forms of DRAM, there are multiple versions of HBM as the technology has been continuously iterated on since the initial release in 2013. HBM2 debuted in 2016, boosting the data rate per pin to 2.4 Gbps -- up from 1 Gbps. HBM3 was released in 2023, accelerating performance up to 6.4 Gbps. The HBM4 standard -- expected to be finalized in 2026 -- will push the data rate further, up to 9.6 Gbps.
How HBM chips differ from conventional microchips
HBM is different from other types of memory microchips in several ways. Key differences include the following:
- 3D stacked architecture. HBM has a different physical architecture than conventional memory microchips as the DRAM silicon dies are vertically stacked, instead of being placed side by side.
- Connection. With HBM, the memory is positioned directly next to the processor using a TSV connection. Conventional memory is typically placed farther on a circuit board.
- Memory bus width. One of the biggest areas of differentiation is in the memory bus width, which is a measure of how much data can be transferred between the CPU and RAM simultaneously. HBM3 provides a 1,024-bit memory bus, which is significantly more than conventional memory with a narrower width of 32 or 64 bits.
- Manufacturing and cost. HBM tends to have higher manufacturing complexity due to the 3D architecture and connection sophistication. That complexity results in higher costs than conventional memory.
HBM vs. DDR vs. GDDR
There are several types of DRAM that are in common use.
In addition to HBM, modern systems also often use graphics double data rate (GDDR) and double data rate (DDR) types of DRAM. Each of these memory technologies has characteristics that make them suitable for different applications. The following table provides a comparison of HBM, GDDR and DDR, highlighting their key differences in architecture, performance and use cases.
| HBM | GDDR | DDR5 | |
| Latest standard | HBM3 | GDDR6X | DDR5 | 
| Architecture | 3D stacked | Planar | Planar | 
| Bus width | 1,024-bit | 32-bit per chip | 64-bit | 
| Bandwidth per stack/chip | 819 GBps | Up to 84 GBps | Up to 51.2 GBps | 
| Power efficiency | Highest | Moderate | Lowest of the three | 
| Form factor | Most compact | Moderate | Largest | 
| Integration | On-package with GPU/CPU | Soldered on PCB | DIMM modules | 
| Cost | Highest | Moderate | Lowest | 
| Primary applications | High-end GPUs, AI accelerators | Graphics cards, some AI inference | General computing, servers | 
What are the key advantages of HBM?
HBM offers several key advantages over other types of memory technology, including the following:
- Higher bandwidth. A primary advantage of HBM is its ability to provide higher bandwidth than other forms of memory.
- Lower latency. The shorter data paths in HBM's 3D stacked design lead to reduced latency.
- Higher capacity. HBM's stacked architecture allows for greater memory capacity in a smaller footprint.
- Lower power consumption. The 3D stacked architecture and shorter data paths from being directly integrated with the chip let HBM operate at lower voltages than conventional memory.
- Improved heat generation. The lower power consumption of HBM also provides reduced heat generation and improved thermal efficiency.
- Smaller form factor. The 3D stacking technology of HBM leads to a smaller form factor and more compact design compared to traditional memory solutions.
Why is HBM important?
HBM was originally created to help improve and advance the memory for HPC. It has become increasingly important in recent years for generative AI and machine learning applications.
With AI and machine learning there is a need to be able to process large amounts of data as fast as possible, for both training as well as inference. With its high bandwidth, reduced power consumption and low latency characteristics, HBM has become a key enabler for supporting large language model development and deployment.
The importance of HBM to AI is further underscored by how hardware vendors are using the technology as part of AI accelerators. HBM is a critical element of all modern AI accelerator design. Nvidia is using HBM as part of its AI accelerator effort, including the company's H100, H200 and Blackwell chips. Similarly, Intel relies on HBM as part of its Gaudi line of AI accelerators.
HBM is also important for its role in helping drive new innovations. HBM's development has led to innovations in memory manufacturing, packaging and design. HBM's energy efficiency makes it increasingly relevant for sustainable computing initiatives. Despite higher manufacturing complexity, the technology's lower power consumption during operation can help to reduce the carbon footprint of data centers and AI training facilities. Moreover, HBM's smaller form factor and improved thermal efficiency can lead to reduced cooling requirements in data centers, further contributing to environmental benefits.
The market for HBM technology is lucrative and growing. Mordor Intelligence forecasts HBM at $3.17 billion for 2025, growing at a compound annual growth rate of 26% from 2025-2030.
How is HBM made?
The production of HBM requires specialized equipment and processes that are continuously getting better with each new generation of technology.
The semiconductor manufacturing process for HBM is complex and involves a series of steps, including the following:
- DRAM fabrication. The process begins with the production of individual DRAM dies.
- TSV formation. TSVs enable the connection of multiple dies and are created by etching narrow holes through a silicon wafer. Those holes are filled with conductive material that allows communication between the 3D stacked dies.
- Wafer thinning. The DRAM silicon wafers are thinned out to reduce the height of the stack and improve thermal performance.
- Die stacking. Multiple DRAM dies are stacked on top of each other.
- Interposer integration. The stacked DRAM is typically mounted on a silicon interposer, which acts as a bridge between the HBM stack and the host chip.
- Micro-bump formation. The connection between the dies, stack and the interposer are created with micro-bumps, which are tiny solder balls used to create electrical connections.
- Advanced packaging. The entire assembly is packaged using advanced techniques to create the final HBM module.
- Testing and quality control. Throughout the manufacturing process, there are multiple types of tests and ongoing quality control checks to ensure reliability and performance.
