Getty Images/iStockphoto
A primer on AI chip design
Four common AI chips -- CPU, GPU, FPGA and ASIC -- are advancing with the current market for AI chip design. Read on to see what the future holds for AI chips.
The AI chip is key to the development, implementation and proliferation of AI today. These computer chips are foundational to powering and processing large-scale AI computations, and they're evolving at a rapid pace. Gartner forecasts worldwide AI chips revenue to more than double by 2027 -- a rise from an expected $53 billion in 2023 to $119 billion in 2027.
This primer will answer what AI chips are, what features and types are available on a mass market level and what applications they can support.
What are AI chips?
A "chip" refers to a microchip -- a unit of integrated circuitry that's manufactured at a microscopic scale using a semiconductor material. Electronic components, such as transistors, and intricate connections are etched into this material to enable the flow of electric signals and power computing functions.
An AI chip focuses on powering AI functions. AI workloads require massive amounts of processing power that general-purpose chips, like CPUs, typically can't deliver at the requisite scale. To get high processing power, AI chips need to be built with a large amount of faster, smaller and more efficient transistors.
Historically, this has been challenging. But thanks to Moore's Law, technology has been able to advance to a point where manufacturers can fit more transistors on chips than ever before. This is largely what's been lowering barriers to AI recently.
Adding transistors to a microchip isn't the only way to power AI calculations. Manufacturers are designing chip features to optimize for specialized AI functions.
Features of AI chips
Although Moore's Law has advanced AI chip design, it will eventually become impossible to fit more transistors on a chip, even on a microscopic scale. It's also expensive to add more transistors to chips -- prohibitively so on a mass market level. As such, manufacturers now focus on more effective chip architecture to achieve similar results.
This is where accelerators come in. AI accelerators boost the processing speeds of AI workloads on a chip as well as enable greater scalability and lower system latency. These accelerators are key to rapidly turning data into information that AI algorithms can consume, learn from and use to generate more accurate outputs.
Accelerators focus on speed because AI workloads are complex. Ideally, this means a substantial number of calculations need to be made in parallel rather than consecutively to get speedier results. Specially designed accelerator features help support the parallelism and rapid calculations AI workloads require but with lower quantities of transistors. A regular microchip would need considerably more transistors than a chip with AI accelerators to accomplish the same AI workload.
This focus on speedier data processing in AI chip design is something data centers should be familiar with. It's all about boosting the movement of data in and out of memory, enhancing the efficiency of data-intensive workloads and supporting better resource utilization. This approach impacts every feature of AI chips, from the processing unit and controllers to the I/O blocks and interconnect fabric.
AI-optimized features are key to the design of AI chips and the foundation of accelerating AI functions, which avoids the need and cost of installing more transistors.
Types of AI chips and their traits
AI features can be built into many kinds of microchips, but there are certain types of chips that are well-suited for different AI uses. These are the most common types of AI chips, moving from general-purpose to more narrowly focused chips:
- CPUs. These are general-purpose chips typically built to perform sequential tasks. CPUs can process simpler AI workloads, but their processing performance tends to fall off quickly compared to more specialized chips.
- GPUs. These are also general-purpose chips but are typically built to perform parallel processing tasks. GPUs were initially designed to perform multiple complex graphics calculations all at once to display visuals for video games. This focus on parallel processing has translated almost perfectly from graphics to AI computations and made GPUs one of the go-to chips for training AI algorithms.
- Field-programmable gate arrays (FPGAs). These are chips based on programmable logic blocks. These blocks can interconnect in a variety of ways to perform complex functions. Like GPUs, they can support parallel processing. But unlike CPUs and GPUs, they are not general purpose. FPGAs are usually programmed for specific functions, but they can be reprogrammed as needed.
- Application-specific integrated circuits (ASICs). These are not general-purpose chips. They are customized and built to support specific applications. ASICs are implemented based on FPGAs, so they offer similar computing ability, but users cannot reprogram them. Google's Tensor Processing Unit is an example of an ASIC that is custom developed to accelerate ML workloads.
Other chips are being developed based on even more specific uses. For example, cloud and edge AI chips handle inference on cloud servers or on edge devices, such as phones, laptops or IoT devices. These are specifically built to balance cost as well as power AI computing in cloud and edge applications.
Applications for AI chips
AI chips can greatly enhance the speed at which AI, ML and deep learning algorithms can be trained and refined, which is particularly useful for speeding up the development of large language models, such as ChatGPT, and other generative AI functionality. From AI assistants such as chatbots to automation in hardware, the applications are found across industries.
AI chips can power more efficient data processing on a massive scale. This can help data centers run greatly expanded workloads with higher complexity more efficiently. In a heavy, data-intensive environment such as a data center, AI chips will be key to improving and boosting data movement, making data more available and fueling data-driven solutions. As a result, data centers can use less energy and still achieve higher levels of performance.
With the rapid evolution of AI chips, data center managers and administrators should stay informed of new chips being announced and released. Doing so will help them ensure their organizations can meet their data-intensive processing needs at scale.
Jacob Roundy is a freelance writer and editor specializing in a variety of technology topics, including data centers and sustainability.