The shortage of AI chips continues, but there may be an end in sight.

The adoption of generative artificial intelligence (AI) continues to rise, but the infrastructure needed to support this growth is facing a significant gap between supply and demand. According to an IDC analysis, 66% of companies worldwide will be investing in this technology over the next 18 months. By 2024, infrastructure will account for 46% of total spending. However, a key piece of hardware necessary for its construction is in shortage.

The rapid pace of AI adoption in the past two years has tested the industry’s ability to supply the high-performance special chips needed to run intensive processing operations. Much attention has been focused on the increasing demand for Nvidia’s GPUs and chip designer alternatives like AMD and Intel, as well as hyperscale data center operators, according to Benjamin Lee, a professor at the University of Pennsylvania’s Department of Computer and Information Science. “Much less attention has been paid to the surge in demand for high bandwidth memory (HBM) chips, manufactured in Korean foundries run by SK Hynix.”

Last week, SK Hynix reported that their high bandwidth memory (HBM) products, necessary alongside high-performance GPUs to handle AI processing requirements, are almost fully booked until 2025 due to high demand. The price of HBMs has recently increased by 5% to 10%, driven by significant premiums and higher capacity needs for AI chips, according to TrendForce.

Chips HBM are expected to represent more than 20% of the total DRAM market value starting in 2024, and potentially exceed 30% by 2025, according to Avril Wu, senior research vice president at TrendForce. “Not all major providers have met customer qualifications for [high-performance HBM], leading buyers to accept higher prices to secure stable and quality supplies.”

According to Lee, without HBM chips, a server’s memory system in a data center would not be able to keep up with a high-performance processor like a GPU. HBM supplies GPUs with the data they process. “Anyone buying a GPU for AI computing will also need high bandwidth memory.”

“In other words, high-performance GPUs would be underutilized and often idle waiting for data transfers. The high demand for SK Hynix memory chips is due to the high demand for Nvidia GPU chips and, to a lesser extent, the demand for alternative AI chips, such as those from AMD, Intel, and others,” explains Lee.

Gaurav Gupta, a Gartner analyst, adds that HBM is relatively new and gaining strong momentum because of what it offers: more bandwidth and capacity. “It’s different from what Nvidia and Intel sell. Except for SK Hynix, the situation for HBM is similar for other memory manufacturers. For Nvidia, there are more limitations associated with the ability to package their chips with the foundries.”

Despite SK Hynix reaching its supply limits, Samsung and Micron are increasing HBM production and should be able to meet demand as the market becomes more distributed, according to Lee. The current HBM shortage is primarily due to TSMC’s packaging (i.e., chip-on-wafer-on-substrate or CoWoS), which is the exclusive technology provider. Lee notes that TSMC is more than doubling its SOIC capacity and increasing CoWoS capacity by over 60%. “I expect the shortage to decrease by the end of this year.”

At the same time, more packaging and foundry providers are connecting and qualifying their technology to support NVIDIA, AMD, Broadcom, Amazon, and others using TSMC’s chip packaging technology, according to Lee.

Nvidia, whose production accounts for around 70% of the global AI server chip supply, is expected to generate $40 billion in GPU sales this year, according to Bloomberg analysts. In comparison, Intel and AMD are expected to generate $500 million and $3.5 billion, respectively. But all three are ramping up production as fast as they can.

According to TrendForce, Nvidia is addressing the GPU supply shortage by increasing its CoWoS and HBM production capabilities. “This proactive approach is expected to halve the current average delivery time of 40 weeks by the second quarter [of 2024], as new capacities come online,” the TrendForce report states.

Shane Rau, IDC’s vice president of computer semiconductor research, comments that while the demand for AI chip capacity is very high, markets are adapting. “In the case of server-type GPUs, there is an increase in wafer, packaging, and memory supply. Increasing supply is key because, due to their performance and programmability, server-class GPUs will remain the preferred platform for training and running large AI models.”

Global spending on AI-focused chips is expected to reach $53 billion this year and more than double over the next four years, according to Gartner. Chip manufacturers are rolling out new processors as quickly as they can.

Intel has announced its plans for chips aimed at powering AI functions with its Gaudi 3 and Xeon 6 processors. Meanwhile, AMD has highlighted its MI300 GPU for AI data center workloads, which also has good traction in the market. Additionally, more than 80 semiconductor providers are developing specialized chips for AI.

On the software side, LLM creators are developing smaller models designed for specific tasks, requiring fewer processing resources. Intel’s strategy also aims to enable generative AI across all types of computing devices, from laptops to smartphones.

However, without HBM, these processors would likely struggle to keep up with the high-performance demands of generative AI.

Scroll to Top