NVIDIA Boosts Its AI Chip Sales as Google, Amazon, and Others Bet on Custom Processors

NVIDIA is experiencing one of the sweetest moments in its history: its AI GPU sales are “off the charts,” well above forecasts, and its profits have soared thanks to the craze for training and running generative AI models in the cloud. But while Jensen Huang’s company dominates the present, more and more tech giants are building their own future with custom chips.

Google, Amazon, Microsoft, OpenAI, Apple, Meta, and Tesla are pushing a new generation of AI processors: custom ASICs, FPGAs, and NPUs for edge computing that aim to reduce reliance on NVIDIA and lower the cost per computation. The result is a much more fragmented and competitive AI hardware landscape than just a few years ago.


From Gaming GPUs to the Engine Powering the AI Revolution

Although it may seem obvious today, NVIDIA’s dominance was not set in stone. Its GPUs originally emerged as graphics accelerators for video games, but in 2012, they made a historic leap when used to train AlexNet, the neural network widely considered the “big bang” of modern AI.

While other competitors in that famous computer vision contest relied on CPUs, AlexNet exploited the massive parallelism of GPUs to train the model much faster and more accurately. That moment changed the narrative: the cards that once powered 3D graphics also proved excellent for matrix multiplication and deep neural network training.

Today, NVIDIA’s GPUs are sold in full rack systems, like the GB200 NVL72, which includes 72 Blackwell GPUs working together as a single “superchip”. The company states these racks sell for around $3 million each, and they are shipping approximately 1,000 systems per week worldwide. Over the past year, they have likely delivered around 6 million latest-generation Blackwell GPUs.

These machines are not confined to labs; they power data centers for Amazon, Microsoft, Google, Oracle, CoreWeave, and governments such as South Korea, Saudi Arabia, and the UK. Large language models like those from OpenAI or Anthropic are trained on hundreds of thousands of these GPUs.

The key isn’t just the hardware: CUDA, NVIDIA’s proprietary software platform, has become the de facto standard for GPU programming in AI—something even AMD, with its more open ecosystem, has not matched in terms of community and tools.


ASICs: Custom Chips for Greater Control by Google, Amazon, and OpenAI

As the market matures, many hyperscalers are seeking alternatives to a “Swiss Army knife” like the GPU. For inference—when a trained model responds to user queries—simpler, specialized, and more efficient chips are gaining importance: ASICs (Application-Specific Integrated Circuits).

An ASIC is the opposite of versatile. It’s “wired” to perform a specific calculation extremely efficiently. Once manufactured, it cannot be reprogrammed, meaning it sacrifices flexibility for speed and lower cost per operation.

Google: A Decade of TPU Experience

Google pioneered this approach. In 2015, it launched its first Tensor Processing Unit (TPU), an ASIC designed from scratch to accelerate AI tasks. Since then, it has iterated through generations, culminating in its seventh generation, TPU Ironwood, announced in November 2025.

TPUs are used internally for products like Search, YouTube, Translate, and Gmail, as well as powering Google Cloud infrastructure. Anthropic, for example, has announced it will train its Claude model on up to 1 million TPUs.

Some experts believe that, in certain scenarios, TPUs match—or even outperform—NVIDIA’s GPUs in raw performance for specific tasks. Google has been selective in offering them to third parties, but market pressure might encourage broader access in the future.

Amazon: Inferentia and Trainium for Model Serving in AWS

Amazon Web Services took a lead in ASICs after acquiring startup Annapurna Labs. In 2018, it introduced Inferentia for inference, and in 2022, Trainium for training. The second-gen, Trainium2, now powers one of the company’s largest AI data centers in Indiana, where Anthropic trains its models on hundreds of thousands of dedicated chips.

AWS’s chief architect of Trainium has stated that its ASICs deliver between 30% to 40% better price-performance than other available accelerators. However, AWS still orders large volumes of NVIDIA GPUs for clients who prefer that ecosystem.

OpenAI and Broadcom: The Next Step

OpenAI also aims to reduce dependence on third-party chips. The company has signed an agreement with Broadcom to develop its own AI ASICs starting in 2026. Broadcom has previously collaborated on Google’s TPUs and Meta’s accelerators, establishing itself as one of the “silent winners” in the AI boom.

Meanwhile, Microsoft develops chips like Maia 100 for its data centers, and other players such as Tesla, Qualcomm, Huawei, ByteDance, and Alibaba are also working on their own designs.


AI at the Edge: NPUs and Chips Embedded in PCs, Mobile Devices, and Vehicles

Not all AI processing happens in the cloud. An increasing portion runs directly on devices—smartphones, laptops, cars, cameras, or robots—thanks to NPUs (Neural Processing Units) and other accelerators integrated into SoCs.

The goals are twofold:

  • Reduce latency by avoiding data transmission to data centers.
  • Enhance privacy by keeping sensitive data on the device itself.

Manufacturers like Qualcomm, Intel, and AMD are integrating NPUs into their PC processors, while Apple includes a dedicated “Neural Engine” in its M-series chips for Macs and in the A-series chips for iPhones. The latest high-end Android devices also feature NPUs in Snapdragon and Samsung’s Exynos chips with integrated NPU.

These units enable local assistants, real-time translation, advanced photo and video editing, and security functions without constantly relying on cloud connectivity. While most AI expenditure today is concentrated in data centers, many analysts predict that edge AI spending will grow rapidly as these functions become more common in daily life.


FPGAs: The Flexible Piece of the Puzzle

The fourth major chip type in this ecosystem is FPGAs (Field-Programmable Gate Arrays), reconfigurable devices via software. They are not as efficient as ASICs nor as powerful as GPUs for training massive models, but they offer an interesting balance: they can adapt to new algorithms after production.

Companies like AMD (after acquiring Xilinx) and Intel (after acquiring Altera) dominate this segment. FPGAs are used in networking, telecommunications, industrial automation, automotive applications, and sometimes as inference accelerators for very specific tasks where flexibility is key.


TSMC: The Main “Common Denominator” of the AI Revolution

Behind almost all these chips—NVIDIA and AMD GPUs, Google TPUs, Amazon Trainium, OpenAI ASICs, Apple and Qualcomm NPUs—stands a single manufacturer: TSMC (Taiwan Semiconductor Manufacturing Company).

The Taiwanese foundry produces the world’s most advanced nodes and has become a strategic player in the AI supply chain. From its new plant in Arizona to its factories in Taiwan, the company is responsible for the physical existence of many of these chips.

This introduces a layer of geopolitics and risk of concentration: although chip design competition is heating up, manufacturing capacity remains highly concentrated among a few players, with TSMC leading the way.


NVIDIA Still at the Center of the Board

With such a diverse array of chips, the big question is whether NVIDIA is at risk of losing its throne. For now, the majority of analysts say not in the short term.

The company has established itself as the de facto hardware standard and has built an ecosystem of developers, libraries, and tools over many years that is extremely difficult to replicate. Corporate clients and startups value this maturity, especially for production projects that cannot afford to rely on immature platforms.

ASICs will grow, NPUs will proliferate at the edge, and FPGAs will continue to serve their niches. But today, NVIDIA’s GPUs remain the backbone of the AI revolution. And as some experts remind us, that position is not accidental—it’s the result of decades of investment in hardware and software.


FAQs About AI Chips

Why are GPUs so important for AI?
Because they are designed to run thousands of simple mathematical operations in parallel, perfectly suited to the types of calculations neural networks require. This makes them ideal both for training large models and for running inference—generating responses based on those models.

What advantage do ASICs have over GPUs in AI?
ASICs sacrifice flexibility for efficiency: they are designed for a specific task and perform it with better performance per watt and lower cost per operation. They are especially attractive to large cloud providers that run the same workloads at scale.

What exactly is an NPU in a phone or computer?
A Neural Processing Unit (NPU) is an integrated accelerator in the device’s chip, designed to execute AI models locally. It supports features like advanced computational photography, smart assistants, or real-time translation without always relying on the cloud, reducing latency and improving privacy.

Could a competitor overthrow NVIDIA in the AI chip market?
In theory, yes, but it’s not just about raw performance. Competing requires matching NVIDIA’s software ecosystem (CUDA), its partner network, and its large-scale deployment capacity. That’s why many giants aim to combine NVIDIA’s GPUs with custom chips that give them more control over costs and workload management.

via: cnbc

Scroll to Top