X (Twitter) Facebook Pinterest LinkedIn E-mail

The AI craze has opened a new chapter in the world of digital infrastructure: so-called AI factories, massive data centers designed to run inference models on an industrial scale. But beyond technological advances, what really interests analysts and investors is the economy behind it.

A recent report by Morgan Stanley has put figures on the table, and its conclusions are clear: NVIDIA leads with an operating profit margin of 77.6% thanks to its GB200 NVL72 racks, while competitors like Google, Amazon, and Huawei also report solid profits. Conversely, AMD reports significant losses, with negative margins reaching up to -64% on its latest platforms.

What is an “AI Factory”?

The term “AI Factory” is not metaphorical: it describes a standardized data center model with 100 MW of capacity, designed to measure profitability in large-scale inference.

Morgan Stanley developed an analysis framework called the “100MW AI Factory Model”, which rests on three pillars:

Standardized Compute Unit: 100 MW power consumption, equivalent to around 750 high-density server racks.
Detailed Cost Breakdown: including construction costs ($660 million depreciated over 10 years), hardware purchases ($367 million to $2.273 billion depreciated over 4 years), and operational costs for electricity and cooling.
Market Revenue Formula: tied to token production in language models, with an average price of $0.20 per million tokens, adjusted for 70% actual equipment utilization.

Using this methodology, Morgan Stanley estimates a typical AI factory’s annual total cost of ownership (TCO) to be between $330 million and $807 million, depending on hardware choices.

NVIDIA: The Most Expensive but Most Profitable Investment

Each NVIDIA GB200 NVL72 rack contains 72 B200 GPUs and 36 Grace CPUs, interconnected via high-speed NVLink 5. Its price is around $3.1 million per rack, compared to $190,000 for a previous-generation H100 rack.

Despite this significant investment difference, the financial calculations are compelling:

Operating Margin: 77.6%
TCO in 100 MW: $806.58 million
Revenue: inference efficiency boosts GPU performance, enabling record profitability.

NVIDIA’s leadership is not only due to the raw power of its chips but also the integration of a comprehensive software ecosystem (CUDA, TensorRT, optimized frameworks) that ensures every dollar spent on hardware translates into more tokens processed and thus higher revenues.

Google, Amazon, and Huawei: Solid Profits

Morgan Stanley’s study shows that other tech giants also achieve positive profitability, albeit with nuances:

Google TPU v6e pods: 74.9% margin. Exact cost unknown, but estimated that renting them costs 40-50% less than an NVL72 rack. Their strength lies in vertical integration with Google Cloud and optimized software-hardware synergy.
Amazon AWS Trn2 UltraServer: 62.5% margin. Amazon relies on its own hardware to cut rental costs and deliver optimized inference in its cloud.
Huawei Ascend CloudMatrix 384: 47.9% margin. Although not reaching NVIDIA or Google levels, it’s a cost-effective alternative with growth potential in Asia, especially after US chip export restrictions.

AMD: Unexpected Losses

The “cold shower” in the report involves AMD. Its platforms, MI300X and MI355X, aimed at high-performance AI, show margins of -28.2% and -64%, respectively.

The reasons are straightforward:

High initial costs, comparable to NVIDIA’s
Low inference efficiency, reducing token output and revenue

The annual TCO for an MI300X reaches $774 million, just below NVIDIA’s $806 million for a similar capacity. But unlike NVIDIA, which generates enough revenue to turn a profit, AMD cannot cover its costs.

This presents a tough blow to many investors’ expectations of AMD as a viable alternative in AI.

A New Business Model: Inference as a Factory

Morgan Stanley concludes that AI inference is no longer just a technological challenge, but a measurable and repeatable business model, with clear investment and return formulas.

AI factories could become the next trillion-dollar infrastructure class, comparable to power plants or telecom networks.

This raises urgent questions:

How will data center architects ensure enough electrical capacity for GPUs of 1,200W or more?
Can utilities scale generation and transmission in step with AI demand?
How to balance deployment speed, liquid cooling, and regulatory approval to maximize profit?

Ecosystems and the Next Battle

The report also warns that the next war won’t just be over chips but connectivity ecosystems.

Beyond NVIDIA, AMD promotes UALink, an open low-latency standard to interconnect GPUs. Meanwhile, companies like Broadcom advocate Ethernet as a flexible alternative. The outcome of this battle could determine if an open ecosystem will emerge capable of competing with NVLink, NVIDIA’s proprietary interconnection.

While this unfolds, NVIDIA advances with its roadmap. Its next platform, “Rubin”, scheduled for 2026, promises to raise the stakes even further.

Conclusion: Unequal Profitability in the Race for AI

Morgan Stanley’s figures highlight a brutal divergence:

NVIDIA and Google set the pace with margins nearing 80%
Amazon and Huawei show that profitability is possible even in a hyper-competitive market
AMD, on the other hand, faces losses that cast doubt on its AI strategy

With inference, which accounts for 85% of the future market according to the report, no longer just about powerful chips but about precise economics and industrial scalability, the message is clear: AI factories will not only produce tokens and models but also generate unprecedented profit margins for those who master the technological and financial equation.

X (Twitter) Facebook Pinterest LinkedIn E-mail