China Strikes Back at the US: Baidu Launches Offensive with New Kunlun M100 and M300 AI Chips

China has just sent a clear message to Washington from Beijing: it doesn’t intend to fall behind in the artificial intelligence race, even if the US tries to tighten the flow of advanced chips. At its Baidu World 2025 conference, the Chinese tech giant unveiled two new AI processors—Kunlun M100 and Kunlun M300—and a massive supernode architecture called Tianchi, which on paper, targets the dominance that Nvidia and other Western manufacturers currently hold in AI computing.

This isn’t just another launch. It’s the centerpiece of a long-term strategy: providing China with its own, “controllable” computing power, free from Washington’s restrictions.


From Washington’s sanctions to Beijing’s “Plan B”

Since 2022, the US has been tightening restrictions on the export of high-performance chips and manufacturing equipment to China, with rules directly limiting the sale of Nvidia and AMD GPUs and access to critical lithography and design tools.

The declared goal of the White House is to slow Beijing’s ability to train cutting-edge AI models, especially those with potential military or mass surveillance uses. As a side effect, Chinese tech giants have been forced to seek domestic alternatives to train their language and multimodal systems.

Huawei led the way by demonstrating that it’s possible to manufacture 7 nm SoCs with SMIC—such as the Kirin 9000S in the Mate 60 Pro—even under sanctions and without EUV lithography access. But in the realm of high-performance AI accelerators, the missing piece was chips capable, at least partially, of competing with Nvidia’s GPUs in data centers.

This is where Baidu comes in, with its Kunlun line.


Kunlun M100 and M300: two weapons in the AI war

At Baidu World 2025, the company announced its next-generation AI chips: the M100, optimized for large-scale inference, and the M300, designed for training and inference of ultra-large, multimodal foundational models.

The key features include:

  • Kunlun M100
    • Focused on massive inference environments, where millions of AI requests are served daily.
    • Especially optimized for Mixture of Experts (MoE) architectures, activating only parts of the model based on the query to improve efficiency.
    • Commercial release expected in early 2026. Baidu markets it as the engine for running large-scale models like Ernie within China.
  • Kunlun M300
    • Intended as an all-in-one chip for training and inference of next-generation multimodal models with trillions of parameters.
    • Designed for models combining text, images, audio, and video within a unified framework.
    • Market arrival anticipated in early 2027.

Baidu has not disclosed manufacturing node details or all technical parameters for these chips—a common silence in the Chinese industry post-sanctions. However, the official message emphasizes three concepts: power, low cost, and above all, “controllable” and domestic computing—targeted at Chinese clients who can no longer or do not want to depend on US hardware.


Tianchi 256 and 512: supernodes aiming to mimic (and challenge) Nvidia

The other significant part of the announcement is the architecture of the supercomputing system accompanying the chips. Baidu revealed two “Tianchi” supernodes — the Tianchi 256 and the Tianchi 512 — which interconnect hundreds of accelerators within a single logical compute unit, similar to systems like Nvidia’s GB200 NVL72 or Huawei’s CloudMatrix 384.

Based on information shared by the company and specialized media:

  • The Tianchi 256 links 256 P800 chips (current Kunlun generation) and is expected to be available in the first half of 2026.
  • The Tianchi 512 will scale up to 512 chips in the latter half of that year.

Baidu claims that, compared to the previous Tianchi supernode unveiled in April, the new Tianchi 256 quadruples total bandwidth, improves overall performance by over 50%, and achieves a 3.5x increase in tokens throughput per GPU in large-model inference workloads.

The concept is clear: although each individual chip may lag one or two steps behind Nvidia’s most advanced GPUs, the interconnection network and cluster design aim to offset this gap by adding more cards and optimizing intercommunication.


A roadmap to 2030: from 256-card supernodes to a million-card cluster

Beyond specific products, Baidu has outlined a five-year roadmap for the Kunlun family, targeting an ambitious goal: building, by 2030, a cluster of one million Kunlun cards within a single “mega AI infrastructure” powering its Baige platform.

The announced milestones include:

  • 2026: Deployment of Tianchi 256 and Tianchi 512.
  • 2028: Arrival of “multi-thousand card” supernodes, positioning China among the world’s top training systems.
  • 2030: The goal to operate a cluster of one million Kunlun cards as a unified logical unit, at least on paper.

Additionally, Baidu introduced Ernie 5.0, a multimodal language model with 2.4 trillion parameters capable of processing text, images, audio, and video, trained on its own infrastructure based on P800 chips.

This combination of chips, supernodes, and models shapes the narrative Baidu aims to project: China is building its own end-to-end “stack” of AI, from silicon to applications.


A direct blow to Nvidia’s dependency… within China

Baidu’s political message is clear. While the US tightens or relaxes export rules for GPUs to China, Beijing wants its large tech companies to minimize reliance on foreign chips for strategic data centers.

With the Kunlun M100 and M300, Baidu positions itself as a domestic alternative to Nvidia and AMD GPUs—at least for much of China’s corporate and cloud AI workloads. This isn’t just about raw performance; it’s also about regulatory security:

  • No dependence on US export licenses that can change overnight.
  • No risk of getting caught in new sanctions or geopolitical bans.
  • Allows the Chinese government to require that sensitive data and critical models run on deemed “safe and controllable” hardware.

Simultaneously, Baidu’s strategy responds to another trend: the growing regulatory hostility in China toward foreign AI chips, with increased inspections of Nvidia imports and tighter control over what enters Chinese ports.


A real challenge to US AI leadership?

The big question for the global community is whether Baidu’s offensive can significantly cut into the advantage still held by the US and its allies in AI computing.

In the short term, the answer seems cautious: Nvidia chips still set the pace in manufacturing processes, software ecosystems (CUDA), libraries, and developer communities. Chinese foundries also remain several nodes behind TSMC or Intel in advanced lithography.

However, the landscape is shifting:

  • China has demonstrated it can manufacture 7 nm chips without EUV and iterate on that foundation.
  • Companies like Baidu, Huawei, and others are building supernode systems that scale more through volume than extreme chip sophistication.
  • Chinese AI models are narrowing the gap with US counterparts, with some estimates placing the difference at just months in certain areas.

In this context, the Kunlun M100 and M300 do not instantly turn China into an AI hardware leader, but they reinforce a clear trend: the country is no longer just a chip importer but a competitor with its own road map.


Frequently Asked Questions

How do Baidu’s Kunlun M100 and M300 chips differ for AI?
The Kunlun M100 is optimized for large-scale inference, especially for models like Mixture of Experts (MoE), and is designed to support assistants, chatbots, and cloud services with millions of requests daily. The Kunlun M300, on the other hand, targets both training and inference of next-generation multimodal foundational models capable of handling text, images, audio, and video. It is primarily reserved for Baidu’s most powerful clusters and major enterprise clients.

What role do the Tianchi 256 and Tianchi 512 supernodes play in Baidu’s AI strategy?
The Tianchi supernodes enable the grouping of hundreds of Kunlun P800 chips into a single logical computation unit, interconnected at very high speeds. Their goal is to offset performance gaps with foreign GPUs by increasing the number of cards and improving intercommunication. Baidu states that Tianchi 256 quadruples bandwidth over its previous iteration, boosts overall AI performance by over 50%, and increases the throughput of tokens per GPU by 3.5x in large-model inference workloads, forming the backbone of its future large-scale AI infrastructure.

How do Kunlun M100/M300 and Tianchi impact China’s dependency on Nvidia chips?
These products allow Baidu and other Chinese firms to reduce reliance on US export controls and mitigate geopolitical and regulatory risks associated with Nvidia GPUs. While Nvidia’s ecosystem remains dominant globally, Kunlun chips and Tianchi supernodes provide a domestically produced pathway for training and deploying advanced AI models within China, aligning with Beijing’s “security and control” priorities in key data centers.

What does Baidu’s plan to reach a cluster of one million Kunlun cards by 2030 entail?
This goal is part of Baidu’s five-year roadmap, aiming to establish one of the world’s largest AI computing infrastructures based on its own chips. Milestones include deploying 256- and 512-card supernodes in 2026 and 2028, respectively, with the ultimate ambition of operating a nearly fully self-reliant, one-million-card cluster by 2030, enabling it to train and run enormous AI models largely independently of foreign hardware.

Scroll to Top