CoreWeave Deploys Thousands of Blackwell Servers and Accelerates the AI Cloud War

CoreWeave has taken a significant step forward in the race to dominate infrastructure for advanced AI models. The company became one of the first cloud providers to bring to production at scale NVIDIA GB200 NVL72 systems for clients, with Cohere, IBM, and Mistral AI among the first confirmed users. NVIDIA and CoreWeave announced this deployment as the true launch of a new phase: moving from hardware announcements to real-world application for training, inference, and AI agents in production environments.

The importance of this move isn’t just in the number of GPUs, but in the type of system involved. GB200 NVL72 is not just another instance in the usual accelerators catalog, but a rack-scale platform connecting 72 Blackwell GPUs and 36 Grace CPUs within a single NVLink domain, with NVIDIA Quantum-2 InfiniBand networks to scale the cluster. CoreWeave had already been the first cloud provider to announce general availability of GB200 NVL72-based instances, and now the message is that these machines are no longer just on paper—they are running actual workloads for some of the most prominent names in the AI market.

It’s not just brute force power: it’s the battle for the entire rack

For years, competition in AI cloud computing has been primarily measured in GPU count. With Grace Blackwell, NVIDIA aims to shift the focus toward the complete system: interconnects, unified memory, network topology, data throughput, and the ability to operate as a single logical machine at rack scale. CoreWeave, on its part, tries to position itself as the provider that is first to this transition and can deploy it quickly for laboratories and companies that cannot wait months for the ecosystem to mature.

This positioning has a clear strategic implication. In AI cloud services, offering loose GPUs or generic clusters is no longer enough. Advanced customers are seeking architectures designed for reasoning, agents, and ever-larger models, where the bottleneck isn’t just the chip, but how the entire system performs when thousands of accelerators work together. CoreWeave has been building its brand around this idea of an “AI hyperscaler,” and the deployment of GB200 NVL72 is arguably one of the most visible pieces of that story.

What Cohere, IBM, and Mistral are doing with Blackwell

NVIDIA has wanted to accompany the announcement with concrete use cases, and three names help illustrate the type of customers CoreWeave is targeting. Cohere is using these systems to develop secure enterprise applications and custom agents within its North platform. According to NVIDIA, the company is already seeing up to 3x the training performance on models with 100 billion parameters compared to Hopper, even without system-specific optimizations for Blackwell.

IBM, on the other hand, is using one of the first GB200 NVL72 deployments at the scale of thousands of GPUs to train the next generation of its Granite models, a family of open, enterprise-oriented models. Their partnership extends beyond computing: IBM also supplies its Storage Scale System as a high-performance storage layer for AI, which was announced back in January 2025 when both companies detailed the supercomputer CoreWeave would deliver to IBM for this purpose.

Mistral AI has already received its first batch of Blackwell GPUs to accelerate the development of new open models. In NVIDIA’s announcement, Mistral’s co-founder and CTO, Timothée Lacroix, states he has seen a 2x improvement in training dense models “straight out of the box,” meaning without additional tuning. The French company has been working with CoreWeave, but this new infrastructure generation allows them to push training and inference workloads at a different scale.

Table: Initial deployment breakdown of GB200 NVL72 at CoreWeave

ClientMain UseHighlight
CohereTraining and inference for enterprise AI and agents with NorthUp to 3x more training performance for models with 100 billion parameters compared to Hopper, according to NVIDIA
IBMTraining the next generation of GraniteDeployment at thousands of Blackwell GPUs scale, supported by IBM Storage Scale System
Mistral AITraining and deploying new open models2x improvement in dense model training without additional optimizations, per NVIDIA
CoreWeaveRack-scale cloud offering for AIInstances with 72 Blackwell GPUs and 36 Grace CPUs; scale-up to 110,000 GPUs with Quantum-2 InfiniBand

The data in this table are not neutral benchmark results but figures and descriptions published by NVIDIA, CoreWeave, and IBM in their official announcements. Nonetheless, they help clarify where the focus is: less marketing about future promises, more messaging around real workloads, specific clients, and demonstrable performance gains.

Main message: the AI cloud is entering its industrial phase

There’s another nuance worth noting. CoreWeave isn’t just offering premium instances to a select few clients. According to their documentation, their infrastructure can scale up to 110,000 Blackwell GPUs with InfiniBand Quantum-2. Parallel to this, the company already boasted record inference results with Grace Blackwell in MLPerf, reinforcing the idea that they aim to compete not only on installed capacity but also on measurable performance.

This aligns with a broader market shift. The conversation around AI is moving from “who has access to GPUs” to “who can operate complete AI factories,” with preintegrated racks, memory, networking, storage, and management software ready to deploy frontier models in production. NVIDIA talks about “AI factories,” CoreWeave emphasizes deployment speed, while clients like IBM, Cohere, and Mistral focus on throughput, costs, and time-to-first-response. These are different ways of describing the same trend: AI is no longer just built in labs but in industrial-grade infrastructure.

The big question now isn’t whether Blackwell will enter the cloud—because it already has—but which providers will be able to convert that early access into a sustained advantage. CoreWeave has positioned itself at the forefront with GB200 NVL72. Moving forward, the challenge will be maintaining that lead as the rest of the market responds with similar deployments, more capacity, and, likely, a new wave of price and performance competition in the AI cloud space.

Frequently Asked Questions

What exactly is NVIDIA GB200 NVL72?
It’s a NVIDIA rack-scale platform integrating 72 Blackwell GPUs and 36 Grace CPUs within a single interconnected system designed for large-scale training, inference, reasoning, and AI agents.

Why is it important that CoreWeave has brought it into production?
Because moving from announced availability to real-world client deployment—by companies like Cohere, IBM, and Mistral—demonstrates that the platform is already operating with productive workloads, not just testing or demos.

What improvements have early clients reported?
Cohere reports up to 3x performance in training for models with 100 billion parameters versus Hopper, while Mistral mentions a 2x boost in training dense models without additional tuning. IBM highlights expected acceleration for their Granite family.

How far can this infrastructure scale within CoreWeave?
CoreWeave states that their Blackwell instances accelerated by GB200 NVL72 can scale up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networks.

via: blogs.nvidia

Scroll to Top