In a new demonstration of leadership in AI infrastructure, CoreWeave (Nasdaq: CRWV) has become the world’s first cloud provider to deploy the NVIDIA GB300 NVL72 platform, a rack-scale solution integrating the new Blackwell Ultra GPUs. This move positions CoreWeave at the forefront of accelerated computing, surpassing competitors like AWS, Google Cloud, and Microsoft Azure in access to cutting-edge technology for generative AI workloads, agent inference, and multimodal reasoning.
An unprecedented performance leap
The GB300 NVL72 represents a significant advancement in NVIDIA’s Blackwell architecture. Its impressive figures include:
– Up to 10× faster response times for conversational agents.
– 5× higher performance per watt compared to Hopper (H100).
– 50× greater inference throughput for reasoning models.
Each NVL72 system contains:
– 72 NVIDIA Blackwell Ultra GPUs (GB200) supporting FP4 inference and FP8 training.
– 36 Grace CPUs (based on ARM Neoverse) for control and general computing tasks.
– 36 BlueField-3 DPUs with advanced networking and security capabilities.
– An interconnect of 14.4 GB/s per GPU with NVLink and NVSwitch.
The inference capacity reaches 1.1 exaFLOPS FP4 per rack, making it one of the most powerful publicly available systems. In comparison, NVIDIA’s DGX GH200 system announced in 2024 offers 8TB of shared memory but at higher energy costs and with lower rack-to-rack integration.
Comparison with other hyperscalers
| Provider | Reference GPU | Year | Inference Capacity (FP4) | Llama 3 (405B) Time | Cooling |
|——————-|————————————–|——-|—————————————————–|———————|——————————|
| CoreWeave | GB300 NVL72 (72x B200 Ultra) | 2025 | 1.1 exaFLOPS/rack | 27.3 min | Custom liquid cooling |
| AWS (Trainium2) | Trn2 Accelerator | 2024 | Estimated 0.3 exaFLOPS | ~55 min | Air (standard) |
| Microsoft Azure | ND MI300x v5 (AMD) | 2024 | 0.8 exaFLOPS/rack (FP8) | ~33 min | Immersive liquid cooling |
| Google Cloud | TPU v5p | 2024 | 0.5 exaFLOPS (bfloat16) | ~45 min | Liquid cooling |
CoreWeave not only outperforms competitors in raw power but also excels in energy efficiency and deployment speed. While AWS and Microsoft collaborate with AMD and customized chips, CoreWeave remains dedicated to NVIDIA, leading to the first public availability of GB300 NVL72 for clients.
A comprehensive ecosystem: software, visibility, and DevOps
Beyond hardware, CoreWeave has integrated this new platform into its cloud-native software stack:
– CoreWeave Kubernetes Service (CKS): managing containers in AI environments.
– Slurm on Kubernetes (SUNK): ideal for HPC and massive training workloads.
– Rack Lifecycle Controller (RLCC): managing maintenance, power, and rack alerts.
– Integration with Weights & Biases: providing detailed hardware and cluster health monitoring.
This ensures AI engineers can scale and fine-tune their models with unprecedented visibility into each GPU, rack, or region.
Recognition and expansion
In June, CoreWeave achieved another milestone by training the Llama 3.1 model with 405 billion parameters in just 27.3 minutes using 2,500 GB200 superchips. This feat, demonstrated in MLPerf Training v5.0 in collaboration with NVIDIA and IBM, solidifies its position as a leader in AI computing.
Additionally, CoreWeave is the only provider certified as Platinum with SemiAnalysis’s ClusterMAX™ system, a metric that evaluates efficiency, availability, and scalability of clouds for AI.
Expert commentary
“The deployment of the GB300 NVL72 systems demonstrates that we’re entering a new era of cloud infrastructure, where performance per watt and the ability to scale autonomous agents are key. The integration of cutting-edge hardware, native software, and observability is what sets CoreWeave apart in this race,” said David Carrero, co-founder of Stackscale and expert in cloud and bare-metal infrastructure in Europe.
A commitment to the future of computing
With this deployment, CoreWeave not only introduces the most powerful hardware available but also reaffirms its strategy: providing labs, startups, and AI companies with an optimized, scalable environment prepared for the next generation of foundational models.
The company plans to expand these systems to its data centers in North America and Europe during the second half of 2025. In a landscape where every second and watt counts, CoreWeave seems to be leading the way.