AWS and NVIDIA expand their collaboration to drive innovation in generative AI.

Amazon Web Services (AWS), a company of Amazon.com (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) announced today the incorporation of the new NVIDIA Blackwell GPU platform to AWS. This initiative offers the NVIDIA GB200 Grace Blackwell Superchip and B100 Tensor Core GPUs, extending the long-standing strategic collaboration between the companies to provide the most advanced and secure infrastructure, software, and services, helping customers unlock new capabilities in generative artificial intelligence (AI).

NVIDIA and AWS combine the best of their technologies, including the latest multi-node NVIDIA systems based on the NVIDIA Blackwell platform and AI software, AWS Nitro system, advanced key management service AWS KMS, petabit-scale Elastic Fabric Adapter (EFA) network, and Amazon Elastic Compute Cloud (Amazon EC2) UltraCluster hyper-scaling. Together, they provide the infrastructure and tools that enable customers to build and run real-time inference on multi-trillion-parameter large language models (LLMs) faster, at a larger scale, and at a lower cost than previous generation NVIDIA GPUs on Amazon EC2.

“The deep collaboration between our two organizations dates back more than 13 years, when we jointly launched the first GPU cloud instance on AWS, and today we offer the widest range of NVIDIA GPU solutions for customers,” said Adam Selipsky, CEO of AWS. “NVIDIA’s next-generation Blackwell Grace processor marks a significant step forward in generative AI and GPU computing. When combined with AWS’s powerful Elastic Fabric Adapter network, Amazon EC2 UltraCluster hyper-scaling, and our advanced Nitro system virtualization and security capabilities, we make it possible for customers to build and run multi-trillion-parameter large language models faster, at a larger scale, and more securely than anywhere else. Together, we continue to innovate to make AWS the best place to run NVIDIA GPUs in the cloud.”

“AI is driving advances at an unprecedented pace, leading to new applications, business models, and innovations across all industries,” said Jensen Huang, founder and CEO of NVIDIA. “Our collaboration with AWS is accelerating new generative AI capabilities and providing customers with unprecedented computing power to push the boundaries of what is possible.”

AWS will offer the NVIDIA Blackwell platform, which includes GB200 NVL72 with 72 Blackwell GPUs and 36 Grace CPUs interconnected by NVIDIA’s fifth-generation NVLink. Connected with Amazon’s powerful EFA network and supported by advanced virtualization (AWS Nitro system) and UltraCluster hyper-scaling (Amazon EC2), customers can scale to thousands of GB200 Superchips. NVIDIA Blackwell on AWS offers a massive breakthrough in accelerating inference workloads for resource-intensive and multi-trillion-parameter large language models.

Building on the success of NVIDIA H100-powered EC2 P5 instances available to customers through Amazon EC2 capacity blocks for ML, AWS plans to offer EC2 instances with the new B100 GPUs deployed on EC2 UltraClusters to accelerate generative AI training and large-scale inference. The GB200 will also be available on NVIDIA DGX™ Cloud, an AI platform co-developed on AWS, providing enterprise developers dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models. Blackwell-powered DGX Cloud instances on AWS will accelerate state-of-the-art generative AI and LLM development that can reach over 1 trillion parameters.

This project reaffirms AWS and NVIDIA’s commitment to accelerating the development of generative AI applications and promoting use cases in healthcare and life sciences, enhancing AI security with AWS Nitro system, AWS KMS, encrypted EFA, and Blackwell encryption. Through Project Ceiba, NVIDIA and AWS aim to advance generative AI innovation by building one of the world’s fastest AI supercomputers, hosted exclusively on AWS. This unparalleled supercomputer, available for NVIDIA’s research and development, will utilize the GB200 NVL72 system, featuring 20,736 B200 GPUs connected to 10,368 NVIDIA Grace CPUs, interconnected through the fourth-generation EFA network, providing up to 800 Gbps per low-latency Superchip, high-bandwidth AI network—capable of processing a massive 414 exaflops—, a 6x performance increase over previous plans to build Ceiba on the Hopper architecture. NVIDIA’s research and development teams will use Ceiba to advance AI for LLMs, graphics (image/video/3D generation), simulation, digital biology, robotics, autonomous vehicles, climate prediction NVIDIA Earth-2, and more to help NVIDIA drive future generative AI innovations.

Additionally, AWS and NVIDIA are collaborating to accelerate the development of generative AI applications and advance use cases in healthcare and life sciences. They join forces to deliver high-performance, cost-efficient infrastructure for generative AI inference with the integration of Amazon SageMaker with NVIDIA NIM™ inference microservices, available with NVIDIA AI Enterprise. Customers can use this combination to quickly deploy pre-compiled and optimized Foundational Models (FMs) to run on NVIDIA GPUs in SageMaker, reducing the time to market for generative AI applications.

AWS and NVIDIA have also teamed up to expand computer-assisted drug discovery with new NVIDIA BioNeMo™ FMs for generative chemistry, protein structure prediction, and understanding how drugs interact with targets. These new models will soon be available on AWS HealthOmics, a service designed specifically to help healthcare and life sciences organizations store, query, and analyze genomics, transcriptomics, and other omics data.

The teams at AWS HealthOmics and NVIDIA Healthcare are also working together to launch generative AI microservices to advance drug discovery, medtech, and digital health, providing a new catalog of GPU-accelerated cloud endpoints for biology, chemistry, imaging, and health data, so healthcare companies can leverage the latest advances in generative AI on AWS.

Scroll to Top