X (Twitter) Facebook Pinterest LinkedIn E-mail

Amazon Web Services (AWS) and OpenAI have announced a multi-year strategic partnership that makes Amazon’s cloud one of the cornerstones for OpenAI’s most advanced AI work, with immediate effect. Valued at $38 billion and expected to grow over seven years, the agreement grants OpenAI access to hundreds of thousands of NVIDIA GPUs on AWS infrastructure — scalable to tens of millions of CPUs — for training, inference, and next-generation agentic workloads.

The scale of deployment confirms two industry trends: the demand for frontier compute has shot up, and AI providers are raising the bar in performance, scale, and security for public clouds. AWS emphasizes operating clusters exceeding 500,000 chips and states that the infrastructure allocated to OpenAI will be fully deployed before the end of 2026, with room to expand into 2027 and beyond.

“Scaling frontier AI requires massive and reliable computation,” said Sam Altman, cofounder and CEO of OpenAI. “Our collaboration with AWS strengthens the large compute ecosystem fueling this new era and brings advanced AI to everyone.”
“As OpenAI continues pushing the boundaries of what’s possible, AWS’s world-class infrastructure will be the backbone of their AI ambitions,” added Matt Garman, CEO of AWS. “The breadth and immediate availability of our optimized compute demonstrate why AWS is uniquely positioned to support AI workloads of this magnitude.”

What the partnership includes: compute, networking, and software ready for large-scale AI

AWS and OpenAI have designed a dedicated AI architecture based on Amazon EC2 UltraServers, where GPU clusters — including GB200 and GB300 — are co-located within the same network to minimize latency between nodes and accelerate training and inference of large models. The cluster topology allows running everything from ChatGPT inference to training the next generation of models, with flexibility to adapt resources as needs change.

Practically, this means:

Low-latency GPU clusters, suitable for token-intensive workloads and growing context windows.
Elastic scaling to tens of millions of CPUs, useful for preprocessing, agent orchestration, RAG, and peripheral services surrounding the core model.
Secure and reliable operation within AWS data centers, with proven experience in large-scale deployments and enterprise-grade security.

This partnership also deepens an existing relationship. This year, OpenAI’s open weights models became accessible via Amazon Bedrock, AWS’s Platform as a Service (PaaS) for models. OpenAI has quickly become one of the most in-demand public providers on Bedrock, with thousands of clients — including Bystreet, Comscore, Peloton, Thomson Reuters, Triomics, and Verana Health — using these models for agentic workflows, assistive programming, scientific analysis, and mathematical problem-solving.

Why now: the “computing economy” in the frontier AI era

The AI industry is in a race for capacity. Each iteration of frontier models requires more data, more parameters, and more training steps. Meanwhile, the market demands shorter response times and more predictable costs in production. In this context:

Performance: physically and logically close interconnection of low-latency GPU networks is critical to maintain high utilization and TTFT (Time to First Token) competitiveness during inference.
Scale: moving from pilots to production entails not only more GPUs but orchestrating millions of work threads, transferring large amounts of embeddings and documents in RAG, and serving global traffic 24/7.
Security and reliability: Workload isolation, data governance, and comprehensive telemetry are now as important as TFLOPS.

AWS emphasizes its expertise in designing, implementing, and operating such large-scale clusters, and OpenAI gains an additional backbone to train and infer without disrupting their model roadmap.

Business value: from lab to production, with less friction

For companies building on ChatGPT or OpenAI models via API, and those consuming open weights models through Amazon Bedrock, this alliance has practical implications:

Sustained capacity: more compute helps reduce queues and latencies, and to absorb spikes during campaigns or launches.
Faster innovation cadence: if OpenAI accelerates training of new versions, clients might see more frequent improvements in reasoning, context, or model security.
Multiple adoption paths: with Bedrock, many companies can integrate OpenAI models — and models from other providers — into a single console, with unified billing, AWS security controls, and consolidated telemetry.

Equally important: Designed for production and growth through 2027+, the platform helps avoid the “eternal testing” syndrome. The shared infrastructure — compute, network, and storage — reduces integration work and operational risk, two major barriers to moving from Poc to real deployments.

What to expect in the tech stack?

While detailed network and software specifics haven’t been shared, the announcement outlines several architectural principles:

EC2 UltraServers as the building block: clusters of NVIDIA GB200 and GB300 GPUs within the same network domain to maximize effective bandwidth and minimize latency during training and inference.
Low-latency fabric: AWS’s interconnect aims to avoid bottlenecks that worsen p99 latency and GPU utilization.
Large-scale CPU/GPU hybridization: the ability to add tens of millions of CPUs enables disaggregation of auxiliary capabilities (preprocessing, sharding, dispatchers, agents) without robbing tensor cycles from GPUs.

In sum: GPUs for tensors, CPUs for control, all orchestrated in a high-speed, secure mesh.

Statements from the companies

Beyond Altman and Garman’s comments, AWS frames the deal as a continuation of its “AI as infrastructure” strategy: very large clusters, enterprise-grade security, and an economy of compute based on performance per dollar and immediate availability. OpenAI, for its part, aims to diversify and secure access to trusted compute while raising the bar for their models.

Timelines and availability

Immediate deployment: OpenAI is already using AWS compute under this partnership.
Target 2026: all committed capacity will be deployed before the end of 2026, with expansion planned into 2027 and beyond.
Bedrock: OpenAI’s open weights models are already available on Amazon Bedrock, with thousands of active clients.

Market perspective: competition, costs, and neutrality

The announcement reinforces that the AI compute race is evolving into a multifront competition. Major cloud providers are competing to offer the best mix of chips, network fabrics, storage, and model platforms. In this race, multi-year deals with AI heavyweights — like OpenAI — ensure usage and economies of scale for cloud providers, while providing model creators with predictable capacity for their roadmaps.

On costs: more capacity doesn’t always mean lower prices short-term, as the demand for frontier training and token-intensive inference still outpaces accelerator supply. However, optimized clusters tend to deliver better performance per dollar, which can help contain costs per session or token over the medium term.

Regarding neutrality: AWS highlights that Bedrock includes multiple models — not only OpenAI — and that the partnership operates within a framework where clients choose; no exclusivity was mentioned in the announcement.

Implications for developers and data teams

More consumption options: OpenAI API or Bedrock for open weights models from OpenAI (and others), with AWS integrations like IAM, CloudWatch, PrivateLink, etc.
Open doors for agents: emphasis on agentic workloads hints that infrastructure supports tool persistence at scale (KV-cache memory, vector search, functions).
Simplified operation: less glue code infrastructure and more focus on data and application (prompts, evaluations, security, product metrics).

Risks and open questions

Topology details: the announcement mentions GB200 and GB300, but lacks info on versions, memory sizes, or network topologies; these will be key to understanding practical limits.
Model improvement cadence: increased capacity doesn’t automatically mean better models; depends on data, recipes, and training security.
Cost implications for end users: no details on potential impact on API pricing or Bedrock services.

Frequently Asked Questions

What exactly does the AWS and OpenAI partnership include and how long does it last?
It’s a multi-year agreement valued at $38 billion with growth over seven years. OpenAI gains immediate access to hundreds of thousands of NVIDIA GPUs on AWS EC2 UltraServers, with the option to scale to tens of millions of CPUs. All committed capacity will be deployed before the end of 2026, with potential expansion into 2027 and beyond.

How does this benefit ChatGPT and OpenAI’s customers?
More capacity and optimized clusters should translate into lower latency, higher availability, and a faster cadence of model improvements. For businesses, the partnership facilitates moving from pilots to production and provides multiple consumption paths: the OpenAI API or Amazon Bedrock for open weights models.

What’s the difference between using OpenAI directly or via Amazon Bedrock?
Using OpenAI directly involves consuming the service from OpenAI; with Bedrock, you access open weights models from OpenAI (and others) within AWS, with security, monitoring, and billing integrated into Amazon’s cloud. Bedrock is useful for standardized multi-model operation on AWS.

What specific workloads will the infrastructure support (training, inference, agents)?
The clusters are designed to serve inference (e.g., ChatGPT) and train next-generation models, as well as agentic workloads combining models + tools + orchestration. The low-latency topology and CPU/GPU elasticity aim at RAG flows, semantic search, scientific analysis, and automation.

via: Amazon

X (Twitter) Facebook Pinterest LinkedIn E-mail