Amazon Web Services and NVIDIA have taken a strategic step in the race for AI infrastructure. During AWS re:Invent 2025, both companies announced the integration of upcoming chips Trainium4 with NVIDIA NVLink Fusion, a rack-scale interconnection platform designed to bring dozens of accelerators together into a single high-performance “brain.”
The collaboration means that Trainium4, AWS’s next-generation AI accelerator, will be designed from the ground up to work with NVLink 6 and the NVIDIA MGX rack architecture, marking a multigenerational partnership between the North American hyperscaler and the chip giant.
What is NVLink Fusion and why does it matter for Trainium4
NVLink Fusion is presented as a rack-scale AI infrastructure platform that allows hyperscalers and custom ASIC designers to integrate their own chips with the NVIDIA ecosystem: from NVLink interconnects to MGX architecture, Vera-Rubin switch trays, and a broad catalog of networking, cooling, and power components.
At the core of the proposal is the NVLink Fusion chiplet, a module designers can “embed” into their ASICs to connect directly to the NVLink domain. When combined with the Vera-Rubin switch tray, based on sixth-generation NVLink Switch and SerDes of 400G, the system can connect up to 72 accelerators in a full mesh, with a bandwidth of 3.6 TB/sec per ASIC, totaling 260 TB/sec of vertical network scale within a single domain.
Practically, this means that future Trainium4 racks could operate as a cohesive supercomputer, with memory accessible between accelerators via direct reads and writes, atomic operations, and advanced capabilities like NVIDIA SHARP for network reductions and multicast acceleration.
The pressure of giant models and AWS’s response
The move arrives at a time when AI infrastructure faces unprecedented demands. New planning, reasoning, and agentic AI models, with hundreds of billions or trillions of parameters and mixture-of-experts (MoE) architectures, require clusters with tens or hundreds of accelerators working in parallel, interconnected via low-latency, high-bandwidth networks.
For hyperscalers, deploying such solutions is no trivial task: it’s not enough to design a competitive AI chip. They must define:
- Custom rack architectures (trays, density, power distribution).
- Vertical-scale networks (NVLink or others) and horizontal-scale (Ethernet, InfiniBand, or proprietary options).
- High-efficiency liquid or air cooling systems.
- Management of thousands of components and dozens of suppliers, ensuring no single-part delays compromise the project.
NVLink Fusion aims to address this foundationally: it offers a “modular AI factory” where Trainium4 and other custom ASICs can be integrated onto a proven platform, reducing development cycles, integration risks, and time to market.
Less risk, more speed: the value of the NVIDIA ecosystem
Beyond interconnection technology, the partnership allows AWS to rely on a comprehensive ecosystem:
- NVIDIA MGX rack architecture, designed for high-density configurations.
- Next-generation GPUs and upcoming NVIDIA Vera CPUs, which can coexist with Trainium4.
- Switches with co-packaged optics, ConnectX SuperNIC, and BlueField DPUs.
- Management and orchestration tools such as NVIDIA Mission Control.
The core idea is that AWS can build heterogeneous silicon offerings—Trainium, NVIDIA GPUs, own Graviton CPUs—within the same footprint, sharing cooling systems, power distribution, and rack designs. Where previously there were parallel projects, now there can be a single flexible platform optimized for diverse training and inference workloads.
According to NVIDIA, connecting 72 accelerators within a single NVLink domain can deliver up to 3 times more inference performance and throughput compared to previous architectures, like the NVL8 configurations with fifth-generation NVLink, all within NVIDIA’s AI software ecosystem.
A step further in the “coopetition” between hyperscalers and NVIDIA
The integration of Trainium4 with NVLink Fusion also carries strategic implications. In recent years, major cloud providers have invested billions designing their own AI chips (Google TPU, AWS Trainium and Inferentia, Microsoft’s proprietary accelerators, and others) to reduce dependency on third parties and optimize costs.
However, the announcement reveals a clear trend: competition and collaboration coexist. AWS continues pushing Trainium as a proprietary alternative but recognizes the value of anchoring it to an interconnection infrastructure, software stack, and ecosystem NVIDIA already extensively deploys in the market.
For NVIDIA, this move ensures that even when clients deploy custom ASICs, the company remains central to the architecture: in the network, within the rack, in the software stack—and in many cases—coexisting with those chips alongside its GPUs.
What’s coming: Factory-scale AI with faster cycles
The combination of Trainium4 and NVLink Fusion aims at a “factory of AI” model where hardware becomes more interchangeable and innovation cycles shorten. Instead of redesigning racks from scratch for each chip generation, hyperscalers could iterate on a stable basis, swapping accelerators and fine-tuning designs without overhauling the entire infrastructure.
In an industry where larger models, more complex agents, and new usage patterns (from enterprise copilots to massive simulations and digital twins) emerge every year, the promise of deploying faster, with less risk and superior performance, becomes hard to ignore.
Within this context, the AWS-NVIDIA alliance is not just a technical agreement: it’s a statement of how the next-generation AI data centers will be built.
Frequently Asked Questions about AWS, Trainium4, and NVIDIA NVLink Fusion
What exactly is NVIDIA NVLink Fusion?
NVLink Fusion is a rack-scale AI infrastructure platform that combines sixth-generation NVLink interconnects, Vera-Rubin switch trays, and the MGX rack architecture. It enables the connection of up to 72 accelerators within a high-speed single domain, offering tens of terabytes per second of aggregate bandwidth between chips.
How will Trainium4 benefit from NVLink Fusion?
Trainium4 is being designed to integrate from the outset with NVLink 6 and the MGX architecture. This will allow AWS to build racks where Trainium4 chips can communicate with each other and potentially with NVIDIA GPUs via a low-latency, high-bandwidth fabric, ideal for training and deploying large-scale AI models.
What advantages does NVLink Fusion have over other high-speed networks?
Compared to other interconnect solutions, NVLink Fusion builds on well-established technology used in supercomputers and AI clusters. It allows direct memory access between accelerators, atomic operations, and network reductions—all integrated with NVIDIA’s software stack—making large-scale model programming and optimization simpler.
Does this alliance threaten AWS’s independence in AI chips?
Not necessarily. AWS continues developing its own chip families like Trainium and Inferentia, and maintains its Graviton CPUs. What’s changing is that instead of building the entire infrastructure from scratch, AWS leverages the NVLink Fusion platform to reduce time, risks, and costs. It exemplifies “coopetition”: competing in silicon design but cooperating in interconnect and rack architecture with NVIDIA.
via: developer.nvidia

