The race for AI computing supremacy is entering a more aggressive phase. According to information circulated by Wccftech based on comments from SemiAnalysis and public statements by Forrest Norrod (AMD), both AMD and NVIDIA have been their upcoming architecture designs on the fly—Instinct MI450/MI450X (MI400 family) and Vera Rubin VR200— with increases in TGP and memory bandwidth to avoid losing their edge.
Norrod went as far as to describe MI450 as AMD’s “Milan moment” (referring to the leap initiated by the EPYC 7003 “Milan”), asserting that the offering will be more competitive against the Rubin series than in previous cycles.
What is rumored to have changed
- Power (TGP):
- AMD MI450X: reportedly increased by +200 W over initial estimates.
- NVIDIA Rubin (VR200): reportedly scaled up by +500 W to 2,300 W per GPU.
- Memory bandwidth per GPU:
- AMD MI450: approximately 19.6 TB/s.
- NVIDIA Rubin: approximately 20.0 TB/s (up from an estimated 13 TB/s previously).
- Expected common technologies:
- HBM4 in both cases.
- Nodes from TSMC N3P and chiplet approaches.
- In NVIDIA’s ecosystem, the public roadmap mentions milestones such as NVLink 5 Switch (1,800 GB/s) and new generations of HBM between 2025–2028.
Important: these are leaks and non-final figures. Neither AMD nor NVIDIA has published complete or final specifications.
Projected launch windows
- AMD Instinct MI400 (MI450/MI450X): 2026 (MI400 family).
- NVIDIA Vera Rubin VR200 / NVL144 platform: Second half of 2026.
Rumored specification comparisons
Parameter (rumored) | AMD Instinct MI450 | NVIDIA Vera Rubin VR200 |
---|---|---|
Memory (type/capacity) | HBM4, up to 432 GB/GPU | HBM4, approximately 288 GB/GPU |
Memory bandwidth | ~19.6 TB/s | ~20.0 TB/s |
FP4 performance | ~40 PFLOPS | ~50 PFLOPS |
TGP (trend) | +200 W compared to initial plan | up to 2,300 W (+500 W) |
(Indicative figures; may vary depending on final silicon, platform configurations, SX/HGX versions, etc.)
What it means for AI data centers
- Power and cooling
Increases in TGP of this magnitude point to denser racks in watts, more demanding thermal limits, and probably new requirements for direct-liquid or immersion cooling. For integrators and operators, the total cost of ownership (TCO) is no longer solely dependent on GPU price: power supply, cooling capacity, and infrastructure become critical variables. - Memory as bottleneck
With HBM4 capacities of 288–432 GB/GPU and bandwidths around 20 TB/s, memory remains the decisive factor for multi-billion parameter models, long contexts, and pipelines with mixture of experts (MoE). The capacity difference (432 GB vs. ~288 GB) could tip the scales in extended inference workloads or models that scale with more cache on HBM. - Technological parity
The historic gap may tighten: if both providers deliver HBM4, N3P, and chiplet designs, differentiation will shift to systems, interconnection (NVLink, XGMI/IF), software ecosystem (compilers, kernels, libraries for LLMs), and ecosystem (server providers, HBM module availability, delivery times). - More aggressive cycles
The info suggests reactive adjustments in both roadmaps (TGP and bandwidth increases) to avoid losing ground. For end clients, this implies narrower purchase windows, accelerated validation, and possible platform revisions between tape-outs.
Competitive perspective
- AMD aims for a cusp moment with MI450/MI450X after several cycles of lagging in time-to-market. If confirmed HBM4 at 432 GB, ~19.6 TB/s, and ~40 PFLOPS FP4 with a mature software stack (ROCm, specific kernels for LLM), significant adoption is expected in 2026.
- NVIDIA maintains momentum with Rubin and ecosystem (frameworks, libraries, system providers, design wins). Rumors point to ~20 TB/s, ~50 PFLOPS FP4, and TGP reaching 2,300 W, with early adoption references among major players. The interconnection and software stack continue to be key advantages.
What to watch in the next 6–12 months
- Scaling of HBM4: who will secure volumes and performance with 12-Hi/stack yields and realistic timelines.
- System platforms: chassis, cooling, and PSUs certified for >100 kW/rack with room for growth.
- Software: optimized kernels for FP4/FP8, attention kernels, MoE, and pipeline parallel; reproducible comparisons with open models.
- Availability: delivery times, factory capacity, and regional/client allocation.
- Actual power consumption: real in-rack measurements during representative loads (training and inference), not just nameplate.
Prudence is advised
The figures cited come from unofficial reports, analyst posts, and public statements by executives. Until AMD and NVIDIA confirm final specifications, it’s best to treat these as indicative. Nevertheless, the trends—HBM4, 20 TB/s, >2 kW/GPU, high FP4— align with the upcoming wave of AI super-nodes expected in 2026.
In summary: 2026 is shaping up to be a much tighter head-to-head. If AMD shortens lead times and delivers higher HBM capacity, while NVIDIA preserves its advantages in ecosystem and interconnectivity, the choice for hyperscalers and large enterprises will depend on a complete platform—power, cooling, supply chain, and software—not just on the GPU specs.
via: wccftech