Oracle and AMD Drive AI Supercomputing: OCI to Deploy Over 130,000 MI355X GPUs to Accelerate Large-Scale Model Training and Inference

Oracle’s cloud will be one of the first to offer AI superclusters based on AMD’s new generation of GPUs, competing with NVIDIA’s deployments and attracting major clients for advanced artificial intelligence development.

Oracle has announced a strategic partnership with AMD to bring the latest generation of AMD Instinct MI355X accelerators to its cloud (Oracle Cloud Infrastructure, OCI), reinforcing the company’s commitment to lead the AI supercomputing market. The agreement includes the deployment of a zettascale supercluster capable of housing up to 131,072 MI355X GPUs, a figure representing one of the largest AI hardware deployments in the cloud worldwide.

Performance and Efficiency: Key Features of the New Supercluster

The new offering, coming soon to OCI, promises customers more than double the performance per dollar compared to the previous AMD generation. The MI355X, officially unveiled at AMD’s Advancing AI event in San Jose, California, is built on 3-nanometer technology and based on the CDNA 4 architecture.

Key features include:

  • 288 GB of HBM3E memory and a bandwidth of 8 TB/s per GPU.
  • Up to 1,400 watts per GPU and 64 GPUs per rack, with liquid cooling.
  • Support for 4-bit floating point (FP4), essential for inference of advanced generative models and large LLMs.
  • 2.8x increase in throughput, tripling computational power compared to the previous generation.
  • Comparison with NVIDIA Blackwell: AMD claims that its MI355X offers 1.6 times more memory capacity, double the FP64 performance, and the same bandwidth as the new GB200/B200.

Additionally, the new generation utilizes AMD’s advanced Pollara NICs network, supporting RoCE and Ultra Ethernet Consortium for low latency and high availability, crucial for distributed AI tasks.

An Open Infrastructure Without Dependencies

One of the strengths of the agreement lies in the commitment to open-source software and the absence of vendor lock-in: customers will be able to use AMD’s ROCm stack, compatible with the most popular models and frameworks, facilitating project migration and integration with hybrid architectures.

The infrastructure is designed to cover the entire AI lifecycle: from training large-scale language and vision models to ultra-efficient inference for generative applications and agentic AI. Oracle emphasizes that its clusters will provide “minimal inference and training times, greater energy efficiency, and optimized orchestration thanks to the latest generation AMD Turin CPUs.”

Technological Rivalry: NVIDIA Blackwell Also in OCI

Alongside the announcement with AMD, Oracle has revealed that its supercluster with NVIDIA GB200 NVL72 (Blackwell) GPUs is now available, thus joining the trend of offering both leading architectures in the cloud. This deployment, which also includes integration with NVIDIA’s DGX Cloud Lepton platform, elevates OCI’s total AI capacity to unprecedented levels and responds to the increasing demand for compute-intensive capabilities from businesses, governments, and startups.

Clients and Use Cases: Seekr Partners with Oracle and AMD

Among the first clients is the AI firm Seekr, which has signed a multi-year agreement with OCI to train the next generation of language and vision models, including use cases in satellites and analysis of large volumes of sensor data. Seekr values the scalability, multi-node performance, and international flexibility of the Oracle-AMD infrastructure, as well as the joint support for model optimization and global deployments.

Perspectives and Competition in AI Cloud

Oracle positions itself as one of the pioneering hyperscalers offering zettascale AI clusters with both AMD and NVIDIA, expanding options for companies that need maximum computing power, efficiency, and flexibility. This move, which follows previous announcements of thousands of GB200 GPUs in centers like Stargate (Texas), intensifies the competition among cloud providers to attract next-generation AI projects.

According to Oracle, “we will offer the most diverse and efficient infrastructure for customers looking to train and infer models at massive scale, supported by open technology, security, flexibility, and competitive commercial agreements.” AMD, for its part, underscores the importance of this milestone in democratizing access to AI supercomputing and responding to the explosion in global demand.

The future of artificial intelligence, and the cloud that supports it, is being shaped today in large data centers — and the Oracle-AMD agreement marks a new chapter in the race for technological supremacy.

Scroll to Top