AMD (NASDAQ: AMD) has announced today new accelerator and networking solutions that will power the next generation of large-scale artificial intelligence infrastructures: the AMD Instinct™ MI325X accelerators, the AMD Pensando™ Pollara 400 NIC, and the AMD Pensando Salina DPU. These new technologies set a new standard for superior performance in AI models and advanced data centers.
The AMD Instinct MI325X accelerators, based on the AMD CDNA™ 3 architecture, are designed to deliver exceptional performance and efficiency in AI tasks such as training, fine-tuning, and inference of base models. Together, these products enable AMD customers and partners to develop optimized, high-performance AI solutions at the system, rack, and data centerA data center or data processing center (CPD) … level.
Forrest Norrod, Executive Vice President and General Manager of the Data Center Solutions Group at AMD, commented: “AMD continues to deliver on our roadmap, providing customers with the performance they need and the flexibility they seek to bring AI infrastructure to market at scale faster. With the new AMD Instinct accelerators, EPYC processors, and AMD Pensando network solutions, along with our open software ecosystem, AMD reinforces its critical expertise to develop and deploy world-class AI solutions.”
AMD Instinct MI325X boosts leading AI performance
The AMD Instinct MI325X accelerators deliver industry-leading memory capacity and bandwidthBandwidth is the maximum transfer capacity of d…, with 256 GB of HBM3E supporting 6.0 TB/s and offering 1.8 times the capacity and 1.3 times the bandwidth of the H2001. The AMD Instinct MI325X also provides 1.3 times the theoretical maximum performance in FP16 and FP8 compared to the H200.
This memory and compute leadership can deliver up to 1.3 times the inference performance of Mistral 7B in FP16, 1.2 times the inference performance with Llama 3.1 70B at FP8, and 1.4 times the inference performance with Mixtral 8x7B at FP16 compared to H200.
The AMD Instinct MI325X accelerators are on track for production shipments in the fourth quarter of 2024 and are expected to have broad availability of systems from a wide array of platform providers, including Dell Technologies, Eviden, Gigabyte, Hewlett Packard Enterprise, Lenovo, Supermicro, and others starting in the first quarter of 2025.
Continuing its commitment to annual cadence, AMD has unveiled the next generation of AMD Instinct MI350 Series accelerators. Based on the AMD CDNA 4 architecture, the AMD Instinct MI350 Series accelerators are designed to deliver a 35x improvement in inference performance compared to accelerators based on AMD CDNA 3.
The AMD Instinct MI350 Series will continue to lead in memory capacity with up to 288 GB of HBM3E memory per accelerator. The AMD Instinct MI350 Series accelerators are on track to be available in the second half of 2025.
Next-Generation AMD AI Networks
AMD leverages the broadest programmable DPU for hyperscalers to power the next generation of AI networks. Split into two parts: the front-end, which delivers data and information to an AI cluster, and the back-end, which manages data transfer between accelerators and clusters, the AI network is crucial to ensuring CPUs and accelerators are efficiently utilized in AI infrastructure.
To effectively manage these two networks and drive high performance, scalability, and efficiency system-wide, AMD has introduced the AMD Pensando™ Salina DPU for the front-end and the AMD Pensando™ Pollara 400, the industry’s first UEC-ready AI NIC, for the back-end.
The AMD Pensando Salina DPU is the world’s most powerful and programmable third-generation DPU, doubling performance, bandwidth, and scale compared to the previous generation. The AMD Pensando Salina DPU, supporting 400G performance for fast data transfer rates, is a key component in AI front-end network clusters, optimizing performance, efficiency, security, and scalability of data-driven AI applications.
The AMD Pensando Pollara 400, UEC-ready and powered by the AMD P4 Programmable engine, is the industry’s first UEC-ready AI NIC. It supports next-generation RDMA software and is backed by an open networking ecosystem. The AMD Pensando Pollara 400 is crucial for providing leading performance, scalability, and efficiency in accelerator-to-accelerator communication in back-end networks.
Both the AMD Pensando Salina DPU and the AMD Pensando Pollara 400 will be available to customers in the fourth quarter of 2024 with availability expected in the first half of 2025.
AMD AI Software Offers New Capabilities for Generative AI
AMD continues its investment in driving software and open ecosystem to deliver new and powerful features and capabilities in the open AMD ROCm™ software stack. Within the open-source community, AMD drives compatibility with AMD compute engines in the most widely used AI frameworks, libraries, and models, including PyTorch, Triton, Hugging Face, and many others. This work translates to performance and ready-to-use support with AMD Instinct accelerators in popular generative AI models such as Stable Diffusion 3, Meta Llama 3, 3.1, and 3.2, and over a million models on Hugging Face.
Beyond the community, AMD continues to advance its open ROCm software stack, bringing the latest features to support leading training and inference in Generative AI workloads. ROCm 6.2 now includes support for critical AI features such as FP8 data type, Flash Attention 3, Kernel Fusion, and more. With these new additions, ROCm 6.2, compared to ROCm 6.0, provides up to 2.4 times more inference performance and 1.8 times in training for various LLMs.