Red Hat, a leading provider of open-source solutions, and AMD (NASDAQ: AMD) have announced a strategic collaboration to enhance artificial intelligence (AI) capabilities and optimize virtualized infrastructure. This strengthened alliance will enable both companies to offer customers more options in hybrid cloud environments, ranging from deploying more efficient AI models to cost-effective modernization of traditional virtual machines (VMs).
With the growing demand and diversity of AI-driven tasks, organizations need to have the appropriate resources and capabilities to tackle these new challenges. However, most data centers continue to focus on traditional IT systems, which limits their ability to run intensive workloads associated with AI.
To address this challenge, Red Hat and AMD are combining the power of Red Hat’s open-source solutions with AMD’s extensive portfolio of high-performance computing architectures, providing a joint offering tailored to the new needs of today’s technological environment.
AMD and Red Hat: Towards More Efficient Generative AI
Red Hat and AMD are combining the power of Red Hat AI with AMD’s offering of x86-based processors and GPU architectures to support optimized, cost-effective, and production-ready AI-enabled workloads. AMD Instinct GPUs are now fully enabled for Red Hat OpenShift AI, providing customers with the high-performance processing power needed for AI deployments in hybrid clouds without high resource requirements. Additionally, using AMD Instinct MI300X GPUs with Red Hat Enterprise Linux AI, Red Hat and AMD successfully conducted tests on Microsoft Azure ND MI300X v5 to demonstrate AI inference for scaling small language models (SLMs) as well as large language models (LLMs) deployed across multiple GPUs in a single VM, reducing the need to deploy across multiple VMs and lowering performance costs.
To further drive performance acceleration and adaptability, Red Hat and AMD are collaborating on the vLLM community channel to promote more efficient AI inference. With this activation, Red Hat and AMD aim to deliver:
Enhanced performance with AMD GPUs: By upgrading the AMD kernel library and optimizing various components like the Triton kernel and FP8, Red Hat and AMD improve inference performance for both dense and quantized models, allowing for faster and more efficient execution of vLLM on AMD Instinct MI300X accelerators.
Improved multi-GPU support: Enhancements in collective communication and optimization of multi-GPU workloads pave the way for more scalable and energy-efficient AI deployments. This is especially beneficial for tasks requiring distributed computing across multiple GPUs, which reduces bottlenecks and improves overall performance.
- Greater commitment to the vLLM ecosystem: Cross-collaboration among Red Hat, AMD, and other industry leaders like IBM helps accelerate upstream development to drive continuous improvements for both the vLLM project and AMD’s GPU optimization, further benefiting vLLM users who rely on AMD hardware for AI inference and training.
Building on this collaboration with the vLLM community, AMD Instinct GPUs will support Red Hat AI Inference Server, the professional distribution for vLLM from Red Hat, ready-to-use for a powerful, reliable, and scalable AI inference server. As a leading commercial partner of vLLM, Red Hat is committed to ensuring compatibility when deploying vLLM on the hardware chosen by the organization, which of course includes AMD Instinct GPUs. Running vLLM on AMD Instinct GPUs enables organizations to deploy any open-source AI model on validated and tested GPU hardware for exceptional optimization and performance.
AMD EPYC™ CPUs also enable end-to-end AI performance and are ideal for hosting GPU-enabled systems. This can help enhance performance and return on investment (ROI) for each GPU server even for the most demanding AI workloads.
Transforming the Modern Data Center
By optimizing the foundations of existing data centers, organizations can reinvest resources more effectively and easily to enable AI innovation. Red Hat OpenShift Virtualization, a feature of Red Hat OpenShift, provides a streamlined path for organizations to migrate and manage virtual machine tasks with the simplicity and speed of a cloud-native application platform. Red Hat OpenShift Virtualization is validated for AMD EPYC processors, leveraging the excellent performance and energy efficiency of AMD EPYC processors as needed in hybrid cloud environments while maintaining a bridge to a cloud-native future. Red Hat OpenShift Virtualization on AMD EPYC CPUs helps companies optimize application deployments on leading servers, such as Dell PowerEdge, HPE ProLiant, and Lenovo ThinkSystem products. By modernizing a legacy data center, Red Hat OpenShift Virtualization allows for the unification of virtual machines and containerized applications, whether on-premises, in public clouds, or in hybrid cloud environments. This helps facilitate high infrastructure consolidation ratios that can lead to a significantly lower total cost of ownership (TCO) across hardware dimensions, software licensing, and energy. This also provides the added advantage of allowing IT teams to manage current critical workloads more effectively while freeing up resources and energy to apply to AI workloads now and in the future.