NVIDIA DSX Turns the AI Factory into a Complete System

NVIDIA has announced DSX, a new platform designed to help operators, manufacturers, and integrators build and operate “AI factories” with a logic more aligned with industrial engineering than traditional server deployment. The announcement, made at GTC Taipei, shows how much the company aims to expand its role in AI infrastructure: it’s no longer just about selling chips, but about defining how data centers are designed, simulated, cooled, powered, and managed to support next-generation models.

The core idea of DSX is to provide a comprehensive manual for constructing large-scale AI infrastructure. NVIDIA combines reference designs, open-source software, APIs, simulation, operational management, cooling technologies, energy integration, and an ecosystem of industry partners. In Jensen Huang’s words, the company does not want to limit itself to “shipping chips,” but to offer infrastructure builders a framework to simulate entire factories before investment, validate performance before rack installation, and operate with the reliability required for production AI.

Cost per token becomes the new metric

The metric NVIDIA wants to place at the center of the debate is the cost per token. Until now, much of the AI discussion has been measured in GPUs, parameters, FLOPS, racks, or megawatts. DSX seeks to shift the conversation to a more operational question: how many tokens can a facility generate or process per megawatt and at what cost.

This is where DSX MaxLPS comes in, a new set of technologies aimed at maximizing tokens per megawatt within a fixed energy budget. According to NVIDIA, this layer combines liquid cooling at 45°C with rack-level technologies that enable up to 40% more GPUs to operate at their peak energy efficiency, with minimal performance impact on workloads.

This data point is significant because AI is already reaching physical limits. Operators not only need to buy more accelerators but also secure electrical power, grid connection capacity, cooling, space, specialized staff, and availability. In many markets, the bottleneck is not just the chip but the megawatt capacity. If a platform can extract more useful capacity within the same electrical limit, the economic impact can be substantial.

DSX also includes DSX OS, a modular, open-source software designed for AI factory operations. The platform covers lifecycle management, intelligent scheduling, runtime consistency, system health automation, resilience, multi-tenant operation, and platform services. The idea is for AI infrastructure to be managed as a living environment, where hardware, software, energy, network, and cooling interact in a coordinated way.

Simulate before building

One of DSX’s most important components is DSX Sim, a high-fidelity simulation layer for modeling infrastructure decisions from planning to operation. NVIDIA aims for clients and partners to validate designs before deploying physical racks—an increasingly necessary step when any error in potential, cooling, or distribution can cost millions.

The platform is complemented by DSX Reference Design, which offers generation-validated architectures for computing, networking, storage, cluster design, power, cooling, controls, and civil, structural, and architectural elements. This standardizes the deployment of AI factories without requiring each operator to reinvent the entire design from scratch.

DSX Flex adds another dimension: connection to electrical grid services. This layer enables dynamic adaptation of AI loads to external signals such as load shedding, demand response, price events, or availability of renewables and storage. In a context where data center demand begins to strain local electrical grids, an AI factory’s ability to behave more flexibly could be crucial for securing new power capacity.

NVIDIA also introduces DSX Exchange, aimed at integrating signals across compute, network, energy, power, and cooling between IT, operational technology, and operational agents. This is critical because AI data centers can no longer be managed in silos. Rack, cluster, power plant, cooling system, and network information must be shared to optimize performance and availability.

An ecosystem reinforcing NVIDIA’s position

DSX is accompanied by a broad network of partners. Manufacturers such as Dell Technologies, HPE, Lenovo, Supermicro, ASUS, Foxconn, GIGABYTE, Pegatron, Quanta Cloud Technology, Wistron, and Wiwynn are developing systems ready for DSX. Meanwhile, cloud providers like CoreWeave, Crusoe, Firmus, IREN, Lambda, Nebius, Nscale, and Yotta Data Services are deploying platform components to reduce risks, improve GPU utilization, and accelerate AI capacity deployment.

This list highlights NVIDIA’s ambition. DSX is not just an internal tool or best practices collection. It’s a way to organize the ecosystem around its architecture. If manufacturers produce “DSX-ready” systems and operators adopt DSX Sim, DSX MaxLPS, and DSX OS, NVIDIA gains influence over the entire infrastructure design—from chip to installation.

Industrial software partners are also involved. QCT and Pegatron are working with Dassault Systèmes on a digital twin configurator for AI factories, while the Omniverse DSX Blueprint ecosystem integrates with companies like Cadence, PTC, and Siemens. This connection between simulation, systems engineering, and physical deployment reinforces the idea that AI is no longer just software: it’s an industry of heavy infrastructure.

DSX Flex is also being tested in a multi-megawatt commercial pilot with Emerald AI and Silicon Valley Power to demonstrate AI factories capable of adjusting consumption in response to utility signals without compromising workload performance. If this approach scales, it could help accommodate new data centers within increasingly strained electrical grids.

The AI factory as a product

The strategic message is clear. NVIDIA aims to turn AI factories into a complete product, not just a collection of components from different suppliers. This benefits clients with reduced integration risk, more proven designs, better simulation, greater operational visibility, and a clearer path to large-scale deployment.

However, it also increases dependency on the NVIDIA ecosystem. The more operators adopt its reference designs, open-source software, simulation tools, and energy management platforms, the more difficult it becomes to separate AI infrastructure from its architecture. For many operators, this integration will be attractive because it accelerates deployment. Others will have concerns about lock-in, interoperability, and long-term control.

This movement comes at a time when the industry is shifting from frantic GPU purchasing to more mature operational phases. Major clients no longer want just quick clusters; they want stable, efficient, repeatable, and measurable capacity. Cost per token, availability, consumption per megawatt, and time-to-market will become metrics as important as the number of accelerators installed.

NVIDIA DSX directly addresses this shift. The company is telling the market that competitive advantage in AI will no longer depend solely on having the most powerful chip but on designing the entire factory to convert energy into intelligence at the lowest possible cost. In a world where megawatts are scarce and AI demand continues to grow, this could become the new frontier of digital infrastructure.

FAQs

What is NVIDIA DSX?
NVIDIA DSX is a platform for designing, simulating, deploying, and operating AI factories. It includes software, APIs, reference designs, simulation tools, operational management, and industry partner technologies.

What is DSX MaxLPS?
It is a set of technologies aimed at maximizing tokens per megawatt. NVIDIA claims it can enable up to 40% more GPUs to operate at peak energy efficiency.

What does DSX OS contribute?
DSX OS is a modular, open-source software for managing AI factory operations, covering lifecycle management, resilience, system health, multi-tenancy, and platform services.

Why does the cost per token matter?
Because it measures the actual efficiency of an AI infrastructure. Having many GPUs is not enough; operators need to know how many tokens can be produced per megawatt and at what cost.

Scroll to Top