Michael Dell and Jensen Huang shared the stage again in Las Vegas with a message that goes far beyond another round of AI servers. The thesis that Dell Technologies wants to establish in the market is simple: private AI is no longer a complex, expensive, and almost artisanal project but has become a measurable, deployable, and operable infrastructure category, seamlessly integrated into the enterprise data center.
The most talked-about figure is savings. Dell claims that, for certain agent-based AI workloads, their Deskside Agentic AI solutions can reduce costs by up to 87% compared to public cloud APIs over two years, with a break-even point that can be reached in just three months. These are business estimates that depend on actual usage, models, energy consumption, support, and hardware amortization, but they reflect an increasingly common discussion in companies: not all AI workloads make sense as variable cloud consumption.
From experimental project to “packaged AI factory”
For years, running AI locally sounded like a lab initiative or a project reserved for large infrastructure teams. It required selecting servers, GPUs, storage, network, software, models, security, operations, and support. Many companies ended up opting for external APIs because they were quick to test, easy to contract, and didn’t demand a significant initial investment.
That balance is starting to shift. The explosion of AI agents changes the economics. A chatbot that answers a few questions can perform well under a pay-as-you-go model. Meanwhile, a set of agents querying internal documents, executing code, calling APIs, working with databases, and generating thousands of interactions daily can drive up costs and raise security, privacy, and data governance concerns.
Dell aims to solve this problem with an end-to-end stack, from workstations to data center racks. Dell Deskside Agentic AI combines high-performance workstations, NVIDIA accelerators, open models, NVIDIA NemoClaw, and OpenShell to create and run agents close to the user or team that needs them. In the data center space, Dell AI Factory with NVIDIA groups PowerEdge servers, storage, networking, software, services, and reference architectures.
The key difference from previous stages is in the purchasing approach. A CIO no longer has to see private AI as a collection of disjointed parts. They can specify a validated “AI factory” from Dell and NVIDIA, with configurations designed for development, inference, agents, enterprise data, and scaling. While it doesn’t eliminate complexity, it packages it more effectively.
Token economy transforms the discussion
Cost per token has become a business metric. Each model query, summary, agent call, code execution, and contextual search consumes tokens. When usage is small or unpredictable, public cloud APIs remain convenient. But for intensive, repetitive workloads tied to internal data, on-premises infrastructure can start to make sense.
That’s where Dell’s argument comes in. If a company knows its agents will be working daily over documentation, repositories, tickets, operational data, or internal workflows, paying for each interaction via external consumption becomes less attractive. Additionally, bringing inference into a private environment allows better control over latency, privacy, data residency, and access to sensitive systems.
| Element Announced by Dell/NVIDIA | What It Brings |
|---|---|
| Dell Deskside Agentic AI | AI agents running on local workstations for development and business teams |
| Up to 87% cost reduction compared to cloud APIs | Dell’s estimate for certain workloads over two years |
| Break-even in three months | Comparison of Dell vs. public cloud APIs |
| PowerEdge with NVIDIA Blackwell Ultra | High-performance infrastructure for training and inference |
| Up to 256 Blackwell Ultra GPUs per rack | Dell IR7000 configurations with liquid cooling |
| Up to 4x faster LLM training | Dell’s comparison against previous generations |
| NVIDIA OpenShell | Open runtime for agents with security and privacy controls |
| NVIDIA NemoClaw | Reference stack for autonomous and persistent agents |
Hardware remains crucial. Dell has already expanded its AI Factory using PowerEdge servers based on NVIDIA Blackwell Ultra, with configurations supporting up to 192 GPUs per system and customizable options of up to 256 GPUs per Dell IR7000 rack. The company also announced up to four times improvements in training large language models compared to previous generations. Achieving such density requires direct liquid cooling, high-speed networking, and a more integrated rack architecture.
But the real novelty isn’t just in fitting more GPUs into less space. It’s in transforming that capacity into an enterprise platform for private AI. Companies don’t just want to train models; they want agents that operate with their data, under their policies, and within their security boundaries.
Open source, agents, and controlled data
The competitive narrative is also evolving. Two years ago, many companies believed the best results could only come from closed, frontier models accessed via API. Today, open models have improved enough to cover many enterprise use cases, especially when fine-tuned with proprietary data, combined with RAG, run in controlled workflows, or integrated with internal tools.
That doesn’t mean open models have replaced frontier models across all tasks. But they do reduce dependency. A company might choose external APIs for highly advanced tasks while running open-weight models locally for internal support, document search, classification, automation, code analysis, or specialized agents.
NVIDIA and Dell aim to fill that intermediate zone. They’re not just selling servers; they’re promoting an infrastructure concept where models, data, and agents operate under corporate control. OpenShell provides an execution environment with security and privacy controls. NemoClaw and Nemotron offer a foundation to develop custom agents and models. Dell’s AI Data Platform connects this layer to enterprise data.
For regulated sectors, this argument is compelling. Banking, healthcare, industry, public administration, defense, and professional services can’t always send sensitive data to external APIs without regulatory, jurisdictional, intellectual property, and traceability considerations. Private AI offers an alternative, provided the company has the capacity to operate it securely.
The concept of “on-premise” also shifts in perception. It’s no longer seen as a step backward or a rejection of cloud. The most likely scenario is hybrid: public cloud for elasticity, experimentation, and some frontier models; private infrastructure for sensitive data, persistent agents, and predictable costs; edge and workstations for low-latency requirements or local data processing.
The battle won’t just be about which model is most powerful. It’ll be about who manages to run AI within the company with sustainable costs, protected data, and agents capable of acting without breaching internal controls. Dell and NVIDIA want this stack to be purchased almost as an infrastructure upgrade. If successful, private AI will cease to be an exception and will genuinely start competing with pure API-based models.
FAQs
What has Dell announced alongside NVIDIA?
Dell has enhanced its AI Factory with NVIDIA with solutions for agent-based AI, local workstations, PowerEdge servers, models, agent software, and architectures designed to run private AI from desktop to data center.
Is the 87% savings compared to cloud APIs real?
It’s Dell’s estimate for certain agent-based AI workloads over two years. It could be significant for intensive, constant use cases but depends on volume, models, energy, support, and hardware amortization.
What does it mean that private AI fits in a rack?
It means some companies can run models and agents on their own infrastructure, with validated servers, GPUs, networking, storage, and software, instead of relying solely on external APIs.
Will private AI replace public cloud?
Not broadly. The most likely scenario is hybrid: public cloud for certain workloads and frontier models; private infrastructure for sensitive data, predictable costs, and integrated agents within internal systems.

