NVIDIA Vera: The CPU That’s Eager to Attack the Data Center

NVIDIA no wants to be just the company dominating AI GPUs anymore. With Vera, their first custom CPU for data centers, the company is stepping directly into territory historically controlled by Intel and AMD. The move is significant: according to statements made after their latest earnings, NVIDIA sees a $200 billion market opportunity for Vera and anticipates nearly $20 billion in CPU revenue this year.

That figure is striking even for a company accustomed to huge numbers. NVIDIA closed its first fiscal quarter of 2027 with record revenues of $81.6 billion, an 85% increase compared to the previous year, and its data center business reached $75.2 billion. Their forecast for the second quarter points to $91 billion, confirming that the AI infrastructure investment cycle remains accelerated.

Why Agentic AI Needs Another CPU

NVIDIA’s thesis is clear: AI agents don’t live solely within the GPU. An agentic system must execute code, call tools, manage memory, query databases, switch context, open isolated environments, coordinate multiple steps, and feed the GPUs with data constantly. Many of these tasks fall to the CPU.

Until now, much of the AI conversation focused on model training and GPU-accelerated inference. But the next phase—driven by agents that act, reason over longer durations, and integrate with enterprise systems—requires a different architecture. That’s where Vera comes in.

NVIDIA defines Vera as a CPU designed for the era of agentic AI. It integrates 88 of their proprietary Olympus cores, supports Armv9.2, features 176 threads via Spatial Multithreading, up to 1.2 TB/s of LPDDR5X memory bandwidth, and supports up to 1.5 TB of memory. It also incorporates second-generation NVLink-C2C with 1.8 TB/s coherent bandwidth for working alongside future Rubin GPUs.

NVIDIA Vera FeaturesAnnounced Data
ArchitectureArmv9.2
Cores88 Olympus
Threads176 with Spatial Multithreading
Memory BandwidthUp to 1.2 TB/s
Supported MemoryUp to 1.5 TB
InterconnectNVLink-C2C at 1.8 TB/s
Main UseAgentic AI, RL, analytics, sandboxes, orchestration
ImprovementUp to 50% faster per core under load

This approach shifts the way we think about CPUs. In traditional servers, value was often measured by cores, consolidation, and resource rental capacity. Jensen Huang summarized it more directly: agents don’t “rent cores”; they want tasks completed quickly. For this logic, NVIDIA proposes a CPU with high per-core performance, ample memory, and tight integration with their GPU, networking, and software ecosystem.

From GPU to the Entire System

Vera doesn’t come alone. It’s part of the Vera Rubin architecture, NVIDIA’s next major platform for rack-scale AI. In Vera Rubin NVL72, the CPU acts as the host processor for Rubin GPUs and integrates with BlueField-4, Spectrum-X, MGX, and NVIDIA’s interconnection networks. The company aims for data centers to be bought and operated increasingly as a complete unit, not just as a collection of separate components.

That’s the strategic core move. NVIDIA isn’t just selling chips—they’re selling the AI factory: GPU, CPU, networking, DPU, software, libraries, rack systems, workload management, and agent tools. The more integrated the AI infrastructure, the harder it becomes for clients to replace a single component without touching the rest.

They have already begun delivering Vera systems to major labs and cloud providers. NVIDIA announced that the first systems have reached Anthropic, OpenAI, SpaceXAI, and Oracle Cloud Infrastructure. OCI also declared plans to deploy hundreds of thousands of Vera CPUs starting in 2026 for large-scale agentic AI workloads.

Uses of VeraRole in Strategy
Independent CPUCompete in high-performance AI and analytics servers
Host CPU in Vera RubinPower and coordinate Rubin GPUs in AI racks
Vera with ConnectX-9Accelerate storage, security, and isolation
Vera for CloudRun agentic workloads, sandboxes, and parallel environments
Vera for EnterprisesBring private AI and agentic inference to on-prem infrastructure

The implications for Intel and AMD are clear. NVIDIA is entering the CPU market not from the general-purpose server angle but from the segment where spending is growing fastest: AI. It isn’t aiming to compete primarily across all traditional data centers but targeting workloads where the economic value is highest and where their ecosystem already holds a dominant position.

A Huge Market, but Bottleneck Risks

NVIDIA’s enthusiasm comes with acknowledged risks: supply constraints. Huang has admitted that supply limitations are expected to persist throughout the Vera Rubin lifetime. It makes sense—Vera depends on high-performance LPDDR5X memory, advanced packaging, manufacturing capacity, and a supply chain already strained by AI demand.

Memory will be one of the most delicate components. Vera needs large capacity and bandwidth to sustain thousands of parallel software environments, manage context switching, and keep the GPUs fed. Meanwhile, the industry faces intense pressure on DRAM, HBM, and advanced memory types due to the growth of AI data centers.

Competition also looms. Google, Amazon, Microsoft, and other hyperscalers develop their own accelerators or specialized chips to reduce costs and dependency. AMD pushes EPYC processors and their Instinct GPUs. Intel attempts to regain ground in data centers. Others, like Groq, target specific inference niches with different architectures.

NVIDIA maintains that solutions relying heavily on SRAM and focused on decode—such as those designed for low latency and high token throughput—will remain niche for a while. Their argument is that many agentic workloads need to handle large contexts, manage bigger models, and operate more varied pipelines, not just generate tokens at high speed.

The CPU Returns to the Center Stage

For years, the CPU seemed to lose prominence in the AI narrative. GPUs were the stars. Vera shows that the story is more complex. When AI moves from just answering questions to executing tasks, the CPU regains importance—not as a GPU replacement, but as the system coordinator.

The practical question for clients will be: how much does Vera improve the cost per completed task? It’s not enough to promise higher performance per core or better energy efficiency. Hyperscalers and companies will measure whether their agents finish faster, GPUs wait less, sandboxes run more smoothly, data queries are quicker, and overall infrastructure costs truly decrease.

If NVIDIA meets its forecasts, Vera could become one of the most significant server launches in years. Not because it immediately displaces Intel and AMD across all workloads, but because it redefines the most strategic segment: AI data centers.

This move also reinforces an increasingly common idea: artificial intelligence isn’t built solely with models. It’s built with CPUs, GPUs, memory, networking, cooling, power, software, and architecture capable of supporting millions of simultaneous tasks. NVIDIA already dominated much of this chain. With Vera, it aims to control another piece previously outside its direct reach.

Frequently Asked Questions

What is NVIDIA Vera?
NVIDIA Vera is NVIDIA’s first custom CPU for data centers, designed for agentic AI workloads, reinforcement learning, analytics, orchestration, tool-calling, and context management.

Why is NVIDIA entering the CPU market now?
Because AI agents require significant CPU capacity to coordinate tools, execute code, move data, manage memory, and feed GPUs. NVIDIA wants to control this part of the infrastructure.

How much does NVIDIA expect to earn with Vera?
Based on their reports, NVIDIA sees a potential $200 billion market for Vera and expects close to $20 billion in CPU revenue this year.

Does Vera replace GPUs?
No. Vera is designed to work alongside GPUs. Its role is to coordinate, feed, and manage control, data, and agent workloads within future AI factories.

Scroll to Top