NVIDIA has announced the general availability of a new variant of their professional GPU based on Blackwell: the RTX PRO 5000 with 72 GB of GDDR7 memory. The concept is straightforward, but it arrives at a very specific moment: current generative AI and especially agentic AI workflows are pushing desktop GPU memory to the limit, just as companies want to prototype and run more tasks “locally” for privacy, latency, and cost reasons.
Until now, the RTX PRO 5000 was configured with 48 GB. With this new 72 GB model, NVIDIA expands the options for those needing more VRAM capacity without necessarily moving to a data center environment. In practice, this move targets development, engineering, and creation teams that are no longer just rendering or simulating — they are also running language models, multimodal pipelines, systems with RAG, and toolchains that keep multiple components active simultaneously.
Memory is king: why 72 GB matter more than they seem
In local AI, raw power isn’t everything. For many modern workloads, the bottleneck appears on two fronts:
- Capacity: what models, contexts, and assets can fit simultaneously on the GPU.
- Throughput: the actual speed achieved when generating text, images, or iterating through a pipeline.
NVIDIA presents this RTX PRO 5000 72 GB as a direct response to this pressure. The clear figure it emphasizes is: 72 GB of GDDR7, which is a 50% increase over the 48 GB variant, designed to run more “memory-hungry” workflows, including large models, extended contexts, and pipelines combining text, images, video, tools, and information retrieval.
In this context, the “agentic AI” argument makes technical sense: an agent isn’t usually just a single model spouting text. It often involves orchestration, calls to tools, data retrieval, reasoning with context, and—at the enterprise level—integration with documentation, tickets, repositories, and internal data. Keeping all this in memory (or avoiding constant back-and-forth system calls) is what differentiates a usable prototype from a frustrating one.
Blackwell on workstations: 2,142 TOPS and a promise of “more local, less cloud”
The RTX PRO 5000 72 GB leverages the NVIDIA Blackwell architecture, which the company associates with improvements in AI performance, neural rendering, and simulation, as well as functions aimed at handling multiple workloads more efficiently.
In numbers, NVIDIA cites 2,142 TOPS of AI performance. The message is clear: it aims to position this GPU as capable of bringing tasks that previously required remote infrastructure—light training, fine-tuning, quick testing, agent prototyping, multimodal inference, and low-latency application development—straight to the desktop.
This “local” aspect is also linked to a common enterprise discourse: privacy and data control. Running models on a workstation reduces external dependencies for internal tasks, prevents moving sensitive datasets, and cuts costs associated with cloud-based repetitive testing.
Performance: generative AI benchmarks and rendering leaps
In generative AI benchmarks, NVIDIA claims that this RTX PRO 5000 72 GB offers:
- Up to 3.5× performance increase over previous-generation professional hardware in image generation.
- Up to 2× higher performance in text generation compared to the previous generation.
Beyond pure AI, NVIDIA emphasizes professional creation and visualization. In path tracing engines like Arnold, Chaos V-Ray, and Blender, as well as real-time GPU renderers like D5 Render and Redshift, the company reports rendering time reductions of “up to 4.7×.” For engineering and product design, NVIDIA states more than 2× performance gains in graphics.
More than marketing, the practical takeaway is that this GPU isn’t just aimed at “doing AI,” but at a hybrid profile increasingly common: teams combining CAD/CAE, simulation, heavy 3D scenes, and generative tools (denoisers, assistants, asset generation, automation)—where extra memory means fewer limits and fewer pauses.
Early adoption cases: generative design and virtual production iteration
NVIDIA cites two examples illustrating its intended use cases:
- InfinitForm, which uses the GPU to optimize its CUDA-accelerated generative design software aimed at engineering and manufacturing, to speed up simulations and CAD workflows.
- Versatile Media, focused on virtual production, seeking to improve real-time rendering performance for large-scale scenes and complex assets, where extra memory allows for higher resolution and more complex iterations without significantly hindering the creative flow.
Availability: professional channels and more systems “starting next year”
The RTX PRO 5000 72 GB is already available through partners such as Ingram Micro, Leadtek, Unisplendour, and xFusion. NVIDIA anticipates broader availability via global integrators early next year.
Overall, this launch sends a clear message: the race to bring agentic and multimodal AI to the workplace isn’t just about models. It’s about memory, rapid iteration, and the ability to do more near the user—without each test requiring data center resources.
Frequently Asked Questions
What’s the benefit of having 72 GB of VRAM in an AI workstation?
To run larger models and pipelines locally (LLMs, RAG, multimodal, tools-enabled agents) and handle bigger scenes or datasets without running out of memory.
How does it compare to the RTX PRO 5000 with 48 GB?
The key difference is capacity: 72 GB, which is a 50% increase, allowing more models and contexts to be active simultaneously, and easier management of multiple components.
Which professionals benefit most?
AI developers, data scientists, engineering teams (CAD/CAE), and 3D/virtual production creators working with rendering, simulation, and generative tools.
When will it be available through global integrators?
NVIDIA indicates broader availability through system builders will start early next year.
via: NVIDIA blogs

