AMD has presented new performance results in artificial intelligence that put its professional RDNA 3 graphics cards with 48 GB ahead of Nvidia’s RTX 4090. According to tests conducted using DeepSeek R1, the Radeon Pro W7900 and the Radeon Pro W7800, both with 48 GB of VRAM, achieved up to 7.3 times greater performance than the RTX 4090 in certain language model inference scenarios.
DeepSeek R1 Test Results
David McAfee, Vice President and General Manager of Ryzen CPUs and Radeon Graphics at AMD, shared a series of tests conducted with LM Studio 0.3.12 and Llama.cpp runtime 1.18 on X (formerly Twitter), comparing the performance of the GPUs in four different configurations:
Test | RTX 4090 | Pro W7800 48GB | Pro W7900 48GB |
---|---|---|---|
Distill Qwen 32B 8-bit | 2.7 tokens/s | 19.1 tokens/s | 19.8 tokens/s |
Distill Llama 70B 4-bit | 2.3 tokens/s | 12.8 tokens/s | 12.7 tokens/s |
Distill Qwen 32B 8-bit (variant) | 2.5 tokens/s | 15.7 tokens/s | 16.2 tokens/s |
Distill Llama 70B 4-bit (variant) | 2.0 tokens/s | 10.1 tokens/s | 10.4 tokens/s |
In terms of performance improvement over the RTX 4090, AMD claims that its 48 GB RDNA 3 GPUs are:
- 7.3 times faster in Distill Qwen 32B 8-bit.
- 6.5 times faster in another variant of Distill Qwen 32B 8-bit.
- 5.5 times faster in Distill Llama 70B 4-bit.
- 5.2 times faster in another variant of Distill Llama 70B 4-bit.
The Impact of VRAM on AI Models
One of the key factors in the performance of these artificial intelligence models is the amount of available VRAM. In inference tasks with large language models (LLMs), parameters are stored directly in the GPU memory. In this regard, AMD argues that its models with 48 GB of VRAM can handle the largest models from DeepSeek R1 without needing to split the load across multiple GPUs.

However, this benefit comes at a high cost. The Radeon Pro W7900 48 GB is priced at $3,500, which puts it $1,500 above the base price of the RTX 5090 ($2,000) and $2,000 more than the RTX 4090 ($1,500 at launch). Still, it is more affordable than the RTX A6000 Ada 48 GB, Nvidia’s closest option in terms of VRAM capacity.
Nvidia’s Counterattack
Although these results position AMD as a competitive option for artificial intelligence workloads, the company has avoided comparing its GPUs with the new RTX 5090, Nvidia’s latest flagship model. Previously, when AMD published similar benchmarks on the RX 7900 XTX, Nvidia responded with its own data, showing that its GPU outperformed AMD’s in DeepSeek R1 under similar configurations.
Nvidia is likely to counterattack with new benchmarks to demonstrate the performance of its latest models against the 48 GB RDNA 3, especially considering that the RTX 5090 only has 32 GB of GDDR7 compared to AMD’s 48 GB cards.
The landscape of GPUs for artificial intelligence continues to evolve, and while AMD shows advantages in VRAM and performance in certain tests, the battle for supremacy in AI between Nvidia and AMD is far from decided.