Qualcomm Brings Dragonfly to Data Centers to Compete in Generative AI

Qualcomm wants its next major phase not to depend on smartphones. The company has presented a comprehensive roadmap for data centers under the Dragonfly brand, featuring a custom CPU, inference accelerators, near-memory computing technology, high-speed connectivity, and tailored silicon for large customers. The goal is to enter AI infrastructure precisely as inference and agents start to consume an increasing share of hyper-scale budgets.

This proposal is accompanied by significant commercial signals. Qualcomm announced a multi-generational agreement with Meta to supply data center CPUs, and market information indicates Microsoft is among the clients utilizing its new High Bandwidth Compute technology. There are also agreements with two unidentified hyperscalers for custom silicon. In other words, Qualcomm isn’t just launching a family of chips; it’s trying to prove it already has clients to justify its return to data centers.

The company starts from a simple idea: AI agents will increase the number of tokens, queries, tools, intermediate calls, and chained tasks. In this scenario, the cost per token and performance per watt are as important as brute power. If inference becomes the dominant workload, infrastructure must be more efficient, more modular, and less reliant on architectures primarily designed for massive training.

Dragonfly C1000: an ARM CPU for AI servers

The most prominent element of the roadmap is the Qualcomm Dragonfly C1000, a data center CPU designed for scale-out workloads, agent orchestration, general-purpose use, and AI infrastructure. It will be based on custom Oryon cores, with frequencies above 5 GHz, more than 250 chiplet-based cores, PCIe Gen 7 connectivity exceeding 2 TB/s, and support for CXL.

Qualcomm claims the CPU is designed to deliver more than twice the performance per watt compared to competitive server references, based on speculations from the announced specs. It’s important to interpret this as a design promise, not an independent result in production. Commercial availability is expected in 2028.

C1000’s significance isn’t in replacing GPUs but in what surrounds them. Large AI clusters need CPUs to coordinate accelerators, move data, run control services, manage memory, handle web loads, serve agents, and reduce infrastructure bottlenecks. If a CPU consumes less power while maintaining good per-core performance, it can improve the overall system cost.

ProductIntended RoleTimeline
Dragonfly C1000Data center CPU for servers, agents, head nodes, and AI infrastructure2028
Dragonfly AI200First-generation inference accelerator2026
Dragonfly AI250Accelerator with HBC Gen 1Commercial sampling planned for mid-2027
Dragonfly AI300Inference accelerator with HBC Gen 2Commercial sampling planned for 2028
HBCNear-memory compute to reduce the “memory wall”Multi-generational roadmap
Custom siliconASIC for large-scale clientsFirst revenues before end of 2026, according to Reuters

The CPU also lends credibility to an entire platform. Qualcomm doesn’t want to sell an isolated card but an architecture composed of rack-level compute, memory, connectivity, software, and customized options. This approach dominates AI data centers: the final performance depends on the entire system, not just the chip.

HBC: Breaking the memory wall attempt

The most technical part of the announcement is Qualcomm High Bandwidth Compute, or HBC. The company describes it as a near-memory computing architecture that combines high-bandwidth compute and memory in a 3D-stacked solution. Its goal is to address one of AI’s core challenges: moving data consumes energy, time, and money.

The “memory wall” isn’t new, but AI has pushed it to the limit. Large models need continuous access to enormous volumes of parameters, context, intermediate memory, and datasets. In inference, especially with agents that reason for longer or call tools, bandwidth and memory efficiency can determine the cost per token.

Qualcomm states that HBC Gen 1, integrated into AI250, will enable 133 TB/s per card and an 18x increase in effective bandwidth compared to AI200 with LPDDR5X. With HBC Gen 2, the AI300 would make another leap, with a 54x improvement over AI200. The company also mentions 6 times more bandwidth per watt compared to HBM and 200 times more capacity per watt compared to SRAM, based on published spec sheets normalized by card or rack.

TechnologyQualcomm’s Technical Promise
HBC Gen 1 in AI250133 TB/s per card and 18x compared to AI200
HBC Gen 2 in AI30054x compared to AI200
HBC vs HBM6x more bandwidth per watt, based on published specs
HBC vs SRAM200x more capacity per watt at rack level
ObjectiveLower energy per token and improved inference cost

This is one of Qualcomm’s most intriguing bets. While NVIDIA dominates with HBM in AI GPUs and others explore SRAM memory or specialized architectures, Qualcomm leverages its historical expertise in low power, LPDDR, integrated systems, and advanced packaging to offer an alternative route. The question is whether this advantage will hold up in real workloads with actual models and mature software.

Dragonfly AI300 and an annual cadence for inference

Dragonfly AI300 will be the third generation in the inference accelerator roadmap, following AI200 and AI250. Qualcomm aims for an annual release cycle, common in AI GPUs but less typical for a company re-entering data centers.

AI300 is envisioned as a rack-scale inference platform with air and direct liquid cooling, HBC Gen 2, disaggregated deployments, and support for large language models and multimodal models. The company expects four to eight times better performance per watt compared to existing GPU architectures, measured in memory bandwidth per watt and per card. It also mentions scaling with UALink and ESUN, plus copper and optical technology for scale-out.

Focusing Dragonfly on inference isn’t accidental. Training models will remain dominated by NVIDIA, Google’s TPUs, Amazon’s Trainium, proprietary chips from large labs, and costly solutions. However, inference will grow with each user, agent, and enterprise application. If Qualcomm can lower cost and power consumption in this layer, it can find a niche even if it doesn’t lead training.

Meta, Microsoft, and hyperscalers contextualize the announcement

The partnership with Meta is the most visible business signal. Qualcomm’s Dragonfly C1000 is planned to power part of Meta’s future server fleet, starting production in the second half of 2028. Mark Zuckerberg linked this collaboration to the infrastructure needed to deliver “personal superintelligence” at a global scale—an ambitious phrase but helpful to grasp the magnitude of Meta’s AI investments.

Reuters also reported that Microsoft will use Qualcomm’s HBC chips for AI workloads, and that the company has two unidentified hyperscalers for custom silicon. Qualcomm expects to generate $5 billion in data center revenue in FY2027 and $15 billion in 2029, according to the same source. For a company still very associated with smartphones, these figures explain why data centers have become a strategic priority.

Partner or clientStrategic insight
MetaValidates Dragonfly C1000 as a data center CPU
MicrosoftIntroduces HBC into AI workloads, per Reuters
Unidentified hyperscalersReinforces custom silicon business
Hugging FaceConnects open models and developers with Qualcomm platforms
ModularEnhances AI software, portability, and deployment
Over 35 partnersIndicates industry support from manufacturers, memory, networking, servers, and cloud

The backing from over 35 companies—including names linked to servers, memory, networking, storage, and manufacturing—helps build a platform narrative. In data centers, no one wins alone. It takes boards, memory modules, racks, cooling systems, validation, networking, storage, integrators, and customers ready to deploy at scale.

The challenge: competing against well-established ecosystems

Qualcomm isn’t entering an empty market. NVIDIA has GPUs, Grace processors, networking, CUDA, NVLink, complete systems, and a massive community. AMD competes with EPYC and Instinct. Intel maintains Xeon processors and proprietary accelerators. Amazon, Google, and Microsoft have their own CPUs or accelerators. Broadcom and Marvell are gaining ground in custom ASICs. Cerebras, SambaNova, Groq, and others seek niche markets.

Qualcomm’s advantage may lie in efficiency, integration, and cost. Its experience with mobile devices has forced it for decades to prioritize performance per watt, efficient memory, and system-level design. However, data centers demand different standards: reliability, enterprise support, long validation cycles, software compatibility, security, observability, and years of seamless operation.

Acquiring Modular and expanding collaboration with Hugging Face help address some of these weaknesses. Qualcomm recognizes that hardware alone isn’t enough. If developers struggle to deploy models easily, frameworks underperform, or moving workloads from established GPUs requires too much effort, adoption will be limited.

Why this move matters

The Dragonfly announcement shows AI infrastructure is shifting away from a solely GPU training focus. Inference, agents, and hybrid deployments demand efficient CPUs, abundant memory, fast networks, specialized accelerators, and software capable of load distribution. AI data centers will be more heterogeneous.

For hyperscalers, heterogeneity reduces dependency, allows tailored hardware for specific workloads, and improves operational cost. For Qualcomm, it’s a chance to diversify revenues and break free from mobile market pressures. For NVIDIA and other incumbents, it’s another sign that competition is moving toward integrated systems, not just chips.

However, the timeline warrants caution. C1000 won’t arrive until 2028. AI300 also targets 2028. HBC Gen 1 will begin sampling in 2027. Before market shifts occur, independent performance, actual availability, stable software, customer deployments, and total cost in production must be evaluated.

Qualcomm has laid out an ambitious roadmap with key clients and a clear thesis: agentic AI will be decided by tokens per watt. If it proves successful in real racks, Dragonfly could become more than just a rebrand—potentially Qualcomm’s serious entry into infrastructure sectors that have so far been out of reach.

Frequently Asked Questions

What is Qualcomm Dragonfly?
It’s Qualcomm’s new family of solutions for AI data centers, including CPUs, inference accelerators, HBC technology, connectivity, and custom silicon.

What does Dragonfly C1000 offer?
A data center CPU based on Oryon cores, with over 250 cores, chiplet design, PCIe Gen 7, CXL, and a focus on efficiency for AI workloads and agents.

What is High Bandwidth Compute or HBC?
Qualcomm’s technology combining compute and high-bandwidth memory in a 3D-stacked architecture to reduce data movement bottlenecks and improve token efficiency.

When will these products be available?
HBC Gen 1 with AI250 is scheduled for commercial sampling by mid-2027. Dragonfly C1000 and AI300 aim for availability or sampling in 2028, per the announced roadmap.

via: qualcomm

Scroll to Top