Akamai, historically associated with the world of CDN and content delivery, wants to be read differently by 2026: GPU, inference, and distributed Artificial Intelligence. The company has announced the acquisition of thousands of NVIDIA Blackwell GPUs with the aim of creating one of the most widely distributed AI platforms in the world, integrated into its cloud infrastructure and global network.
This move comes at a time when the market is beginning to accept an uncomfortable reality: the first wave of AI focused on training models in large, centralized “factories,” but the bottleneck is shifting elsewhere. Inference (using the model in production) matters just as much as training, and in production, latency, traffic costs, and data locality weigh more than the latest “benchmark.”
From “training in one place” to “responding anywhere”
Akamai’s approach relies on an idea that until recently sounded more like network architecture than AI: treating the planet as a low-latency backplane. Instead of concentrating execution in a handful of massive regions, the company proposes a unified platform that routes inference loads toward optimized compute resources within its own distributed footprint.
This approach aims to address two classic challenges of centralized cloud when scaling AI in production:
- Latency: When responses need to happen “in real-time” (or as close as possible), physical distance once again becomes critical.
- Data egress costs: Moving data to and from centralized data centers can be expensive, especially with large volumes or sovereignty requirements.
Akamai supports this argument with a cited fact in its announcement: MIT Technology Review reportedly noted that 56% of organizations identify latency as the main barrier to deploying AI at scale. With this premise, the company positions itself as a “decentralized nervous system” to bring models from the lab to the real world, where the data “lives” and where the return on investment is calculated.
What exactly will be deployed: Blackwell for inference, local fine-tuning, and post-training
Akamai has not revealed the exact number of accelerators, but has emphasized that it will be “thousands” and that the chips have already been acquired. The platform is designed to cover various phases of the model lifecycle, not just serving responses:
- High-performance and predictable inference, executed on dedicated GPU clusters for rapid responses.
- Localized fine-tuning, to optimize models close to the data, with clear implications for privacy and regional compliance.
- Post-training, to adapt foundational models with proprietary data and improve accuracy in specific tasks.
Regarding the technical “kitchen,” Akamai describes a combination that aligns with the infrastructure NVIDIA is pushing for the Blackwell era: NVIDIA RTX PRO servers with NVIDIA RTX PRO 6,000 Blackwell Server Edition GPUs, along with NVIDIA BlueField-3 DPUs, integrated into Akamai’s distributed cloud.
4,400 locations: the “non-obvious” advantage over hyperscalers
Akamai emphasizes a number that, in its case, is not just marketing fluff: its global network surpasses 4,400 locations. In practice, this level of reach enables a narrative that hyperscalers have been pursuing for years with “edge regions,” but with a key difference: Akamai’s infrastructure has historically been designed to be close to the end user.
The company frames this strategy as a further step in its transition from “CDN + security” to “distributed cloud.” It’s not an improvised shift: in 2022, it acquired Linode for approximately $900 million, a move seen as the major step toward building a general-purpose computing base for adding higher-value services.
And this is where AI comes in: Akamai argues that the market has reached a point where the big leap isn’t just “having the best model,” but making the model work with minimal latency and reasonable costs in real-world environments. In this context, the company positions its platform as a global AI computing grid optimized for inference.
A step beyond Inference Cloud: more GPUs, more ROI pressure
The announcement ties into previous initiatives. Akamai had already introduced its Inference Cloud, and later expanded with NVIDIA infrastructure aimed at bringing inference closer to users. In its messaging, the company claimed latency improvements of up to 2.5 times and cost savings of up to 86% for inference compared to traditional hyperscale infrastructure—a comparison that, as always, depends on workload type, location, and traffic patterns.
The industry takeaway is clear: companies want AI in production, but also seek governance, predictable costs, and quick responses. Akamai reports seeing “strong demand” in its initial rollout with RTX PRO 6,000 Blackwell Server Edition and plans to continue expanding GPU capacity as part of its cloud strategy.
The emerging battle: edge as an AI platform
Ultimately, Akamai’s move isn’t just competing for GPUs. It’s competing for architecture: the idea that operational AI, agentic AI, and “physical AI” (robots, logistics, industry, healthcare) will need decisions with ultra-low latency and data that, due to cost or regulation, cannot always travel to centralized regions.
In this scenario, Akamai aims to become a viable alternative for a specific market segment: distributed inference at scale, with localized fine-tuning and deployments prioritizing proximity and compliance. If successful, this will be another sign of 2026: AI ceases to be just a modeling problem and becomes fully an infrastructure issue.
Frequently Asked Questions
What does “inference” mean in Artificial Intelligence, and why is it so important in 2026?
Inference involves using a trained model to generate responses in production. By 2026, it’s especially crucial because it’s where real user experiences (and ROI) are created, and latency and traffic costs are most penalizing.
What advantage does a distributed AI platform have over a centralized cloud?
It reduces the distance between compute and the user or data, which can cut latency and data egress costs. It also makes it easier to meet data residency requirements when processing can happen closer to the data source.
What GPUs will Akamai use, and for what kinds of workloads?
Akamai has announced the deployment of thousands of NVIDIA Blackwell GPUs, in RTX PRO servers with RTX PRO 6,000 Blackwell Server Edition GPUs, aimed at inference, localized fine-tuning, and post-training.
Is Akamai shifting from a CDN to an AI cloud?
No, it’s leveraging its core strengths. Its global network, initially built for content distribution, forms the foundation for bringing compute closer to thousands of locations and executing low-latency inference.
via: akamai

