CES 2026 kicks into gear: the show that aims to turn Artificial Intelligence into everyday infrastructure

Las Vegas once again becomes the showcase where the tech industry tries to define the year this week. CES 2026 arrives not only with the usual deluge of laptops, TVs, and gadgets but with a more concrete ambition: to make Artificial Intelligence no longer perceived as “a layer of software” but understood as a new infrastructure, from data centers to developers’ desktops.

The CES organization itself highlights in a preview of what to expect that this edition will focus on trends that are no longer isolated in labs: “trade show robotics to industrial robotics,” digital health with real deployment intent, and an expanded space for creators. Additionally, CES Foundry debuts — a new format designed for business and tech conversations around AI and quantum computing. Simply put, the idea is for CES 2026 to look less like a catalog and more like a roadmap.

CES Foundry: When AI and Quantum Sit Down at the Table

One of the recurring messages ahead of the event is that AI is no longer measured solely by models, but by the muscle it enables to train, serve, and maintain models without skyrocketing costs. Enter CES Foundry, envisioned as a meeting point to discuss investment, adoption, and scaling — not just product launches — with AI and quantum as central themes that cross all verticals. The messaging about what to expect from the event aligns with this: less abstract futurism, more industrial grounding.

NVIDIA Rubin: Six Chips and a Promise of an “AI Supercomputer”

In the data center realm, NVIDIA leverages CES to reinforce a narrative it’s well known for: the next-gen product isn’t sold as just a GPU but as a complete platform.

The company presents Rubin as an “extreme co-design” architecture integrating six silicon components: Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch. Its declared goal is twofold: reduce training time and, more importantly, lower inference token costs, which are critical points for models with reasoning, agents, and long contexts.

In terms of numbers, NVIDIA talks about up to 10 times lower inference token costs compared to Blackwell and the ability to train Mixture-of-Experts models with four times fewer GPUs than before. For the industry, these metrics signal that if costs decrease, more use cases become economically viable.

Rubin also touts a technical approach to justify the leap: sixth-generation NVLink for GPU-GPU communications at scale, a next-generation Transformer engine, Confidential Computing capabilities at rack level, and a resilience/operability focus that aims to tackle one of the major enemies of large clusters: performance loss due to failures and maintenance.

NVIDIA expects Rubin-based systems to be available in the second half of 2026, in partnership with manufacturers and major cloud providers.

DGX SuperPOD and the obsession with “rack as a compute unit”

Another key point in this narrative is scale deployment. The company emphasizes the concept of “AI factory”: data centers designed as factories for tokens and reasoning.

Within this framework, DGX SuperPOD is introduced as the “blueprint” to industrialize Rubin. NVIDIA outlines configurations that turn racks into almost indivisible units of computation and memory, with interconnection as the key component. The clear message: bottlenecks are no longer just computational—they now include networking, data movement, and daily cluster operation.

From Data Center to Desktop: DGX Spark and DGX Station

CES 2026 also highlights an interesting shift: local AI is regaining appeal, not as an absolute alternative to the cloud, but as a way to accelerate iteration, protect intellectual property, and reduce development friction.

NVIDIA introduces DGX Spark and DGX Station as “desktop supercomputers” for running advanced models locally. Spark claims to handle models with around 100 billion parameters, while Station aims at larger scales, based on Grace Blackwell Ultra with coherent memory, enabling even more demanding models from a desktop environment, according to the company.

This aligns with growing demand: teams want to prototype and fine-tune models without relying solely on cloud GPU queues, with the option to scale later.

AMD Responds with “Helios”: The Yottascale AI Platform

At CES 2026, AMD underscores its vision: “AI everywhere, for everyone”, spanning data centers, edge, and PCs.

The flagship announcement is a sneak peek of “Helios”, a rack-scale platform described as a “blueprint” for yottascale AI infrastructure. AMD presents it as a system built around GPU Instinct MI455X, EPYC “Venice” CPUs, and networking with Pensando, all supported by the ROCm software ecosystem. The bold figure is up to 2.9 exaflops of AI performance per rack, emphasizing density and efficiency.

Simultaneously, AMD extends its Instinct family with the MI440X for on-premise enterprise deployments, and previews the upcoming MI500 series, scheduled for launch in 2027.

The message also emphasizes PCs: new Ryzen AI platforms with 60 TOPS NPU, and a public commitment of $150 million to promote AI adoption in education and communities.

MSI Translates Trends into Products: Laptops, Displays, Wi-Fi 7, and PCIe 5.0 SSDs

CES remains CES: consumer hardware is still present, but increasingly infused with AI as a cross-cutting theme.

MSI introduces a broad line under “Innovate Beyond,” including super-slim laptops aimed at productivity, gaming rigs boasting robust thermal margins, and accessories/components riding the connectivity and performance wave.

Highlights include a 16” gaming laptop capable of delivering up to 300 W combined CPU+GPU via an aggressive cooling system; ultrawide QD-OLED monitors with high refresh rates; and Wi-Fi 7 mesh solutions for home networks. Storage-wise, PCIe Gen 5 drives with high read/write speeds target creators and users seeking faster data transfer.

The overarching theme isn’t just raw performance but the impression that the PC is being redesigned to thrive alongside workflows that incorporate local or hybrid AI.

Memory Returns to the Center: HBM4 and LPDDR6 on the Horizon

This edition makes clear that AI isn’t powered solely by GPUs anymore. Memory and bandwidth are becoming strategic again.

SK hynix emerges as a key player, with reports indicating completion of HBM4 development featuring 2,048-bit interfaces and 10 GT/s speeds, exceeding JEDEC specifications. Its roadmap also emphasizes LPDDR6 and related technologies targeting servers and energy efficiency. For the industry, the message is clear: without memory leaps and better packaging, scaling advanced models becomes more expensive and less sustainable.

Robotics, Digital Health, and Creators: CES Focuses on Practicality, Not Just Power

Apart from raw compute, CES emphasizes that 2026 will be an edition where technology is more “applied”: robotics linked to safety, sustainability, and productivity; digital health with tangible products; and an expanded Creator Space reflecting that content — and the tools to produce it — are now central to industry’s economic core.

Overall, CES 2026 paints a coherent picture: AI is driving a comprehensive reorganization of the stack, from chips and networks to trade show formats, spaces, demos, and corporate messaging. Above all, a key idea threads through nearly all announcements: efficiency (cost per token, performance per watt, operational ease) begins to match power as a value.


Frequently Asked Questions (FAQ)

What is CES Foundry, and why is it mentioned so much at CES 2026?
It’s a new format within CES focused on business and tech discussions around AI and quantum computing, designed to connect innovation with real-world adoption in industry and investment.

What does it mean that NVIDIA Rubin is “a platform of six chips” and not just a GPU?
It means NVIDIA integrates CPU, GPU, networking, DPU, and switching as a co-designed system aimed at reducing costs and accelerating large-scale training and inference.

Why is “cost per token” inference so frequently discussed with this generation of hardware?
Because models with reasoning, agents, and extensive contexts consume many more tokens. If token costs are high, some use cases may no longer be economically feasible, even if the models technically function.

What does HBM4 bring compared to previous generations in AI infrastructure?
HBM4 aims to increase bandwidth and high-performance memory efficiency, critical for feeding accelerators and avoiding bottlenecks as models grow in size and data demands.

via: ces.tech

Scroll to Top