IBM accelerates quantum error correction: real-time decoding on AMD FPGAs “10 times faster than needed” fuels the Starling project

IBM is preparing a new leap in quantum computing that, this time, doesn’t come from the cryostat but from classical hardware that must work hand-in-hand with the quantum processor. According to industry advances and IBM’s own team, the company has executed quantum error correction in real time on standard AMD FPGAs, achieving performance ten times higher than what’s needed for its immediate goal: maintaining stable logical qubits in quantum memory. This breakthrough unlocks one of the most significant bottlenecks toward fault tolerance, where decoding stops being a slow and costly burden to become an agile coprocess alongside the quantum chip.

The result doesn’t come out of nowhere. In late August, IBM and AMD announced a strategic collaboration to explore supercomputing architectures centered on quantum, combining CPUs, GPUs, and FPGAs from AMD with modular quantum systems from IBM. A key element of this effort is the Relay-BP decoder, an improved belief propagation (BP) algorithm designed for quantum LDPC codes (qLDPC), which fits four difficult-to-reconcile requirements: flexibility, compactness, speed, and accuracy. Implementing this on FPGAs was the logical step; turning it into real-time hardware was the challenge.

Why Quantum Error Correction Needs an Ultra-Fast “Classical Half”

Physical qubits are fragile: decoherence, gate noise, imperfect readings, unwanted couplings… To protect information, quantum computing groups many physical qubits within an error correction code that defines more robust logical qubits. But the code itself doesn’t correct anything: it requires measuring syndromes (collective properties revealing the error footprint) and passing those syndromes through a decoder that infers the most likely error pattern and prescribes a correction before the next round of quantum gates worsens the situation.

This is where classical hardware comes into play. If the decoder isn’t precise, the logical qubit degrades; if it’s not fast, syndrome backlogs occur; if it’s not compact, scaling to hundreds or thousands of qubits becomes impossible. The tension between accuracy and latency has been a longstanding headache in the field. IBM tackled this with Relay-BP, coming from the traditional BP approach — a message-passing algorithm where nodes of a graph “talk” — but introduces heterogeneous and dynamic memory parameters to prevent the cycles and symmetries that trap BP in wrong solutions.

Relay-BP: The “Unordered Memory” Idea That Unlocks BP for qLDPC

In standard BP, all nodes update beliefs with a uniform rule: each message is weighted equally, each node remembers the same. This works for many problems, but oscillates or converges poorly in qLDPC (low-density quantum parity-check codes). Relay-BP introduces “knobs” of control: each node has a different memory strength (can remember more or less, even negatively, allowing it to “forget” wrong decisions), and these strengths vary to break local symmetries. Additionally, it chains executions with different memory configurations (“relay”) to refine a solution without restarting the algorithm.

The reported outcome from IBM is a decoder that outperforms in accuracy the traditional BP+OSD (the classic but costly method) and maintains — or even improves — the speed of pure BP. Crucially, it fits into FPGAs and ASICs with reduced footprint, enabling real-time decoding needed for sustained logical qubit operation. This compactness was essential for transitioning from theory to hardware.

What’s New Now? AMD FPGAs and Decoding 10× Faster

The next step from the lab is the implementation on AMD FPGAs and the demonstration that the decoder, embedded in reconfigurable logic, not only keeps pace with the quantum experiment but does so completely comfortably: 10 times faster than the threshold needed for the intermediate proof-of-concept goal IBM set (quantum memory with real-time decoding). This number isn’t marketing hype; it provides temporal margin for pipelines, read latencies, cryogenic communication, and, above all, scaling to more qubits and wider codes.

This achievement aligns with IBM’s roadmap: in the short term, testing real-time decoding in quantum memory; mid-term, applying it to logical processing; and, by the end of the decade, developing fault-tolerant architectures within its Starling/Kookaburra ecosystem and the Quantum System Two platform, modular and designed to integrate classical and quantum computing as a quantum-centric supercomputer.

Why AMD (and Why FPGAs)

FPGAs are the ideal bridge between prototype and product: they enable massive parallelism, low latency, deterministic timing, and are reconfigurable for iterative designs. For qLDPC decoding, where message passing between nodes and parallel updates are essential, FPGAs fit perfectly. AMD also provides a comprehensive stack — EPYC CPUs, Instinct GPUs, and Xilinx FPGAs — already powering exascale systems like Frontier and El Capitan. This HPC/AI DNA streamlines integration of hybrid pipelines where the decoder shares memory and network with quantum simulation, control AI, and orchestration.

The IBM-AMD collaboration announced in August emphasized this convergence: quantum + HPC + AI in hybrid flows where each part executes what it’s best atqubits for simulating matter and reactions, classical for optimization, AI for estimation and calibration. Error correction is the silent glue enabling smooth operation hour to hour, preventing noise from disrupting the process.

What This Milestone Makes Possible — and What It Still Doesn’t

  • Yes: enables quantum memory experiments with real-time decoding on standard hardware, a key stepping stone toward fault-tolerant logic.
  • Yes: suggests scalability; if there’s 10× margin now, this can translate in the future to more qubits, more complex codes, or stricter latencies without a complete redesign.
  • No: it still doesn’t mean a full fault-tolerant processor. IBM clarifies: this study focuses on memory; adding logic increases width and complexity, and the decoding hardware must be even more compact to match gate frequencies.

Nevertheless, closing the loop of syndrome reading → decoding → correction in real time with ample margin is precisely the incremental progress the industry needs to move from “possible” to “operational”.


Roadmap Table: Error Correction & Quantum-Centric Supercomputing by IBM

DateMilestoneTechnical DetailsRelevance
Aug 2025IBM & AMD Announce CollaborationIntegrate AMD CPUs/GPUs/FPGAs with IBM Quantum for hybrid flows; focus on real-time decodingPositions AMD alongside IBM on the road to fault tolerance
Jul 2025Preprint Relay-BPqLDPC decoder based on BP with disordered memories and relay mode; 10× more accurate than BP+OSD in tests, fast and compactFirst viable decoder for FPGAs/ASICs in real-time
Oct 2025AMD FPGA DecodingQEC executed 10× faster than needed on standard AMD FPGAs; goal: quantum memory with real-time decodingEliminates a practical barrier for Starling
2026 (expected)Kookaburra: device testingTesting decoders under real noise; moving toward fault-tolerant logicTransition from “algorithm” to “operating system
2030 (vision)Quantum-centric supercomputingSeamless integration of quantum + classical + AI; fault-toleranceAmbition for practical utility at large scale

What Does “10× Faster Than Needed” Mean?

In real-time QEC, the minimum requirement is to decode each syndrome before the next batch of measurements arrives; otherwise, queues form, and the effective correction latency increases, degrading the logical qubit. “10×” implies decoding in a fraction of the time budget available, leaving margin for I/O, buffers, telemetry, and experiment variability. This cushion is what turns a demo into a constructive building block.

Implications for IBM’s Roadmap (Starling) and the Industry

  • IBM: fits with Quantum System Two, modular, with its hybrid integration. Relay-BP may not be the final decoder, but it embodies a path of iterations that are already headed out of the PowerPoint slides.
  • AMD: strengthens its role as a classical hardware partner in the quantum ecosystem; FPGAs and potentially GPUs for simulation/AI near the qubit.
  • Ecosystem: validates the hybrid paradigm (QPU + HPC/AI) and adds pressure to standardize interfaces (e.g., Qiskit) and co-design toolchains for algorithms and hardware.

FAQs

What is Relay-BP and why is it important for quantum error correction?
Relay-BP is a belief-propagation-based decoder that introduces disordered memories (including negative) and a relay mode chaining executions to escape traps and oscillations typical of BP in qLDPC. It is accurate, fast, and compact, fitting into FPGAs/ASICs for real-time operation.

Why use AMD FPGAs for decoding? Wouldn’t a CPU or GPU suffice?
qLDPC message-passing decoding demands low latency and fine-grained parallelism with deterministic timing; FPGAs are ideal. Additionally, AMD provides a comprehensive stack — EPYC CPUs, Instinct GPUs, and Xilinx FPGAs — already powering exascale systems like Frontier. This HPC/AI heritage facilitates integrating hybrid pipelines where the decoder shares memory and network with quantum simulation, control AI, and orchestration.

Does “10× faster than needed” mean we already have fault tolerance?
Not yet. The milestone covers quantum memory with real-time decoding. Achieving fault-tolerant logic requires even more compact and wider hardware decoders. IBM anticipates tests in 2026 and continues to iterate toward the Starling architecture.

What does the IBM-AMD partnership contribute beyond this decoder?
They are exploring a quantum-centric fabric combining QPU + CPUs/GPUs/FPGAs for hybrid algorithms (quantum simulation + AI/HPC). Real-time QEC is a key piece, but the goal is a full system capable of solving problems beyond the reach of classical computing alone.

References: tomshardware, newsroom.ibm, arxiv and ibm

Scroll to Top