← back to jason-chadwick.com

SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding

(under review)

SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding
Joshua Viszlai
University of Chicago
Jason D. Chadwick
University of Chicago
Sarang Joshi
University of Chicago
Gokul Subramanian Ravi
University of Michigan
Yanjing Li
University of Chicago
Frederic T. Chong
University of Chicago
Joshua Viszlai
University of Chicago
Jason D. Chadwick
University of Chicago
Sarang Joshi
University of Chicago
Gokul Subramanian Ravi
University of Michigan
Yanjing Li
University of Chicago
Frederic T. Chong
University of Chicago
Joshua Viszlai
University of Chicago
Jason D. Chadwick
University of Chicago
Sarang Joshi
University of Chicago
Gokul Subramanian Ravi
University of Michigan
Yanjing Li
University of Chicago
Frederic T. Chong
University of Chicago

Abstract


Real-time decoding is a key ingredient in future fault-tolerant quantum systems, yet many decoders are too slow to run in real time. Prior work has shown that parallel window decoding schemes can scalably meet throughput requirements in the presence of increasing decoding times, given enough classical resources. However, windowed decoding schemes require that some decoding tasks be delayed until others have completed, which can be problematic during time-sensitive operations such as T gate teleportation, leading to suboptimal program runtimes. To alleviate this, we introduce SWIPER, a speculative window decoding scheme. Taking inspiration from branch prediction in classical computer architecture, SWIPER utilizes a light-weight speculation step to predict data dependencies between adjacent decoding windows, allowing multiple layers of decoding tasks to be resolved simultaneously. Through a state-of-the-art compilation pipeline and a detailed open-source simulator, we find that SWIPER reduces application runtimes by 40% on average compared to prior parallel window decoders.

[.pdf]   [arXiv]  

Selected Figures


/assets/img/projects/swiper/01_hero.png

Figure 1. Decoding a lattice surgery T gate teleportation (top left) using prior parallel window decoders (middle) and using SWIPER (bottom). SWIPER improves decoding throughput with a lightweight speculation step, allowing dependent windows to begin decoding earlier

/assets/img/projects/swiper/04_predictor.png

Figure 4. Accuracy of the three different predictors for increasing values of code distance. Bars indicate a breakdown of different failure cases.

/assets/img/projects/swiper/06_predictor_logic.png

Figure 6. An overview of the final, 3-step predictor logic.

/assets/img/projects/swiper/09_pipeline.png

Figure 9. The pipeline used to evaluate high-level benchmark programs with different window decoders. Right: details of SWIPER-SIM and its performance scaling versus program size.

/assets/img/projects/swiper/10_msd.png

Figure 10. SWIPER-SIM program traces for a 15-to-1 magic state distillation in a distance-7 code using the construction from [24] (Fig. 17). Time advances vertically, and each horizontal slice represents a batch of syndrome data (colored by instruction type). Decoding time is fixed to be the same as the time to generate 2d rounds, about twice the window generation rate. Device traces are shown for (a) baseline parallel window method (15.1d QEC rounds) and (b) SWIPER aligned window (11.3d QEC rounds, a 25% improvement).

/assets/img/projects/swiper/11_sensitivity.png

Figure 11. Sensitivity of reaction time to decoding latency and speculation accuracy. Speculation accuracy of 0% corresponds to baseline (no speculation) decoding. Decoder latency is $t_{\text{dec}}(v) = rv/d^2$ rounds, where $v$ is the volume of the decoding problem (in units of $d^3$) and the relative latency factor $r$ is varied along the x axis. Equivalent latency factors extracted from linear fits to PyMatching latency data (Figure 3) are shown along the x axis. Right: simulator trace of 10 T gates on an idling logical qubit. A similar experiment with 1000 T gates was used to collect the data in this figure.

/assets/img/projects/swiper/12_alignment.png

Figure 12. Comparing a well-aligned T gate (top) and a poorly- aligned T gate (bottom).

/assets/img/projects/swiper/13_evaluation.png

Figure 13. Top: Program runtime for selected benchmarks under different window scheduling methods, with and without SWIPER. Classical processor resources are unlimited. Vertical black lines indicate standard deviation over 10 randomized trials for smaller benchmarks. Top right: Relative differences in runtime when using auto-limited processor count (Section 5.3.4) for benchmarks with runtime longer than $10^4 \mu s$. Bottom: Relative runtimes normalized to default parallel window runtime.

/assets/img/projects/swiper/15_classical.png

Figure 15. Comparing classical compute cost between baseline method and SWIPER.


My Contributions


  • Developed first-of-its-kind open-source lattice surgery decoding simulator SWIPER-SIM, including 3D visualization of program traces. Specified window decoding schemes for spatially-windowed lattice surgery operations, made many performance optimizations to maximize simulator speed, and wrote clean, documented code with an intention to open-source upon publication.
  • Ran simulations and generated figures.
  • Revised manuscript.

  • Things I Learned


  • Intentional software development: SWIPER-SIM consists of four main pieces that handle different parts of the simulation (such as the device or the decoder). We were careful about how we layed out the initial design of the simulator before starting to fill it in with code, and this was very helpful in ending up with a clean design at the end.
  • Presentation: A lot of work went into figuring out how best to present this paper to the target audience (computer architecture reviewers). I learned about how to effectively guide the (potentially short on time) reader through a complex topic by highlighting the key concepts and making certain parts of the paper as self-contained as possible.

  • ← back to jason-chadwick.com