Karpathy-style autoresearch ratchet for encoded magic-state preparation on [[4,2,2]] quantum error-detecting code
Find a file
saymrwulf e13a3268c2 Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests
- 8 Jupyter notebooks across 3 learning plans (A: bottom-up, B: spiral, C: parallel tracks)
- Teaching toolkit (src/autoresearch_quantum/teaching/) with ipywidgets-based
  quiz, predict_choice, reflect, and order widgets — visually distinct from code cells
- Fix spectator_z operator: was {1:'Z',2:'Z'} (IZZI, expectation=0), now {1:'Z',3:'Z'}
  (ZIZI, expectation=+1 for ideal T-state, commutes with logical operators)
- Fix u_magic seed: swap phase arguments to match h_p and ry_rz preparations
- Fix double-display bug: widgets rendered twice when function returned the box
- Fix CLI override parser for negative integers and missing '=' validation
- Fix stabilizer detection quiz: ZZZZ detects X errors, not Z errors
- Add ties parameter to order() for questions with interchangeable items
- Expand test suite from 21 to 107 tests
- Update README with notebook instructions and project tree
2026-04-07 17:14:37 +02:00
configs/rungs Initial commit: autoresearch-quantum — automated magic-state preparation ratchet 2026-04-05 12:37:39 +02:00
notebooks Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
paper Add technical paper: 19-page LaTeX document with compiled PDF 2026-04-05 12:37:39 +02:00
scripts Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
src/autoresearch_quantum Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
tests Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
.gitignore Initial commit: autoresearch-quantum — automated magic-state preparation ratchet 2026-04-05 12:37:39 +02:00
pyproject.toml Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
README.md Add teaching notebooks, widget-based quizzes, bug fixes, and expanded tests 2026-04-07 17:14:37 +02:00
THE_STORY.md Initial commit: autoresearch-quantum — automated magic-state preparation ratchet 2026-04-05 12:37:39 +02:00

Autoresearch Quantum

autoresearch-quantum is a Python research harness for a Karpathy-style autoresearch ratchet in quantum experiments:

  • keep an incumbent experiment
  • generate challenger experiments
  • screen challengers on a cheap tier
  • promote only justified challengers to an expensive tier
  • replace the incumbent only when the challenger wins on the final criterion
  • log every ratchet step
  • extract a transferable lesson at the end of each rung

The first built-in experiment family targets encoded magic-state preparation in the [[4,2,2]] code with Qiskit. The framework is designed so the [[4,2,2]] rung is not the destination. It is the first rung in a ladder that shifts from best-circuit hunting toward reusable design rules for larger encoded workflows.

Project Tree

autoresearch-quantum/
├── configs/rungs/
│   ├── rung1.yaml          Baseline: what recipe works?
│   ├── rung2.yaml          Stability under noise variation
│   ├── rung3.yaml          Transfer across backends
│   ├── rung4.yaml          Factory throughput / cost
│   └── rung5.yaml          Rosenfeld direction
├── src/autoresearch_quantum/
│   ├── cli.py              CLI entry point
│   ├── config.py           YAML config loader
│   ├── models.py           All data structures
│   ├── codes/
│   │   └── four_two_two.py [[4,2,2]] stabilisers, encoder, seed gates
│   ├── experiments/
│   │   └── encoded_magic_state.py  Circuit bundle builder
│   ├── execution/
│   │   ├── analysis.py     Postselection, witness, stability
│   │   ├── backends.py     Backend resolution
│   │   ├── hardware.py     IBM hardware executor
│   │   ├── local.py        Aer noise simulation executor
│   │   ├── transfer.py     Cross-backend transfer evaluator
│   │   └── transpile.py    Transpilation utilities
│   ├── lessons/
│   │   ├── extractor.py    Human-readable lesson extraction
│   │   └── feedback.py     Machine-readable rules + search narrowing
│   ├── persistence/
│   │   └── store.py        JSON file store with resumability
│   ├── ratchet/
│   │   └── runner.py       AutoresearchHarness orchestrator
│   ├── scoring/
│   │   └── score.py        WAC + factory throughput scorers
│   └── search/
│       ├── challengers.py  Neighbour generation with dedup
│       └── strategies.py   NeighborWalk, RandomCombo, LessonGuided
├── paper/
│   ├── autoresearch_quantum.tex   Full technical paper (LaTeX)
│   └── autoresearch_quantum.pdf   Compiled PDF (19 pages)
├── notebooks/
│   ├── plan_a/              Bottom-up: 3 sequential notebooks
│   │   ├── 01_encoded_magic_state.ipynb
│   │   ├── 02_measuring_progress.ipynb
│   │   └── 03_the_ratchet.ipynb
│   ├── plan_b/              Spiral: 1 notebook, three passes
│   │   └── spiral_notebook.ipynb
│   └── plan_c/              Parallel tracks + dashboard
│       ├── 00_dashboard.ipynb
│       ├── track_a_physics.ipynb
│       ├── track_b_engineering.ipynb
│       └── track_c_search.ipynb
├── tests/                   107 tests
│   ├── test_analysis.py
│   ├── test_cli.py
│   ├── test_codes.py
│   ├── test_config.py
│   ├── test_experiments.py
│   ├── test_feedback.py
│   ├── test_harness.py
│   ├── test_persistence.py
│   └── test_scoring.py
├── THE_STORY.md             Narrative documentation
├── pyproject.toml
└── README.md

Scientific Framing

What is optimized

The harness optimizes an experiment, not just a circuit. A spec includes:

  • logical magic-seed construction
  • encoder realization
  • verification strategy
  • postselection rule
  • ancilla strategy
  • transpilation choices
  • backend target and noise proxy
  • shot and repeat allocation

What is measured

The default score is:

score = (usable_magic_quality * acceptance_rate) / total_cost

with a configurable usable_magic_quality assembled from:

  • noisy encoded fidelity proxy
  • logical magic witness
  • codespace survival / postselection success
  • stability under repeated noisy evaluation
  • spectator logical alignment

and a configurable total_cost assembled from:

  • two-qubit gate count
  • transpiled depth
  • total shots consumed
  • runtime proxy
  • hardware queue proxy

Cheap tier vs expensive tier

Cheap tier:

  • backend-aware transpilation
  • noisy Aer evaluation
  • density-matrix fidelity when a backend-derived noise model is available
  • repeated local runs for stability scoring

Expensive tier:

  • IBM Runtime execution through SamplerV2
  • only used when enabled and when cheap-tier promotion thresholds are met
  • isolated behind hardware.py

Built-In [[4,2,2]] Experiment

The built-in experiment prepares an encoded logical T-state on one logical qubit of the [[4,2,2]] code while keeping the spectator logical qubit in |0⟩. The code utilities live in four_two_two.py.

The harness evaluates:

  • acceptance under optional ZZZZ and XXXX stabilizer checks
  • logical X and Y witnesses for the encoded magic state
  • spectator logical Z
  • compiled cost after transpilation to a chosen backend target

This keeps the core scientific distinction explicit:

  • a circuit can be locally good for [[4,2,2]]
  • a rule is only valuable if it keeps helping across new backends or new rungs

Installation

Create an isolated environment in the project root and install the package:

python3 -m venv .venv
. .venv/bin/activate
pip install -e '.[dev,notebooks]'

For the optional IBM hardware path:

pip install -e '.[hardware,dev,notebooks]'

If you want the CLI without installing editable mode, use PYTHONPATH=src.

Jupyter Notebooks --- Learning Plans

The notebooks/ folder contains three independent learning experiences. Each plan teaches the same material (encoded magic-state preparation, measurement, and the ratchet optimiser) through a different didactic lens. No IBM account or API key is needed --- everything runs locally with the Aer simulator.

Quick start

# 1. Activate the virtual environment (if not already active)
. .venv/bin/activate

# 2. Install the project with notebook dependencies
pip install -e '.[notebooks]'

# 3. Start the Jupyter server
jupyter lab --notebook-dir=notebooks

This opens JupyterLab in your browser (usually at http://localhost:8888). Navigate into any plan folder and open the first notebook.

Alternative: If you prefer the classic notebook interface, run jupyter notebook --notebook-dir=notebooks instead.

Plan A --- Bottom-Up (3 sequential notebooks)

# File What you learn
1 plan_a/01_encoded_magic_state.ipynb T-state, 4,2,2 encoder, stabilisers, error detection, postselection
2 plan_a/02_measuring_progress.ipynb Noise, logical operators, magic witness, scoring formula, parameter sweeps
3 plan_a/03_the_ratchet.ipynb Incumbent/challenger model, ratchet steps, lessons, cross-rung propagation

Start with notebook 01 and work through in order. Run each cell top-to-bottom (Shift+Enter).

Plan B --- Spiral (1 notebook, three passes)

File What you learn
plan_b/spiral_notebook.ipynb Pass 1: 5-min demo (black-box). Pass 2: Open the box (circuits, stabilisers, scoring). Pass 3: Make it your own (modify parameters, run experiments).

One notebook, 78 cells. Each pass revisits the same system at a deeper level.

Plan C --- Parallel Tracks (4 notebooks)

File Focus
plan_c/00_dashboard.ipynb Interactive dashboard (ipywidgets) --- run experiments from dropdowns
plan_c/track_a_physics.ipynb Pure quantum mechanics: Eastin-Knill, Bloch sphere, stabiliser algebra
plan_c/track_b_engineering.ipynb Noise models, transpilation, cost model, failure modes
plan_c/track_c_search.ipynb Parameter space, search strategies, lesson extraction, cross-rung transfer

Start with the dashboard for an overview, then dive into whichever track interests you. The three tracks are independent and can be read in any order.

Troubleshooting

Problem Fix
ModuleNotFoundError: autoresearch_quantum Run pip install -e '.[notebooks]' inside the activated .venv
ModuleNotFoundError: ipywidgets Run pip install ipywidgets --- needed for the Plan C dashboard
Plots don't render Make sure %matplotlib inline is in the first code cell (it already is)
Kernel not found In JupyterLab, select Kernel > Change Kernel and pick the .venv Python

How To Run

1. Run a single local experiment

Use the rung config bootstrap incumbent as-is:

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-experiment \
  --config configs/rungs/rung1.yaml \
  --store-dir data/demo

Override individual experiment fields:

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-experiment \
  --config configs/rungs/rung1.yaml \
  --store-dir data/demo \
  --set verification=z_only \
  --set postselection=z_only \
  --set ancilla_strategy=reused_single

2. Run one ratchet step

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-step \
  --config configs/rungs/rung1.yaml \
  --store-dir data/demo

This will:

  • load or bootstrap the incumbent
  • generate neighbor challengers from the rung search space
  • evaluate every challenger on the cheap tier
  • promote only margin-beating challengers if hardware is enabled
  • log the step and update the incumbent pointer if a challenger wins

3. Run one full rung

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-rung \
  --config configs/rungs/rung1.yaml \
  --store-dir data/demo

Artifacts are persisted under data/demo/rung_<n>/:

  • experiments/*.json
  • ratchet_steps/*.json
  • incumbent.json
  • lesson.json
  • lesson.md

4. Run a multi-rung ratchet campaign

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-ratchet \
  --config configs/rungs/rung1.yaml \
  --config configs/rungs/rung2.yaml \
  --config configs/rungs/rung3.yaml \
  --config configs/rungs/rung4.yaml \
  --store-dir data/campaign

5. Run an optional hardware-backed confirmation

First install the hardware extra and make IBM credentials available in the usual qiskit-ibm-runtime way. The simplest path is to export:

export QISKIT_IBM_TOKEN=...

Then enable the hardware tier in the rung config by setting tier_policy.enable_hardware: true and optionally hardware.backend_name: ibm_brisbane.

Run:

PYTHONPATH=src .venv/bin/python -m autoresearch_quantum run-step \
  --config configs/rungs/rung1.yaml \
  --store-dir data/hardware \
  --hardware

Only challengers that beat the incumbent cheap-tier score by tier_policy.cheap_margin are promoted.

Extending The Ladder

The intended progression is:

  1. rung1.yaml baseline [[4,2,2]] encoded magic-state preparation
  2. rung2.yaml same code with stronger stability and backend-awareness
  3. rung3.yaml transfer across backend families
  4. rung4.yaml factory-style cost pressure

To add a new rung:

  • create a new YAML in configs/rungs/
  • narrow the challenger space to the specific next question
  • tune cheap and expensive score weights for that rung
  • keep the lesson document as the real product

To add a new experiment family:

  • implement a new builder under src/autoresearch_quantum/experiments/
  • define the target state, witness operators, verification flow, and logging metadata
  • route the ratchet to that experiment family through config or a new CLI selector

Notes On Interpretation

This harness is explicit about proxy vs confirmation:

  • cheap-tier fidelity and witness numbers are local proxies
  • hardware runs are scarce and should be treated as confirmation
  • the most important artifact of each rung is the lesson, not just the incumbent ID

That is the intended ratchet: better experiment plus better search rule.