mirror of
https://github.com/saymrwulf/autoresearch-quantum.git
synced 2026-05-14 20:37:51 +00:00
- Create notebooks/00_START_HERE.ipynb as the single entry point with plan descriptions, audience guidance, and links to all 4 plans - Add navigation footer cells to all 11 content notebooks with Next/Previous links and back-link to Start Here - Terminal notebooks (plan endings) offer cross-plan links to explore other plans - Plan C dashboard gets explicit recommended reading order (Track A → B → C) - Add test_start_here_exists_and_links_all_plans and test_every_notebook_has_navigation_footer to test suite - Skip navigation-only notebooks in code-cell and assessment tests
857 lines
No EOL
30 KiB
Text
857 lines
No EOL
30 KiB
Text
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "751fe8cc",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Notebook 3: The Ratchet Learns For You\n",
|
|
"\n",
|
|
"**Plan A \u2014 Automated Search**\n",
|
|
"\n",
|
|
"You now know what an encoded magic state is (Notebook 1) and how to measure its quality (Notebook 2). This notebook shows how the **autoresearch ratchet** automatically explores the parameter space to find the best circuit configuration.\n",
|
|
"\n",
|
|
"**What you will learn:**\n",
|
|
"1. The incumbent-challenger optimization model\n",
|
|
"2. How challengers are generated (neighbor walk, random combo, lesson-guided)\n",
|
|
"3. How the ratchet selects winners and extracts lessons\n",
|
|
"4. Cross-rung propagation and search space narrowing"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "3f9b56a6",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"All imports successful.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"%matplotlib inline\n",
|
|
"import sys, warnings, tempfile\n",
|
|
"warnings.filterwarnings(\"ignore\")\n",
|
|
"\n",
|
|
"import numpy as np\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"from math import sqrt\n",
|
|
"\n",
|
|
"from autoresearch_quantum.models import (\n",
|
|
" ExperimentSpec, RungConfig, EvaluationMetrics,\n",
|
|
" QualityWeights, CostWeights, ScoreConfig, SearchSpaceConfig,\n",
|
|
" TierPolicyConfig, HardwareConfig, LessonFeedback, SearchRule,\n",
|
|
")\n",
|
|
"from autoresearch_quantum.execution.local import LocalCheapExecutor\n",
|
|
"from autoresearch_quantum.search.challengers import (\n",
|
|
" generate_neighbor_challengers, mutation_summary, GeneratedChallenger,\n",
|
|
")\n",
|
|
"from autoresearch_quantum.search.strategies import (\n",
|
|
" NeighborWalk, RandomCombo, LessonGuided, CompositeGenerator,\n",
|
|
" default_composite, StrategyWeight,\n",
|
|
")\n",
|
|
"from autoresearch_quantum.ratchet.runner import AutoresearchHarness\n",
|
|
"from autoresearch_quantum.persistence.store import ResearchStore\n",
|
|
"from autoresearch_quantum.config import load_rung_config\n",
|
|
"from autoresearch_quantum.lessons.extractor import extract_rung_lesson\n",
|
|
"from autoresearch_quantum.lessons.feedback import (\n",
|
|
" extract_search_rules, narrow_search_space, build_lesson_feedback,\n",
|
|
")\n",
|
|
"from autoresearch_quantum.execution.transfer import TransferEvaluator\n",
|
|
"\n",
|
|
"print(\"All imports successful.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "7cb035ce",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Learning tracker active.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from autoresearch_quantum.teaching import LearningTracker\n",
|
|
"from autoresearch_quantum.teaching.assess import quiz, predict_choice, reflect, order, checkpoint_summary\n",
|
|
"tracker = LearningTracker(\"plan_a_03\")\n",
|
|
"print(\"Learning tracker active.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5fefc4e4",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 1. The Incumbent-Challenger Model\n",
|
|
"\n",
|
|
"The ratchet keeps a **best-so-far** configuration called the **incumbent**. Each step:\n",
|
|
"\n",
|
|
"1. Generate **challengers** \u2014 new configurations that differ from the incumbent in one or more parameters\n",
|
|
"2. Evaluate each challenger on the cheap tier (noisy simulator)\n",
|
|
"3. If any challenger beats the incumbent by a margin, it becomes the new incumbent\n",
|
|
"4. Repeat until patience runs out\n",
|
|
"\n",
|
|
"This is a form of **local search** \u2014 like hill climbing in parameter space."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "e563c118",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Bootstrap incumbent:\n",
|
|
" seed_style: h_p\n",
|
|
" encoder_style: cx_chain\n",
|
|
" verification: both\n",
|
|
" postselection: all_measured\n",
|
|
" optimization_level: 2\n",
|
|
" target_backend: fake_brisbane\n",
|
|
"\n",
|
|
"Search space dimensions:\n",
|
|
" seed_style: ['h_p', 'ry_rz', 'u_magic']\n",
|
|
" encoder_style: ['cx_chain', 'cz_compiled']\n",
|
|
" verification: ['both', 'z_only', 'x_only']\n",
|
|
" postselection: ['all_measured', 'z_only', 'none']\n",
|
|
" ancilla_strategy: ['dedicated_pair', 'reused_single']\n",
|
|
" optimization_level: [1, 2, 3]\n",
|
|
"\n",
|
|
"Max challengers per step: 8\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Load the rung1 configuration\n",
|
|
"rung_config = load_rung_config(\"../../configs/rungs/rung1.yaml\")\n",
|
|
"\n",
|
|
"# The bootstrap incumbent\n",
|
|
"incumbent_spec = rung_config.bootstrap_incumbent\n",
|
|
"print(\"Bootstrap incumbent:\")\n",
|
|
"print(f\" seed_style: {incumbent_spec.seed_style}\")\n",
|
|
"print(f\" encoder_style: {incumbent_spec.encoder_style}\")\n",
|
|
"print(f\" verification: {incumbent_spec.verification}\")\n",
|
|
"print(f\" postselection: {incumbent_spec.postselection}\")\n",
|
|
"print(f\" optimization_level: {incumbent_spec.optimization_level}\")\n",
|
|
"print(f\" target_backend: {incumbent_spec.target_backend}\")\n",
|
|
"print(f\"\\nSearch space dimensions:\")\n",
|
|
"for dim, values in rung_config.search_space.dimensions.items():\n",
|
|
" print(f\" {dim}: {values}\")\n",
|
|
"print(f\"\\nMax challengers per step: {rung_config.search_space.max_challengers_per_step}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d4044fc8",
|
|
"metadata": {},
|
|
"source": [
|
|
"### The ratchet guarantee\n",
|
|
"\n",
|
|
"The key property: the incumbent **never gets worse**. A challenger must demonstrably beat the incumbent to replace it."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "1f48aa77",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"application/vnd.jupyter.widget-view+json": {
|
|
"model_id": "47c3d5bbac3d4fd4ab7c8c57b4432b17",
|
|
"version_major": 2,
|
|
"version_minor": 0
|
|
},
|
|
"text/plain": [
|
|
"VBox(children=(HTML(value='<div style=\"font-size:14px; font-weight:600; color:#4a148c; margin-bottom:8px;\">	\u2026"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"application/vnd.jupyter.widget-view+json": {
|
|
"model_id": "47c3d5bbac3d4fd4ab7c8c57b4432b17",
|
|
"version_major": 2,
|
|
"version_minor": 0
|
|
},
|
|
"text/plain": [
|
|
"VBox(children=(HTML(value='<div style=\"font-size:14px; font-weight:600; color:#4a148c; margin-bottom:8px;\">	\u2026"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"quiz(tracker, \"q1_ratchet_guarantee\",\n",
|
|
" question=\"What is the ratchet guarantee?\",\n",
|
|
" options=[\n",
|
|
" \"Every step improves the score\",\n",
|
|
" \"The incumbent never gets worse \\u2014 challengers must beat it to replace it\",\n",
|
|
" \"The search space shrinks every step\",\n",
|
|
" \"The ratchet always converges to the global optimum\",\n",
|
|
" ],\n",
|
|
" correct=1, section=\"1. Incumbent-challenger\", bloom=\"remember\",\n",
|
|
" explanation=\"The ratchet is monotonic: if no challenger beats the incumbent, the incumbent stays. This does NOT guarantee finding the global optimum.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "07aea2c1",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 2. Generating Challengers: Neighbor Walk\n",
|
|
"\n",
|
|
"The simplest strategy: change **one parameter at a time**. For each dimension in the search space, try each alternative value."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9bb9a7f8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"challengers = generate_neighbor_challengers(\n",
|
|
" incumbent_spec,\n",
|
|
" rung_config.search_space,\n",
|
|
")\n",
|
|
"\n",
|
|
"print(f\"Generated {len(challengers)} challengers:\\n\")\n",
|
|
"for i, c in enumerate(challengers):\n",
|
|
" print(f\" {i+1}. {c.mutation_note}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "1d3c7add",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"quiz(tracker, \"q2_neighborwalk\",\n",
|
|
" question=\"How does NeighborWalk generate challengers?\",\n",
|
|
" options=[\n",
|
|
" \"Changes all parameters simultaneously to random values\",\n",
|
|
" \"Changes exactly one parameter at a time to each of its other possible values\",\n",
|
|
" \"Applies gradient descent to continuous parameters\",\n",
|
|
" ],\n",
|
|
" correct=1, section=\"2. Challengers\", bloom=\"understand\",\n",
|
|
" explanation=\"NeighborWalk is single-axis: for each dimension, try every alternative value while keeping all other dimensions fixed.\")\n",
|
|
"checkpoint_summary(tracker, \"2. Challengers\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "42943480",
|
|
"metadata": {},
|
|
"source": [
|
|
"> **Key Insight:** Neighbor walk is exhaustive within one axis but never explores *combinations* of changes. It is good for identifying which single parameter matters most."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ed8e7c15",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 3. Evaluating Challengers\n",
|
|
"\n",
|
|
"Let us run the incumbent and all challengers on the local noisy simulator."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "369bf954",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Use smaller settings for speed\n",
|
|
"fast_rung = RungConfig(\n",
|
|
" rung=1, name=rung_config.name, description=rung_config.description,\n",
|
|
" objective=rung_config.objective, bootstrap_incumbent=incumbent_spec,\n",
|
|
" search_space=rung_config.search_space,\n",
|
|
" tier_policy=TierPolicyConfig(\n",
|
|
" cheap_margin=0.002, cheap_shots=256, cheap_repeats=1,\n",
|
|
" expensive_shots=512, expensive_repeats=1,\n",
|
|
" promote_top_k=2, enable_hardware=False,\n",
|
|
" ),\n",
|
|
" score=rung_config.score,\n",
|
|
" step_budget=1, patience=1,\n",
|
|
" hardware=HardwareConfig(),\n",
|
|
")\n",
|
|
"\n",
|
|
"executor = LocalCheapExecutor()\n",
|
|
"incumbent_result = executor.evaluate(incumbent_spec, fast_rung)\n",
|
|
"print(f\"Incumbent score: {incumbent_result.score:.4f}\")\n",
|
|
"print(f\" failure_mode: {incumbent_result.metrics.dominant_failure_mode}\")\n",
|
|
"\n",
|
|
"challenger_scores = {}\n",
|
|
"for c in challengers[:8]: # Limit for speed\n",
|
|
" result = executor.evaluate(c.spec, fast_rung)\n",
|
|
" challenger_scores[c.mutation_note] = result.score\n",
|
|
" beat = \"BEATS\" if result.score > incumbent_result.score else \" \"\n",
|
|
" print(f\" {beat} {c.mutation_note}: {result.score:.4f}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3d80e869",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Visualize incumbent vs challengers\n",
|
|
"fig, ax = plt.subplots(figsize=(12, 5))\n",
|
|
"\n",
|
|
"labels = [\"INCUMBENT\"] + list(challenger_scores.keys())\n",
|
|
"scores = [incumbent_result.score] + list(challenger_scores.values())\n",
|
|
"colors = [\"#e74c3c\"] + [\"#2ecc71\" if s > incumbent_result.score else \"#95a5a6\" for s in challenger_scores.values()]\n",
|
|
"\n",
|
|
"bars = ax.barh(range(len(labels)), scores, color=colors)\n",
|
|
"ax.set_yticks(range(len(labels)))\n",
|
|
"ax.set_yticklabels([l[:40] for l in labels], fontsize=8)\n",
|
|
"ax.set_xlabel(\"Score\")\n",
|
|
"ax.set_title(\"Incumbent vs Challengers\")\n",
|
|
"ax.axvline(x=incumbent_result.score, color=\"#e74c3c\", linestyle=\"--\", alpha=0.5, label=\"Incumbent\")\n",
|
|
"ax.legend()\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8d43f418",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"predict_choice(tracker, \"q3_challenger_wins\",\n",
|
|
" question=\"Looking at the bar chart: did any challenger beat the incumbent?\",\n",
|
|
" options=[\n",
|
|
" \"Yes \\u2014 at least one bar is taller than INCUMBENT\",\n",
|
|
" \"No \\u2014 the incumbent bar is the tallest\",\n",
|
|
" \"Can't tell from a bar chart\",\n",
|
|
" ],\n",
|
|
" correct=0, section=\"3. Evaluation\", bloom=\"apply\",\n",
|
|
" explanation=\"In most runs, at least one challenger finds a better configuration.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9da23164",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 4. One Ratchet Step in Slow Motion\n",
|
|
"\n",
|
|
"Now let the harness run a complete ratchet step \u2014 including challenger generation, evaluation, promotion, and winner selection."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "788ec6fd",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"store = ResearchStore(tempfile.mkdtemp())\n",
|
|
"harness = AutoresearchHarness(store)\n",
|
|
"\n",
|
|
"step = harness.run_ratchet_step(fast_rung, allow_hardware=False)\n",
|
|
"\n",
|
|
"print(f\"Step index: {step.step_index}\")\n",
|
|
"print(f\"Incumbent before: {step.incumbent_before_id}\")\n",
|
|
"print(f\"Challengers tested: {len(step.challengers_tested)}\")\n",
|
|
"print(f\"Promoted to expensive: {len(step.promoted_challengers)}\")\n",
|
|
"print(f\"Winner: {step.winner_id}\")\n",
|
|
"print(f\"Winning margin: {step.winning_margin:+.4f}\")\n",
|
|
"print(f\"\\nCheap-tier justification:\")\n",
|
|
"print(f\" {step.cheap_tier_justification}\")\n",
|
|
"print(f\"\\nDistilled lesson:\")\n",
|
|
"print(f\" {step.distilled_lesson}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "979057fb",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"quiz(tracker, \"q4_no_improvement\",\n",
|
|
" question=\"What happens if ALL challengers score lower than the incumbent?\",\n",
|
|
" options=[\n",
|
|
" \"The harness picks the best challenger anyway\",\n",
|
|
" \"The incumbent stays and the step is logged with zero improvement\",\n",
|
|
" \"The harness generates more challengers until one wins\",\n",
|
|
" ],\n",
|
|
" correct=1, section=\"4. Ratchet step\", bloom=\"understand\",\n",
|
|
" explanation=\"Monotonic guarantee: if no challenger wins, the incumbent stays. Consecutive no-improvement steps trigger patience.\")\n",
|
|
"checkpoint_summary(tracker, \"4. Ratchet step\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "bfb10c0e",
|
|
"metadata": {},
|
|
"source": [
|
|
"> **Key Insight:** The tells you how much the winner improved over the incumbent. If positive, the ratchet \"clicked\" forward. The is a human-readable summary of what changed and why.\n",
|
|
"\n",
|
|
"---\n",
|
|
"## 5. Running a Full Rung\n",
|
|
"\n",
|
|
"A **rung** runs multiple ratchet steps in sequence. It stops when the step budget is exhausted or when patience runs out (no improvement for N consecutive steps)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "a7123aa0",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Fresh store for a clean rung\n",
|
|
"store2 = ResearchStore(tempfile.mkdtemp())\n",
|
|
"harness2 = AutoresearchHarness(store2)\n",
|
|
"\n",
|
|
"# Run with step_budget=3\n",
|
|
"full_rung = RungConfig(\n",
|
|
" rung=1, name=rung_config.name, description=rung_config.description,\n",
|
|
" objective=rung_config.objective, bootstrap_incumbent=incumbent_spec,\n",
|
|
" search_space=SearchSpaceConfig(\n",
|
|
" dimensions={\"verification\": [\"both\", \"z_only\", \"x_only\"],\n",
|
|
" \"seed_style\": [\"h_p\", \"ry_rz\", \"u_magic\"],\n",
|
|
" \"postselection\": [\"all_measured\", \"z_only\", \"none\"]},\n",
|
|
" max_challengers_per_step=6,\n",
|
|
" ),\n",
|
|
" tier_policy=TierPolicyConfig(\n",
|
|
" cheap_margin=0.001, cheap_shots=256, cheap_repeats=1,\n",
|
|
" expensive_shots=512, expensive_repeats=1,\n",
|
|
" promote_top_k=2, enable_hardware=False,\n",
|
|
" ),\n",
|
|
" score=rung_config.score,\n",
|
|
" step_budget=3, patience=2,\n",
|
|
" hardware=HardwareConfig(),\n",
|
|
")\n",
|
|
"\n",
|
|
"steps, lesson, feedback = harness2.run_rung(full_rung, allow_hardware=False)\n",
|
|
"print(f\"Steps completed: {len(steps)}\")\n",
|
|
"for s in steps:\n",
|
|
" print(f\" Step {s.step_index}: winner={s.winner_id[:30]}... margin={s.winning_margin:+.4f}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "bdeac1f5",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Show the lesson\n",
|
|
"print(\"=\" * 60)\n",
|
|
"print(lesson.narrative)\n",
|
|
"print(\"=\" * 60)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5ed4a4f2",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"reflect(tracker, \"q5_lesson_quality\",\n",
|
|
" question=\"Read the lesson narrative above. What actionable insight does it give? What would make it better?\",\n",
|
|
" section=\"5. Lesson\", bloom=\"evaluate\",\n",
|
|
" model_answer=\"A good lesson names specific parameter values that helped/hurt and explains WHY. The machine-readable rules are often more actionable than the narrative.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c016ebc3",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 6. Visualizing the Exploration\n",
|
|
"\n",
|
|
"Let us plot how the score evolved and which experiments the ratchet tried."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "853caf26",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"experiments = store2.list_experiments(1)\n",
|
|
"exp_scores = [(e[\"experiment_id\"][:25], e[\"final_score\"], e[\"role\"]) for e in experiments]\n",
|
|
"\n",
|
|
"fig, ax = plt.subplots(figsize=(12, 5))\n",
|
|
"colors = [\"#e74c3c\" if role == \"incumbent\" else \"#3498db\" for _, _, role in exp_scores]\n",
|
|
"ax.bar(range(len(exp_scores)), [s for _, s, _ in exp_scores], color=colors)\n",
|
|
"ax.set_xticks(range(len(exp_scores)))\n",
|
|
"ax.set_xticklabels([eid for eid, _, _ in exp_scores], rotation=45, ha=\"right\", fontsize=7)\n",
|
|
"ax.set_ylabel(\"Score\")\n",
|
|
"ax.set_title(\"All Experiments in Rung 1\")\n",
|
|
"\n",
|
|
"# Add legend\n",
|
|
"from matplotlib.patches import Patch\n",
|
|
"ax.legend(handles=[Patch(color=\"#e74c3c\", label=\"Incumbent\"), Patch(color=\"#3498db\", label=\"Challenger\")])\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "77264cbd",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 7. Search Strategies Compared\n",
|
|
"\n",
|
|
"The harness supports three strategies. Let us see how they generate different challengers from the same incumbent."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "d3871da7",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"search_space = SearchSpaceConfig(\n",
|
|
" dimensions={\n",
|
|
" \"verification\": [\"both\", \"z_only\", \"x_only\"],\n",
|
|
" \"seed_style\": [\"h_p\", \"ry_rz\", \"u_magic\"],\n",
|
|
" \"optimization_level\": [1, 2, 3],\n",
|
|
" },\n",
|
|
" max_challengers_per_step=6,\n",
|
|
")\n",
|
|
"\n",
|
|
"for name, strategy in [(\"NeighborWalk\", NeighborWalk()),\n",
|
|
" (\"RandomCombo\", RandomCombo(num_candidates=6)),\n",
|
|
" (\"CompositeGenerator\", default_composite(has_lessons=False))]:\n",
|
|
" challengers = strategy.generate(incumbent_spec, search_space, set())\n",
|
|
" print(f\"\\n{name} ({len(challengers)} challengers):\")\n",
|
|
" for c in challengers[:4]:\n",
|
|
" print(f\" {c.mutation_note}\")\n",
|
|
" if len(challengers) > 4:\n",
|
|
" print(f\" ... and {len(challengers) - 4} more\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b2b6b18f",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Strategy comparison\n",
|
|
"\n",
|
|
"- **NeighborWalk**: 1 axis at a time, systematic\n",
|
|
"- **RandomCombo**: multiple axes, random\n",
|
|
"- **LessonGuided**: rule-biased from previous rungs"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "fc0d6390",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"order(tracker, \"q6_strategy_breadth\",\n",
|
|
" instruction=\"Rank strategies from narrowest to broadest exploration:\",\n",
|
|
" items=[\"NeighborWalk\", \"RandomCombo\", \"LessonGuided\"],\n",
|
|
" correct_order=[\"NeighborWalk\", \"LessonGuided\", \"RandomCombo\"],\n",
|
|
" section=\"6. Search strategies\", bloom=\"analyze\",\n",
|
|
" explanation=\"NeighborWalk: 1 param (narrowest). LessonGuided: focused by rules (medium). RandomCombo: multiple params randomly (broadest).\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1fb3757c",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 8. Lesson-Guided Search\n",
|
|
"\n",
|
|
"After a rung completes, the harness extracts **SearchRules** \u2014 machine-readable directives like \"prefer z_only\" or \"avoid x_only\". These guide future search."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "d37dcb46",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Show the feedback rules from our rung\n",
|
|
"print(f\"Feedback from rung 1: {len(feedback.rules)} rules\\n\")\n",
|
|
"for rule in feedback.rules:\n",
|
|
" print(f\" {rule.action.upper():7s} {rule.dimension}={rule.value} (confidence={rule.confidence:.2f})\")\n",
|
|
" print(f\" {rule.reason}\\n\")\n",
|
|
"\n",
|
|
"print(f\"Narrowed dimensions:\")\n",
|
|
"for dim, vals in feedback.narrowed_dimensions.items():\n",
|
|
" print(f\" {dim}: {vals}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2d485ab3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"quiz(tracker, \"q7_fix_vs_avoid\",\n",
|
|
" question=\"What is the difference between a 'fix' rule and an 'avoid' rule?\",\n",
|
|
" options=[\n",
|
|
" \"'fix' locks a value permanently; 'avoid' removes a value from the search space\",\n",
|
|
" \"'fix' repairs a bug; 'avoid' prevents a crash\",\n",
|
|
" \"They are synonyms\",\n",
|
|
" ],\n",
|
|
" correct=0, section=\"7. Lesson-guided\", bloom=\"remember\",\n",
|
|
" explanation=\"'fix': this value is clearly best, always use it. 'avoid': this value consistently hurts, remove it.\")\n",
|
|
"checkpoint_summary(tracker, \"7. Lesson-guided\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "09951547",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Run LessonGuided strategy using the feedback\n",
|
|
"if feedback.rules:\n",
|
|
" guided = LessonGuided(num_candidates=6)\n",
|
|
" guided_challengers = guided.generate(incumbent_spec, search_space, set(), [feedback])\n",
|
|
" print(f\"Lesson-guided generated {len(guided_challengers)} challengers:\")\n",
|
|
" for c in guided_challengers:\n",
|
|
" print(f\" {c.mutation_note}\")\n",
|
|
"else:\n",
|
|
" print(\"No rules extracted (too few experiments). Try increasing step_budget.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "24e1dcfb",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 9. Cross-Rung Propagation\n",
|
|
"\n",
|
|
"A **ratchet** runs multiple rungs in sequence. The winner from rung N becomes the bootstrap incumbent for rung N+1. Lessons narrow the search space."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "6730be92",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"store3 = ResearchStore(tempfile.mkdtemp())\n",
|
|
"harness3 = AutoresearchHarness(store3)\n",
|
|
"\n",
|
|
"rung1_config = RungConfig(\n",
|
|
" rung=1, name=\"Rung 1\", description=\"Explore seed and verification\",\n",
|
|
" objective=\"Find best basic config\",\n",
|
|
" bootstrap_incumbent=ExperimentSpec(\n",
|
|
" rung=1, target_backend=\"fake_brisbane\", noise_backend=\"fake_brisbane\",\n",
|
|
" shots=256, repeats=1,\n",
|
|
" ),\n",
|
|
" search_space=SearchSpaceConfig(\n",
|
|
" dimensions={\"verification\": [\"both\", \"z_only\"], \"seed_style\": [\"h_p\", \"ry_rz\"]},\n",
|
|
" max_challengers_per_step=4,\n",
|
|
" ),\n",
|
|
" tier_policy=TierPolicyConfig(cheap_margin=0.0, cheap_shots=256, cheap_repeats=1,\n",
|
|
" promote_top_k=1, enable_hardware=False),\n",
|
|
" score=rung_config.score, step_budget=2, patience=1, hardware=HardwareConfig(),\n",
|
|
")\n",
|
|
"\n",
|
|
"rung2_config = RungConfig(\n",
|
|
" rung=2, name=\"Rung 2\", description=\"Refine with more dimensions\",\n",
|
|
" objective=\"Optimize further\",\n",
|
|
" bootstrap_incumbent=ExperimentSpec(\n",
|
|
" rung=2, target_backend=\"fake_brisbane\", noise_backend=\"fake_brisbane\",\n",
|
|
" shots=256, repeats=1,\n",
|
|
" ),\n",
|
|
" search_space=SearchSpaceConfig(\n",
|
|
" dimensions={\"verification\": [\"both\", \"z_only\"], \"optimization_level\": [1, 2, 3]},\n",
|
|
" max_challengers_per_step=4,\n",
|
|
" ),\n",
|
|
" tier_policy=rung1_config.tier_policy,\n",
|
|
" score=rung_config.score, step_budget=2, patience=1, hardware=HardwareConfig(),\n",
|
|
")\n",
|
|
"\n",
|
|
"results = harness3.run_ratchet([rung1_config, rung2_config], allow_hardware=False)\n",
|
|
"\n",
|
|
"for lesson_obj, fb in results:\n",
|
|
" print(f\"\\nRung {lesson_obj.rung}: {lesson_obj.name}\")\n",
|
|
" print(f\" Rules extracted: {len(fb.rules)}\")\n",
|
|
" print(f\" Best spec fields: {dict(list(fb.best_spec_fields.items())[:5])}...\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3da85dc8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"quiz(tracker, \"q8_propagation\",\n",
|
|
" question=\"Why does the ratchet propagate the winning spec to the next rung?\",\n",
|
|
" options=[\n",
|
|
" \"To save typing the spec again\",\n",
|
|
" \"The winner from rung N is a good starting point for rung N+1, avoiding cold-start\",\n",
|
|
" \"Each rung must use the same spec\",\n",
|
|
" ],\n",
|
|
" correct=1, section=\"8. Cross-rung\", bloom=\"understand\",\n",
|
|
" explanation=\"Cross-rung propagation transfers knowledge: best settings from one rung become the starting point for the next.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "59c7bef1",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## 10. Transfer Evaluation\n",
|
|
"\n",
|
|
"A **transfer test** runs the best spec across multiple backend noise models to check if the settings generalize or are overfit to one specific noise profile."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5628cccb",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"evaluator = TransferEvaluator()\n",
|
|
"report = evaluator.evaluate_across_backends(\n",
|
|
" incumbent_spec,\n",
|
|
" [\"fake_brisbane\"], # Single backend for speed (add more for real transfer tests)\n",
|
|
" fast_rung,\n",
|
|
")\n",
|
|
"\n",
|
|
"print(f\"Transfer score (pessimistic = min): {report.transfer_score:.4f}\")\n",
|
|
"print(f\"Mean score: {report.mean_score:.4f}\")\n",
|
|
"print(f\"Std score: {report.std_score:.4f}\")\n",
|
|
"for backend_name, score in report.per_backend_scores.items():\n",
|
|
" print(f\" {backend_name}: {score:.4f}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5dea8979",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"quiz(tracker, \"q9_transfer_quality\",\n",
|
|
" question=\"When is a transfer score 'good'?\",\n",
|
|
" options=[\n",
|
|
" \"When it is higher than 0\",\n",
|
|
" \"When it is close to the original score on the source backend\",\n",
|
|
" \"When it is exactly 1.0\",\n",
|
|
" ],\n",
|
|
" correct=1, section=\"9. Transfer\", bloom=\"evaluate\",\n",
|
|
" explanation=\"Good transfer means settings work almost as well on the target backend. A large drop means overfitting to the source noise profile.\")\n",
|
|
"checkpoint_summary(tracker, \"9. Transfer\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "71095cd5",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## Summary\n",
|
|
"\n",
|
|
"You have now seen the complete autoresearch pipeline:\n",
|
|
"\n",
|
|
"| Layer | What happens |\n",
|
|
"|---|---|\n",
|
|
"| **Circuit** | Build encoded magic state (Notebook 1) |\n",
|
|
"| **Metrics** | Measure quality, cost, and score (Notebook 2) |\n",
|
|
"| **Search** | Generate and evaluate challengers (this notebook) |\n",
|
|
"| **Ratchet** | Iterate: incumbent vs challengers, promote winners |\n",
|
|
"| **Lessons** | Extract rules, narrow search space, propagate to next rung |\n",
|
|
"| **Transfer** | Verify settings generalize across backends |\n",
|
|
"\n",
|
|
"The entire process compresses hours of manual parameter exploration into minutes of automated search. Each rung produces a human-readable lesson and machine-readable rules that make future exploration more efficient."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ba79fac4",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"## Final Assessment"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "bdbf806a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"tracker.dashboard()\n",
|
|
"path = tracker.save()\n",
|
|
"print(f\"\\nProgress saved to: {path}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d0913b05",
|
|
"source": "---\n## You've completed Plan A!\n\nWant to explore the same material from a different angle? Try another plan:\n- [Plan B \u2014 Spiral Notebook](../plan_b/spiral_notebook.ipynb) (three passes, increasing depth)\n- [Plan C \u2014 Parallel Tracks](../plan_c/00_dashboard.ipynb) (self-directed deep dives)\n- [Plan D \u2014 Hypothesis-Driven](../plan_d/experiment_1_protection.ipynb) (experimental method)\n\n*\u2190 Previous: [Notebook 2 \u2014 Measuring Progress](02_measuring_progress.ipynb) \u00b7 [Start Here](../00_START_HERE.ipynb)*",
|
|
"metadata": {}
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.14.2"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
} |