crisis/tests/test_alarm.py
saymrwulf b8684297fa Add crisis_agents — Crisis as a coordination layer for AI agent teams
A new sibling Python package, `crisis_agents`, that lifts the Crisis
protocol from "consensus between machines" to "consensus between AI
agents". Threat model: a team of sub-agents normally talks freely
with its orchestrator (the "mothership"); when the team's boundary
opens and an external agent of unknown trust joins, the mothership
activates the Crisis layer so byzantine equivocation is detectable.

Two-phase orchestration model:

  Phase 1 — closed team, no Crisis: agents emit claims directly, the
  mothership collects them flat.

  Phase 2 — boundary opens: every subsequent claim is wrapped into a
  Crisis Message with the agent's stable process_id and a PoW nonce,
  delivered into per-agent LamportGraphs, and after each turn the
  mothership scans for mutations via LamportGraph.find_mutations.

  Phase 3 — proof: when an alarm fires, the mothership emits a
  replayable JSON proof-of-malfeasance document with the contradictory
  witnesses, their delivery sets, and DAG cross-references showing
  which honest agents saw what.

Modules:
  - claim.py      Claim dataclass + JSON round-trip
  - boundary.py   membership tracker + open() trigger
  - agent.py      CrisisAgent abstract + MockAgent + MockByzantineAgent
                  (the latter equivocates by emitting two variants to
                  disjoint peer subsets at the same logical turn)
  - mothership.py orchestrator driving both phases, building Crisis
                  Messages from Claims, per-agent LamportGraphs, log
  - alarm.py      scan_for_mutations: same-agent same-turn distinct
                  digests with non-identical delivery sets, verified
                  spacelike via LamportGraph.are_spacelike on the
                  honest-agent graphs
  - proof.py      build_proof + ProofDocument + JSON serializer +
                  verify_proof_self_consistent
  - cli.py        `crisis-agents demo` + `crisis-agents verify`
  - scenarios/    fact_check: reference doc + 6 statements + scripted
                  honest/byzantine agents producing a deterministic
                  equivocation on statement s03

Tests: 50 new tests across test_claim, test_boundary, test_mothership,
test_alarm, test_proof, test_demo_fact_check. End-to-end test runs the
fact_check scenario, asserts exactly one alarm raised, proof is built,
re-serialized JSON passes self-consistency. Full suite (existing
crisis + new crisis_agents) green in 0.77s — 145 tests.

Out of scope (deliberately): visualization (separate CrisisViz upgrade
later), real TCP gossip (agents talk via in-process function calls in
the mothership), false-claim detection without equivocation (an
agent that consistently lies but never equivocates is out-voted, not
"caught"; catching it would require a ground-truth oracle).

Reuse from existing crisis package: Message, Vertex, LamportGraph,
LamportGraph.find_mutations, ProofOfWorkWeight, digest. The new code
is a thin adapter layer; the protocol substrate did the heavy lifting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 16:38:11 +02:00

103 lines
3.6 KiB
Python

"""Tests for byzantine equivocation detection."""
import pytest
from crisis_agents.agent import MockAgent, MockByzantineAgent
from crisis_agents.alarm import AlarmEvent, scan_for_mutations
from crisis_agents.claim import Claim
from crisis_agents.mothership import Mothership
def _claim(sid: str, verdict: str = "true", evidence: str = "ok") -> Claim:
return Claim(statement_id=sid, verdict=verdict, confidence=0.9, # type: ignore[arg-type]
evidence=evidence, timestamp_logical=0)
def _equivocating_team() -> Mothership:
"""A 3-honest-1-byzantine team where the byzantine equivocates on s03."""
m = Mothership()
m.add_agent(MockAgent("a", [[]]))
m.add_agent(MockAgent("b", [[]]))
m.add_agent(MockAgent("c", [[]]))
m.open_boundary(MockByzantineAgent(
"d",
scripted_pairs=[(
_claim("s03", verdict="true", evidence="to_a"),
_claim("s03", verdict="false", evidence="to_b"),
)],
split_a={"a", "c"},
split_b={"b"},
))
return m
class TestAlarmDetection:
def test_no_alarms_in_honest_run(self):
m = Mothership()
m.add_agent(MockAgent("a", [[]]))
m.add_agent(MockAgent("b", [[]]))
m.open_boundary(MockAgent("d", [[_claim("s01")]]))
m.run_crisis_phase(num_turns=1)
alarms = scan_for_mutations(m)
assert alarms == []
def test_equivocation_raises_one_alarm(self):
m = _equivocating_team()
m.run_crisis_phase(num_turns=1)
alarms = scan_for_mutations(m)
assert len(alarms) == 1
a = alarms[0]
assert isinstance(a, AlarmEvent)
assert a.accused_agent == "d"
assert a.statement_id == "s03"
assert a.turn == 0
assert len(a.witnesses) == 2
def test_witness_digests_are_distinct(self):
m = _equivocating_team()
m.run_crisis_phase(num_turns=1)
a = scan_for_mutations(m)[0]
d1 = a.witnesses[0].message_digest_hex
d2 = a.witnesses[1].message_digest_hex
assert d1 != d2
def test_delivery_sets_are_disjoint(self):
m = _equivocating_team()
m.run_crisis_phase(num_turns=1)
a = scan_for_mutations(m)[0]
s1 = set(a.witnesses[0].delivered_to)
s2 = set(a.witnesses[1].delivered_to)
assert s1 & s2 == set()
def test_spacelike_verified_is_true(self):
"""The Crisis layer should confirm the witness vertices are causally
incomparable in at least one honest graph."""
m = _equivocating_team()
m.run_crisis_phase(num_turns=1)
a = scan_for_mutations(m)[0]
assert a.spacelike_verified is True
def test_duplicate_broadcast_is_not_equivocation(self):
"""If a byzantine emits the SAME payload to two disjoint subsets,
the message digests are identical and it's not equivocation."""
same = _claim("s03", verdict="true", evidence="same evidence")
m = Mothership()
m.add_agent(MockAgent("a", [[]]))
m.add_agent(MockAgent("b", [[]]))
m.open_boundary(MockByzantineAgent(
"d",
scripted_pairs=[(same, same)],
split_a={"a"},
split_b={"b"},
))
m.run_crisis_phase(num_turns=1)
alarms = scan_for_mutations(m)
# Same payload → same nonce-mined message after PoW → same digest →
# no equivocation. (The byzantine has to actually say different
# things to be caught.)
assert alarms == [] or all(
len({w.message_digest_hex for w in alarm.witnesses}) > 1
for alarm in alarms
)