Module Architecture

Design | Chapter 8 | Prerequisites: 1-Design-Thesis | 24 min read | Home

8. Module Architecture

All modules are implemented. The codebase consists of 28 source files and 8 scripts.

8.0 Three-Lane Verification Architecture

Proof verification operates across three lanes at different granularities:

Lane	Purpose	Backend	Latency	Metric Category
A	Step-wise tactic search (inner loop)	Pantograph (local)	~1ms/tactic	`raw_success`
B	Proof audit: verify, repair, decompose	Axle API (cloud)	~100-200ms/call	`axle_assisted_success`, `axle_repair_only`
C	High-assurance recheck (leaderboard only)	lean4checker / Comparator / SafeVerify	seconds	independent validation

Lane A is the core search loop (proof_search.py → lean_interface.py). It applies tactics one at a time within a 600-step budget.

Lane B is the ProofAuditor service (proof_auditor.py → Axle API). It operates on complete or near-complete proofs after Lane A finishes. Operations: verify_proof (authoritative check), repair_proofs (close remaining goals with terminal tactics), sorry2lemma (decompose failures into subgoals for retry). Lane B never inflates Lane A metrics — results are tagged with strict SuccessCategory.

Lane C is optional and reserved for results submitted to benchmarks/leaderboards. Axle itself notes that verify_proof trusts the Lean environment and is not adversarial-safe (see Axle docs). For high-assurance, use lean4checker, Comparator, or SafeVerify.

Cost controls (Lane B):

Content-hash cache (SHA-256) avoids redundant Axle API calls
Caller filters to top-N candidates or high critic score before invoking
Bounded async concurrency (max_concurrency in config)
Graceful degradation on timeout/429/503 (returns failed AuditResult, search continues)

Metric separation prevents inflated claims. Benchmark reports always show:

raw_success: proved by Lane A alone, no Axle involvement
axle_assisted_success: Lane A partial + Axle decompose/repair
axle_repair_only: Axle repair_proofs closed remaining goals

Scope caveat: Axle is designed for simple imports/theorems/definitions. Tool count and API surface are evolving (v1.0.0, March 2026). Pin environment aggressively in config; update only with Mathlib version bumps.


src/
├── encoder.py               GoalEncoder (math-native model, see Research §3.2)
├── goal_analyzer.py          GoalAnalyzer (features + bank positions + anchor logits)
├── bridge.py                 InformationBridge (continuous → discrete bottleneck)
├── proof_navigator.py        ProofNavigator (ternary navigational decoder)
├── proof_network.py          Proof network database interface
├── resolution.py             Structured resolution (navigate + spread)
├── proof_search.py           Lane A: outer search loop with Lean interaction
├── lean_interface.py         Lane A: Pantograph-based Lean kernel interaction
├── proof_auditor.py          Lane B: Axle-based verification/repair/decomposition
├── nav_contracts.py          Navigational data contracts (NavigationalExample, NavOutput, etc.)
├── ternary_decoder.py        TernaryLinear, ternary_quantize (shared)
├── domain_gate.py            OOD detection gate
├── losses.py                 NavigationalLoss (UW-SO adaptive)
├── lowering.py               Deterministic tactic lowering
├── contracts.py              Core data contracts (ProofExample, etc.)
├── data.py                   Dataset loading (NavigationalDataset, etc.)
├── verification.py           Proof verification
├── pab_profile.py            PAB profile artifact
├── pab_metrics.py            PAB metric functions
├── pab_tracker.py            Checkpoint accumulation
├── behavioral_fingerprint.py Behavioral signatures
├── trainer.py                Training loop
├── trainer_steps.py          Train step mixin
├── trainer_checks.py         Gradient health, logging
├── trainer_constants.py      Domain inference
├── trainer_setup.py          Pipeline construction
├── trainer_validation.py     Proof validation
└── evaluate.py               Evaluation harness

scripts/
├── extract_proof_network.py    Populate proof network from Mathlib
├── build_nav_training_data.py  Convert proof traces to nav labels
├── anchor_gap_analysis.py      Iterative anchor gap analysis
├── tactic_maps.py              Tactic-to-direction mapping tables
├── train_navigator.py          Training script with curriculum
├── eval_retrieval.py           Nav vs dense retrieval comparison
├── eval_spreading.py           Spreading activation evaluation
└── run_benchmark.py            MiniF2F + Mathlib benchmark runner

configs/
└── wayfinder.yaml              Full Wayfinder experiment configuration

data/  (generated/downloaded at runtime)
├── proof_network.db            SQLite semantic network
├── nav_training.jsonl             Navigational training data
├── nav_eval.jsonl              Navigational eval data (frozen)
└── leandojo_mathlib.jsonl      LeanDojo extracted dataset

8.1 Navigational Modules

proof_navigator.py: Replaces ternary_decoder.py's vocabulary classification heads with 6 navigational direction heads (one per navigable bank). Inherits TernaryLinear for hidden layers. Outputs 6 ternary directions (3^6 = 729 coarse bins) + continuous anchor logits + progress + critic. Configurable via navigable_banks for graceful degradation to 3-bank.

proof_network.py: SQLite interface for the semantic network. Adapted from ModelAtlas's db.py. Schema: entities, entity_positions, anchors, entity_anchors, entity_links, anchor_idf. Core functions: navigate(), spread(), get_anchor_set(), compute_idf(), batch_get_positions().

resolution.py: Converts ProofNavigator output to concrete tactics and premises. Builds a StructuredQuery from ternary directions + top-k anchors. Calls navigate() on the tactic and premise entity tables. Combines with spread() scores from proof context. Returns ranked (tactic, premise_list) pairs.

proof_search.py: Outer search loop. Manages open goals, proof context, search budget. Calls the neural pipeline once per goal, resolution symbolically, and the Lean kernel for verification. Goal selection uses critic (solvability estimate) and progress (remaining steps). Uses Pipeline dataclass bundling encoder/analyzer/bridge/navigator.

lean_interface.py (Lane A): Wrapper around LeanDojo's Pantograph interaction protocol. Provides try_tactic(goal_state, tactic_text) -> TacticResult with success/failure and new goal states. Supports stub backend for offline testing. Configurable via lean.backend in config.

proof_auditor.py (Lane B): Axle API wrapper for proof verification, repair, and subgoal decomposition. ProofAuditor service with verify(), repair(), decompose(), check(), and extract_theorems(). Content-hash cache, bounded concurrency, graceful degradation on API errors. All results tagged with SuccessCategory (raw/assisted/repair_only). Optional dependency: pip install wayfinder[axle].

nav_contracts.py: Navigational data contracts including NavigationalExample, NavOutput, StructuredQuery, Candidate, SearchContext, BANK_NAMES.

8.2 Retained Modules

All modules from the BalancedSashimi port are retained. The GoalAnalyzer is extended (not replaced) with bank and anchor heads. The InformationBridge is unchanged. The TernaryLinear and STE infrastructure is shared between the existing TernaryDecoder and the new ProofNavigator.

Sources: docs/WAYFINDER_DESIGN.md

8. Module Architecture

8.0 Three-Lane Verification Architecture

8.1 Navigational Modules

8.2 Retained Modules

See Also