Design | Chapter 3 | Prerequisites: 2-The-Proof-Network | 24 min read | Home

3. Scoring System

3.1 Bank Alignment (adapted from ModelAtlas)

For each bank where the query specifies a direction:


def bank_score(model_signed_pos, query_direction):
    if query_direction == 0:
        return 1.0 / (1.0 + abs(model_signed_pos))  # want zero: penalize distance
    alignment = model_signed_pos * query_direction
    if alignment > 0:
        return 1.0      # on the right side
    elif alignment == 0:
        return 0.5      # at zero (neutral)
    else:
        return 1.0 / (1.0 + abs(alignment))  # wrong side: decay

Scoring composition — configurable mechanism. The default is multiplicative (bank_alignment = Π bank_score(pos_i, dir_i)), but the system supports alternative composition strategies selected via scoring.mechanism in config:


def compose_bank_scores(scores: dict[str, float], confidences: dict[str, float],
                        mechanism: str) -> float:
    if mechanism == "multiplicative":
        return math.prod(scores.values())
    elif mechanism == "confidence_weighted":
        # Banks with uncertain positions contribute less
        return math.prod(s ** c for s, c in zip(scores.values(), confidences.values()))
    elif mechanism == "soft_floor":
        epsilon = config.scoring.floor_epsilon  # default 0.1
        return math.prod(max(s, epsilon) for s in scores.values())
    elif mechanism == "geometric_mean":
        return math.prod(scores.values()) ** (1.0 / len(scores))
    elif mechanism == "log_additive":
        weights = config.scoring.bank_weights  # learned or tuned per bank
        return math.exp(sum(w * math.log(max(s, 1e-6))
                            for w, s in zip(weights, scores.values())))

Recommended starting point: confidence_weighted. The GoalAnalyzer's bank heads output softmax distributions — the max probability serves as a natural confidence estimate. Banks with uncertain positions (max prob ~0.4) automatically contribute less to the final score than banks with confident positions (max prob ~0.95). This degrades gracefully: high confidence → pure multiplicative; low confidence → near-neutral.

Missing bank penalty: If an entity lacks a position for a queried bank, score = 0.3 (not zero, but penalized). This is intentional — some entities genuinely don't have a meaningful position on some banks.

3.2 Anchor Relevance (IDF-weighted)

Three anchor lists:

  • require (hard filter): SELECT entity_id ... HAVING COUNT(DISTINCT anchor_id) = len(require). Missing any required anchor → entity excluded.
  • prefer (IDF-weighted boost): score = Σ idf(matched) / Σ idf(all_preferred). Rare anchors count more.
  • avoid (decay): Each avoided anchor present halves the score: 0.5^count.

3.3 Seed Similarity (IDF-weighted Jaccard)

When the proof context includes already-used lemmas:


seed_similarity = Σ idf(shared_anchors) / Σ idf(union_anchors)

Standard Jaccard treats all anchors equally. IDF weighting means sharing Lyapunov-monotonicity (rare) matters more than sharing equality (ubiquitous).

3.4 Final Score


final_score = bank_alignment × anchor_relevance × seed_similarity

All three components are in [0, 1]. The product ensures no averaging away bad matches. A lemma that perfectly matches domain but completely fails structure scores 0, not 0.5.



Sources: docs/WAYFINDER_DESIGN.md

See Also