Skip to content

feat(format): glm-v1 + gnn-v1 + learned-position-embedding-v1 13-gate PARTIAL discharge#1403

Open
noahgift wants to merge 1 commit intomainfrom
feat/glm-gnn-pos-001-013-partial-discharge
Open

feat(format): glm-v1 + gnn-v1 + learned-position-embedding-v1 13-gate PARTIAL discharge#1403
noahgift wants to merge 1 commit intomainfrom
feat/glm-gnn-pos-001-013-partial-discharge

Conversation

@noahgift
Copy link
Copy Markdown
Contributor

@noahgift noahgift commented May 2, 2026

Summary

Triple bundle of three sister contracts:

  • glm-v1 (FALSIFY-GLM-001..004): link round-trip, mean range, IRLS monotone, finite preds
  • gnn-v1 (FALSIFY-GNN-001..006): GCN node count, message-pass, mean/max pool, dim, finite
  • learned-position-embedding-v1 (FALSIFY-POS-001..003): OOB rejection, deterministic, output dim

35 unit tests including 6-bucket OOB sweep + 3-direction IRLS monotonicity sweep.
Algorithm-level coverage advances by 13 gates; runtime ship % unchanged.

Gates bound

Gate ID Rule
GLM-001 g(g⁻¹(η)) ≈ η within 1e-4
GLM-002 predicted mean in valid family range (Poisson/Gamma > 0; Binomial ∈ (0,1))
GLM-003 IRLS deviance monotone non-increasing
GLM-004 every prediction finite
GNN-001/002 input/output node count preserved
GNN-003/006 every output finite
GNN-004 max-pool output ≤ per-feature node max
GNN-005 pooling preserves feature dim
POS-001 pos ≥ max_positions returns Err (no silent trunc)
POS-002 PE(pos) bit-deterministic
POS-003 PE(pos).len() == d_model

Five Whys

See commit message — captures family-specific GLM ranges, bit-exact for POS-002, and inclusive boundary for POS-001.

Test plan

  • cargo test -p aprender-core --lib glm_gnn_pos — 35 passed
  • PMAT pre-commit gates green
  • CI green

🤖 Generated with Claude Code

… PARTIAL discharge

Triple bundle of three sister contracts in one verdict module:

glm-v1 (FALSIFY-GLM-001..004):
- GLM-001: link round-trip g(g^{-1}(eta)) ≈ eta within 1e-4
- GLM-002: predicted mean in valid family-specific range
- GLM-003: IRLS deviance monotone non-increasing
- GLM-004: predictions finite for bounded input

gnn-v1 (FALSIFY-GNN-001..006):
- GNN-001: GCN preserves node count
- GNN-002: message-passing preserves node count
- GNN-003: global mean-pool produces finite output
- GNN-004: global max-pool ≤ per-feature node max
- GNN-005: pooling preserves feature dimension
- GNN-006: GCN output finite for finite input

learned-position-embedding-v1 (FALSIFY-POS-001..003):
- POS-001: pos ≥ max_positions returns Err (no silent truncation)
- POS-002: PE(pos) bit-deterministic across calls
- POS-003: PE(pos).len() == d_model

## Five Whys

1. Why bundle these three contracts? They span the GLM/GNN/positional-
   embedding coverage band; one verdict module captures all 13 gates
   without duplicate provenance pin overhead.
2. Why does this block ship? Coverage % cannot move while these
   peripheral ML invariants are unbound at PARTIAL_ALGORITHM_LEVEL.
3. Why family-specific ranges for GLM-002? Different exponential-
   family distributions have different valid mean domains:
   Poisson/Gamma require mu > 0, Binomial requires 0 < p < 1, Gaussian
   is unbounded. A single "all positive" check would miss the
   regression class "Binomial mu = 1.5" — the tighter bounds catch
   each family's specific failure mode.
4. Why bit-exact (`to_bits()`) for POS-002? PE(pos) is a pure index
   into a frozen embedding table — no float arithmetic, just a
   memory load. Any drift between calls indicates a non-determinism
   leak (e.g., random padding selected for tie-breaking). Float-
   tolerant compare would mask exactly that regression class.
5. Why pass at boundary (pos == max_positions, returned_err=true)
   for POS-001? The contract specifies "pos >= max_positions" as
   OOB. The boundary case (`pos == max_positions`) is exactly the
   regression class — without the gate it's the most common
   off-by-one bug. Asserting `Pass` when err is correctly raised
   at the exact boundary locks in the inclusive ≥ semantics.

Adds 35 unit tests including 6-bucket OOB position sweep and
3-direction IRLS monotonicity sweep. Realistic-healthy walks the
canonical 100-node GNN + Poisson GLM + 768-dim PE; pre-fix walks
13 simultaneous regressions across all three contracts.

No runtime % shift; algorithm-level coverage advances by 13 gates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant