feat(format): glm-v1 + gnn-v1 + learned-position-embedding-v1 13-gate PARTIAL discharge#1403
Open
feat(format): glm-v1 + gnn-v1 + learned-position-embedding-v1 13-gate PARTIAL discharge#1403
Conversation
… PARTIAL discharge
Triple bundle of three sister contracts in one verdict module:
glm-v1 (FALSIFY-GLM-001..004):
- GLM-001: link round-trip g(g^{-1}(eta)) ≈ eta within 1e-4
- GLM-002: predicted mean in valid family-specific range
- GLM-003: IRLS deviance monotone non-increasing
- GLM-004: predictions finite for bounded input
gnn-v1 (FALSIFY-GNN-001..006):
- GNN-001: GCN preserves node count
- GNN-002: message-passing preserves node count
- GNN-003: global mean-pool produces finite output
- GNN-004: global max-pool ≤ per-feature node max
- GNN-005: pooling preserves feature dimension
- GNN-006: GCN output finite for finite input
learned-position-embedding-v1 (FALSIFY-POS-001..003):
- POS-001: pos ≥ max_positions returns Err (no silent truncation)
- POS-002: PE(pos) bit-deterministic across calls
- POS-003: PE(pos).len() == d_model
## Five Whys
1. Why bundle these three contracts? They span the GLM/GNN/positional-
embedding coverage band; one verdict module captures all 13 gates
without duplicate provenance pin overhead.
2. Why does this block ship? Coverage % cannot move while these
peripheral ML invariants are unbound at PARTIAL_ALGORITHM_LEVEL.
3. Why family-specific ranges for GLM-002? Different exponential-
family distributions have different valid mean domains:
Poisson/Gamma require mu > 0, Binomial requires 0 < p < 1, Gaussian
is unbounded. A single "all positive" check would miss the
regression class "Binomial mu = 1.5" — the tighter bounds catch
each family's specific failure mode.
4. Why bit-exact (`to_bits()`) for POS-002? PE(pos) is a pure index
into a frozen embedding table — no float arithmetic, just a
memory load. Any drift between calls indicates a non-determinism
leak (e.g., random padding selected for tie-breaking). Float-
tolerant compare would mask exactly that regression class.
5. Why pass at boundary (pos == max_positions, returned_err=true)
for POS-001? The contract specifies "pos >= max_positions" as
OOB. The boundary case (`pos == max_positions`) is exactly the
regression class — without the gate it's the most common
off-by-one bug. Asserting `Pass` when err is correctly raised
at the exact boundary locks in the inclusive ≥ semantics.
Adds 35 unit tests including 6-bucket OOB position sweep and
3-direction IRLS monotonicity sweep. Realistic-healthy walks the
canonical 100-node GNN + Poisson GLM + 768-dim PE; pre-fix walks
13 simultaneous regressions across all three contracts.
No runtime % shift; algorithm-level coverage advances by 13 gates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Triple bundle of three sister contracts:
glm-v1(FALSIFY-GLM-001..004): link round-trip, mean range, IRLS monotone, finite predsgnn-v1(FALSIFY-GNN-001..006): GCN node count, message-pass, mean/max pool, dim, finitelearned-position-embedding-v1(FALSIFY-POS-001..003): OOB rejection, deterministic, output dim35 unit tests including 6-bucket OOB sweep + 3-direction IRLS monotonicity sweep.
Algorithm-level coverage advances by 13 gates; runtime ship % unchanged.
Gates bound
g(g⁻¹(η)) ≈ ηwithin 1e-4pos ≥ max_positionsreturns Err (no silent trunc)Five Whys
See commit message — captures family-specific GLM ranges, bit-exact for POS-002, and inclusive
≥boundary for POS-001.Test plan
cargo test -p aprender-core --lib glm_gnn_pos— 35 passed🤖 Generated with Claude Code