Skip to content

feat(format): cpu-work-stealing-v1 + encoder-forward-v1 8-gate PARTIAL discharge#1397

Open
noahgift wants to merge 2 commits into
mainfrom
feat/ws-enc-001-008-partial-discharge
Open

feat(format): cpu-work-stealing-v1 + encoder-forward-v1 8-gate PARTIAL discharge#1397
noahgift wants to merge 2 commits into
mainfrom
feat/ws-enc-001-008-partial-discharge

Conversation

@noahgift
Copy link
Copy Markdown
Contributor

@noahgift noahgift commented May 2, 2026

Summary

Bundles two sister contracts in one verdict module:

  • cpu-work-stealing-v1 (FALSIFY-WS-001..004): dispatch overhead, L1 miss rate, Rayon parity, scaling efficiency
  • encoder-forward-v1 (FALSIFY-ENC-001..004): shape preservation, finite output, HF reference, CLS pooling

28 unit tests including 6-bucket scaling sweep + 5-bucket layer-count sweep.
Algorithm-level coverage advances by 8 gates; runtime ship % unchanged.

Gates bound

Gate ID Rule
WS-001 dispatch overhead < 1ms per forward pass
WS-002 L1 cache miss rate < 5%
WS-003 matvec parity vs Rayon within 1e-6
WS-004 4-thread tps ≥ 3.5 × single-thread
ENC-001 12 layers × 768 hidden × seq-len preserved
ENC-002 every output element finite
ENC-003 aprender vs HF reference within 1e-4
ENC-004 CLS pooling bit-exact == encoder_output[0]

Five Whys

See commit message — captures strict < for WS-001 dispatch budget, bit-exact for ENC-004 CLS pooling, and why ENC-001 models three invariants independently.

Test plan

  • cargo test -p aprender-core --lib ws_enc — 28 passed
  • PMAT pre-commit gates green
  • CI green

🤖 Generated with Claude Code

…L discharge

Bundles two sister contracts in one verdict module:

cpu-work-stealing-v1 (FALSIFY-WS-001..004):
- WS-001: dispatch overhead < 1ms per forward pass
- WS-002: L1 cache miss rate < 5% during inner loop
- WS-003: matvec parity vs Rayon within 1e-6
- WS-004: 4-thread throughput ≥ 3.5× single-thread

encoder-forward-v1 (FALSIFY-ENC-001..004):
- ENC-001: 12 encoder layers preserve (n, 768) shape
- ENC-002: every output element finite for inputs in [-10, 10]
- ENC-003: aprender output matches HF reference within 1e-4
- ENC-004: CLS pooling extracts encoder_output[0] bit-exactly

## Five Whys

1. Why bundle these two contracts? Both peripheral, span the
   parallel-runtime + encoder-forward coverage band; one verdict
   module captures both without duplicate provenance pin.
2. Why does this block ship? Coverage % cannot move while these
   peripheral contracts are unbound at PARTIAL_ALGORITHM_LEVEL.
3. Why strict `<` for WS-001 (not `<=`)? The contract says
   "< 1ms per forward pass." Equality at exactly 1ms would mean
   the dispatcher is consuming the entire budget — there's no
   room for the actual matmul. Strict `<` catches the regression
   class "atomic contention saturated the overhead window."
4. Why bit-exact (`to_bits()`) for ENC-004 (CLS pooling)? The
   spec calls it "extract row 0" — pooling is a pure index
   operation, no float arithmetic. Any drift between
   `encoder_output[0]` and `cls_embedding` indicates the pooler
   is averaging or selecting a different row, not just precision
   loss. Strict bit-equal catches the regression class.
5. Why a separate dimension check for ENC-001 (`AC_ENC_HIDDEN_DIM`
   AND `AC_ENC_LAYER_COUNT` AND seq-len preservation)? The
   contract bundles three invariants — count of layers, hidden
   dim 768, sequence length preserved through layers. A single
   shape-equal check would let "layer dropped a dim AND added
   another to compensate" pass. Modeling the three invariants
   separately catches every mutation class independently.

Adds 28 unit tests including 6-bucket scaling sweep + 5-bucket
layer-count sweep. Realistic-healthy walks the canonical 4-thread
RTX-4090 + BERT-base scenario; pre-fix walks 8 simultaneous
regressions across both contracts.

No runtime % shift; algorithm-level coverage advances by 8 gates.
@noahgift noahgift force-pushed the feat/ws-enc-001-008-partial-discharge branch from 3965ff9 to 21044ae Compare May 11, 2026 15:22
@noahgift noahgift enabled auto-merge (squash) May 11, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant