Skip to content

feat(format): f16-conversion-v1 + dpo-loss-v1 9-gate PARTIAL discharge#1395

Open
noahgift wants to merge 1 commit intomainfrom
feat/f16-dpo-001-009-partial-discharge
Open

feat(format): f16-conversion-v1 + dpo-loss-v1 9-gate PARTIAL discharge#1395
noahgift wants to merge 1 commit intomainfrom
feat/f16-dpo-001-009-partial-discharge

Conversation

@noahgift
Copy link
Copy Markdown
Contributor

@noahgift noahgift commented May 2, 2026

Summary

Bundles two sister contracts in one verdict module:

  • f16-conversion-v1 (FALSIFY-F16-001..004): bit-trick parity, roundtrip, sign, SIMD≡scalar
  • dpo-loss-v1 (FALSIFY-DPO-001..005): non-negativity, log(2) reference, monotonicity, stability, symmetry

26 unit tests including 11-bucket monotonicity sweep on DPO-003. Algorithm-level coverage advances by 9 gates.

Gates bound

Gate ID Rule
F16-001 Bit-trick output bit-exact == f32::from(f16)
F16-002 f16→f32→f16 roundtrip identity
F16-003 Sign preservation under conversion
F16-004 SIMD f16 conversion bit-exact == scalar
DPO-001 L_DPO ≥ 0 for all valid inputs
DPO-002 L_DPO ≈ log(2) at r_w == r_l == 0
DPO-003 Monotonicity in preferred log-ratio (raising r_w lowers loss)
DPO-004 Loss finite for log-ratios in [-100, 100]
DPO-005 L(rw,rl) + L(rl,rw) == |z| + 2*log(1+exp(-|z|))

Reference helper

dpo_loss(r_w, r_l, beta) uses softplus form for numerical stability.

Five Whys

See commit message — captures bit-exact rationale for F16-001/004, NaN-on-invalid-input policy, and why a reference loss helper ships in-module.

Test plan

  • cargo test -p aprender-core --lib f16_dpo — 26 passed
  • PMAT pre-commit gates green
  • CI green

🤖 Generated with Claude Code

Bundles two sister contracts in one verdict module:

f16-conversion-v1 (FALSIFY-F16-001..004):
- F16-001: bit-trick output bit-exact equal to f32::from(f16)
- F16-002: f16→f32→f16 roundtrip is identity for normal values
- F16-003: sign preservation under conversion
- F16-004: SIMD f16 conversion bit-exact equal to scalar

dpo-loss-v1 (FALSIFY-DPO-001..005):
- DPO-001: L_DPO ≥ 0 for all valid log-ratio pairs and beta > 0
- DPO-002: L_DPO ≈ log(2) at the reference policy (r_w == r_l == 0)
- DPO-003: monotonicity in preferred log-ratio
- DPO-004: finite output for log-ratios in [-100, 100]
- DPO-005: symmetry — L(rw,rl) + L(rl,rw) == |z| + 2*log(1+exp(-|z|))

## Five Whys

1. Why bundle these two contracts? Both peripheral, span the
   precision-conversion + preference-optimization coverage band;
   one verdict module captures both without duplicate provenance pin.
2. Why does this block ship? Coverage % cannot move while these
   peripheral contracts are unbound at PARTIAL_ALGORITHM_LEVEL.
3. Why bind both verdicts AND a reference `dpo_loss()` helper? The
   monotonicity (DPO-003) and symmetry (DPO-005) gates need a
   deterministic loss function to compare values; the `log(2)` gate
   (DPO-002) needs a pinned reference value. Pinning the
   numerically-stable softplus formula in-module prevents future
   drift between the verdict-test loss and the actual training loss.
4. Why bit-exact (`to_bits()`) for F16-001/004 vs ULP-tolerant?
   The contract specifies "Bit manipulation matches f32::from(f16)"
   — even one ULP of drift on a single conversion accumulates over
   millions of weights into measurable inference divergence. Strict
   bit-exact is the only way to enforce "no precision loss in the
   conversion path."
5. Why `f32::NAN` for invalid DPO inputs (vs returning Some/Result)?
   The downstream verdict functions all check `is_finite()` and
   Fail on NaN — making the loss helper return NaN for invalid
   inputs propagates the "broken harness" signal cleanly through
   the gate pipeline without changing every call-site signature.

Adds 26 unit tests including a 11-bucket monotonicity sweep on DPO-003.
Realistic-healthy walks the canonical loss values; pre-fix walks
9 simultaneous regressions across both contracts.

No runtime % shift; algorithm-level coverage advances by 9 gates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant