feat(format): f16-conversion-v1 + dpo-loss-v1 9-gate PARTIAL discharge by noahgift · Pull Request #1395 · paiml/aprender

noahgift · 2026-05-02T09:29:33Z

Summary

Bundles two sister contracts in one verdict module:

f16-conversion-v1 (FALSIFY-F16-001..004): bit-trick parity, roundtrip, sign, SIMD≡scalar
dpo-loss-v1 (FALSIFY-DPO-001..005): non-negativity, log(2) reference, monotonicity, stability, symmetry

26 unit tests including 11-bucket monotonicity sweep on DPO-003. Algorithm-level coverage advances by 9 gates.

Gates bound

Gate ID	Rule
F16-001	Bit-trick output bit-exact == `f32::from(f16)`
F16-002	f16→f32→f16 roundtrip identity
F16-003	Sign preservation under conversion
F16-004	SIMD f16 conversion bit-exact == scalar
DPO-001	L_DPO ≥ 0 for all valid inputs
DPO-002	L_DPO ≈ log(2) at r_w == r_l == 0
DPO-003	Monotonicity in preferred log-ratio (raising r_w lowers loss)
DPO-004	Loss finite for log-ratios in [-100, 100]
DPO-005	L(rw,rl) + L(rl,rw) == \|z\| + 2*log(1+exp(-\|z\|))

Reference helper

dpo_loss(r_w, r_l, beta) uses softplus form for numerical stability.

Five Whys

See commit message — captures bit-exact rationale for F16-001/004, NaN-on-invalid-input policy, and why a reference loss helper ships in-module.

Test plan

cargo test -p aprender-core --lib f16_dpo — 26 passed
PMAT pre-commit gates green
CI green

🤖 Generated with Claude Code

Bundles two sister contracts in one verdict module: f16-conversion-v1 (FALSIFY-F16-001..004): - F16-001: bit-trick output bit-exact equal to f32::from(f16) - F16-002: f16→f32→f16 roundtrip is identity for normal values - F16-003: sign preservation under conversion - F16-004: SIMD f16 conversion bit-exact equal to scalar dpo-loss-v1 (FALSIFY-DPO-001..005): - DPO-001: L_DPO ≥ 0 for all valid log-ratio pairs and beta > 0 - DPO-002: L_DPO ≈ log(2) at the reference policy (r_w == r_l == 0) - DPO-003: monotonicity in preferred log-ratio - DPO-004: finite output for log-ratios in [-100, 100] - DPO-005: symmetry — L(rw,rl) + L(rl,rw) == |z| + 2*log(1+exp(-|z|)) ## Five Whys 1. Why bundle these two contracts? Both peripheral, span the precision-conversion + preference-optimization coverage band; one verdict module captures both without duplicate provenance pin. 2. Why does this block ship? Coverage % cannot move while these peripheral contracts are unbound at PARTIAL_ALGORITHM_LEVEL. 3. Why bind both verdicts AND a reference `dpo_loss()` helper? The monotonicity (DPO-003) and symmetry (DPO-005) gates need a deterministic loss function to compare values; the `log(2)` gate (DPO-002) needs a pinned reference value. Pinning the numerically-stable softplus formula in-module prevents future drift between the verdict-test loss and the actual training loss. 4. Why bit-exact (`to_bits()`) for F16-001/004 vs ULP-tolerant? The contract specifies "Bit manipulation matches f32::from(f16)" — even one ULP of drift on a single conversion accumulates over millions of weights into measurable inference divergence. Strict bit-exact is the only way to enforce "no precision loss in the conversion path." 5. Why `f32::NAN` for invalid DPO inputs (vs returning Some/Result)? The downstream verdict functions all check `is_finite()` and Fail on NaN — making the loss helper return NaN for invalid inputs propagates the "broken harness" signal cleanly through the gate pipeline without changing every call-site signature. Adds 26 unit tests including a 11-bucket monotonicity sweep on DPO-003. Realistic-healthy walks the canonical loss values; pre-fix walks 9 simultaneous regressions across both contracts. No runtime % shift; algorithm-level coverage advances by 9 gates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(format): f16-conversion-v1 + dpo-loss-v1 9-gate PARTIAL discharge#1395

feat(format): f16-conversion-v1 + dpo-loss-v1 9-gate PARTIAL discharge#1395
noahgift wants to merge 1 commit intomainfrom
feat/f16-dpo-001-009-partial-discharge

noahgift commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 2, 2026

Summary

Gates bound

Reference helper

Five Whys

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant