🎉 docs(spec): SHIP-TWO-001 §75 — MODEL-1 SHIP % = 100% (SHIP-007 LIVE-DISCHARGED) by noahgift · Pull Request #1652 · paiml/aprender

noahgift · 2026-05-13T07:37:45Z

🎉 MODEL-1 SHIP % = 100%

All 10 AC-SHIP1- LIVE-DISCHARGED.*

PR-E (#1651) shipped the F32 GEMV PTX layout fix that closes SHIP-007 (the last PARTIAL). §75 records the discharge.

10/10 LIVE-discharge table

AC	Discharge section	Path
SHIP-001	§72	`apr run <safetensors>` exit 0
SHIP-002	§61	`apr run "def fib(n):"` valid Python (#1609)
SHIP-003	§72	`apr diff` 20 tensors at cos_sim=1.000000
SHIP-004	§72	`llama-cli` exit 0
SHIP-005	§71	HumanEval pass@1 = 86.59% (gx10 164-run)
SHIP-006	§61.8	`apr qa` 12-gate aggregate PASS (#1615)
SHIP-007	§75	PARITY-GATE PASS + 124.6 tok/s @ 128-tok decode
SHIP-008	§61	`apr run` SHIP-008 USER → 256-token ChatML (#1614)
SHIP-009	§72	`apr inspect` license/provenance
SHIP-010	§72	sha256 match `0a854098…`

Cascade arc

§	Date	Discovery
63	2026-05-11	SHIP-007 framed as 3-layer cascade
73	2026-05-12	Re-measurement: only parity blocks
74	2026-05-13	Bug LOCALIZED to F32 GEMV
75	2026-05-13	PR-E fix → MODEL-1 100%

§73 estimated "3-5 PR / 3-5 days". Actual: 4 PRs (#1648/#1649/#1650/#1651) in 2 days.

Methodology lesson #22 NEW

Symptom analysis → bug class localization in O(1). Sign-flipped top-K divergences + CPU/GPU mean mismatch + sane intermediates → exactly one bug class (transposed matmul). Lessons compose; each makes the next cheaper.

Ship-% movement

MODEL-1: 99% → 100% 🎉
MODEL-2: unchanged at 57% (independent track)

Test plan

Empirical discharge: apr bench 5-iter 128-tok = 124.6 tok/s on default path
PARITY-GATE PASS (no error)
All AC-SHIP1-* paths captured in evidence dirs
Spec v3.19.0 → v3.21.0

Refs

§74 SHIP-007 localization (PR docs(spec): SHIP-TWO-001 §74 — SHIP-007 bug LOCALIZED to LM head F32 GEMV via PR-B stage bisection #1650)
§73 SHIP-007 cascade reduction (PR docs(spec): SHIP-TWO-001 §73 — SHIP-007 cascade reduced from 3 layers to 1 on re-measurement #1647)
PR feat(contracts): SHIP-007 GPU-vs-CPU stage-bisection scaffold (PR-A) #1648 (contract), feat(aprender-serve): SHIP-007 PR-B — GPU stage dump scaffold + Embedding/LmHead capture #1649 (PR-B), fix(aprender-gpu): SHIP-007 PR-E — F32 GEMV layout fix → MODEL-1 100% (10/10 AC-SHIP1-* LIVE-DISCHARGED) #1651 (PR-E fix)
AC-SHIP1-001..010 (spec §5)
evidence/section-75-ship-007-discharged-2026-05-13/

🤖 Generated with Claude Code

…P-TWO-SECTION-75) PR-E (#1651) shipped the single-file F32 GEMV PTX layout fix. SHIP-007 LIVE-DISCHARGED. All 10 AC-SHIP1-* now LIVE on canonical 7B Qwen2.5- Coder-Instruct Q4_K_M teacher. 10/10 LIVE-discharge table: SHIP-001 §72 apr run <safetensors> exit 0 SHIP-002 §61 apr run "def fib(n):" valid Python (#1609) SHIP-003 §72 apr diff 20 tensors at cos_sim=1.000000 SHIP-004 §72 llama-cli exit 0, 133.1 gen tok/s SHIP-005 §71 HumanEval pass@1 = 86.59% (gx10 164-run) SHIP-006 §61.8 apr qa 12-gate aggregate PASS (#1615) SHIP-007 §75 PARITY-GATE PASS + 124.6 tok/s @ 128-tok (this section) SHIP-008 §61 apr run SHIP-008 USER → 256-token ChatML (#1614) SHIP-009 §72 apr inspect license/provenance fields SHIP-010 §72 sha256 match 0a854098… Empirical discharge proof for SHIP-007: apr bench <canonical 7B APR> --iterations 5 --max-tokens 128 → tokens_per_second: 124.6 → AC-SHIP1-007 floor: 30 → headroom 4.15× → PARITY-GATE: PASS (no error) → Default path (CUDA graphed), no SKIP_PARITY_GATE, no APR_SKIP_FP8_WARMUP Cascade arc closeout: §63 2026-05-11 → SHIP-007 framed as 3-layer cascade §73 2026-05-12 → re-measurement: only parity layer blocks §74 2026-05-13 → bug LOCALIZED to F32 GEMV via PR-B stage bisection §75 2026-05-13 → PR-E layout fix → MODEL-1 100% §73's '3-5 PR / 3-5 day' estimate. Actual: 4 PRs (#1648 contract, Methodology lesson #22 NEW: symptom analysis (sign-flipped top-K divergences + CPU/GPU mean mismatch + sane intermediates) → bug class localization in O(1). Methodology lessons compose; each makes the next cheaper. Ship-% movement: MODEL-1 ship %: 99% → 100% 🎉 MODEL-2 ship %: unchanged at 57% (independent track, gated on step 5g.3 val_loss < 9.38). Spec version: 3.19.0 → 3.21.0 (post-§72/73 stack at 3.18.0; §74 at 3.20.0; §75 here at 3.21.0). Out of scope (future work): - MODEL-2 ship % path (independent track, separate cascade) - Publish-readiness gates (GATE-SHIP-001/002/003 still need green CI + post-publish QA per feedback_post_publish_qa_required.md) - HumanEval/MBPP benchmark improvements beyond §71's 86.59% Refs: - §74 SHIP-007 localization (PR #1650) - §73 SHIP-007 cascade reduction (PR #1647) - PR #1648 (contract scaffold), #1649 (PR-B stage dump) - PR #1651 (PR-E F32 GEMV layout fix) - AC-SHIP1-007 (spec §5) - evidence/section-75-ship-007-discharged-2026-05-13/ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…T-CODE-V0-33-0-RELEASE-PREP) 🎉 v0.33.0 marks **MODEL-1 SHIP % = 100%** for SHIP-TWO-001. All 10 AC-SHIP1-* falsifiers are LIVE-discharged on the canonical 7B Qwen2.5-Coder-Instruct Q4_K_M teacher (lambda-vector RTX 4090, --features cuda). This release prep PR ships: 1. CHANGELOG.md [0.33.0] entry with §69-§75 highlights: - 🎉 MODEL-1 SHIP % = 100% (all 10 AC-SHIP1-* LIVE) - Fixed: SHIP-007 F32 GEMV PTX layout (PR #1651, §75) — 124.6 tok/s - Fixed: SHIP-005 HumanEval RC3 (PR #1635, §70/§71) — pass@1 86.59% - Added: APR_EVAL_DEBUG=1 diagnostic surface (PR #1634) - Added: APR_GPU_STAGE_DUMP=<dir> diagnostic surface (PR #1649) - Added: MBPP harness H4 fix (PR #1645) - Added: 2 new falsifiable contracts (apr-eval-humaneval-harness- invariant v1.1.0, apr-ship-007-gpu-stage-bisection v1.0.0) - Methodology lessons #16-22 captured in MEMORY.md - Spec: v3.13.0 → v3.21.0 across §67-§75 2. Workspace version bump: - [workspace.package].version: 0.32.0 → 0.33.0 - Root [package].version (aprender facade crate): 0.32.0 → 0.33.0 - 28 sub-crate version literals: 0.32.0 → 0.33.0 3. `cargo check -p aprender` → clean (workspace builds at 0.33.0). Out of scope for this PR (separate steps after #1651/1652 land + this PR lands): - Tag release `v0.33.0` on main - Cascade publish to crates.io (per memory project_ship_two_001_v0_32_0_release.md — 15 user-facing crates + 7 internal-tier in topological dependency order; uses `make publish CRATE=<name>`) - Post-publish QA per `feedback_post_publish_qa_required.md` — `cargo install aprender --force` + `/dogfood` GO verdict required before declaring release done (v0.31.1 was yanked for skipping this) - GitHub Release with §75 narrative - HF artifact verification (paiml/qwen2.5-coder-7b-apache-q4k-v1 sha256 already verified by §72 SHIP-010 LIVE evidence; double-check before release announcement) This PR ships ONLY the version-bump + CHANGELOG. Publishing is the next step after merge. Refs: - §75 MODEL-1 100% (PR #1652) - §74 SHIP-007 bug localized (PR #1650) - §73 SHIP-007 cascade reduction (PR #1647) - §72 5-AC LIVE cascade (PR #1646) - §71 SHIP-005 LIVE-DISCHARGED (PR #1642) - §70 RC3 fix (PR #1636) - §69 Q4K hypothesis falsified (PR #1633) - PR #1635 RC3 prepend - PR #1634 diagnostic surface + contract - PR #1648 SHIP-007 contract scaffold - PR #1649 SHIP-007 PR-B stage dump - PR #1651 SHIP-007 PR-E F32 GEMV layout fix Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…T-CODE-V0-33-0-RELEASE-PREP) (#1653) 🎉 v0.33.0 marks **MODEL-1 SHIP % = 100%** for SHIP-TWO-001. All 10 AC-SHIP1-* falsifiers are LIVE-discharged on the canonical 7B Qwen2.5-Coder-Instruct Q4_K_M teacher (lambda-vector RTX 4090, --features cuda). This release prep PR ships: 1. CHANGELOG.md [0.33.0] entry with §69-§75 highlights: - 🎉 MODEL-1 SHIP % = 100% (all 10 AC-SHIP1-* LIVE) - Fixed: SHIP-007 F32 GEMV PTX layout (PR #1651, §75) — 124.6 tok/s - Fixed: SHIP-005 HumanEval RC3 (PR #1635, §70/§71) — pass@1 86.59% - Added: APR_EVAL_DEBUG=1 diagnostic surface (PR #1634) - Added: APR_GPU_STAGE_DUMP=<dir> diagnostic surface (PR #1649) - Added: MBPP harness H4 fix (PR #1645) - Added: 2 new falsifiable contracts (apr-eval-humaneval-harness- invariant v1.1.0, apr-ship-007-gpu-stage-bisection v1.0.0) - Methodology lessons #16-22 captured in MEMORY.md - Spec: v3.13.0 → v3.21.0 across §67-§75 2. Workspace version bump: - [workspace.package].version: 0.32.0 → 0.33.0 - Root [package].version (aprender facade crate): 0.32.0 → 0.33.0 - 28 sub-crate version literals: 0.32.0 → 0.33.0 3. `cargo check -p aprender` → clean (workspace builds at 0.33.0). Out of scope for this PR (separate steps after #1651/1652 land + this PR lands): - Tag release `v0.33.0` on main - Cascade publish to crates.io (per memory project_ship_two_001_v0_32_0_release.md — 15 user-facing crates + 7 internal-tier in topological dependency order; uses `make publish CRATE=<name>`) - Post-publish QA per `feedback_post_publish_qa_required.md` — `cargo install aprender --force` + `/dogfood` GO verdict required before declaring release done (v0.31.1 was yanked for skipping this) - GitHub Release with §75 narrative - HF artifact verification (paiml/qwen2.5-coder-7b-apache-q4k-v1 sha256 already verified by §72 SHIP-010 LIVE evidence; double-check before release announcement) This PR ships ONLY the version-bump + CHANGELOG. Publishing is the next step after merge. Refs: - §75 MODEL-1 100% (PR #1652) - §74 SHIP-007 bug localized (PR #1650) - §73 SHIP-007 cascade reduction (PR #1647) - §72 5-AC LIVE cascade (PR #1646) - §71 SHIP-005 LIVE-DISCHARGED (PR #1642) - §70 RC3 fix (PR #1636) - §69 Q4K hypothesis falsified (PR #1633) - PR #1635 RC3 prepend - PR #1634 diagnostic surface + contract - PR #1648 SHIP-007 contract scaffold - PR #1649 SHIP-007 PR-B stage dump - PR #1651 SHIP-007 PR-E F32 GEMV layout fix Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 13, 2026 07:37

noahgift mentioned this pull request May 13, 2026

🎉 chore: v0.33.0 release prep — CHANGELOG + workspace version bump (MODEL-1 100%) #1653

Merged

4 tasks

noahgift force-pushed the docs/section-75-model-1-100-percent branch from b3b7835 to 598b323 Compare May 13, 2026 09:11

noahgift added 7 commits May 13, 2026 11:45

Merge branch 'main' into docs/section-75-model-1-100-percent

3776fe5

Merge branch 'main' into docs/section-75-model-1-100-percent

5e4196e

Merge branch 'main' into docs/section-75-model-1-100-percent

c2e0557

Merge branch 'main' into docs/section-75-model-1-100-percent

26b5694

Merge branch 'main' into docs/section-75-model-1-100-percent

fe6c253

Merge branch 'main' into docs/section-75-model-1-100-percent

7910898

Merge branch 'main' into docs/section-75-model-1-100-percent

e0add8f

noahgift added 4 commits May 13, 2026 18:14

Merge branch 'main' into docs/section-75-model-1-100-percent

7482033

Merge branch 'main' into docs/section-75-model-1-100-percent

0aada1a

Merge branch 'main' into docs/section-75-model-1-100-percent

a767ff3

Merge branch 'main' into docs/section-75-model-1-100-percent

7d06a46

noahgift added 4 commits May 14, 2026 00:17

Merge branch 'main' into docs/section-75-model-1-100-percent

a10439e

Merge branch 'main' into docs/section-75-model-1-100-percent

d60a9f6

Merge branch 'main' into docs/section-75-model-1-100-percent

eeb5c3e

Merge branch 'main' into docs/section-75-model-1-100-percent

ca2ffcf

noahgift merged commit 4209a39 into main May 14, 2026
10 checks passed

noahgift deleted the docs/section-75-model-1-100-percent branch May 14, 2026 02:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎉 docs(spec): SHIP-TWO-001 §75 — MODEL-1 SHIP % = 100% (SHIP-007 LIVE-DISCHARGED)#1652

🎉 docs(spec): SHIP-TWO-001 §75 — MODEL-1 SHIP % = 100% (SHIP-007 LIVE-DISCHARGED)#1652
noahgift merged 16 commits into
mainfrom
docs/section-75-model-1-100-percent

noahgift commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 13, 2026

🎉 MODEL-1 SHIP % = 100%

10/10 LIVE-discharge table

Cascade arc

Methodology lesson #22 NEW

Ship-% movement

Test plan

Refs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant