From 355a1e74df0f9213f32ea83b5bb8ccc581f0fcca Mon Sep 17 00:00:00 2001
From: Noah Gift <noah.gift@gmail.com>
Date: Fri, 15 May 2026 08:14:49 +0200
Subject: [PATCH 1/5] =?UTF-8?q?contracts(ccpa):=20v1.27.0=20=E2=86=92=20v1?=
 =?UTF-8?q?.28.0=20=E2=80=94=20register=20FALSIFY-CCPA-017=20project=5Fsca?=
 =?UTF-8?q?le=5Fparity=5Fbound=20(PROPOSED)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adds 1 new falsification gate to the registry: CCPA-017 (project-scale
parity bound). Gate count: 16 → 17.

Phase 4 closes the function-scale → project-scale extrapolation gap
the M159 ProgramBench prior-art (arXiv:2605.03546) flagged. 0%/200
fully-resolved across Claude Opus/Sonnet/Haiku + GPT + Gemini at
project-scale means a CCPA-016-style "both pass" assertion is
implausible. CCPA-017 inverts the question: partial-progress
agreement, not all-or-nothing.

DUAL-threshold design:
- partial_agreement >= 0.3
- files_jaccard_corpus >= 0.3

Both orthogonal channels must show agreement. Tentative 0.3/0.3
POC-tier floors; recalibration awaits first operator-dispatched
measurement.

CCPA-017 enters at status: PROPOSED because no operator-dispatched
measurement has produced evidence/phase-4/project-scale-scores.json
yet. The live-evidence test is #[ignore]'d until that file exists.
Once the operator runs `bash scripts/phase-4-bench.sh` and the gate
passes against real data, a v1.29.0 bump will flip PROPOSED →
ACTIVE_RUNTIME.

Companion-repo Phase 4 sequence:
- M180 (PR #167 squash c7107b9) — phase-4-project-scale-plan.md
- M182 (PR #169 squash b36ceb6) — P4.1 corpus (5 real GitHub issues)
- M184 (PR #171 squash 0f8c451) — P4.2 runner (phase-4-bench.sh)
- M186 (PR #173 squash c115966) — P4.3 scoring (project_scale_diff.rs)
- M188 (PR #175 squash a574655) — P4.4 gate test scaffold

Gate-level statuses post-v1.28.0: 4 ACTIVE_RUNTIME (CCPA-013/014/015/
016) + 1 PROPOSED (CCPA-017) + rest at PLANNED_M*/IN_REVIEW per
their lifecycle phase. No OPEN residue.

Pure additive bump: new gate + new status_history entry. No schema
bump in aprender-contracts/src/schema/. pv validate clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 contracts/claude-code-parity-apr-v1.yaml | 203 ++++++++++++++++++++++-
 1 file changed, 201 insertions(+), 2 deletions(-)

diff --git a/contracts/claude-code-parity-apr-v1.yaml b/contracts/claude-code-parity-apr-v1.yaml
index 80a7522ce..4ff4d0765 100644
--- a/contracts/claude-code-parity-apr-v1.yaml
+++ b/contracts/claude-code-parity-apr-v1.yaml
@@ -63,8 +63,8 @@ metadata:
     - crates/aprender-orchestrate/contracts/batuta/apr-code-v1.yaml
 
 name: claude-code-parity-apr
-version: "1.27.0"
-status: ACTIVE_RUNTIME   # 16/16 gates registered; 4 with status: ACTIVE_RUNTIME (CCPA-013/014/015/016 — the runtime-evidence + outcome-parity track), rest at PLANNED_M*/IN_REVIEW/HARD_BLOCKING_M16 per their lifecycle phase. No OPEN residue. v1.27.0 (companion-repo M167, 2026-05-14) — flips FALSIFY-CCPA-013 (first_recorded_parity_score) from `status: OPEN` → `status: ACTIVE_RUNTIME`. The gate's assertion has been satisfied since v1.1.0 (3 measured_parity blocks dating 2026-04-27 against `fixtures/canonical/` with aggregate_score = 1.0000), but the gate-level status field was never flipped — stale prose that this revision corrects. Also extends the assertion's `fixture_corpus_path` constraint to accept EITHER `fixtures/canonical/` (AUTHORED, since v1.2.0) OR `evidence/phase-3/captures/` (REAL-BINARY bilateral bench, companion-repo M150 — claude 2.1.139 + apr 0.32.0 + Qwen2.5-Coder-1.5B-Instruct-Q4_K_M, agreement = 1.0000 on MultiPL-E-Rust HumanEval/0..4). Adds a 4th measured_parity block under CCPA-013 recording M150's real-binary evidence as the strongest empirical discharge anchor. **CCPA-013 was the last gate stuck at `status: OPEN`** — its flip closes the OPEN residue. v1.26.0 (companion-repo M147+M152+M162 Phase 3 sequence, 2026-05-13) (companion-repo M147+M152+M162 Phase 3 sequence, 2026-05-13) — adds FALSIFY-CCPA-015 (ccpa_trace_subproc_output_purity) AND FALSIFY-CCPA-016 (outcome_parity_bound) to the gate registry. CCPA-015 was authored at M147 via provable-contract design (falsifying test FIRST, fix via Stdio::null()) for the ccpa-trace-subproc capture binary; PROPOSED in v1.25.0, promoted ACTIVE_RUNTIME here. CCPA-016 is the Phase 3 P3.4 outcome-parity gate authored at M152 — asserts aggregate agreement >= 0.5 on a MultiPL-E-Rust-class corpus with bidirectional sensitivity (synthetic regression fixture fails threshold; synthetic identity passes). CCPA-016 was empirically validated at M150 (real bilateral bench produced agreement = 1.0000 on 5/5 HumanEval/0..4 with real claude 2.1.139 + real apr code 0.32.0 via Qwen2.5-Coder-1.5B-Instruct-Q4_K_M). The companion-repo M162 row records that aprender#1638 MERGED upstream at squash b61b76b4 (2026-05-13), un-gating apr code from `--features code` so `cargo install apr-cli` ships it by default — the Axis 3 LlmDriver-adapter discharge is FULLY confirmed. v1.25.0 (companion-repo M136-M140 axis-2-closure-plan sequence, 2026-05-11) — adds FALSIFY-CCPA-014 (companion-repo M136-M140 axis-2-closure-plan sequence, 2026-05-11) — adds FALSIFY-CCPA-014 (os_event_parity_bound) to the gate registry, completing the axis-2 closure-plan idea (2) CLI subprocess instrumentation track. New gate consumes ccpa_subproc::OsEvent records (M136) via ccpa_differ::os_event_parity (M137) and asserts canonical-corpus score >= 0.95 + bidirectional sensitivity on regression corpus (M139). v1.24.0 (companion-repo M128-M131 sequence, 2026-05-10) — bumped from v1.23.0 to integrate the M109 cosine-vs-HF-FP16 LIVE-DISCHARGE (cos_sim 0.995384 ≥ 0.99 on lambda-vector RTX 4090, 2026-05-09; aprender PR #1597 squash 3fb04ef86 flipped `qwen3-moe-forward-v1` v1.4.0 ACTIVE_ALGORITHM_LEVEL → v1.5.0 ACTIVE_RUNTIME). Discharges the v1.23.0 status-prose claim "Cosine vs HF FP16 remains operator-confirm pending ~60 GB HF download" — the FP16 weights had been on lambda-vector at /mnt/nvme-raid0/models/Qwen3-Coder-30B-A3B-Instruct/ (57 GB / 16 safetensors shards) for ~7 days; the "60 GB download" blocker was stale by 62 days. v1.23.0 (M35 M32d discharge audit-trail bump) records the 4-bug stack landed on aprender main as commit 5235aaeb9 (#1228) plus diagnostic surface PRs #1222 (Step 2), #1226 (Step 2.5), #1401 (Step 2 JSON wire). M32d gibberish output ("%%%%%%%%") converted to coherent English answers across math/geography/translation/code domains. M34 FAST PATH 5-whys plan delivered at lucky-case bound (5 substantive PRs vs 4-6 estimated, ~6 hours wall vs 2-3 days). Component priors verified empirically: rank-3 Q/K RMSNorm (15%) + rank-4 rope_theta (10%) + chat template both correct. Cosine vs HF FP16 formal flip **DISCHARGED 2026-05-09 at companion-repo M109** (apr_argmax = hf_argmax = 3555 " What"; 555ms apr-forward; HF FP16 fixture generated in 52s).
+version: "1.28.0"
+status: ACTIVE_RUNTIME   # 17/17 gates registered; 4 with status: ACTIVE_RUNTIME (CCPA-013/014/015/016 — the runtime-evidence + outcome-parity track) + 1 with status: PROPOSED (CCPA-017 — project-scale parity, awaiting first operator-dispatched bench to flip ACTIVE_RUNTIME at v1.29.0), rest at PLANNED_M*/IN_REVIEW/HARD_BLOCKING_M16 per their lifecycle phase. No OPEN residue. v1.28.0 (companion-repo M180-M188 Phase 4 sequence, 2026-05-15) — adds FALSIFY-CCPA-017 (project_scale_parity_bound) to the gate registry. Phase 4 operationalizes the M159 ProgramBench prior-art (arXiv:2605.03546, 0%/200 SOTA baseline) into companion-tier project-scale parity testing: the M182 corpus draws 5 fixtures from real open GitHub issues across paiml/decy + paiml/bashrs + paiml/depyler with pinned pre-fix commit SHAs; the M184 runner (scripts/phase-4-bench.sh, 288 lines bash) clones at the pinned SHA, dispatches each system with timeout APR_TIMEOUT_S (default 900s), snapshots diff vs SHA, runs the per-fixture oracle_cmd; the M186 scorer (crates/ccpa-differ/src/project_scale_diff.rs, ~310 lines Rust) lifts the runner JSON into ProjectScaleParityReport with 5 derived metrics (per-fixture: approach_match + lines_edited_ratio; corpus-level: partial_agreement + files_jaccard_corpus + approach_match_rate); the M188 gate test (crates/ccpa-differ/tests/falsify_ccpa_017_project_scale_parity.rs, ~260 lines, 7 active + 1 #[ignore]'d) asserts partial_agreement >= 0.3 AND files_jaccard_corpus >= 0.3 with bidirectional sensitivity verified on synthetic identity (passes) and synthetic regression (fails) fixtures. CCPA-017 enters at status: PROPOSED because no operator-dispatched measurement has produced evidence/phase-4/project-scale-scores.json yet; the live-evidence test is #[ignore]'d until that exists. Threshold values (0.3/0.3) are tentative POC-tier floors — they WILL be recalibrated after first operator dispatch. Phase 4 is the SIGNAL regime, not the SATURATION regime: a CCPA-016-style "agreement = 1.0" result is implausible at project-scale per ProgramBench evidence; the goal is "do both systems make matching partial progress?" not "do both systems fully succeed?". v1.27.0 (companion-repo M167, 2026-05-14) — flips FALSIFY-CCPA-013 (first_recorded_parity_score) from `status: OPEN` → `status: ACTIVE_RUNTIME`. The gate's assertion has been satisfied since v1.1.0 (3 measured_parity blocks dating 2026-04-27 against `fixtures/canonical/` with aggregate_score = 1.0000), but the gate-level status field was never flipped — stale prose that this revision corrects. Also extends the assertion's `fixture_corpus_path` constraint to accept EITHER `fixtures/canonical/` (AUTHORED, since v1.2.0) OR `evidence/phase-3/captures/` (REAL-BINARY bilateral bench, companion-repo M150 — claude 2.1.139 + apr 0.32.0 + Qwen2.5-Coder-1.5B-Instruct-Q4_K_M, agreement = 1.0000 on MultiPL-E-Rust HumanEval/0..4). Adds a 4th measured_parity block under CCPA-013 recording M150's real-binary evidence as the strongest empirical discharge anchor. **CCPA-013 was the last gate stuck at `status: OPEN`** — its flip closes the OPEN residue. v1.26.0 (companion-repo M147+M152+M162 Phase 3 sequence, 2026-05-13) (companion-repo M147+M152+M162 Phase 3 sequence, 2026-05-13) — adds FALSIFY-CCPA-015 (ccpa_trace_subproc_output_purity) AND FALSIFY-CCPA-016 (outcome_parity_bound) to the gate registry. CCPA-015 was authored at M147 via provable-contract design (falsifying test FIRST, fix via Stdio::null()) for the ccpa-trace-subproc capture binary; PROPOSED in v1.25.0, promoted ACTIVE_RUNTIME here. CCPA-016 is the Phase 3 P3.4 outcome-parity gate authored at M152 — asserts aggregate agreement >= 0.5 on a MultiPL-E-Rust-class corpus with bidirectional sensitivity (synthetic regression fixture fails threshold; synthetic identity passes). CCPA-016 was empirically validated at M150 (real bilateral bench produced agreement = 1.0000 on 5/5 HumanEval/0..4 with real claude 2.1.139 + real apr code 0.32.0 via Qwen2.5-Coder-1.5B-Instruct-Q4_K_M). The companion-repo M162 row records that aprender#1638 MERGED upstream at squash b61b76b4 (2026-05-13), un-gating apr code from `--features code` so `cargo install apr-cli` ships it by default — the Axis 3 LlmDriver-adapter discharge is FULLY confirmed. v1.25.0 (companion-repo M136-M140 axis-2-closure-plan sequence, 2026-05-11) — adds FALSIFY-CCPA-014 (companion-repo M136-M140 axis-2-closure-plan sequence, 2026-05-11) — adds FALSIFY-CCPA-014 (os_event_parity_bound) to the gate registry, completing the axis-2 closure-plan idea (2) CLI subprocess instrumentation track. New gate consumes ccpa_subproc::OsEvent records (M136) via ccpa_differ::os_event_parity (M137) and asserts canonical-corpus score >= 0.95 + bidirectional sensitivity on regression corpus (M139). v1.24.0 (companion-repo M128-M131 sequence, 2026-05-10) — bumped from v1.23.0 to integrate the M109 cosine-vs-HF-FP16 LIVE-DISCHARGE (cos_sim 0.995384 ≥ 0.99 on lambda-vector RTX 4090, 2026-05-09; aprender PR #1597 squash 3fb04ef86 flipped `qwen3-moe-forward-v1` v1.4.0 ACTIVE_ALGORITHM_LEVEL → v1.5.0 ACTIVE_RUNTIME). Discharges the v1.23.0 status-prose claim "Cosine vs HF FP16 remains operator-confirm pending ~60 GB HF download" — the FP16 weights had been on lambda-vector at /mnt/nvme-raid0/models/Qwen3-Coder-30B-A3B-Instruct/ (57 GB / 16 safetensors shards) for ~7 days; the "60 GB download" blocker was stale by 62 days. v1.23.0 (M35 M32d discharge audit-trail bump) records the 4-bug stack landed on aprender main as commit 5235aaeb9 (#1228) plus diagnostic surface PRs #1222 (Step 2), #1226 (Step 2.5), #1401 (Step 2 JSON wire). M32d gibberish output ("%%%%%%%%") converted to coherent English answers across math/geography/translation/code domains. M34 FAST PATH 5-whys plan delivered at lucky-case bound (5 substantive PRs vs 4-6 estimated, ~6 hours wall vs 2-3 days). Component priors verified empirically: rank-3 Q/K RMSNorm (15%) + rank-4 rope_theta (10%) + chat template both correct. Cosine vs HF FP16 formal flip **DISCHARGED 2026-05-09 at companion-repo M109** (apr_argmax = hf_argmax = 3555 " What"; 555ms apr-forward; HF FP16 fixture generated in 52s).
 
 # ─────────────────────────────────────────────────────────────────────────────
 # Top-level invariants — the 12 falsifiable gates this contract asserts.
@@ -90,6 +90,7 @@ invariants:
   - { id: FALSIFY-CCPA-014, name: os_event_parity_bound,       summary: 'OS-level event parity (axis-2-closure-plan M115.4): macro-averaged Jaccard >= 0.95 per fixture in fixtures/os-canonical/; bidirectional-sensitivity gate on fixtures/os-regression/ (every fixture < 0.95 + non-empty drift records).' }
   - { id: FALSIFY-CCPA-015, name: ccpa_trace_subproc_output_purity, summary: 'Every line emitted to stdout by ccpa-trace-subproc MUST decode as a ccpa_subproc::OsEvent JSON object. Subprocess stdout MUST NOT interleave with the capture stream (use Stdio::null() not Stdio::inherit()).' }
   - { id: FALSIFY-CCPA-016, name: outcome_parity_bound,        summary: 'Outcome parity (Phase 3 P3.4): aggregate agreement on a MultiPL-E-Rust-class corpus >= 0.5 (POC-tier); bidirectional-sensitivity via synthetic regression (< 0.5 → fail) + synthetic identity (1.0 → pass) fixtures.' }
+  - { id: FALSIFY-CCPA-017, name: project_scale_parity_bound,  summary: 'Project-scale parity (Phase 4 P4.4): aggregate partial_agreement >= 0.3 AND files_jaccard_corpus >= 0.3 on a multi-file Cargo-workspace task corpus drawn from real GitHub issues (companion-repo M182). Bidirectional-sensitivity via synthetic identity (passes) + synthetic regression (fails) fixtures. PROPOSED at v1.28.0; ACTIVE_RUNTIME pending first operator-dispatched measurement.' }
 
 scope: >
   every recorded fixture under <ccpa-repo>/fixtures/, every replay run the
@@ -814,6 +815,118 @@ falsification_conditions:
       - { date: '2026-05-13', version_before: '1.25.0', version_after: '1.26.0',
           change: 'Added FALSIFY-CCPA-016 to gate registry. Companion-repo M152 ships the gate test against live evidence/phase-3/multipl-e-rust-scores.json; M150 produced the bilateral bench empirical evidence; this revision flips the contract to recognize CCPA-016 as ACTIVE_RUNTIME from authoring.' }
 
+  - id: FALSIFY-CCPA-017
+    name: project_scale_parity_bound
+    status: PROPOSED
+    assertion: |
+      Project-scale parity (Phase 4 P4.4). On a multi-file Cargo-workspace
+      task corpus where each task is drawn from a real open GitHub issue
+      (companion-repo M182: fixtures/project-scale/ initially 5 fixtures
+      across paiml/decy + paiml/bashrs + paiml/depyler), both teacher
+      (claude) and student (apr code) are dispatched in a clone of the
+      pinned pre_fix_commit SHA + given the issue body as their prompt
+      + their final repo state is scored against the per-fixture
+      completion oracle_cmd. The aggregate project-scale parity report
+      MUST satisfy BOTH:
+        - aggregate `partial_agreement` >= 0.3
+        - aggregate `files_jaccard_corpus` >= 0.3
+
+      Where derived metrics are:
+        partial_agreement     = mean over fixtures of
+                                 min(teacher.oracle_pass, student.oracle_pass)
+        files_jaccard_corpus  = mean over fixtures of
+                                 |teacher.files_touched ∩ student.files_touched|
+                                 / |teacher.files_touched ∪ student.files_touched|
+
+      Plus consistency invariants:
+        - `corpus_size >= 3` (minimum sample size for statistical meaning)
+        - `corpus_size == per_fixture.len()` (record-count match)
+
+      Bidirectional sensitivity (mandatory):
+        - A synthetic regression fixture (one side passes, other fails,
+          disjoint files-touched lists) MUST fail BOTH thresholds.
+        - A synthetic identity fixture (both sides pass on same files
+          with identical files_touched_jaccard = 1.0) MUST pass.
+        - An empty-corpus report MUST fail (prevents "no-data" from
+          being claimed as success).
+
+      Source of truth: `evidence/phase-4/project-scale-scores.json`
+      produced by `scripts/phase-4-bench.sh` on the companion repo.
+    test_harness: |
+      `cargo test -p ccpa-differ --test falsify_ccpa_017_project_scale_parity`
+      runs 7 active assertions + 1 `#[ignore]`'d live-evidence assertion:
+        - synthetic_identity_corpus_passes_gate
+        - synthetic_regression_corpus_fails_gate
+        - empty_corpus_vacuously_fails_threshold
+        - exactly_at_threshold_passes (verifies >= not >)
+        - just_below_partial_threshold_fails (single-gate sensitivity)
+        - just_below_files_threshold_fails  (single-gate sensitivity)
+        - threshold_constants_match_plan (sentinel)
+        - live_evidence_meets_project_scale_threshold (#[ignore]'d
+          until operator dispatches `bash scripts/phase-4-bench.sh`)
+
+      All 7 active GREEN on the companion-repo M188 scaffold (synthetic
+      fixtures constructed in-test, no on-disk corpus dependency).
+    rationale: |
+      The M180 Phase 4 plan operationalizes the M159 ProgramBench
+      prior-art (arXiv:2605.03546) into companion-tier project-scale
+      parity testing. ProgramBench reports 0%/200 fully-resolved across
+      Claude Opus/Sonnet/Haiku + GPT + Gemini at the project-scale
+      layer; this evidence validates the M159 caveat "function-level
+      1.0 does not extrapolate to project-scale" and establishes the
+      Phase 4 SIGNAL regime: the user-facing parity question is
+      "do both systems make matching partial progress?" not "do both
+      systems fully succeed?".
+
+      The DUAL-threshold design (partial_agreement >= 0.3 AND
+      files_jaccard_corpus >= 0.3) is intentional: project-scale parity
+      has two orthogonal signal channels — pass-rate agreement AND
+      files-touched overlap. A system could match pass rate without
+      touching the same files (different solutions to same problem);
+      or touch the same files without matching pass rate (one fixes
+      the bug, the other breaks more). Both channels must show
+      agreement for "project-scale parity" to mean anything.
+
+      Threshold values (0.3/0.3) are tentative POC-tier floors. They
+      WILL be recalibrated after first operator-dispatched measurement
+      against the M182 corpus. A 0.5/0.5 threshold à la CCPA-016 would
+      assume saturation that ProgramBench evidence shows doesn't exist
+      at project-scale; 0.3 is "at least 30% of fixtures see matching
+      progress" — a plausible POC-tier floor that the M182 corpus
+      might actually meet.
+
+      Status PROPOSED (not ACTIVE_RUNTIME) because no
+      operator-dispatched measurement has produced
+      evidence/phase-4/project-scale-scores.json yet. The
+      live-evidence test is `#[ignore]`'d until that file exists.
+      Once the operator runs `bash scripts/phase-4-bench.sh` and the
+      gate passes against real data, a v1.29.0 bump will flip
+      PROPOSED → ACTIVE_RUNTIME.
+
+      Companion-repo Phase 4 sequence (M180-M188):
+        M180 (PR #167 squash c7107b9) — phase-4-project-scale-plan.md
+          authored; P4.1-P4.5 sub-deliverables defined.
+        M182 (PR #169 squash b36ceb6) — P4.1 corpus: 5 fixtures from
+          paiml/decy#40 + paiml/decy#39 + paiml/bashrs#209 +
+          paiml/depyler#223 + paiml/depyler#224. Operator directive
+          "why not use ../decy ../bashrs and ../depy corpus" steered
+          authoring toward real GitHub issues over synthetic stretch
+          goals.
+        M184 (PR #171 squash 0f8c451) — P4.2 runner: phase-4-bench.sh
+          (288 lines bash); clones at pre_fix_commit SHA + dispatches
+          + snapshots + runs oracle + emits per-fixture and aggregate
+          JSON with files_touched_jaccard via jq set-arithmetic.
+        M186 (PR #173 squash c115966) — P4.3 scoring: project_scale_diff.rs
+          (~310 lines Rust) consumes runner JSON + adds 5 derived
+          metrics + passes_threshold predicate; 14 unit tests GREEN.
+        M188 (PR #175 squash a574655) — P4.4 gate test:
+          falsify_ccpa_017_project_scale_parity.rs (~260 lines); 7
+          synthetic-fixture tests verify bidirectional sensitivity
+          before any real measurement exists.
+    semantic_change_log:
+      - { date: '2026-05-15', version_before: '1.27.0', version_after: '1.28.0',
+          change: "Added FALSIFY-CCPA-017 to gate registry at status: PROPOSED. Companion-repo M188 ships the gate test scaffold (7 synthetic-fixture tests + 1 #[ignore]'d live-evidence test); thresholds (partial_agreement >= 0.3 AND files_jaccard_corpus >= 0.3) are tentative POC-tier floors awaiting first operator-dispatched measurement to calibrate. Phase 4 P4.5 contract bump." }
+
   - id: FALSIFY-CCPA-008
     name: parity_score_bound
     status: PLANNED_M6
@@ -972,6 +1085,92 @@ milestones:
 # ─────────────────────────────────────────────────────────────────────────────
 
 status_history:
+  - date: '2026-05-15'
+    from: 'ACTIVE_RUNTIME v1.27.0'
+    to: 'ACTIVE_RUNTIME v1.28.0'
+    note: 'companion-repo M180-M188 Phase 4 sequence — FALSIFY-CCPA-017 (project_scale_parity_bound) added to gate registry at status: PROPOSED; awaits first operator-dispatched bench to flip ACTIVE_RUNTIME'
+    reason: |
+      Adds 1 new falsification gate to the registry: CCPA-017
+      (project-scale parity bound). Gate count: 16 → 17.
+
+      Phase 4 closes the function-scale → project-scale extrapolation
+      gap the M159 ProgramBench prior-art (arXiv:2605.03546) flagged:
+      0%/200 fully-resolved across Claude Opus/Sonnet/Haiku + GPT +
+      Gemini at the project-scale layer means a CCPA-016-style "both
+      pass" assertion is implausible. CCPA-017 inverts the question:
+      partial-progress agreement, not all-or-nothing.
+
+      DUAL-threshold design: partial_agreement >= 0.3 AND
+      files_jaccard_corpus >= 0.3 — both orthogonal channels must
+      show agreement. Tentative 0.3/0.3 floors; recalibration awaits
+      first operator-dispatched measurement.
+
+      Companion-repo Phase 4 sequence (M180-M188):
+
+        M180 (PR #167 squash c7107b9) — phase-4-project-scale-plan.md
+          authored. P4.1-P4.5 sub-deliverables defined. Anchored in
+          ProgramBench prior-art (M159) for the signal-regime framing.
+
+        M182 (PR #169 squash b36ceb6) — P4.1 corpus: 5 fixtures from
+          real open GitHub issues across paiml/decy + paiml/bashrs +
+          paiml/depyler (decy#40 fix test assertions; decy#39 fix
+          clippy; bashrs#209 lint Makefile false-positives; depyler#223
+          enforce Oracle constraints; depyler#224 numeric coercion).
+          Operator directive "why not use ../decy ../bashrs and
+          ../depy corpus" steered toward real GitHub issues over
+          synthetic stretch goals — real issues = real stretch goals
+          = the operator has already triaged them as work worth doing.
+          Each fixture pins pre_fix_commit SHA in meta.toml; runner
+          clones at dispatch (no per-fixture starting-state snapshot,
+          which would be impractical at 685+ Rust files in depyler).
+
+        M184 (PR #171 squash 0f8c451) — P4.2 runner: phase-4-bench.sh
+          (288 lines bash). Per fixture × system: clone at SHA,
+          dispatch with timeout, snapshot diff, run oracle, record
+          exit + pattern match. Emits per-fixture + aggregate JSON
+          to evidence/phase-4/project-scale-scores.json with
+          files_touched_jaccard via jq set-arithmetic.
+
+        M186 (PR #173 squash c115966) — P4.3 scoring:
+          project_scale_diff.rs (~310 lines Rust). Type hierarchy
+          ProjectScaleParityReport → Vec<PerFixtureScore> →
+          (teacher: SideScore, student: SideScore). Loader
+          ProjectScaleParityReport::from_json_str() + 5 derived
+          metrics (per-fixture: approach_match + lines_edited_ratio;
+          corpus-level: partial_agreement + files_jaccard_corpus +
+          approach_match_rate). 14 unit tests GREEN.
+
+        M188 (PR #175 squash a574655) — P4.4 gate test:
+          falsify_ccpa_017_project_scale_parity.rs (~260 lines).
+          7 active synthetic-fixture tests + 1 #[ignore]'d
+          live-evidence test. Bidirectional sensitivity verified:
+          identity passes (partial=1.0, jaccard=1.0); regression
+          fails (partial=0.0, jaccard=0.0); empty corpus fails
+          (by design — prevents "no-data" from claiming success);
+          exactly-at-threshold passes (>= not > semantics);
+          just-below-either-threshold fails (single-gate
+          sensitivity). 7/7 GREEN.
+
+      Gate-level statuses post-v1.28.0: 4 ACTIVE_RUNTIME (CCPA-013/
+      014/015/016) + 1 PROPOSED (CCPA-017) — awaiting first
+      operator-dispatched bench, after which v1.29.0 will flip
+      PROPOSED → ACTIVE_RUNTIME. Rest at PLANNED_M*/IN_REVIEW/
+      HARD_BLOCKING_M16 per their lifecycle phase. No OPEN residue.
+
+      Gate registry summary post-v1.28.0:
+        FALSIFY-CCPA-001..006  PLANNED_M*    (Phase 1 RECORD scope; M2.3-rescoped)
+        FALSIFY-CCPA-007       IN_REVIEW     (coverage-floor)
+        FALSIFY-CCPA-008       PLANNED_M6    (parity_score_bound)
+        FALSIFY-CCPA-009..012  ACTIVE_ALGORITHM_LEVEL (CI gates from M0)
+        FALSIFY-CCPA-013       ACTIVE_RUNTIME (first_recorded_parity_score, at v1.27.0)
+        FALSIFY-CCPA-014       ACTIVE_RUNTIME (os_event_parity_bound, at v1.25.0)
+        FALSIFY-CCPA-015       ACTIVE_RUNTIME (ccpa_trace_subproc_output_purity, at v1.26.0)
+        FALSIFY-CCPA-016       ACTIVE_RUNTIME (outcome_parity_bound, at v1.26.0)
+        FALSIFY-CCPA-017       PROPOSED       (project_scale_parity_bound, at v1.28.0)
+
+      Pure additive bump: new gate + new status_history entry. No
+      schema bump in aprender-contracts/src/schema/. pv validate clean.
+
   - date: '2026-05-14'
     from: 'ACTIVE_RUNTIME v1.26.0'
     to: 'ACTIVE_RUNTIME v1.27.0'

From f651296bfcd050e21599946d099fac015f4fea5b Mon Sep 17 00:00:00 2001
From: Noah Gift <noah.gift@gmail.com>
Date: Fri, 15 May 2026 10:04:08 +0200
Subject: [PATCH 2/5] ci(workspace-test): recover from cancel-in-progress
 target-dir damage

Five-whys for the recurring workspace-test "no such file or directory:
'...rcgu.o'" linker failure that has caused 2+ runs to fail on
aprender#1684:

1. Why does workspace-test fail? Linker can't find yoke_derive-*.rcgu.o
   intermediate compile artifacts.
2. Why are they missing? Cargo was killed mid-compile (no .rcgu.o
   written yet) but .rmeta files already existed.
3. Why was cargo killed? GitHub Actions concurrency.cancel-in-progress
   killed the previous run on the same PR when a new commit arrived.
4. Why does this persist? The target dir is per-PR-persistent at
   /mnt/nvme-raid0/targets/aprender-ci/<PR>/, so partial-compile state
   survives across runs.
5. Root cause: cargo's incremental state is not atomic-on-kill.
   cancel-in-progress + persistent per-PR target dir = inconsistent
   rmeta/rcgu.o pairs.

Fix: wrap `cargo test` with detect-and-recover logic:
- On link-failure pattern (rcgu.o file-not-found), `cargo clean`
  and retry once.
- On any other failure, propagate exit code unchanged.

Adds latency only on the damage-recovery path; warm-cache happy
path is unchanged.

Affects only the "Workspace lib tests (25,300+)" step. Other steps
(Compute tests, Integration tests, Build.rs check) reuse the
cleaned target dir from the recovery path if it fires.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .github/workflows/ci.yml | 47 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 2518edc9c..31399dda7 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -105,8 +105,22 @@ jobs:
         # = 41min, which JUST overruns 40 — observed on intel-clean-room-3 first run
         # after the refactor invalidated its sccache by changing workspace lints.
         # Warm runs are ~3-5min compile per the workflow comments above.
+        #
+        # Cancel-damage recovery (five-whys): when concurrency.cancel-in-progress
+        # cancels the previous run on the same PR mid-compile, cargo's
+        # incremental state is left inconsistent — .rmeta files exist (claiming
+        # "I'm built") but the matching .rcgu.o codegen artifacts were never
+        # written because cargo got SIGKILL'd. The persistent per-PR target
+        # dir at /mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}/ preserves
+        # this damage across runs. The next run's linker then fails with:
+        #   clang: error: no such file or directory: '...rcgu.o'
+        # Fix: wrap the cargo invocation; on link-failure pattern (rcgu.o
+        # missing), `cargo clean` and retry once. Adds latency only on the
+        # damage-recovery path; warm-cache happy path unchanged.
         timeout-minutes: 55
         run: |
+          set +e
+          LOG=$(mktemp)
           docker run --rm \
           -e CI -e GITHUB_ACTIONS -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_RUN_ID -e GITHUB_EVENT_NAME -e GITHUB_WORKFLOW \
             -v "${GITHUB_WORKSPACE}:/workspace" \
@@ -122,7 +136,38 @@ jobs:
             -e CARGO_PROFILE_TEST_DEBUG=line-tables-only \
             -e CARGO_PROFILE_DEV_DEBUG=line-tables-only \
             "$IMAGE" \
-            cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute
+            cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute 2>&1 | tee "$LOG"
+          TEST_EXIT=${PIPESTATUS[0]}
+          set -e
+          if [ "$TEST_EXIT" -ne 0 ] && grep -qE 'no such file or directory.*\.rcgu\.o' "$LOG"; then
+            echo "::warning::Linker failure on stale target dir from prior cancel-in-progress run; running cargo clean + retry"
+            docker run --rm \
+              -v "${GITHUB_WORKSPACE}:/workspace" \
+              -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
+              -w /workspace \
+              -e CARGO_TARGET_DIR=/workspace/target \
+              "$IMAGE" \
+              cargo clean
+            docker run --rm \
+            -e CI -e GITHUB_ACTIONS -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_RUN_ID -e GITHUB_EVENT_NAME -e GITHUB_WORKFLOW \
+              -v "${GITHUB_WORKSPACE}:/workspace" \
+              -v "/mnt/nvme-raid0/cargo-ci/registry/${PR_OR_REF}:/usr/local/cargo/registry" \
+              -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
+              -v "/home/noah/data/sccache:/sccache" \
+              -w /workspace \
+              -e CARGO_TARGET_DIR=/workspace/target \
+              -e RUSTC_WRAPPER=rustc-sccache \
+              -e SCCACHE_DIR=/sccache \
+              -e CARGO_INCREMENTAL=0 \
+              -e CARGO_BUILD_JOBS=8 \
+              -e CARGO_PROFILE_TEST_DEBUG=line-tables-only \
+              -e CARGO_PROFILE_DEV_DEBUG=line-tables-only \
+              "$IMAGE" \
+              cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute
+          elif [ "$TEST_EXIT" -ne 0 ]; then
+            echo "Non-recovery failure mode; propagating exit $TEST_EXIT"
+            exit "$TEST_EXIT"
+          fi
       - name: Compute tests (tolerate SIGSEGV at exit — all tests pass but harness crashes on cleanup)
         run: |
           docker run --rm \

From 6a165a647ad09a7cf5d0ee6c0c133223eac28fea Mon Sep 17 00:00:00 2001
From: Noah Gift <noah.gift@gmail.com>
Date: Fri, 15 May 2026 10:59:35 +0200
Subject: [PATCH 3/5] ci(workspace-test): expand cancel-damage recovery regex
 to 3 patterns

The initial workspace-test recovery (commit f651296bf) only matched
the rcgu.o linker-failure pattern. Aprender#1684 run 25907945016
exposed two additional cancel-damage patterns the original regex
missed:

  Pattern 2: cc-rs missing build subdirs
    cargo:warning=Fatal error: can't create
      /workspace/target/debug/build/zstd-sys-X/out/Y.o:
      No such file or directory

  Pattern 3: rustc orphan rmeta
    error: extern location for libloading does not exist:
      /workspace/target/debug/deps/liblibloading-X.rmeta

Both patterns indicate the same root cause (partial-compile state
from a SIGKILL'd previous run) but manifest in different cargo
subsystems depending on how far the cancelled build got.

Updated regex matches ANY of the three known damage patterns:
  no such file or directory.*\.rcgu\.o
  Fatal error: can.t create.*\.o: No such file
  extern location for .* does not exist.*\.rmeta

On match: cargo clean + retry once. Adds latency only on the
damage-recovery path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .github/workflows/ci.yml | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 31399dda7..cba044b62 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -108,14 +108,17 @@ jobs:
         #
         # Cancel-damage recovery (five-whys): when concurrency.cancel-in-progress
         # cancels the previous run on the same PR mid-compile, cargo's
-        # incremental state is left inconsistent — .rmeta files exist (claiming
-        # "I'm built") but the matching .rcgu.o codegen artifacts were never
-        # written because cargo got SIGKILL'd. The persistent per-PR target
+        # incremental state is left inconsistent. The persistent per-PR target
         # dir at /mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}/ preserves
-        # this damage across runs. The next run's linker then fails with:
-        #   clang: error: no such file or directory: '...rcgu.o'
-        # Fix: wrap the cargo invocation; on link-failure pattern (rcgu.o
-        # missing), `cargo clean` and retry once. Adds latency only on the
+        # this damage across runs. Observed damage patterns:
+        #   - Orphan rmeta + missing rcgu.o (link fails):
+        #       clang: error: no such file or directory: '...rcgu.o'
+        #   - Missing build subdirs (cc-rs can't write output files):
+        #       cargo:warning=Fatal error: can't create .../build/.../X.o: No such file or directory
+        #   - Orphan rmeta references (rustc dep-graph lookup fails):
+        #       error: extern location for X does not exist: .../X.rmeta
+        # Fix: wrap the cargo invocation; on ANY of these damage-indicator
+        # patterns, `cargo clean` and retry once. Adds latency only on the
         # damage-recovery path; warm-cache happy path unchanged.
         timeout-minutes: 55
         run: |
@@ -139,7 +142,11 @@ jobs:
             cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute 2>&1 | tee "$LOG"
           TEST_EXIT=${PIPESTATUS[0]}
           set -e
-          if [ "$TEST_EXIT" -ne 0 ] && grep -qE 'no such file or directory.*\.rcgu\.o' "$LOG"; then
+          # Match any known cancel-damage pattern. Combined regex covers:
+          #   - linker missing .rcgu.o (orphan rmeta, missing codegen)
+          #   - cc-rs Fatal error can't create .o (missing build subdirs)
+          #   - rustc extern location does not exist .rmeta (orphan dep-graph)
+          if [ "$TEST_EXIT" -ne 0 ] && grep -qE 'no such file or directory.*\.rcgu\.o|Fatal error: can.t create.*\.o: No such file|extern location for .* does not exist.*\.rmeta' "$LOG"; then
             echo "::warning::Linker failure on stale target dir from prior cancel-in-progress run; running cargo clean + retry"
             docker run --rm \
               -v "${GITHUB_WORKSPACE}:/workspace" \

From 6fcd975b59fe0da2c1510bf9e0a585895d7cd34d Mon Sep 17 00:00:00 2001
From: Noah Gift <noah.gift@gmail.com>
Date: Fri, 15 May 2026 13:02:07 +0200
Subject: [PATCH 4/5] ci(workspace-test): OS-level rm -rf + sccache-disabled
 retry on cancel-damage

cargo clean (commits f651296bf + df1c2c767) is insufficient. Observed
in aprender#1684 runs 25907945016 + 25912792473: after recovery
cargo clean, the retry fails within ~52s with:

  error: couldn't create a temp dir: No such file or directory
    (os error 2) at path "/workspace/target/debug/deps/rmetaDAW518"

Root cause: sccache (mounted at /home/noah/data/sccache, shared
across all PRs) caches metadata that references compile artifact
paths the cargo clean just removed. The retry asks sccache for a
cache hit; sccache returns a metadata blob pointing at a path that
no longer exists; rustc's mkdir+write fails.

Two-prong fix:
1. OS-level rm -rf of target dir contents (more thorough than cargo
   clean; doesn't depend on cargo lockfiles being sane), plus
   pre-create target/debug/deps/ to avoid mkdir races on parallel
   rustc invocations.
2. Disable sccache during retry (no RUSTC_WRAPPER, no SCCACHE_DIR,
   no /sccache mount). Pays 30-40min cold-compile cost once to
   guarantee correctness over a fast-but-broken cached state.

Cost: only on the damage-recovery path. Warm-cache happy path
unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .github/workflows/ci.yml | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index cba044b62..e5aa98201 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -147,24 +147,31 @@ jobs:
           #   - cc-rs Fatal error can't create .o (missing build subdirs)
           #   - rustc extern location does not exist .rmeta (orphan dep-graph)
           if [ "$TEST_EXIT" -ne 0 ] && grep -qE 'no such file or directory.*\.rcgu\.o|Fatal error: can.t create.*\.o: No such file|extern location for .* does not exist.*\.rmeta' "$LOG"; then
-            echo "::warning::Linker failure on stale target dir from prior cancel-in-progress run; running cargo clean + retry"
+            # Observed in aprender#1684 runs 25907945016 + 25912792473: cargo
+            # clean is insufficient — after the clean, the retry build still
+            # fails with "couldn't create a temp dir: No such file or
+            # directory at .../target/debug/deps/rmetaX" within ~52s. Root
+            # cause is sccache cache pointing at cached metadata referencing
+            # paths the clean just removed. Two-prong fix:
+            #   1. Use rm -rf at the OS level (more thorough than cargo
+            #      clean; doesn't depend on cargo lockfiles to be in a sane
+            #      state).
+            #   2. Disable sccache for the retry (no RUSTC_WRAPPER, no
+            #      sccache mount) — pay the 30-40min cold-compile cost
+            #      once to guarantee correctness over a fast-but-broken
+            #      cached state.
+            echo "::warning::Cancel-damage pattern detected; OS-level target dir nuke + retry without sccache"
             docker run --rm \
-              -v "${GITHUB_WORKSPACE}:/workspace" \
               -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
-              -w /workspace \
-              -e CARGO_TARGET_DIR=/workspace/target \
               "$IMAGE" \
-              cargo clean
+              bash -c 'rm -rf /workspace/target/* /workspace/target/.[!.]* 2>/dev/null || true; mkdir -p /workspace/target/debug/deps; ls -la /workspace/target/'
             docker run --rm \
             -e CI -e GITHUB_ACTIONS -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_RUN_ID -e GITHUB_EVENT_NAME -e GITHUB_WORKFLOW \
               -v "${GITHUB_WORKSPACE}:/workspace" \
               -v "/mnt/nvme-raid0/cargo-ci/registry/${PR_OR_REF}:/usr/local/cargo/registry" \
               -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
-              -v "/home/noah/data/sccache:/sccache" \
               -w /workspace \
               -e CARGO_TARGET_DIR=/workspace/target \
-              -e RUSTC_WRAPPER=rustc-sccache \
-              -e SCCACHE_DIR=/sccache \
               -e CARGO_INCREMENTAL=0 \
               -e CARGO_BUILD_JOBS=8 \
               -e CARGO_PROFILE_TEST_DEBUG=line-tables-only \

From d34013918c1b80d36f467af820c7c72a32d5949e Mon Sep 17 00:00:00 2001
From: Noah Gift <noah.gift@gmail.com>
Date: Fri, 15 May 2026 13:56:39 +0200
Subject: [PATCH 5/5] ci(workspace-test): prevent cancel-damage via pre-flight
 check (not retry-on-failure)

Replaces the retry-on-failure recovery (commits 6a165a647 + 6fcd975b5)
with a prevention-based pre-flight check that runs ONCE at the start
of workspace-test.

Five-whys (root cause):
1. Why does workspace-test fail with "no such file .rcgu.o" / extern
   location missing / cc-rs can't create .o? Cargo's incremental
   state on the per-PR target dir is inconsistent.
2. Why inconsistent? Prior run was SIGKILL'd mid-compile.
3. Why SIGKILL'd? concurrency.cancel-in-progress (line 22) cancels
   the previous run when a new commit lands (every "Update branch" /
   strict-up-to-date trigger when main moves).
4. Why does this persist? Target dir is bind-mounted from per-PR
   persistent path; partial-compile state survives across runs.
5. Root cause: cargo's incremental state is not atomic-on-kill, so
   persistent shared target dir + cancel-in-progress = damage.

Prevention: at job start, check the previous workflow run on this
branch via gh api. If conclusion was "cancelled", rm -rf the target
dir BEFORE invoking cargo. This addresses the root cause by ensuring
no damaged state enters the cargo invocation in the first place.

This is NOT retry-on-failure (which the operator rejects under the
"flake is not allowed" directive). It is one-time prevention based
on a verifiable signal (prior run conclusion = cancelled).

Cost: ~30-40min cold rebuild ONLY on runs following a cancellation.
Warm-cache happy path (no prior cancellation): zero added latency.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .github/workflows/ci.yml | 108 +++++++++++++++++----------------------
 1 file changed, 48 insertions(+), 60 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index e5aa98201..133cd76fa 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -99,31 +99,61 @@ jobs:
             sleep "$delay"
             delay=$((delay + 6))  # linear backoff: 4,10,16,22,28,34,...
           done
+      - name: Pre-flight target-dir consistency check
+        # Root cause (five-whys):
+        #   1. Why does workspace-test sometimes fail with "no such file or
+        #      directory .rcgu.o" / extern location missing / cc-rs can't
+        #      create .o? Cargo's incremental state on the per-PR target
+        #      dir is inconsistent.
+        #   2. Why inconsistent? A prior run was SIGKILL'd mid-compile and
+        #      left orphan .rmeta files (parts cargo had registered as built)
+        #      without the corresponding .rcgu.o codegen artifacts (which were
+        #      mid-write at the moment of the kill).
+        #   3. Why was it SIGKILL'd? concurrency.cancel-in-progress (line 22)
+        #      cancels the previous run as soon as a new commit lands on the
+        #      branch (and "Update branch" / strict-up-to-date triggers this
+        #      every time aprender main moves forward).
+        #   4. Why does this persist? The target dir is bind-mounted from a
+        #      per-PR persistent path /mnt/nvme-raid0/targets/aprender-ci/<PR>/,
+        #      so partial-compile state survives across runs.
+        #   5. Root cause: cargo's incremental state is not atomic-on-kill, so
+        #      a persistent shared target dir + cancel-in-progress = damage.
+        # Prevention (this step): BEFORE invoking cargo, check whether the
+        # immediately-preceding workflow run on this branch was cancelled. If
+        # yes, rm -rf the target dir contents. This is a one-time check at
+        # job start — NOT a retry-on-failure pattern (which the operator
+        # rejects under the "flake is not allowed" directive).
+        run: |
+          set -e
+          if [ -z "${{ github.event.pull_request.number }}" ]; then
+            echo "Not a PR run; skipping prior-cancel check"
+            exit 0
+          fi
+          # Find the immediately-preceding workflow run on this branch.
+          # status=completed filter excludes the current in-progress run.
+          PREV_CONCLUSION=$(gh api \
+            "repos/${GITHUB_REPOSITORY}/actions/runs?branch=${GITHUB_HEAD_REF}&status=completed&per_page=1" \
+            --jq '.workflow_runs[0].conclusion' 2>/dev/null || echo "")
+          echo "Previous run conclusion on ${GITHUB_HEAD_REF}: ${PREV_CONCLUSION:-<none>}"
+          if [ "$PREV_CONCLUSION" = "cancelled" ]; then
+            echo "::warning::Previous run was cancelled; nuking target dir to prevent cargo cancel-damage"
+            docker run --rm \
+              -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
+              "$IMAGE" \
+              bash -c 'rm -rf /workspace/target/* /workspace/target/.[!.]* 2>/dev/null || true; ls -la /workspace/target/ || true'
+          else
+            echo "No cancel damage to clean (prior conclusion: ${PREV_CONCLUSION:-fresh-branch})"
+          fi
+        env:
+          GH_TOKEN: ${{ github.token }}
       - name: Workspace lib tests (25,300+)
         # Excluded: aprender-gpu (cuBLAS), aprender-cuda-edge (CUDA), aprender-compute (SIMD SIGSEGV at exit)
         # Timeout: 55min (was 40). Cold sccache run is ~34min compile + ~7min tests
         # = 41min, which JUST overruns 40 — observed on intel-clean-room-3 first run
         # after the refactor invalidated its sccache by changing workspace lints.
         # Warm runs are ~3-5min compile per the workflow comments above.
-        #
-        # Cancel-damage recovery (five-whys): when concurrency.cancel-in-progress
-        # cancels the previous run on the same PR mid-compile, cargo's
-        # incremental state is left inconsistent. The persistent per-PR target
-        # dir at /mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}/ preserves
-        # this damage across runs. Observed damage patterns:
-        #   - Orphan rmeta + missing rcgu.o (link fails):
-        #       clang: error: no such file or directory: '...rcgu.o'
-        #   - Missing build subdirs (cc-rs can't write output files):
-        #       cargo:warning=Fatal error: can't create .../build/.../X.o: No such file or directory
-        #   - Orphan rmeta references (rustc dep-graph lookup fails):
-        #       error: extern location for X does not exist: .../X.rmeta
-        # Fix: wrap the cargo invocation; on ANY of these damage-indicator
-        # patterns, `cargo clean` and retry once. Adds latency only on the
-        # damage-recovery path; warm-cache happy path unchanged.
         timeout-minutes: 55
         run: |
-          set +e
-          LOG=$(mktemp)
           docker run --rm \
           -e CI -e GITHUB_ACTIONS -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_RUN_ID -e GITHUB_EVENT_NAME -e GITHUB_WORKFLOW \
             -v "${GITHUB_WORKSPACE}:/workspace" \
@@ -139,49 +169,7 @@ jobs:
             -e CARGO_PROFILE_TEST_DEBUG=line-tables-only \
             -e CARGO_PROFILE_DEV_DEBUG=line-tables-only \
             "$IMAGE" \
-            cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute 2>&1 | tee "$LOG"
-          TEST_EXIT=${PIPESTATUS[0]}
-          set -e
-          # Match any known cancel-damage pattern. Combined regex covers:
-          #   - linker missing .rcgu.o (orphan rmeta, missing codegen)
-          #   - cc-rs Fatal error can't create .o (missing build subdirs)
-          #   - rustc extern location does not exist .rmeta (orphan dep-graph)
-          if [ "$TEST_EXIT" -ne 0 ] && grep -qE 'no such file or directory.*\.rcgu\.o|Fatal error: can.t create.*\.o: No such file|extern location for .* does not exist.*\.rmeta' "$LOG"; then
-            # Observed in aprender#1684 runs 25907945016 + 25912792473: cargo
-            # clean is insufficient — after the clean, the retry build still
-            # fails with "couldn't create a temp dir: No such file or
-            # directory at .../target/debug/deps/rmetaX" within ~52s. Root
-            # cause is sccache cache pointing at cached metadata referencing
-            # paths the clean just removed. Two-prong fix:
-            #   1. Use rm -rf at the OS level (more thorough than cargo
-            #      clean; doesn't depend on cargo lockfiles to be in a sane
-            #      state).
-            #   2. Disable sccache for the retry (no RUSTC_WRAPPER, no
-            #      sccache mount) — pay the 30-40min cold-compile cost
-            #      once to guarantee correctness over a fast-but-broken
-            #      cached state.
-            echo "::warning::Cancel-damage pattern detected; OS-level target dir nuke + retry without sccache"
-            docker run --rm \
-              -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
-              "$IMAGE" \
-              bash -c 'rm -rf /workspace/target/* /workspace/target/.[!.]* 2>/dev/null || true; mkdir -p /workspace/target/debug/deps; ls -la /workspace/target/'
-            docker run --rm \
-            -e CI -e GITHUB_ACTIONS -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_RUN_ID -e GITHUB_EVENT_NAME -e GITHUB_WORKFLOW \
-              -v "${GITHUB_WORKSPACE}:/workspace" \
-              -v "/mnt/nvme-raid0/cargo-ci/registry/${PR_OR_REF}:/usr/local/cargo/registry" \
-              -v "/mnt/nvme-raid0/targets/aprender-ci/${PR_OR_REF}:/workspace/target" \
-              -w /workspace \
-              -e CARGO_TARGET_DIR=/workspace/target \
-              -e CARGO_INCREMENTAL=0 \
-              -e CARGO_BUILD_JOBS=8 \
-              -e CARGO_PROFILE_TEST_DEBUG=line-tables-only \
-              -e CARGO_PROFILE_DEV_DEBUG=line-tables-only \
-              "$IMAGE" \
-              cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute
-          elif [ "$TEST_EXIT" -ne 0 ]; then
-            echo "Non-recovery failure mode; propagating exit $TEST_EXIT"
-            exit "$TEST_EXIT"
-          fi
+            cargo test --workspace --lib --exclude aprender-gpu --exclude aprender-cuda-edge --exclude aprender-compute
       - name: Compute tests (tolerate SIGSEGV at exit — all tests pass but harness crashes on cleanup)
         run: |
           docker run --rm \