CI: paired clean=0 / planted>=1 across 12 references + Caracal pre-flight#2
Merged
Conversation
…ight Lifts CI scope from `planted >= 1` (single-leg) to paired `clean = 0` AND `planted >= 1` across all 12 reference contracts, closing the honest-bound gap documented in 5ac98a9's commit message ("a paired clean = 0 leg would require checking in 12 clean variants — out of scope for this dispatch"). Approach — single-tree twin via scarb feature flag: * Each reference's Scarb.toml declares `[features] clean = []`. * Each contract module defines a cfg-gated `CLEAN_TWIN: bool` const (`#[cfg(feature: 'clean')]` -> true; otherwise false) and wraps the planted-bug body in `if CLEAN_TWIN { /* fixed */ }` / `if !CLEAN_TWIN { /* planted */ }`. * Default `scarb build` / `snforge test` keeps the planted bug embedded — existing on-chain deploys, Voyager source-verifications, fixtures, and findings READMEs remain bit-identical. * `snforge test --features clean` flips the gate; the same invariant test passes (rc=0, zero `INVARIANT VIOLATED` markers). Five test drivers (erc20, erc721, lending, multisig, timelock, vesting) carry the same cfg-gated `CLEAN_TWIN` const + a pre-call guard on the transitions that the clean variant would otherwise reject (e.g. erc721 duplicate-mint, multisig double-execute, timelock pre-delay execute, lending LTV-breaching withdraw, vesting non-monotone-ts). Seven references needed no driver change (payment_splitter, oracle, governance, single_side_amm, erc4626, staking). CI shape: * `references` job is now a 12x2 matrix (reference x variant). planted: rc != 0 AND `INVARIANT VIOLATED` markers >= 1. clean: rc == 0 AND `INVARIANT VIOLATED` markers == 0. All 24 legs required for green. * New `caracal` job runs `caracal detect` against every reference per push, uploads outputs as artifacts. **Advisory, not gating** for this PR — Caracal's Cairo 2.x coverage is incomplete and the false-positive surface against pinned Cairo 2.18 / scarb 2.18 is not yet pinned. Promoting to gating requires a separate suppression-set workstream; this delivers the M1 milestone "Caracal pre-flight in CI" as a live, every-push, recorded pre-flight. Local validation (operator's machine, scarb 2.18.0 + snforge 0.61.0): all 24 legs green — 12 planted legs produce 1+ markers + rc!=0; 12 clean legs produce zero markers + rc==0 across the per-test fuzzer-runs setting. README: new "Paired clean / planted CI" section documents the matrix shape, the twin-via-feature-flag mechanism, the Caracal advisory-mode call, and the run-id pattern for downstream verification. Pre-built ahead of Starknet Seed M1 disbursement per the CaliperForge build-to-win roadmap. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lifts CI scope from
planted >= 1(single-leg) to pairedclean = 0ANDplanted >= 1across all 12 reference contracts, plus adds a Caracal static-analysis pre-flight job (advisory mode). Closes the honest-bound gap documented in 5ac98a9 ("a pairedclean = 0leg would require checking in 12 clean variants — out of scope for this dispatch") and pre-builds the CI surface ahead of the Starknet Seed M1 milestone.Scarb.tomldeclares[features] clean = []; each contract module defines a cfg-gatedCLEAN_TWIN: boolconst (#[cfg(feature: 'clean')]→ true; otherwise false) and wraps the planted-bug body inif CLEAN_TWIN { /* fixed */ }/if !CLEAN_TWIN { /* planted */ }. Defaultscarb build/snforge testkeeps the planted bug embedded — existing on-chain deploys, Voyager source-verifications, fixtures, and findings READMEs remain bit-identical.snforge test --features cleanflips the gate; the same invariant test passes (rc=0, zeroINVARIANT VIOLATEDmarkers).CLEAN_TWINconst + a pre-call guard on the transitions the clean variant rejects (e.g. erc721 duplicate-mint, multisig double-execute, timelock pre-delay execute, lending LTV-breaching withdraw, vesting non-monotone-ts). Seven references needed no driver change (payment_splitter, oracle, governance, single_side_amm, erc4626, staking).referencesis now a 12×2 matrix (reference × variant). planted:rc != 0AND markers ≥ 1. clean:rc == 0AND markers == 0. All 24 legs required for green.caracaljob runscaracal detectagainst every reference per push, uploads outputs as artifacts. Not gating for this PR — Caracal's Cairo 2.x coverage is incomplete and the false-positive surface against pinned Cairo 2.18 / scarb 2.18 is not yet pinned. Promoting to gating requires a separate suppression-set workstream; this delivers the M1 milestone "Caracal pre-flight in CI" as a live, every-push, recorded pre-flight. If Caracal install crashes on the runner, the install step short-circuits cleanly and the job still exits 0 — the 12-twin invariant legs remain the authoritative proof.Honest call-outs
references/erc20_planted_bug/tests/cf_invariant_total_supply.cairo: (a) the originaldeploy-with-1000-to-deployerpattern leaked supply into a 4th address and tripped the invariant on step 0 even under a fixedburn; (b)cheat_caller_addresswas applied to the wrong call when followed by a view read. Both are now fixed in-line with WHY comments. These are real fixes, not gate-shape changes; calling them out so downstream reviewers see the side-effect.Local validation
Operator's machine,
scarb 2.18.0+snforge 0.61.0(.tool-versions-pinned):GitHub Actions run-id will be backfilled into the proof register once CI completes.
Test plan
referencesmatrix: 24 legs (12 planted + 12 clean) all green on the PR's first push.caracaladvisory job exits 0 (gracefully degrades on install failure if upstream pin shifts).scarb build/snforge testbehavior on any reference (planted variant remains default — fixtures + on-chain deploys + findings narratives untouched).AI disclosure
Per the cf-invariants calm-register disclosure footer policy (see
STARKNET_SEPOLIA_FINDINGS.md).Co-Authored-By: Claudeline present on the commit.