Skip to content

CI: paired clean=0 / planted>=1 across 12 references + Caracal pre-flight#2

Merged
michael-moffett merged 1 commit into
mainfrom
feat/12twin-clean-ci
Jun 7, 2026
Merged

CI: paired clean=0 / planted>=1 across 12 references + Caracal pre-flight#2
michael-moffett merged 1 commit into
mainfrom
feat/12twin-clean-ci

Conversation

@michael-moffett

Copy link
Copy Markdown
Contributor

Summary

Lifts CI scope from planted >= 1 (single-leg) to paired clean = 0 AND planted >= 1 across all 12 reference contracts, plus adds a Caracal static-analysis pre-flight job (advisory mode). Closes the honest-bound gap documented in 5ac98a9 ("a paired clean = 0 leg would require checking in 12 clean variants — out of scope for this dispatch") and pre-builds the CI surface ahead of the Starknet Seed M1 milestone.

  • Twin selector: each reference's Scarb.toml declares [features] clean = []; each contract module defines a cfg-gated CLEAN_TWIN: bool const (#[cfg(feature: 'clean')] → true; otherwise false) and wraps the planted-bug body in if CLEAN_TWIN { /* fixed */ } / if !CLEAN_TWIN { /* planted */ }. Default scarb build / snforge test keeps the planted bug embedded — existing on-chain deploys, Voyager source-verifications, fixtures, and findings READMEs remain bit-identical. snforge test --features clean flips the gate; the same invariant test passes (rc=0, zero INVARIANT VIOLATED markers).
  • Five test driver gates (erc20, erc721, lending, multisig, timelock, vesting): each affected driver carries the same cfg-gated CLEAN_TWIN const + a pre-call guard on the transitions the clean variant rejects (e.g. erc721 duplicate-mint, multisig double-execute, timelock pre-delay execute, lending LTV-breaching withdraw, vesting non-monotone-ts). Seven references needed no driver change (payment_splitter, oracle, governance, single_side_amm, erc4626, staking).
  • CI shape: references is now a 12×2 matrix (reference × variant). planted: rc != 0 AND markers ≥ 1. clean: rc == 0 AND markers == 0. All 24 legs required for green.
  • Caracal pre-flight (advisory): new caracal job runs caracal detect against every reference per push, uploads outputs as artifacts. Not gating for this PR — Caracal's Cairo 2.x coverage is incomplete and the false-positive surface against pinned Cairo 2.18 / scarb 2.18 is not yet pinned. Promoting to gating requires a separate suppression-set workstream; this delivers the M1 milestone "Caracal pre-flight in CI" as a live, every-push, recorded pre-flight. If Caracal install crashes on the runner, the install step short-circuits cleanly and the job still exits 0 — the 12-twin invariant legs remain the authoritative proof.

Honest call-outs

  • Caracal is advisory, not gating. Documented explicitly above and in the workflow preamble. The gating-promotion is a follow-up ticket once the suppression set across the 12 references is settled.
  • erc20 driver scope-creep. The clean=0 leg surfaced two latent test-driver bugs on references/erc20_planted_bug/tests/cf_invariant_total_supply.cairo: (a) the original deploy-with-1000-to-deployer pattern leaked supply into a 4th address and tripped the invariant on step 0 even under a fixed burn; (b) cheat_caller_address was applied to the wrong call when followed by a view read. Both are now fixed in-line with WHY comments. These are real fixes, not gate-shape changes; calling them out so downstream reviewers see the side-effect.

Local validation

Operator's machine, scarb 2.18.0 + snforge 0.61.0 (.tool-versions-pinned):

erc20_planted_bug      planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
governance             planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
single_side_amm        planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
erc4626_ref            planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
multisig_ref           planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
erc721_ref             planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
lending_ref            planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
staking_ref            planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
vesting_ref            planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
timelock_ref           planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
payment_splitter_ref   planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK
oracle_ref             planted(rc=1 markers=1) clean(rc=0 markers=0) -> OK

GitHub Actions run-id will be backfilled into the proof register once CI completes.

Test plan

  • references matrix: 24 legs (12 planted + 12 clean) all green on the PR's first push.
  • caracal advisory job exits 0 (gracefully degrades on install failure if upstream pin shifts).
  • No regression to default scarb build / snforge test behavior on any reference (planted variant remains default — fixtures + on-chain deploys + findings narratives untouched).

AI disclosure

Per the cf-invariants calm-register disclosure footer policy (see STARKNET_SEPOLIA_FINDINGS.md). Co-Authored-By: Claude line present on the commit.

…ight

Lifts CI scope from `planted >= 1` (single-leg) to paired
`clean = 0` AND `planted >= 1` across all 12 reference contracts,
closing the honest-bound gap documented in 5ac98a9's commit message
("a paired clean = 0 leg would require checking in 12 clean
variants — out of scope for this dispatch").

Approach — single-tree twin via scarb feature flag:
  * Each reference's Scarb.toml declares `[features] clean = []`.
  * Each contract module defines a cfg-gated `CLEAN_TWIN: bool`
    const (`#[cfg(feature: 'clean')]` -> true; otherwise false)
    and wraps the planted-bug body in `if CLEAN_TWIN { /* fixed */ }`
    / `if !CLEAN_TWIN { /* planted */ }`.
  * Default `scarb build` / `snforge test` keeps the planted bug
    embedded — existing on-chain deploys, Voyager source-verifications,
    fixtures, and findings READMEs remain bit-identical.
  * `snforge test --features clean` flips the gate; the same
    invariant test passes (rc=0, zero `INVARIANT VIOLATED` markers).

Five test drivers (erc20, erc721, lending, multisig, timelock,
vesting) carry the same cfg-gated `CLEAN_TWIN` const + a pre-call
guard on the transitions that the clean variant would otherwise
reject (e.g. erc721 duplicate-mint, multisig double-execute,
timelock pre-delay execute, lending LTV-breaching withdraw,
vesting non-monotone-ts). Seven references needed no driver
change (payment_splitter, oracle, governance, single_side_amm,
erc4626, staking).

CI shape:
  * `references` job is now a 12x2 matrix (reference x variant).
    planted: rc != 0 AND `INVARIANT VIOLATED` markers >= 1.
    clean:   rc == 0 AND `INVARIANT VIOLATED` markers == 0.
    All 24 legs required for green.
  * New `caracal` job runs `caracal detect` against every reference
    per push, uploads outputs as artifacts. **Advisory, not gating**
    for this PR — Caracal's Cairo 2.x coverage is incomplete and
    the false-positive surface against pinned Cairo 2.18 / scarb 2.18
    is not yet pinned. Promoting to gating requires a separate
    suppression-set workstream; this delivers the M1 milestone
    "Caracal pre-flight in CI" as a live, every-push, recorded
    pre-flight.

Local validation (operator's machine, scarb 2.18.0 + snforge 0.61.0):
all 24 legs green — 12 planted legs produce 1+ markers + rc!=0;
12 clean legs produce zero markers + rc==0 across the per-test
fuzzer-runs setting.

README: new "Paired clean / planted CI" section documents the
matrix shape, the twin-via-feature-flag mechanism, the Caracal
advisory-mode call, and the run-id pattern for downstream
verification.

Pre-built ahead of Starknet Seed M1 disbursement per the
CaliperForge build-to-win roadmap.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@michael-moffett michael-moffett merged commit 0bff2c4 into main Jun 7, 2026
26 checks passed
@michael-moffett michael-moffett deleted the feat/12twin-clean-ci branch June 7, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant