perf(lottery): per-signer Taylor cache + factored compare (~46% fewer guest cycles) by Sbcdn · Pull Request #10 · blocksmithy/mithril-dwarf

Sbcdn · 2026-06-11T22:42:15Z

Two structural optimizations to the BLS lottery — the dominant per-cert proving cost — plus the test scaffolding and divergence-registry updates that prove they hold against upstream Mithril.

Changes

Per-signer Taylor cache. The (phi_plus, phi_minus) error-bound sequence is a pure function of the per-signer x and per-cert three; only the comparison against q is per-index. Build it once per signer and replay the cached bounds instead of recomputing the full series for every index (~48 indices/signer on mainnet).

Factor ev_max out of the compare. q.numer is always ev_max, so the ev_max * bound.denom side of each cross-multiply is constant per (signer, level). Precompute it once per cached bound, halving the per-level wide-muls.

Guest cycles (oaks_cert cycle_bench, mainnet SD certs)

cert	before	after	Δ
sd_a	301.2M	153.2M	-49.1%
sd_b	316.9M	171.4M	-45.9%
sd_c	276.1M	147.9M	-46.4%

Correctness

Both changes are bit-identical to the prior code by construction (the cache is a pure refactor; the factoring precomputes one operand of the same cross-multiply).

Full equivalence harness (real MithrilCertificateVerifier over the corpus + ~25 mutation axes): green.
Divergence-registry corpus verdict-equivalence gate: green.
upstream_differential: the cache/compare diffed against a faithful BigInt re-port of upstream mithril-stm's lottery over 2M realistic + 200k overflow-regime + 40k random inputs and a decision-boundary sweep — 0 regressions, 0 soundness-direction divergences.
factored_compare_*: q_gt_bound/q_lt_bound pinned bit-for-bit against crypto-ratio's gt/lt, including near-equal (tied-high-limb) constructions that end-to-end fuzz cannot reach.

Also documents three pre-existing lottery approximations vs upstream (from_float ~2^-52, ev_max 2^512-1, U512 Taylor overflow-panic) as divergence-registry entries #7-#9, and adds a guest-only guest-bench feature for per-section cycle profiling.

The per-index lottery check recomputed the full Taylor expansion of exp(x) for every index, but the (phi_plus, phi_minus) bound sequence is a pure function of the per-signer x and the per-cert `three` — only the comparison against q is per-index. Build the sequence once per signer and replay the cached bounds; the expensive Ratio512 normalize/wide-mul work now amortises across a signer's indices (47.6 indices/signer on a real mainnet cert). Guest cycle counts (oaks_cert cycle_bench, mainnet SD certs): sd_a 301.2M -> 184.7M (-38.7%) sd_b 316.9M -> 202.4M (-36.1%) sd_c 276.1M -> 173.2M (-37.3%) genesis unchanged (no lottery path). Behaviourally a no-op: the cached bounds are bit-identical to the pre-cache series by construction. Verified by differential fuzz in `upstream_differential` (cache == old series over 2M realistic + 200k overflow-regime inputs; 0 mismatches vs a faithful re-port of upstream mithril-stm's exact BigInt lottery over 40k random inputs; 0 soundness-direction divergences in the decision-boundary sweep) and by `taylor_cache_tests` (688 synthetic + 2428 real mainnet cert indices). Documents three pre-existing lottery approximations vs upstream (from_float ~2^-52, ev_max 2^512-1, U512 Taylor overflow-panic) as divergence-registry entries #7-#9; none introduced by this change.

q.numer is always ev_max, so the ev_max*bound.denom side of each compare is constant per (signer, level). Precompute it once per cached Taylor bound and reuse it across the signer's indices, halving the per-level wide-muls. ~15-17% fewer guest cycles per SD cert. Bit-identical to the unfactored compare; pinned by factored_compare_* against crypto-ratio gt/lt and the upstream differential suite. Also adds a guest-only guest-bench feature with per-section cycle_count probes in verify_bls_multisig.

Sbcdn added 3 commits June 11, 2026 22:45

chore: gitignore /.claude

f12cb99

Sbcdn merged commit 4466979 into main Jun 11, 2026
1 check failed

Sbcdn deleted the perf/taylor-per-signer-cache branch June 11, 2026 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(lottery): per-signer Taylor cache + factored compare (~46% fewer guest cycles)#10

perf(lottery): per-signer Taylor cache + factored compare (~46% fewer guest cycles)#10
Sbcdn merged 3 commits into
mainfrom
perf/taylor-per-signer-cache

Sbcdn commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Sbcdn commented Jun 11, 2026

Changes

Guest cycles (oaks_cert cycle_bench, mainnet SD certs)

Correctness

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant