Skip to content

feat: standalone criterion bench workflow + self-hosted intel runner#12

Merged
noahgift merged 1 commit into
mainfrom
feat/bench-workflow
Apr 28, 2026
Merged

feat: standalone criterion bench workflow + self-hosted intel runner#12
noahgift merged 1 commit into
mainfrom
feat/bench-workflow

Conversation

@noahgift
Copy link
Copy Markdown
Contributor

Summary

  • Adds .github/workflows/bench.yml — full criterion suite on a self-hosted [intel] org runner (paiml/intel-clean-room-{1..8}), triggered manually, weekly via cron, or on main pushes that touch etl-core/ / etl-bench/ / Cargo.{toml,lock}.
  • Commits a seed run of bench-results/latest/ (Threadripper 7960X, ~9.6M rows/sec across 1k/10k/100k sizes — confirms linear scaling).
  • Updates README with a throughput badge + Benchmarks section explaining the smoke-vs-full split between gate (CI) and bench (this workflow).
  • Updates CHANGELOG under [Unreleased].

Why a self-hosted runner?

Criterion's statistical model assumes low-variance measurements. GitHub-hosted shared VMs run with 5-15% CV, hiding regressions below ~10%. Bare-metal stays under 1%, so a 2% regression is real signal. For a teaching repo, signal quality is worth the operational cost.

Why commit benchmark output?

  1. Teaching artifact — a learner reading the README can click through to actual numbers without cloning + installing criterion.
  2. Regression signal — git history preserves prior runs; multi-month drift on etl_throughput/100000 is a real signal.

Test plan

  • gate (existing CI) stays green
  • After merge, manually dispatch bench.yml from the Actions tab to confirm runner pickup + result commit-back
  • Verify the throughput badge in README links to a real bench-results/latest/SUMMARY.md

🤖 Generated with Claude Code

Three additions:

1. .github/workflows/bench.yml — runs the full criterion suite
   (warmup + 100 samples × 3 sizes) on a self-hosted [intel] runner
   from the paiml org runner pool. Triggered via workflow_dispatch,
   weekly cron (Sundays 06:00 UTC), or push to main when etl-core/
   etl-bench/ Cargo.{toml,lock} change. Pins CPU governor to
   performance during the run for low-CV measurements, captures
   host metadata, and commits curated results back to main.

2. bench-results/ — committed criterion output as a teaching
   artifact. Seed numbers from a Threadripper 7960X local run
   (~9.6M rows/sec across all three sizes — pipeline is linear in
   input size) plus a README explaining what's tracked, why
   benchmark output belongs in the repo, and how to read the JSON
   estimates. Workflow runs overwrite latest/; git history
   preserves drift.

3. README throughput badge + Benchmarks section — links to
   bench-results/latest/SUMMARY.md and explains the smoke-vs-full
   split between gate (CI) and bench (this) workflows. Covers why
   we picked a self-hosted runner (CV <1% bare metal vs 5-15%
   GitHub-hosted, so a 2% regression is real signal for a
   teaching repo).

CHANGELOG entries added under [Unreleased] / Added.
@noahgift noahgift merged commit a21beb9 into main Apr 28, 2026
3 checks passed
@noahgift noahgift deleted the feat/bench-workflow branch April 28, 2026 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant