feat: standalone criterion bench workflow + self-hosted intel runner by noahgift · Pull Request #12 · paiml/shipping-rust

noahgift · 2026-04-28T11:02:28Z

Summary

Adds .github/workflows/bench.yml — full criterion suite on a self-hosted [intel] org runner (paiml/intel-clean-room-{1..8}), triggered manually, weekly via cron, or on main pushes that touch etl-core/ / etl-bench/ / Cargo.{toml,lock}.
Commits a seed run of bench-results/latest/ (Threadripper 7960X, ~9.6M rows/sec across 1k/10k/100k sizes — confirms linear scaling).
Updates README with a throughput badge + Benchmarks section explaining the smoke-vs-full split between gate (CI) and bench (this workflow).
Updates CHANGELOG under [Unreleased].

Why a self-hosted runner?

Criterion's statistical model assumes low-variance measurements. GitHub-hosted shared VMs run with 5-15% CV, hiding regressions below ~10%. Bare-metal stays under 1%, so a 2% regression is real signal. For a teaching repo, signal quality is worth the operational cost.

Why commit benchmark output?

Teaching artifact — a learner reading the README can click through to actual numbers without cloning + installing criterion.
Regression signal — git history preserves prior runs; multi-month drift on etl_throughput/100000 is a real signal.

Test plan

gate (existing CI) stays green
After merge, manually dispatch bench.yml from the Actions tab to confirm runner pickup + result commit-back
Verify the throughput badge in README links to a real bench-results/latest/SUMMARY.md

🤖 Generated with Claude Code

Three additions: 1. .github/workflows/bench.yml — runs the full criterion suite (warmup + 100 samples × 3 sizes) on a self-hosted [intel] runner from the paiml org runner pool. Triggered via workflow_dispatch, weekly cron (Sundays 06:00 UTC), or push to main when etl-core/ etl-bench/ Cargo.{toml,lock} change. Pins CPU governor to performance during the run for low-CV measurements, captures host metadata, and commits curated results back to main. 2. bench-results/ — committed criterion output as a teaching artifact. Seed numbers from a Threadripper 7960X local run (~9.6M rows/sec across all three sizes — pipeline is linear in input size) plus a README explaining what's tracked, why benchmark output belongs in the repo, and how to read the JSON estimates. Workflow runs overwrite latest/; git history preserves drift. 3. README throughput badge + Benchmarks section — links to bench-results/latest/SUMMARY.md and explains the smoke-vs-full split between gate (CI) and bench (this) workflows. Covers why we picked a self-hosted runner (CV <1% bare metal vs 5-15% GitHub-hosted, so a 2% regression is real signal for a teaching repo). CHANGELOG entries added under [Unreleased] / Added.

noahgift merged commit a21beb9 into main Apr 28, 2026
3 checks passed

noahgift deleted the feat/bench-workflow branch April 28, 2026 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: standalone criterion bench workflow + self-hosted intel runner#12

feat: standalone criterion bench workflow + self-hosted intel runner#12
noahgift merged 1 commit into
mainfrom
feat/bench-workflow

noahgift commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 28, 2026

Summary

Why a self-hosted runner?

Why commit benchmark output?

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant