ci: re-enable sccache — unblocks 240-PR cold-compile bottleneck#1632
Merged
Conversation
The Phase 3 sccache pilot was disabled on 2026-04-19 because the
`sovereign-ci:stable` container image was missing the `rustc-sccache`
wrapper script. That was fixed upstream in paiml/infra commit f4fccf9
("use exec script not symlink", PR #66) the same day, but aprender's
ci.yml was never flipped back.
Verified on intel runner:
$ docker run --rm localhost:5000/sovereign-ci:stable rustc-sccache --version
sccache 0.14.0
$ docker run --rm localhost:5000/sovereign-ci:stable which rustc-sccache
/usr/local/bin/rustc-sccache
Sccache cache directory is warm: `/home/noah/data/sccache` is ~11GB
across 290 sub-dirs, shared across all 16 intel-clean-room runners and
all PRs via the existing `/home/noah/data/sccache:/sccache` bind-mount
in `paiml/.github/.github/workflows/sovereign-ci.yml`.
Why this matters:
- Per-PR target dir scheme (`/mnt/nvme-raid0/targets/aprender-ci/<PR>`)
from #1043 cold-compiles each new PR's 879 deps from scratch.
- Job timing (PR #1619 latest run): 34min build + 4min tests = 40min
timeout. Tests never finish.
- 249-PR queue × 34min cold compile = backlog cannot drain.
- With sccache hit-rate ≥80% expected on a warm cache, cold builds
drop from 34min → ~3-5min, and the timeout becomes a non-issue.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first commit on this branch flipped enable_sccache=true on the
reusable ci/{test,lint,coverage,...} jobs. That doesn't reach the
inline `workspace-test` job (the slowest one, where the 40min timeout
actually fires), so this commit wires sccache into it directly:
- Bind-mount /home/noah/data/sccache:/sccache (shared across all 16
intel-clean-room runners + all PRs; sccache handles concurrency
via per-entry atomic rename + LRU eviction).
- Set RUSTC_WRAPPER=rustc-sccache (image-baked exec shim) and
SCCACHE_DIR=/sccache env vars.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Diagnosis
```
$ docker run --rm localhost:5000/sovereign-ci:stable rustc-sccache --version
sccache 0.14.0
$ docker run --rm localhost:5000/sovereign-ci:stable which rustc-sccache
/usr/local/bin/rustc-sccache
```
Wrapper is in the live image. Cache dir warm:
```
$ sudo du -sh /home/noah/data/sccache
11G /home/noah/data/sccache
```
Bind-mount and `RUSTC_WRAPPER=rustc-sccache` env var are already wired in `paiml/.github/.github/workflows/sovereign-ci.yml:210`, gated on the `enable_sccache` input. Only aprender's flag was stale.
Job timing — current state (PR #1619, last failed run)
Expected after this PR
With ~80% sccache hit rate on warm cache:
Test plan
🤖 Generated with Claude Code