Skip to content

v18 fresh full audit: reusable AUDIT_CHECKLIST + 3 HF gate fixes#33

Open
lucapinello wants to merge 1 commit intochorus-applicationsfrom
audit/2026-04-21-v18-fresh-full-audit
Open

v18 fresh full audit: reusable AUDIT_CHECKLIST + 3 HF gate fixes#33
lucapinello wants to merge 1 commit intochorus-applicationsfrom
audit/2026-04-21-v18-fresh-full-audit

Conversation

@lucapinello
Copy link
Copy Markdown
Contributor

Summary

A from-scratch audit with nothing cached or precomputed — fresh notebook re-execution, selenium IGV rendering (so the client-side JS actually loads tracks instead of a placeholder), CDF sanity across all 6 oracle normalizers, device detection across all 6 oracle envs on macOS arm64, and a trace of the HuggingFace gate that AlphaGenome users hit on first use.

Main deliverable — audits/AUDIT_CHECKLIST.md

A reusable 12-section runbook for future audits. Every check has an exact command or grep pattern, and P0/P1/P2 severity tags so you know what blocks ship vs. what's polish:

  1. Installation & environment — environment.yml, chorus setup, chorus genome download, per-oracle env existence
  2. HuggingFace auth — HF_TOKEN, whoami, license-repo URL consistency across 3 code paths
  3. GPU / device detection — per-env probe of TF / PyTorch / JAX, CUDA_VISIBLE_DEVICES respect
  4. Per-track CDF / normalization — monotonicity, p50/p95/p99 ordering, signed-flag semantics, track counts
  5. Python API sanity — create_oracle, sequence_length, ModelNotLoadedError, ref-allele warning
  6. Notebooks — cell-by-cell fresh execution, drift band
  7. Shipped HTML reports — selenium full-JS render recipe, IGV visibility, formula badges, cell-type dedup
  8. MCP server — 22-tool registry, list_oracles spec sync
  9. Error-message quality — every path triggered and inspected
  10. Repo-wide consistency — canonical numbers + formula conventions + directory naming
  11. Test suite — fast + integration gates
  12. Reproducibility — regen scripts, cache paths

Appendix: exactly what artefacts an audit should leave behind in audits/YYYY-MM-DD_vNN_<label>/ so the next auditor can diff mechanically.

Fixes

The HF-gate agent caught 3 live drifts — all in the flow a first-time AlphaGenome user follows:

  1. chorus/oracles/alphagenome.py:133 — the "no HF token" error pointed at huggingface.co/google/alphagenome, but the real gated repo is google/alphagenome-all-folds (matches README.md:631, README.md:926, environments/README.md:105). A user clicking the link would not find the license form. Fixed.
  2. chorus/oracles/alphagenome_source/templates/load_template.py:49 — env-runner load path raised with no URL at all. Appended the alphagenome-all-folds license URL so direct and env-runner paths give the same actionable guidance.
  3. chorus/oracles/alphagenome_source/alphagenome_metadata.py:4 — module docstring said "5,930+ tracks"; library reports 5,731 (matches v17 audit and post-v16 notebooks). Updated.
  4. Tightened tests/test_error_recovery.py:169 from a loose substring match to the full correct URL so the test actually catches this drift in CI.

What the audit found healthy

  • Fresh notebook run: exit 0, zero errors, zero warnings. The ref-allele warning that used to fire on cell 39 is gone — confirms PR Fix off-by-one in predict_variant_effect ref-allele check #32 flowed through.
  • All 6 CDF normalizers load, sort-monotonic, signed% matches expected semantics. Sei + LegNet auto-download from huggingface.co/datasets/lucapinello/chorus-backgrounds works.
  • Device detection clean on macOS arm64: TF Metal picked up for Enformer/ChromBPNet, MPS for Borzoi/Sei/LegNet, JAX device list populated for AlphaGenome. Zero hardcoded cuda:0 in live code.
  • 5 selenium-rendered HTMLs: IGV tracks load with signal overlays, 0 browser-console JS errors.

Known issues NOT fixed here

  • Genome storage hardcoded to <repo>/genomes/ in core/globals.py:13. Should be user-overridable (CHORUS_GENOMES_DIR) and default to ~/.chorus/genomes/.
  • EnvironmentManager.install_chorus_primitive can raise with empty stderr on install failure.
  • README doesn't document that chorus genome download auto-resumes after a stall.

Each is worth its own focused PR.

Test plan

  • pytest tests/ --ignore=tests/test_smoke_predict.py -q334 passed / 1 skipped (9m 11s)
  • Fresh jupyter nbconvert --execute single_oracle_quickstart.ipynb → exit 0, 0 errors, 0 warnings
  • Selenium rendering of 5 HTMLs at 1600×4500 → 0 SEVERE/ERROR JS messages, IGV tracks visible
  • CDF sanity script (monotonicity + p50≤p95≤p99) → all 6 oracles pass
  • Device probe across 6 oracle envs → all backends detected correctly on macOS arm64
  • Tightened test assertion still passes with the new URL

🤖 Generated with Claude Code

Ran a from-scratch audit with nothing cached or precomputed — fresh
notebook re-execution, selenium IGV rendering (so the client-side JS
actually loads), CDF sanity across all 6 oracle normalizers, device
detection across all 6 oracle envs on macOS arm64, and a trace of the
HuggingFace gate path.

## Main deliverable

`audits/AUDIT_CHECKLIST.md` — a 12-section reusable checklist covering
Install → HF auth → GPU/device → CDFs → Python API → Notebooks → HTML
reports (incl. IGV) → MCP server → Error paths → Repo-wide consistency
→ Tests → Reproducibility. Every check has an exact command or
grep pattern. P0/P1/P2 severity on each item. Future audits should
walk this top-to-bottom and leave behind a populated
`audits/YYYY-MM-DD_vNN_<label>/` per the appendix.

## Fixes

The HF agent caught 3 live drifts in the AlphaGenome gate flow:

1. chorus/oracles/alphagenome.py:133 — the "no HF token" error pointed
   at `huggingface.co/google/alphagenome`, but the real gated repo is
   `google/alphagenome-all-folds` (README.md:631, README.md:926,
   environments/README.md:105 all agree). A user clicking the link
   would not find the license form. Fixed.

2. chorus/oracles/alphagenome_source/templates/load_template.py:49 —
   the env-runner load path raised with no URL at all ("Set HF_TOKEN
   or run huggingface-cli login"). Appended the alphagenome-all-folds
   license URL so direct and env-runner paths give the same actionable
   guidance.

3. chorus/oracles/alphagenome_source/alphagenome_metadata.py:4 —
   module docstring said "AlphaGenome predicts 5,930+ human functional
   genomic tracks"; the library reports 5,731 (matches v17 audit and
   the notebooks after v16). Updated.

Also tightened tests/test_error_recovery.py:169 from a loose substring
match on `huggingface.co/google/alphagenome` to the full correct URL
`huggingface.co/google/alphagenome-all-folds` so the test catches
future drift in CI.

## What the audit found healthy

- Fresh notebook run: exit 0, zero errors, zero warnings. The ref-
  allele warning that used to fire on cell 39 is gone (v17 PR #32 fix
  flowed through).
- All 6 CDF normalizers load, sort-monotonic, signed% matches expected
  semantics. Sei + LegNet auto-download from the HF dataset works.
- Device detection clean across TF / PyTorch / JAX envs on macOS arm64
  (Metal / MPS detected correctly; zero hardcoded cuda:0).
- 5 selenium-rendered HTML reports: IGV tracks load, 0 browser-console
  JS errors.

Full report at `audits/2026-04-21_v18_fresh_full_audit.md`.

Tests: 334 passed / 1 skipped (fast suite, 9m 11s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant