Fix v7 UX audit findings: phantom base env + docs drift#17
Open
lucapinello wants to merge 53 commits intomainfrom
Open
Fix v7 UX audit findings: phantom base env + docs drift#17lucapinello wants to merge 53 commits intomainfrom
lucapinello wants to merge 53 commits intomainfrom
Conversation
…und distributions - New chorus/analysis module: multi-layer scoring (scorers.py), variant reports, quantile normalization, batch scoring, causal prioritization with enriched HTML tables (gene, cell type, per-layer score columns, top-3 IGV signal tracks), cell type discovery, region swap, integration simulation - New chorus/analysis/build_backgrounds.py: variant effect and baseline signal background distributions for quantile normalization, with batch GPU scripts - 8 application examples with full outputs: variant analysis (SORT1, TERT, BCL11A, FTO across AlphaGenome/Enformer/ChromBPNet), causal prioritization, batch scoring, cell type discovery, sequence engineering (region swap + integration simulation) - Validation against AlphaGenome paper: SORT1 confirmed, TERT partially confirmed (ELF1 limitation documented), HBG2 not reproduced in K562 or monocytes (ISM vs log2FC methodology difference documented with side-by-side comparison) - Fix mamba PATH resolution in environment runner and manager - Add gene_name, cell_type fields to CausalVariantScore and BatchVariantScore - 500 common SNPs BED file for background computation - 91 tests covering all analysis components Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The annotation module was re-parsing the full 1GB GENCODE GTF file on every call to get_genes_in_region, get_gene_tss, and get_gene_exons (~11s each). Now the GTF is loaded once per feature type (gene/transcript/exon) and cached as a DataFrame for the process lifetime. Exon lookups use a groupby index for O(1) gene-name access. Before: 11,000ms per query (full GTF scan) After: 0.03s genes, 0.04s TSS, 1.5ms exons (cached) Full analysis test suite now completes in 2 min (was timing out at 10+ min). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ores Single-process AlphaGenome script that extracts all 3,763 valid tracks (711 cell types × 6 output types) from each forward pass. Same GPU time as K562-only (~55 min total on A100) but yields comprehensive per-layer distributions across all cell types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive background distribution builder: - 10K random SNPs from hg38 reference across all autosomes - 20K protein-coding gene TSS positions (promoter state baselines) - 5K random genomic positions (general baseline) - Parallel GPU execution: --part variants --gpu 0 / --part baselines --gpu 1 - All 3,763 AlphaGenome tracks extracted per forward pass Expected output: ~37M variant scores + ~94M baseline samples in ~18 hours. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n counting Covers all AlphaGenome output types: - Window-based: DNASE, ATAC, CHIP_TF, CHIP_HISTONE, CAGE, PROCAP, SPLICE_SITES, SPLICE_SITE_USAGE - Exon-counting: RNA_SEQ (sum across merged protein-coding exons per gene) - All backgrounds unsigned (abs magnitude) for quantile ranking Pre-loads GENCODE v48 gene annotations and builds spatial index for fast exon lookup within prediction windows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…haul
Analysis framework:
- PerTrackNormalizer with per-track CDFs (effect, activity, perbin) for all 6 oracles
- Auto-download backgrounds from HuggingFace on oracle load
- AnalysisRequest dataclass preserves user's original prompt on every report
- Magnitude-gated interpretation labels ("Very strong" requires |effect| > 0.7)
- Top-10-per-layer cap in markdown reports with truncation footer
- Biological interpretation + suggested next steps on all 14 example outputs
- Literature caveats where oracle predictions diverge from published biology
Bug fixes:
- Sequence.slice() missing self argument in interval.py (broke Enformer predictions)
- oracle_name="oracle" placeholder in region_swap, integration, discovery
- Cell-type column bloat in batch_scoring and causal (thousands of cell types)
- Corrupted igv.min.js in 15 HTML files from misplaced injection into JS string
- predict() called with string region instead of (chrom, start, end) tuple
MCP server:
- All 8 critical tools accept user_prompt and forward it into reports
- _safe_tool decorator returns structured {"error", "error_type"} on failure
- Improved docstrings: score_variant_batch (variant dict schema), discover_variant_cell_types (runtime + cell count), fine_map_causal_variant (composite formula + output columns)
- Causal table shows Top Layer column; batch scoring resolves track IDs to human-readable names
Application examples (14 folders, all with MD/JSON/TSV/HTML):
- Regenerated all variant_analysis, validation, discovery, causal, batch, sequence_engineering
- Every report has Analysis Request header + Interpretation section
- Cleaned stale intermediate files (5 removed)
- IGV browser verified working in headless Chrome
Documentation:
- README: "Start here" applications callout, updated MCP tools list, MCP walkthrough link
- API_DOCUMENTATION: application layer section (all 6 functions + AnalysisRequest)
- MCP_WALKTHROUGH.md: 5 example conversations showing natural-language usage
- Natural-language framing notes on all 7 category READMEs
- Fixed AlphaGenome HF URL, clarified environment.yml vs chorus-base.yml
- Notebook install banners for comprehensive/advanced (all 6 oracles required)
Scripts:
- Internal scripts moved to scripts/internal/
- regenerate_examples.py + regenerate_remaining_examples.py for reproducible output generation
- scripts/README.md updated with public script descriptions
Testing:
- 268 tests passed (including new magnitude-gate and causal-table tests)
- All 3 notebooks executed end-to-end (Enformer, all 6 oracles, multi-oracle analysis)
- IGV browser rendering verified via Selenium in headless Chrome
- MCP server startup verified
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These are replaced by the per-oracle build_backgrounds_*.py scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove stale outputs containing machine-specific paths (/Users/lp698/...) and runtime-specific logs. Notebooks should be committed clean so new users run them fresh in their own environment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace /PHShome/lp698/chorus with REPO_ROOT (computed from __file__) in all 8 public scripts (6 build_backgrounds + 2 regenerate) - Clear stale notebook output cells containing machine-specific paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ll traceability Batch scoring: - Per-track columns (one per assay:cell_type) with raw score + percentile - track_scores dict preserved on BatchVariantScore for programmatic access - display_mode parameter: "by_assay" (default), "by_cell_type", "summary" - Track ID footnotes for tracing back to oracle data - oracle_name parameter fixes normalizer CDF lookup (was returning None) Causal prioritization: - Per-track columns replacing generic "Max Effect / Top Layer" - Each cell shows raw effect + percentile for each scored track - track_scores dict on CausalVariantScore Report infrastructure: - report_title field: "Region Swap Analysis Report", "Integration Simulation Report" - modification_region: IGV highlights full replaced/inserted region (not 2-3bp) - modification_description: documents what was inserted/replaced and its length - has_quantile scoping fix (UnboundLocalError on empty allele_scores) All examples regenerated with biologically specific tracks: - SORT1: HepG2 DNASE + CEBPA + CEBPB + H3K27ac + CAGE (reproduces Musunuru) - BCL11A: K562 DNASE + GATA1 + TAL1 + H3K27ac + CAGE (reproduces Bauer) - FTO: HepG2 tracks (nearest metabolic cell type available) - TERT: K562 tracks - Validation: forced HepG2 CEBP tracks matching the AlphaGenome paper Every report carries the user's original prompt (Analysis Request block). All 13 examples verified: MD + JSON + TSV + HTML, prompt present, 268 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…racle table - Remove ChromBPNet loading and "Combining oracles" sections (belong in main README) - Rename "Window" to "Output window" + add Resolution column - Separate Effect percentile and Activity percentile explanations - Add recommendation to start with AlphaGenome - Remove Python API details (get_normalizer) that don't belong here Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- .mcp.json: drop the /data/pinello/... PATH hardcoding so new users can use the file as-is from `curl` in any environment. mamba resolves the chorus env without an explicit PATH override. - README.md: add LDlink token setup section under Troubleshooting — fine_map_causal_variant auto-fetch path was silently failing for users without a free LDlink API key. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changes informed by a fresh walkthrough of the README from the perspective of a brand-new user: - Reorder Installation: Fresh Install now comes before Upgrading (a first- time reader no longer sees "remove existing envs" before they install) - Consolidate the Fresh Install block to cover env create, pip install, chorus setup --oracle enformer, and chorus genome download hg38 so a user copying the block ends up actually ready to run the Quick Start - Clarify that the root environment.yml is what you install and the per-oracle YAMLs in environments/ are internal to `chorus setup` - Quick Start: point to examples/single_oracle_quickstart.ipynb for users who prefer a notebook, and call out the setup prerequisite explicitly - Annotate the ENCFF413AHU track ID in the DNase snippet so users know what it is before the Discovering Tracks section explains it - HF_TOKEN: note that Claude Code inherits env from the shell where `claude` is started (the MCP server is spawned by that shell) - Add a "Further reading" section linking the docs/ folder — previously API_DOCUMENTATION, METHOD_REFERENCE, VISUALIZATION_GUIDE, and IMPLEMENTATION_GUIDE were all invisible to a README-only reader - Remove REQUIREMENTS_CHECKLIST.md (internal audit scratch file) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gitignore
Real issues caught by a deeper audit pass and fixed:
- **Duplicate CAGE column headers**: batch scoring tables rendered two
"CAGE:HepG2" columns because both + and - strand tracks have
identical description fields. _track_display_name now appends (+) / (-)
when the assay_id carries a strand suffix, producing unique column
labels in markdown, HTML, and DataFrame outputs.
- **UnboundLocalError in _build_html_report**: has_quantile /
has_baseline were defined inside a for-loop that doesn't execute for
empty allele_scores, causing .to_html() to crash on minimally-populated
reports. Initialise both before the loop (matches the markdown fix).
- **docs/RELEASE_CHECKLIST.md**: internal QA checklist with stale
metrics (references 128 tests when we have 280). Removed — internal
docs shouldn't live in user-facing docs/.
- **API_DOCUMENTATION / METHOD_REFERENCE overlap**: added reciprocal
callouts clarifying that API_DOCUMENTATION is authoritative and
METHOD_REFERENCE is a one-line cheat sheet.
- **logs/ not ignored**: 82 MB of run logs at risk of being committed.
Added to .gitignore.
Test coverage added (+12 tests, 268 → 280 passed):
- TestReportMetadataFields: report_title, modification_region,
modification_description rendering in MD / HTML / dict
- TestBatchDisplayModes: by_assay, by_cell_type, track-ID footnote,
CAGE strand disambiguation in DataFrame columns
- TestSafeToolDecorator: passthrough, exception → error dict,
function name preservation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Columns and TSV headers now show CAGE:HepG2 (+) and CAGE:HepG2 (-) instead of two identical CAGE:HepG2 columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A new user could miss this entirely — the previous mention was a single
feature bullet ("auto-downloaded from HuggingFace") with no detail.
This section spells out:
- The backgrounds turn raw log2FC into the effect/activity percentiles
shown in every report
- They're fetched on first oracle use from the public HF dataset
lucapinello/chorus-backgrounds and cached in ~/.chorus/backgrounds/
- File sizes per oracle (so users with limited disk know what to expect)
- **No HF_TOKEN required** for backgrounds (only AlphaGenome model is gated)
- LDlink token is separate and only needed for causal auto-fetch
- Optional pre-download snippet for users who want to avoid the first-use
wait
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a comprehensive reference appendix covering:
- What the backgrounds are (effect %ile vs activity %ile vs per-bin)
and why they exist (turn raw log2FC into genome-aware metrics)
- How they were calculated:
* Variant effect distribution: 10K random SNPs × all tracks with
layer-specific scoring formulas (log2FC, logFC, diff)
* Activity distribution: ~31.5K positions (random intergenic +
ENCODE SCREEN cCREs + protein-coding TSSs + gene-body midpoints)
* Per-bin distribution: 32 random bins per position for IGV scaling
* RNA-seq exon-precise sampling rule
* CAGE summary routing rule
- Sample sizes per oracle (track count, samples per track, NPZ size)
- Python API usage with verified signatures (get_pertrack_normalizer,
download_pertrack_backgrounds, effect_percentile, activity_percentile,
perbin_floor_rescale_batch)
- MCP / Claude usage (auto-attached, zero-config)
- Documented ranges and a sanity-check rule of thumb for interpretation
- How to reproduce or extend the backgrounds via build_backgrounds_*.py
All function signatures in the appendix were verified against the actual
implementation before committing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Local-only IDE/agent state (settings, scheduled tasks lock) — per- developer, not for the repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pies - AUDIT_PROMPT.md: systematic end-to-end audit script for a new machine, with REPLACE_* placeholders for HF_TOKEN and LDLINK_TOKEN. - .gitignore: block any *_WITH_TOKENS.md or AUDIT_PROMPT_WITH_TOKENS* file from ever being staged, since filled-in copies contain secrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Records what worked and what did not on a fresh macOS 15.7.4 / arm64
clone of chorus-applications: full install, all 6 oracle smoke-predicts,
286/286 pytest pass, 3 example notebooks (0 errors), 22 MCP tools registered,
6/6 application tools producing correct outputs (rs12740374 SORT1 case
reproduces the published Musunuru-2010 finding), 18/19 application HTML
reports IGV-verified via headless Chrome, ChromBPNet smoke build completed
end-to-end on CPU.
Top issues a macOS user hits, ranked, with one-or-two-line fixes:
1. No Apple GPU (MPS / Metal / jax-metal) auto-detect — frameworks are
installed but borzoi/sei/legnet only check torch.cuda.is_available(),
SEI hardcodes map_location='cpu', chrombpnet/enformer envs lack
tensorflow-metal. Verified Borzoi runs on MPS in 4.3 s when forced.
2. SEI Zenodo download via stdlib urllib at ~80 KB/s — 3.2 GB tar takes
~11 h. curl -C - -L recovers it in ~30 min.
3. fine_map_causal_variant rsID-only crash (KeyError 'chrom' at
causal.py:355). Workaround: pass "chr1:pos REF>ALT" form.
4. Two-mamba-installs MAMBA_ROOT_PREFIX gotcha breaks chorus health.
5. Notebooks need explicit `python -m ipykernel install --user --name chorus`.
6. SEI download has no single-flight lock — concurrent inits race.
Verdict: production-ready with caveats. None of the issues block correctness;
all are operational or one-line code fixes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses every actionable item in audits/2026-04-14_macos_arm64.md.
All changes are platform-conditional — Linux CUDA paths are unchanged.
PyTorch oracles (borzoi, sei, legnet) — auto-detect MPS on Apple Silicon
- Both the in-process loader (chorus/oracles/{borzoi,sei,legnet}.py) and
the subprocess templates ({borzoi,sei,legnet}_source/templates/{load,
predict}_template.py) now resolve `device is None` (or the new 'auto'
sentinel) as: cuda > mps > cpu. Linux + CUDA box hits the cuda branch
first, no behavior change there.
- SEI: replaced the hard `map_location='cpu'` device pin (the value is
still used to load weights to host memory before .to(device), which is
the standard pattern across torch versions and works for mps too).
- Sei BSplineTransformation lazily moved its spline matrix only when
`input.is_cuda`. Generalized to any non-CPU device so the matmul works
on MPS as well. Verified: 286/286 pytest still pass.
TensorFlow oracles (chrombpnet, enformer) — Metal backend on Apple Silicon
- chorus/core/platform.py macos_arm64 adapter now adds
`tensorflow-metal>=1.1.0` to pip_add. Once installed, Apple's plugin
registers a 'GPU' physical device, so the oracles' existing
tf.config.list_physical_devices('GPU') auto-detect picks it up with no
code change. Linux paths don't see the macos_arm64 adapter so CUDA stays
intact.
JAX oracle (alphagenome) — unchanged
- Already explicitly skips Metal in auto-detect (jax-metal still missing
`default_memory_space` for AlphaGenome). README updated to document
this trade-off.
MCP fix — fine_map_causal_variant rsID-only crash
- Calling `fine_map_causal_variant(lead_variant="rs12740374")` previously
raised KeyError: 'chrom' at chorus/analysis/causal.py:355 because
`_parse_lead_variant("rs12740374")` returns {"id": ...} only.
- Backfill chrom/pos/ref/alt onto the sentinel from the LDlink response
(which always carries them) before invoking prioritize_causal_variants.
- Verified end-to-end: rs12740374 ranked #1 with composite=1.000 of 12 LD
variants on AlphaGenome (matches the published Musunuru-2010 finding).
SEI Zenodo download — chunked + resume + single-flight lock
- Replaced urllib.request.urlretrieve with a stdlib chunked urlopen loop
that supports HTTP Range resume and an fcntl exclusive lock so two
concurrent SeiOracle inits don't race the same partial file. Original
observed throughput on macOS was ~80 KB/s (would take ~11 hours for the
3.2 GB tar); the new path resumes interrupted downloads and progress-
logs every 100 MB.
README — macOS troubleshooting + Apple GPU policy table + kernel install
- Documented the two-mamba-installs MAMBA_ROOT_PREFIX gotcha that breaks
`chorus health` when the new chorus env lands in a different mamba root
than the per-oracle envs.
- Added the per-oracle macOS GPU support matrix (MPS / Metal / CPU) with
explicit `device=` examples.
- Added the missing `python -m ipykernel install --user --name chorus`
step to Fresh Install so examples/*.ipynb find the chorus kernel.
Validation on macOS 15.7.4 / Apple Silicon (CPU + MPS + Metal):
- 286/286 pytest pass (incl. all 6 oracle smoke-predict tests)
- chorus.create_oracle('borzoi') auto-detects mps:0
- chorus.create_oracle('sei') auto-detects mps:0 + smoke-predict ok
- chrombpnet env now reports tf.config.list_physical_devices('GPU') = [GPU:0]
- fine_map_causal_variant(lead_variant='rs12740374') ranks rs12740374
composite=1.000 of 12 LD variants
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…EI resumable download, rsID backfill) Verified on Linux CUDA: 285/285 code tests pass. Alphagenome smoke errors due to expired HF token in chorus-alphagenome env (unrelated to PR). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tests/test_mcp.py: TestFineMapRsidBackfill verifies fine_map_causal_variant backfills chrom/pos/ref/alt when a caller passes only an rsID lead_variant. Regression test for the macOS audit crash (KeyError: 'chrom'). - examples/*.ipynb: Re-executed all three notebooks end-to-end on Linux CUDA to refresh outputs against the merged audit branch. Full suite now: 286/286 tests pass (including alphagenome real-oracle smoke). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regeneration on Linux CUDA GPU 1 (GPU 0 was full): - AlphaGenome variant + validation (5 examples) — 28 min - Remaining AlphaGenome (batch/causal/discovery/seq) — ~14 min - Enformer SORT1 — 1 example - ChromBPNet SORT1 — 1 example Discovery HTML filenames now use oracle_name "alphagenome" (was placeholder "oracle"). Verification: - 289/289 tests pass across combined runs (6 oracle smoke tests green on GPU 1) - Selenium screenshot sweep: 18/19 HTML render cleanly; 1 "NO-IGV" is the batch_scoring HTML which is a scoring table by design (no browser view) - Hardcoded /PHShome/lp698/chorus paths in notebook log outputs redacted to /path/to/chorus .gitignore: ignore examples/applications/**/*_screenshot.png so selenium artifacts don't pollute the repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lper Adds audits/2026-04-15_macos_arm64_post_merge.md — the second-pass end-to-end audit on a fully wiped + fresh-cloned install, after the PR #7 macOS-support changes were merged. Every fix from v1 is confirmed working on a clean setup: * chrombpnet + enformer envs now pull in tensorflow-metal automatically → `Auto-detected 1 GPU(s) … name: METAL` * borzoi/sei/legnet auto-detect mps:0 * fine_map_causal_variant("rs12740374") rsID-only returns rs12740374 composite=0.963 of 12 LD variants (was KeyError in v1) * analyze_variant_multilayer reproduces Musunuru-2010 biology (CEBPA strong binding gain +0.37, DNASE strong opening +0.43) * 286/286 pytest, 0 notebook errors, 19/19 IGV reports ok Two download-reliability findings surfaced on this clean run (both pre-existing, both same bug-class as the SEI fix that landed in PR #7): chorus/utils/genome.py stalled at 36% of the hg38 download, and chorus/oracles/chrombpnet.py has no single-flight lock so two concurrent callers race the ENCODE tar and hit EOFError. This commit also adds chorus/utils/http.py — the resume+lock helper that previously lived inside SeiOracle, now extracted as a shared stdlib-only utility so genome + chrombpnet can reuse it. The sei.py helper shim keeps the old public API working. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…elper
Three call sites fetch large files from the public internet with plain
urllib.request.urlretrieve (no resume, no concurrency lock). The
2026-04-15 v2 audit on a fresh install hit two of them the hard way:
UCSC cut the hg38 connection at ~36% of the 938 MB download
(urllib.error.URLError: retrieval incomplete: got only 363743871 out
of 983659424 bytes), and two concurrent callers of
_download_chrombpnet_model raced the same partial ENCODE .tar.gz so
one read it mid-write and hit
EOFError: Compressed file ended before the end-of-stream marker was
reached
inside tarfile.extractall.
Re-use the resume+lock helper introduced for SEI in PR #7, lifted
into chorus/utils/http.py in the preceding commit:
chorus/oracles/sei.py
_download_with_resume staticmethod becomes a thin shim that
forwards to chorus.utils.http.download_with_resume. No behaviour
change and no API break.
chorus/utils/genome.py
GenomeManager.download_genome swaps urllib.request.urlretrieve
for download_with_resume. Fixes the UCSC stall observed in the
v2 audit; partial .fa.gz is now resumable across retries.
chorus/oracles/chrombpnet.py
_download_chrombpnet_model (ENCODE tar) and _download_jaspar_motif
(JASPAR motif) both route through download_with_resume. The fcntl
lock on <dest>.lock serialises concurrent callers so the pytest
smoke fixture and a background build_backgrounds_chrombpnet.py
job can no longer corrupt each other's download.
All three changes are platform-agnostic; the helper is stdlib-only
(urllib + fcntl). Linux CUDA is not touched.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…S audit v2 audit confirmed post-merge macOS works end-to-end (286/286 tests, 19/19 HTML, reproduces Musunuru-2010 biology, rsID backfill verified). Adds chorus/utils/http.py (resume + fcntl-lock) and routes hg38 genome + chrombpnet ENCODE tar + JASPAR motif downloads through it. SEI helper becomes a shim for backward-compat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Third-pass audit going one level deeper than v1 (pre-merge smoke) and
v2 (post-merge fresh install). Scope: 14 application examples × 19 HTML
reports × 6 per-track normalizer NPZs + the scoring/normalization stack.
Read-only deliverable — the fixes identified here belong in a separate
follow-up PR after review.
Findings (5, ranked by severity):
1. HIGH — chrombpnet_pertrack.npz:DNASE:hindbrain has 0 background
samples and an all-zeros CDF. PerTrackNormalizer.effect_percentile()
silently returns 1.0 for every raw_score (including 0.0) because
np.searchsorted on a zeros row ranks everything at the end, and
_get_denominator falls through to cdf_width=10000 when counts[idx]=0.
Same bug class as the v2 concurrent-download race that landed in
PR #8 — the hindbrain model download failed silently and left a
zero-count reservoir. Impact: any variant scored against
DNASE:hindbrain in ChromBPNet gets a false "100th percentile".
2. MEDIUM — every committed HTML report loads igv.min.js from
cdn.jsdelivr.net at view time. 2/19 reports flaked on
net::ERR_CERT_AUTHORITY_INVALID during this audit; any user
behind a corporate proxy / airgapped network / jsdelivr outage
will see IGV silently fail with no fallback. No SRI either.
3-5. LOW — documentation improvements:
- TERT_promoter example doesn't caveat that C228T's published
biology is melanoma-specific; K562 result (all negative) is
correctly modelled but reads as "no effect" without context
- AlphaGenome DNASE vs ChromBPNet ATAC disagree on rs12740374
direction in HepG2 (+0.45 vs -0.11); no application note
teaches this real cross-oracle divergence
- HBG2_HPFH footer notes BCL11A/ZBTB7A catalog absence; could
be tightened
Normalization stack verified clean:
- CDF monotonicity: 0 bad rows across 18,159 tracks × 10,000 points
- signed_flags match LAYER_CONFIGS.signed exactly (AG 667 RNA-seq,
Borzoi 1543 stranded RNA, SEI 40/40 regulatory_classification,
LegNet 3/3 MPRA; Enformer 0 signed is correct — no RNA-seq)
- Build-vs-scoring window_bp bit-identical via shared LAYER_CONFIGS
- Pseudocount/formula: _compute_effect reproduces reference
implementation with diff=0.0 across all test cases
- perbin_floor_rescale_batch math verified at all edges
- Edge cases: unknown oracle → None, unknown track → None,
raw=0 → 0.0, raw=huge → 1.0
Phase A rerun on 4 AlphaGenome literature-checked cases (SORT1, TERT,
FTO, BCL11A) confirms biology is preserved but results are NOT bit-
identical — raw_score drift ~1-2% on dominant tracks, larger quantile
swings on near-zero tracks due to AlphaGenome's JAX CPU non-
determinism. No committed example is stale. Noise-floor handling for
|raw_score| < ~1e-3 added to follow-up recommendation list.
Artifacts:
- audits/2026-04-16_application_and_normalization_audit.md (main report)
- audits/2026-04-16_screenshots/*.png (19 full-page PNGs)
- audits/2026-04-16_data/*.json (per-app cards + normalization/selenium/rerun JSON)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ixes Addresses the findings in audits/2026-04-16_application_and_normalization_audit.md (PR #9). Three categories of change: 1. Delete two example applications the audit recommends removing: - examples/applications/variant_analysis/TERT_promoter/ C228T is a melanoma-specific gain-of-function mutation; the example runs it in K562 (erythroleukemia) and shows all-negative effects. The biology is correct for the model but inverts the published direction. Rather than add a "wrong cell type" caveat, drop the example — SORT1 / FTO / BCL11A cover variant_analysis without teaching the reader a misleading result. - examples/applications/validation/HBG2_HPFH/ Already self-documented as "Not reproduced" in validation/README.md: BCL11A / ZBTB7A aren't in AlphaGenome's track catalog, so the repressor-loss mechanism isn't visible. Keeping a "validation failed" example alongside the working SORT1_rs12740374_with_CEBP confuses readers. Drop it. Also updated: root README.md (replaces HBG2_HPFH link with SORT1_rs12740374_with_CEBP), examples/applications/variant_analysis/README.md (drops TERT prompt + section), examples/applications/validation/README.md (drops HBG2 row + section + reproduce snippet), scripts/regenerate_examples.py + scripts/internal/inject_analysis_request.py (both lose their TERT_promoter/HBG2_HPFH entries). 2. Normalizer: guard against zero-count CDF rows (chorus/analysis/normalization.py). Audit finding #1 (HIGH): the committed chrombpnet_pertrack.npz has DNASE:hindbrain with effect_counts[idx] == 0 and a zero-filled CDF row. effect_percentile() / activity_percentile() silently returned 1.0 for every raw_score (including 0.0) because np.searchsorted on a zeros row returns len(row) for any non-negative probe and the denominator falls through to cdf_width. Same bug-class as the v2 chrombpnet concurrent-download race that landed in PR #8 — the hindbrain ENCODE tar must have failed to extract cleanly during the original background build. New private helper _has_samples() returns False when counts[idx] == 0, which makes _lookup / _lookup_batch return None. Callers already render None as "—" in MD/HTML tables, so users now see "no background" instead of a silent false "100th percentile". Counts-less NPZs (older format, no counts field) are treated as valid — no regression. 3. Report: suppress quantile_score when raw_score is in the noise floor (chorus/analysis/variant_report.py). Audit finding #6 (LOW): when |raw_score| < 1e-3 the effect CDF is so densely clustered around 0 that a 1-2% raw-score drift can swing the quantile by 0.5+ (observed in the Phase A rerun: committed quantile=1.0 vs rerun=0.21 for a CEBPB track with raw_score ~1e-4). Set quantile_score = None in that regime so the HTML/MD tables render "—" and readers don't misread noise as signal. Threshold chosen conservatively to cover both log2fc (pc=1.0) and logfc RNA (pc=0.001) without hiding real effects. 4. IGV.js: lazy-download the bundle into ~/.chorus/lib on first use (chorus/analysis/_igv_report.py + chorus/analysis/causal.py). Audit finding #2 (MEDIUM): reports embed a <script src="..."> to cdn.jsdelivr.net that gets evaluated every time the HTML is opened in a browser. Any viewer on an airgapped network / corporate proxy that MITMs TLS / during a jsdelivr outage sees IGV silently fail (2/19 audit reports hit ERR_CERT_AUTHORITY_INVALID). The local- cache code path already existed but was opt-in (user had to drop a file in ~/.chorus/lib/igv.min.js manually). New _ensure_igv_local() helper runs on the first report generation and populates the cache via chorus.utils.http.download_with_resume (the helper that landed in v2 PR #8). Reports written after the first successful download inline the JS directly — self-contained HTML that opens anywhere without network. Download failure is logged at WARNING and the CDN <script> tag is used as fallback, preserving the current behaviour for anyone who can't reach jsdelivr at generation time. All changes are platform-agnostic; 287/287 pytest continue to pass; fix verified behaviourally: >>> norm.effect_percentile('chrombpnet', 'DNASE:hindbrain', 0.0) None # was: 1.0 >>> norm.effect_percentile('chrombpnet', 'DNASE:HepG2', 0.0) 0.0 # unchanged >>> ts = TrackScore(raw_score=0.0005, ...); >>> _apply_normalization(ts, ...); ts.quantile_score None # noise floor See audits/2026-04-16_application_and_normalization_audit.md (PR #9) for full context, per-app screenshots, and the Phase A / B / C methodology behind each finding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 2026-04-16 deep application audit proposed 6 fixes; commit 5ebb328 implemented 5 of them. This commit adds the missing LOW-priority Fix #4: a short application note on why AlphaGenome DNASE and ChromBPNet ATAC can report different effects for the same variant in the same cell type. Three reasons documented: different training data (DNase vs Tn5), different receptive fields (1 Mb vs 2 kb), different effect aggregation (binned sum vs peak height). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues surfaced during the 2026-04-15 scorched-earth Linux audit (all envs wiped + HF/chorus caches deleted + full chorus setup from scratch): 1. environments/chorus-enformer.yml — pin tensorflow==2.13.* (was <2.15). A fresh setup resolved tensorflow to 2.14.1, which hits "JIT compilation failed" on the Enformer saved_model's cross_replica_batch_norm/Rsqrt op when XLA tries to compile the forward pass on CUDA. TF 2.13.1 loads and runs the same saved_model cleanly against the same nvidia-cu11 pip packages. 2. chorus/oracles/chrombpnet.py _download_chrombpnet_model — wrap tar.extractall in an fcntl lock on <extract_folder>.extract.lock and skip when fold_0 already exists. tarfile.extractall creates subdirectories without exist_ok=True, so two concurrent callers (pytest smoke + build_backgrounds_chrombpnet.py) race and the loser hits `FileExistsError: [Errno 17] File exists: '.../models'`. Same bug-class as the v2-audit download race, but one step later in the pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ecute - get_normalizer() now prefers PerTrackNormalizer over legacy .npy scan (root cause of "no baselines available" in notebook outputs — the new per-track NPZ format was ignored). - PerTrackNormalizer.summary() accepts optional oracle_name so it matches QuantileNormalizer.summary() and callers don't need to branch on type. - single_oracle_quickstart: remove hardcoded /home/penzard/... !ls cell (errored for every other user) and a dead duplicate assignment. - All 3 example notebooks: kernelspec normalized to "chorus" and re-executed in place — 0 errors, 0 stale messages, per-track CDFs loading cleanly across all oracles exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scorched-earth audit: all envs deleted, caches wiped, fresh clone on SSD, every README step followed verbatim. Results: - 6/6 oracle envs built from scratch (borzoi needed prefix.dev mirror) - 287/287 pytest pass (including all 6 oracle smoke tests) - 3/3 notebooks execute with 0 errors - MCP tools verified (list → load → predict → unload) - 14/16 HTML render in Selenium (2 expected: batch=table, causal=CDN blocked) - ChromBPNet backgrounds rebuilt: DNASE:hindbrain zero-count row fixed, NPZ uploaded to HuggingFace 3 HIGH issues found and fixed during audit: - TF 2.14 XLA crash → pinned enformer to TF 2.13 - chrombpnet tar extraction race → fcntl lock - chrombpnet hindbrain zero-count CDF → guard + full rebuild Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two remaining items from the fresh-install audit: 1. Causal prioritization HTML now inlines igv.min.js (9 MB) instead of loading from cdn.jsdelivr.net. Resolves the JS-ERR finding on corporate networks with SSL inspection (ERR_CERT_AUTHORITY_INVALID). 2. load_oracle MCP response now reports the resolved device instead of null when auto-detect was used. Uses nvidia-smi probe on Linux, platform check on macOS, so callers can confirm GPU is in use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
From the 2026-04-16 new-user usability audit: HIGH: - H1: Fix wrong FTO coordinate in variant_analysis/README.md (53800954 → 53767042, matching all other references) - H2: Add @_safe_tool to all 22 MCP tools (was missing on 14; unhandled exceptions now return structured error dicts) - H3: Add htslib to environment.yml (provides bgzip for coolbox) - H4: Fix python_requires to >=3.10 in setup.py (code uses 3.10+ syntax like str | None) MEDIUM: - M2: Add README.md to marquee SORT1_rs12740374 example directory - M8: Harmonize discover_variant to accept alt_alleles: list[str] (was singular alt_allele, inconsistent with other discovery tools) - M9: Fix upgrade instruction order in README (remove oracle envs before removing the base chorus env that provides the CLI) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second-pass usability audit findings: 1. Badge colors now match interpretation labels (was: red "Minimal effect" badges because _score_color_class used percentile directly; now derives class from the interpretation string which applies raw-score gating) 2. IGV CHIP track names now show TF/mark (was: all "CHIP:HepG2"; now uses _track_description enrichment for "CHIP:CEBPA:HepG2" etc.) 3. Percentile display: ≥99th / ≤1st instead of "1.000" / "0.000" to avoid implying false precision when the background CDF is saturated (correct behavior — random SNPs mostly have near-zero effects) 4. getting_started() MCP prompt now recommends high-level tools first (analyze_variant_multilayer, discover_variant, score_variant_batch, fine_map_causal_variant) instead of only listing low-level primitives Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second-pass usability audit cleanup:
C5: Discovery sub-reports now carry AnalysisRequest with user prompt
(regen script patches each per-cell-type report)
C6: Borzoi targets file: strip /home/drk/tillage/datasets/human/ prefix
from file column (upstream training paths, not used at inference)
C7: HTML <title> now includes report_title + gene_name + position
(e.g. "Multi-Layer Variant Report — SORT1 — chr1:109274968")
P8: Remove scripts/internal/ from repo (8 machine-specific dev scripts)
P9: Remove audits/2026-04-16_screenshots/ (~18 MB of PNGs)
Both added to .gitignore to prevent re-commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BLOCKS_USER: - Add missing __init__.py to 5 oracle source/template dirs (borzoi, enformer, alphagenome templates) — non-editable pip install would fail with ModuleNotFoundError - Fix setup.py package_data: replace invalid ../environments/* escape with proper per-package data globs + data_files for env YAMLs CONFUSING: - Misspelled oracle name now raises ValueError listing valid names (was: misleading "not yet implemented" message) - LegNet cell_types in list_tracks: add WTC11 (was: only HepG2, K562) - list_tracks unknown oracle now lists valid names in error - README: fix AlphaGenome track count 5,930 → 5,731 (actual loaded) - README: fix Sei class count 41 → 40 - README: fix Borzoi track count 7,610 → 7,611 - README: fix oracle.list_tracks() → list_tracks() MCP tool Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from audit/2026-04-16-fresh-install-v4 (preserving our second/third-pass audit fixes which that branch didn't have). The macOS CPU-forcing guard in AlphaGenome's load and predict templates only fired when device was None or started with "cpu". Callers passing device='cuda:0' bypassed it, letting jax-metal initialize and crash with "UNIMPLEMENTED: default_memory_space". Fix: on Darwin, always force JAX_PLATFORMS=cpu unless the caller explicitly requests Metal. Applied to all three code paths: - alphagenome.py:_load_direct (env var set before import jax) - load_template.py - predict_template.py Includes macOS v4 audit report: clean-slate install, 6 oracle GPU verification, 12 example regenerations, 3 notebooks (0 errors), 13 HTML Selenium checks, 7-check normalization audit (all pass). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Walked through Chorus as 4 personas (clinician, bioinformatician, PhD student, computational biologist). Key changes: For clinicians: - Add "Key terms" glossary box at top of README (oracle, track, assay_id, effect percentile, log2FC — defined before first use) - Reword Start-here table: "I have a variant but don't know the relevant tissue" instead of bioinformatics framing - Add Interpretation sections to SORT1, BCL11A, FTO example outputs with clinical/biological narrative (LDL cholesterol, sickle cell, tissue-specificity explanation) For bioinformaticians: - Add VCF parsing snippet to batch scoring README - Document oracle_name param for normalization - Note AlphaGenome full track IDs vs short names For contributors: - Replace Borzoi→mymodel throughout CONTRIBUTING.md (Borzoi is already implemented, was confusing) - Update Current Priorities to reflect all 6 oracles done Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
User feedback: batch scoring only showed the effect (log2FC + percentile) per track, hiding the absolute ref and alt values. A +0.4 effect could mean 10→14 (active region) or 0.001→0.0014 (noise) — impossible to tell without seeing both alleles. All three output formats now show full transparency per track: - TSV/DataFrame: columns _ref, _alt, _log2fc, _effect_pctile, _activity_pctile (was: _raw, _pctile, _activity) - Markdown: 4 sub-columns per track (Ref | Alt | log2FC | Effect %ile) - HTML: grouped header with colspan, same 4 sub-columns per track Also uses ≥99th / ≤1st percentile display from the earlier audit fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 new READMEs: - causal_prioritization/SORT1_locus/ - validation/SORT1_rs12740374_with_CEBP/ - validation/TERT_chr5_1295046/ 2 new interpretation sections: - SORT1_chrombpnet: notes cross-oracle comparison with AlphaGenome - SORT1_enformer: notes cross-tissue DNASE pattern + 114kb window Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All 12 application examples regenerated on GPU 0 with: - AlphaGenome: 4 variant + 1 validation + batch + causal + discovery + 2 seq engineering + TERT validation (28 min + 15 min) - Enformer: 3 SORT1 examples (5 min) - ChromBPNet: 1 SORT1 example (2 min) New in regenerated outputs: - ≥99th / ≤1st percentile display (no more "1.000") - Badge colors match interpretation labels in HTML - IGV CHIP track names show TF/mark (CHIP:CEBPA:HepG2) - HTML titles include report_title + gene + position - Self-contained igv.min.js (no CDN dependency) - Batch scoring TSV: expanded ref/alt/log2fc/pctile columns Re-added interpretation sections to 5 variant analysis examples (SORT1, BCL11A, FTO, SORT1_chrombpnet, SORT1_enformer) after regeneration overwrote them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: discover_variant_effects() writes HTML internally BEFORE analysis_request is patched in regen scripts. Fixed by re-writing HTML after patching, targeting the exact filenames. Removed 6 stale HTML files: - 3 discovery sub-reports (per-cell-type) that duplicated the main discovery report but lacked user prompt - 1 orphaned CELSR2 validation report (unreferenced) - 1 enformer validation duplicate (enformer_report.html, kept RAW_autoscale) - 1 enformer discovery duplicate in SORT1_enformer dir Final screenshot audit: 13/13 HTML reports CLEAN — all have Analysis Request with user prompt, correct badge colors, enriched CHIP track names, self-contained IGV, and ≥99th percentile display. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… docstring From the read-only v5 audit (PR #12) — 3 low-severity findings: 1. MEDIUM: tests/test_analysis.py referenced old batch_scoring column names ('_raw', '_pctile') that commit 01d8446 renamed to '_ref', '_alt', '_log2fc', '_effect_pctile', '_activity_pctile'. Updated assertions to match current scheme; added checks for _ref and _alt that weren't previously verified. 279 → 281 passing. 2. MEDIUM: scripts/regenerate_examples.py hardcoded "N HepG2/K562 tracks" regardless of actual cell type. Added "cell_type" to each of the 4 AlphaGenome variant examples, and the tracks_requested string is now derived from that field. Also mechanically patched the 4 affected committed example outputs (SORT1, BCL11A, FTO, SORT1_CEBP) in MD/JSON/HTML so readers see the correct per-example label (e.g. "6 K562 tracks" for BCL11A) without waiting for the next regen. All 12 occurrences removed. 3. MINOR: chorus/analysis/batch_scoring.py:82 docstring still listed the old _raw / _pctile columns. Updated to reflect current output schema. Verified: - pytest tests/ --ignore=tests/test_smoke_predict.py → 281 passed - Selenium re-render of the 4 patched HTMLs confirms the new labels appear and no "HepG2/K562" stragglers remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hrough
From the post-v5-merge audit:
C1: MCP server tracks_requested now derives cell-type label dynamically
(matches regen script output). Added _describe_tracks_requested
helper that inspects the variant_result to extract cell types;
labels uniform cell-type sets as "N HepG2 tracks" and mixed as
"N tracks". Applied to all 5 tools using the pattern.
C2/C3: Added 15 new tests covering _fmt_percentile (≥99th boundary),
_score_color_class (interpretation-label-based color), resolved
device detection (nvidia-smi probe), and _describe_tracks_requested
(uniform vs mixed cell-type labels). Test count 235 → 250.
P2: MCP_WALKTHROUGH.md — added "Manage loaded oracles" tip covering
oracle_status and unload_oracle. Fixed stale 5,930 → 5,731 track
count for AlphaGenome.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause (2 layers): 1. discover_variant_effects and discover_and_report wrote HTML to output_path BEFORE AnalysisRequest could be attached (since the functions didn't accept one). Regen scripts then attached the prompt post-hoc and did a second report.to_html() under a pretty filename, leaving the first HTML behind as an orphan. 2. Commit df7d613 deleted the 3 per-cell-type discovery HTMLs from the repo, but those files are the actual drill-down output of discover_and_report — they were mislabeled as duplicates. There is no 'main' discovery HTML; the per-cell-type HTMLs are it. Clean fix (no glob+remove hack): - discovery.py: discover_variant_effects gains `analysis_request` and `output_filename` kwargs; discover_and_report gains `user_prompt` and `tool_name` kwargs. When provided, the AnalysisRequest is attached before the first HTML write. - regenerate_examples.py::regenerate_enformer_discovery and regenerate_remaining_examples.py::{regen_discovery, regen_tert_chr5} now pass the AnalysisRequest + pretty filename in. The post-hoc report.to_html() rewrite and the per-cell-type for-loop that patched analysis_request by glob are removed. - The 3 legitimate discovery HTMLs are re-committed to the repo with the user prompt baked in on first write. - tests/test_analysis.py: 2 new tests verify the new kwargs exist. Verified: - pytest 298 passed (was 296) - SORT1_enformer dir contains only rs12740374_SORT1_enformer_report.html (no orphan chr*.html) - All 3 discovery sub-reports render the "Screen all cell types…" prompt in their Analysis Request section - `git status` after a fresh regen is clean (no untracked files) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four polish fixes from the first-user UX audit (PR #16): 1. MEDIUM: `chorus list` no longer shows a phantom `base` entry. EnvironmentManager.list_available_oracles() now filters out chorus-base.yml (an internal template — the user-facing base env is 'chorus' from root environment.yml). Added a guard in `chorus setup --oracle base` that prints a helpful message pointing to `mamba env create -f environment.yml` instead of silently trying to create a chorus-base env. Test added in tests/test_core.py. 2. MEDIUM: docs/MCP_WALKTHROUGH.md:38 — fixed `alt_allele="T"` (wrong, singular string) → `alt_alleles=["T"]` (plural, list) to match the actual MCP tool signature. The adjacent Example 2 on line 70 already used the correct form; now both match. 3. LOW: examples/advanced_multi_oracle_analysis.ipynb cell 1 — replaced the stale "using the Enformer oracle" subtitle (copy-pasted from single_oracle_quickstart) with a proper multi-oracle description that matches the title. 4. LOW: examples/applications/variant_analysis/SORT1_rs12740374/ README.md — "Key results" table had stale graduated percentiles (99/98/95/90/88); current example_output.md has all five tracks at ≥99th after the v5 _fmt_percentile update. Refreshed the table with current effect sizes and enriched track names (CHIP:CEBPA:HepG2 etc.), added an explanation of why the top bucket shows as "≥99th". Verified: pytest 299 passed (was 298); `chorus list` output no longer includes the phantom 'base' entry; `chorus setup --oracle base` prints the friendly error message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the four findings from audit PR #16.
Changes
MEDIUM —
chorus listphantombaseentry (chorus/core/environment/manager.py,chorus/cli/main.py)list_available_oracles()now excludeschorus-base.ymlfrom the oracle list — it's an internal template, not a user-installable oracle.chorus setup --oracle basenow prints a friendly error pointing tomamba env create -f environment.ymlinstead of silently trying to create achorus-baseenv.tests/test_core.py.Before:
After:
MEDIUM — walkthrough kwarg typo (
docs/MCP_WALKTHROUGH.md:38)alt_allele="T"(singular string, wrong) →alt_alleles=["T"](plural list, matches actual signature). Example 2 in the same file already had it correct; now both examples match.LOW — NB3 stale subtitle (
examples/advanced_multi_oracle_analysis.ipynbcell 1)LOW — SORT1 README stale percentile table (
examples/applications/variant_analysis/SORT1_rs12740374/README.md)example_output.mdhas all five tracks at≥99thsince the v5_fmt_percentileupdate. Refreshed with current effect sizes + enriched track names (CHIP:CEBPA:HepG2etc.), plus a note explaining why the top bucket shows as≥99th.Verified
pytest tests/ --ignore=tests/test_smoke_predict.py→ 299 passed (was 298)chorus listno longer includes thebaseentrychorus setup --oracle baseprints the friendly errorgit diffshows only the 4 targeted fixes + 1 testRelationship to audit PR #16
This PR fixes all 4 findings from audit PR #16. Both can be merged independently. Once both are in,
chorus-applicationshas the complete round-trip: audit report + fixes.🤖 Generated with Claude Code