benchmark-folder created by ypriverol · Pull Request #8 · bigbio/mokume

ypriverol · 2026-01-31T09:01:41Z

Summary by CodeRabbit

Release Notes

New Features
- Added comprehensive batch effect correction benchmarking framework with multi-lab dataset analysis and quantification method comparisons
- Added HeLa protein quantification benchmarking pipeline covering multiple quantification methods with evaluation metrics
- Added TMT vs LFQ quantification comparison benchmark with cross-technology agreement analysis
- Included automated analysis and visualization tools for benchmark result reporting
Documentation
- Added project roadmap outlining planned improvements for batch correction and single-cell proteomics support
- Enhanced benchmark documentation with standardized structure and detailed methodology sections

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-31T09:02:09Z

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive benchmarking framework for protein quantification methods, adding multiple benchmark directories with documentation, configuration infrastructure, data acquisition pipelines, quantification runners, metrics computation, and visualization generation capabilities.

Changes

Cohort / File(s)	Summary
Configuration & Roadmap `.gitignore`, `ROADMAP.md`	Updates gitignore patterns for benchmark artifacts and adds comprehensive product roadmap documenting batch correction improvements, single-cell proteomics support, and evaluation metrics.
Batch Quartet Multilab Benchmark `benchmarks/batch-quartet-multilab/README.md`, `benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py`, `benchmarks/batch-quartet-multilab/scripts/plot_results.py`, `benchmarks/batch-quartet-multilab/scripts/run_benchmark.py`	Adds complete benchmark suite for multi-lab batch effect correction evaluation, including metadata construction, protein coverage analysis, inter-lab correlations, differential expression consistency, batch effect magnitude estimation, and summary visualization generation.
HeLa Method Comparison Benchmark `benchmarks/quant-hela-method-comparison/README.md`, `benchmarks/quant-hela-method-comparison/scripts/config.py`	Introduces configuration module defining dataset metadata, directory paths, quantification methods, column mappings, and constants for the comprehensive HeLa benchmarking pipeline.
HeLa Data Acquisition & Preparation `benchmarks/quant-hela-method-comparison/scripts/01_download_data.py`, `benchmarks/quant-hela-method-comparison/scripts/02_prepare_peptides.py`	Implements data acquisition with URL validation and robust fallback strategies, and peptide preparation with multi-format detection, column normalization, and decoy/contaminant filtering.
HeLa Quantification & Metrics `benchmarks/quant-hela-method-comparison/scripts/03_run_quantification.py`, `benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py`	Executes multiple quantification methods (iBAQ, MaxLFQ, DirectLFQ, Top3, TopN, Sum) with error handling and computes comprehensive metrics including CV, cross-experiment correlation, rank consistency, and expression stability.
HeLa Visualization & Analysis `benchmarks/quant-hela-method-comparison/scripts/05_generate_plots.py`, `benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py`, `benchmarks/quant-hela-method-comparison/scripts/07_intensity_distribution_plots.py`, `benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py`	Generates comprehensive benchmark visualizations including CV distributions, cross-experiment heatmaps, TMT-LFQ correlations, method comparisons, intensity profiles, and workflow diagrams with statistical annotations.
HeLa Pipeline Orchestration `benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py`	Implements multi-phase pipeline orchestration with CLI options for dataset selection, force recomputation, and phase-specific execution across data acquisition, preparation, quantification, metrics, and visualization phases.
TMT vs LFQ Benchmark Documentation `benchmarks/quant-pxd007683-tmt-vs-lfq/README.md`	Documents comprehensive TMT vs LFQ quantification benchmark with key findings, dataset descriptions, cross-technology comparisons, performance benchmarks, and methodology details.
Benchmarks Root Documentation `benchmarks/README.md`	Restructures benchmark documentation from detailed narrative to minimal directory-based outline with standardized repository layout specification.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A rabbit's ode to quantification quest,
Benchmarks built to put methods to test,
From batch effects purged with ComBat's might,
To HeLa's peptides shining so bright,
Four pipelines hop where data shall rest,
Metrics and plots show which one's the best!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'benchmark-folder created' is overly vague and generic, not clearly summarizing the substantial changes including benchmark infrastructure setup, quantification methods, analysis pipelines, and documentation.	Use a more descriptive title that captures the main additions, such as 'Add comprehensive benchmarking suite for protein quantification methods' or 'Implement benchmark infrastructure for batch correction and quantification evaluation'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 93.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 15

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In `@benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py`:
- Around line 55-128: The analyze_protein_coverage function can raise when
batch_proteins, any_batch or overlaps are empty; update the logic to guard all
min/max and percentage calculations and prints: check if batch_proteins before
calling min(...) or max(...), use 0 (or an appropriate default) when empty;
check if any_batch before computing len(core)/len(any_batch) and format the
percent as 0% when empty; check overlaps before taking np.mean and default to 0;
apply these guards in both the results.append dictionary entries and the
subsequent print statements so no division or min/max is called on empty
collections.
- Around line 131-178: In analyze_missing_values, avoid calling min()/max() on
an empty batch_missing dict which raises ValueError: check if batch_missing is
empty before computing min/max and variance (e.g., set Min_Batch_Missing,
Max_Batch_Missing, Missing_Variance to np.nan or 0 when empty), update the
results.append and the print statements to use the safe values, and replace the
unnecessary f-string prefix on the WARNING message with a normal string literal
(no interpolation).

In `@benchmarks/batch-quartet-multilab/scripts/run_benchmark.py`:
- Around line 110-156: The function run_maxlfq_quantification creates a
MaxLFQQuantification instance but never uses it; instead it computes a median
which is incorrect. Replace the manual per-protein median logic by calling
MaxLFQQuantification.quantify(...) with the original peptide-level DataFrame
(peptide_df) or the appropriately formatted DataFrame (columns: peptide,
protein, run_id, intensity), capture its return (quant_df), and then
convert/return quant_df with columns renamed to match expected output
("protein","run_id","intensity"); keep the existing try/except around the
quantify call and remove the manual log/median computation using
intensity_matrix, log_intensities, and protein_intensities. Ensure you still
handle empty results and NaNs as needed.

In `@benchmarks/quant-hela-method-comparison/scripts/01_download_data.py`:
- Around line 65-72: The check_url_exists function uses a bare except and
doesn't close the urlopen response; update check_url_exists to open the request
with urllib.request.urlopen(request, timeout=10) assigned to a variable or used
as a context manager so the response is closed, and replace the bare except with
explicit exception handling (e.g., except urllib.error.HTTPError,
urllib.error.URLError, socket.timeout as e) to return False while preserving the
exception variable for possible logging; reference the Request creation and
urlopen call in check_url_exists when making these changes.
- Around line 75-87: In list_available_files, replace the bare except and
single-letter variable by catching specific errors (e.g., urllib.error.URLError
/ urllib.error.HTTPError, socket.timeout, and UnicodeDecodeError) instead of
Exception, rename the comprehension variable from l to link for clarity, and
keep the same fallback return []; also keep or move the re import as
needed—ensure you log or print the caught exception object in the handler (e.g.,
process the caught exception as err) so the error message is not lost.
- Around line 22-63: download_file currently assumes the destination directory
exists and accepts any URL scheme; before calling urllib.request.urlretrieve in
download_file, ensure dest.parent exists by creating it (Path.mkdir with
parents=True, exist_ok=True) and validate the URL scheme using
urllib.parse.urlparse(url).scheme, allowing only expected schemes (e.g., 'http',
'https', 'ftp'); if the scheme is invalid, print an informative message and
return False. Make these checks near the start of download_file (before the
download attempt) so failures are handled early and directories are prepared.

In `@benchmarks/quant-hela-method-comparison/scripts/02_prepare_peptides.py`:
- Around line 338-392: The variable base_name in load_raw_data is assigned but
never used, causing a lint error (Ruff F841); remove the unused assignment (the
line starting with "base_name =") from the load_raw_data function so the
function logic is unchanged but the unused-variable warning is eliminated.
- Around line 132-205: The local variable targets in normalize_columns is
declared but never used (causing Ruff F841); remove the unused assignment
"targets = [\"ProteinName\", \"PeptideSequence\", \"SampleID\",
\"NormIntensity\"]" from the function (or alternatively use it to validate
post-rename presence) so the variable is not unused and linting passes.

In `@benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py`:
- Around line 459-541: In save_results, several print calls use f-strings even
though they contain no placeholders (Ruff F541); replace these unnecessary
f-strings with regular string literals: change print(f"  Saved:
cv_comparison.csv"), print(f"  Saved: cross_experiment_corr_*.csv"), print(f" 
Saved: rank_consistency_*.csv"), print(f"  Saved: expression_stability_*.csv"),
print(f"  Saved: tmt_lfq_comparison.csv, tmt_lfq_values_*.csv"), and print(f" 
Saved: summary_metrics.csv") to plain strings (keep the f-string for the first
print that interpolates output_dir).
- Around line 315-383: In compute_expression_stability, you are double‑logging
already log‑transformed iBAQ values and dropping valid zero/negative log values
by applying np.log2(val + 1) and filtering val > 0; instead detect when
intensity_col corresponds to a log scale (e.g., INTENSITY_COLUMNS entries like
"IbaqLog" or column names containing "ibaqlog"/"log") and skip the np.log2
conversion for those columns, and change the validity check to only
pd.notna(val) (or np.isfinite) rather than val > 0 so valid logged values aren’t
discarded; update the handling around intensity_col selection, median_expr
extraction, and the append logic to apply logging conditionally based on the
intensity_col name.
- Around line 68-103: The compute_cv_per_protein function can produce inf/NaN
when Mean == 0; update the CV computation to guard against division by zero by
replacing zero means with NaN (or explicitly setting CV to NaN) before dividing:
e.g., compute a safe_mean = result["Mean"].replace(0, np.nan) and then set
result["CV"] = result["Std"] / safe_mean, or alternatively set
result.loc[result["Mean"] == 0, "CV"] = np.nan after computing CV; modify the
compute_cv_per_protein function accordingly and ensure numpy (np) is available
where used.
- Around line 198-264: The function compute_tmt_lfq_correlation should validate
that at least two paired samples remain after filtering non-finite values (the
boolean mask valid applied to tmt_vals and lfq_vals) before calling
stats.pearsonr/stats.spearmanr; add a check (e.g., if len(tmt_vals) < 2) and
return None (or a suitable empty result) to avoid ValueError when fewer than 2
samples remain, keeping the rest of the return structure unchanged.

In
`@benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py`:
- Around line 342-448: The code in create_method_comparison_grid assumes axes is
a 2D array but plt.subplots returns a single Axes when n_methods==1 (or a 1D
array when one dimension is 1), causing axes[i, j] to fail; after creating fig,
axes = plt.subplots(...), coerce axes into a 2D array (e.g., axes =
np.atleast_2d(axes) or replace with axes = np.array([[axes]]) when n_methods ==
1) so subsequent indexing axes[i, j] works reliably for all n_methods values;
update the axes variable immediately after the plt.subplots call (before the
for-loops) and do not change later logic.

In
`@benchmarks/quant-hela-method-comparison/scripts/07_intensity_distribution_plots.py`:
- Around line 51-85: The try/except in plot_intensity_distribution currently
swallows all errors (bare except: pass); change it to catch Exception as e, log
the failure with identifying context (e.g., sample name and intensity_col) and
the exception message (using the module logger or a descriptive print) and then
continue so plotting proceeds for other samples; update the same pattern in the
other KDE try/except blocks referenced (lines 87-135, 138-230) to use Exception
as e and emit an informative log/print before continuing.

In
`@benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py`:
- Around line 6-7: The script fails when saving figures because the output
directory "plots" may not exist; import os (or pathlib) and create the directory
before the save calls by calling os.makedirs("plots", exist_ok=True) (or
Path("plots").mkdir(parents=True, exist_ok=True)) immediately before the
plt.savefig/fig.savefig calls that write "plots/mokume_workflow.png" and
"plots/mokume_workflow.pdf" so the directory exists.

🟡 Minor comments (20)

benchmarks/quant-hela-method-comparison/README.md-17-24 (1)
17-24: ⚠️ Potential issue | 🟡 Minor

Fix table pipe spacing to satisfy markdownlint (MD060).

The tables flagged by markdownlint need aligned pipes/spaces. Apply the same spacing fix to each affected table.
✏️ Example fix for the first table (apply similarly to others)
-| Method | Mean Cross-Corr | Mean Rank-Corr |
-|--------|-----------------|----------------|
-| **iBAQ** | **0.741** | **0.742** |
-| iBAQ raw | 0.656 | 0.652 |
-| DirectLFQ | 0.620 | 0.612 |
-| Sum | 0.594 | 0.578 |
-| Top3 | 0.530 | 0.503 |
-| TopN | 0.511 | 0.470 |
+| Method    | Mean Cross-Corr | Mean Rank-Corr |
+|-----------|-----------------|----------------|
+| **iBAQ**  | **0.741**       | **0.742**      |
+| iBAQ raw  | 0.656           | 0.652          |
+| DirectLFQ | 0.620           | 0.612          |
+| Sum       | 0.594           | 0.578          |
+| Top3      | 0.530           | 0.503          |
+| TopN      | 0.511           | 0.470          |
Also applies to: 84-90, 107-113
benchmarks/quant-pxd007683-tmt-vs-lfq/README.md-21-25 (1)
21-25: ⚠️ Potential issue | 🟡 Minor

Fix table pipe spacing to satisfy markdownlint (MD060).

Apply consistent pipe spacing to the tables listed.
✏️ Example fix for the dataset table (apply similarly to others)
-| Method | Samples | Proteins | Peptides | Features |
-|--------|---------|----------|----------|----------|
-| TMT | 11 | 9,423 | 77,439 | 1,409,771 |
-| LFQ | 11 | 8,213 | 54,939 | 505,906 |
+| Method | Samples | Proteins | Peptides | Features |
+|--------|---------|----------|----------|----------|
+| TMT    | 11      | 9,423    | 77,439   | 1,409,771 |
+| LFQ    | 11      | 8,213    | 54,939   | 505,906 |
Also applies to: 75-80, 151-160
benchmarks/batch-quartet-multilab/README.md-25-31 (1)
25-31: ⚠️ Potential issue | 🟡 Minor

Fix table pipe spacing to satisfy markdownlint (MD060).

Apply consistent spacing to all affected tables.
✏️ Example fix (apply similarly to other tables)
-| Method | Raw Correlation | After ComBat |
-|--------|-----------------|--------------|
-| **DirectLFQ** | **0.816** | **0.977** |
-| MaxLFQ | 0.780 | 0.980 |
-| iBAQ | 0.771 | 0.971 |
-| Top3 | 0.753 | 0.973 |
+| Method      | Raw Correlation | After ComBat |
+|-------------|-----------------|--------------|
+| **DirectLFQ** | **0.816**      | **0.977**    |
+| MaxLFQ      | 0.780           | 0.980        |
+| iBAQ        | 0.771           | 0.971        |
+| Top3        | 0.753           | 0.973        |
Also applies to: 43-48, 54-60, 69-73
benchmarks/quant-hela-method-comparison/README.md-33-33 (1)
33-33: ⚠️ Potential issue | 🟡 Minor

Hyphenate “Best-performing” for grammar.

LanguageTool flags this as needing a hyphen.
✏️ Suggested edit
-Best performing datasets (CV < 3%):
+Best-performing datasets (CV < 3%):
ROADMAP.md-138-138 (1)
138-138: ⚠️ Potential issue | 🟡 Minor

Use descriptive link text (MD059).

Replace the generic “link” with descriptive text.
✏️ Suggested edit
-3. Vanderaa C & Gatto L. "scp: Mass Spectrometry-Based Single-Cell Proteomics Data Analysis" *Bioconductor* [link](https://bioconductor.org/packages/scp)
+3. Vanderaa C & Gatto L. "scp: Mass Spectrometry-Based Single-Cell Proteomics Data Analysis" *Bioconductor* [scp Bioconductor package](https://bioconductor.org/packages/scp)
benchmarks/README.md-43-49 (1)
43-49: ⚠️ Potential issue | 🟡 Minor

Add a language to the fenced code block (MD040).

This avoids markdownlint warnings and improves readability.
✏️ Suggested edit
-```
+```text
 benchmarks/<benchmark-name>/
 ├── README.md           # Overview, results, methodology (expandable)
 ├── scripts/            # Benchmark scripts
 ├── data/               # GIT-IGNORED: large data files
 ├── results/            # CSV metrics
 └── figures/            # PNG visualizations
</details>

</blockquote></details>
<details>
<summary>benchmarks/README.md-7-11 (1)</summary><blockquote>

`7-11`: _⚠️ Potential issue_ | _🟡 Minor_

**Fix table pipe spacing (MD060).**

markdownlint flags the table column spacing.  

<details>
<summary>✏️ Suggested edit</summary>

```diff
-| Benchmark | Description | Key Finding |
-|-----------|-------------|-------------|
-| [quant-hela-method-comparison](./quant-hela-method-comparison/) | Cross-experiment consistency using HeLa datasets | iBAQ (log) best for cross-experiment (r=0.74) |
-| [batch-quartet-multilab](./batch-quartet-multilab/) | Batch effect correction with Quartet multi-lab data | DirectLFQ + ComBat most reliable |
-| [quant-pxd007683-tmt-vs-lfq](./quant-pxd007683-tmt-vs-lfq/) | TMT vs LFQ comparison from Gygi lab | `median-cov` gives lowest CV |
+| Benchmark | Description | Key Finding |
+|-----------|-------------|-------------|
+| [quant-hela-method-comparison](./quant-hela-method-comparison/) | Cross-experiment consistency using HeLa datasets | iBAQ (log) best for cross-experiment (r=0.74) |
+| [batch-quartet-multilab](./batch-quartet-multilab/) | Batch effect correction with Quartet multi-lab data | DirectLFQ + ComBat most reliable |
+| [quant-pxd007683-tmt-vs-lfq](./quant-pxd007683-tmt-vs-lfq/) | TMT vs LFQ comparison from Gygi lab | `median-cov` gives lowest CV |
benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py-86-95 (1)

86-95: ⚠️ Potential issue | 🟡 Minor

--force has no effect in phase 5.
The flag is ignored, so plot regeneration can’t be enforced. Consider forwarding it or removing the parameter.
benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py-349-427 (1)
349-427: ⚠️ Potential issue | 🟡 Minor

biological_var is computed but never surfaced.
Consider adding it to the results output (or drop the computation).
🔧 Proposed fix (add to results)
             results.append({
                 "Method": method.upper(),
                 "Correction": correction,
                 "Total_Variance": total_variance,
                 "Within_Batch_Variance": within_batch_var,
                 "Between_Batch_Variance": between_batch_var,
+                "Biological_Variance": biological_var,
                 "Batch_Effect_Pct": batch_effect_pct,
             })
benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py-121-140 (1)
121-140: ⚠️ Potential issue | 🟡 Minor

Guard against conflicting dataset flags.
If both --hela-only and --comparison-only are passed, the current logic silently prefers HeLa. Consider rejecting the combination.
🔧 Proposed guard
     args = parser.parse_args()
 
+    if args.hela_only and args.comparison_only:
+        raise SystemExit("--hela-only and --comparison-only are mutually exclusive")
+
     # Select datasets
     if args.hela_only:
         datasets = HELA_DATASETS
benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py-73-83 (1)
73-83: ⚠️ Potential issue | 🟡 Minor

--force has no effect in phase 4.
force is unused, so recomputation can’t be requested from the CLI. Consider passing it into the metrics runner if supported or dropping the arg.
🔧 Suggested change (if supported)
-    results = mod.run_all_metrics(datasets=datasets)
+    results = mod.run_all_metrics(datasets=datasets, force=force)
benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py-31-58 (1)
31-58: ⚠️ Potential issue | 🟡 Minor

Guard CCC against zero-variance inputs.
If both series are constant, the denominator becomes zero.
🔧 Proposed fix
-    ccc = numerator / denominator
-
-    return ccc
+    if denominator == 0:
+        return 0
+    return numerator / denominator
benchmarks/batch-quartet-multilab/scripts/plot_results.py-237-306 (1)
237-306: ⚠️ Potential issue | 🟡 Minor

Avoid divide-by-zero when raw CV is zero.
If a method’s raw CV is 0, the improvement calculation will blow up.
🔧 Proposed fix
-        if len(raw_cv) > 0 and len(combat_cv) > 0:
-            cv_improvement = (raw_cv[0] - combat_cv[0]) / raw_cv[0] * 100
+        if len(raw_cv) > 0 and len(combat_cv) > 0 and raw_cv[0] != 0:
+            cv_improvement = (raw_cv[0] - combat_cv[0]) / raw_cv[0] * 100
         else:
             cv_improvement = 0
benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py-37-46 (1)
37-46: ⚠️ Potential issue | 🟡 Minor

--force has no effect in phase 1.
force is unused, so the CLI flag doesn’t impact data acquisition. Consider threading it through if the downloader supports it (or remove it to avoid misleading behavior).
🔧 Suggested change (if supported)
-    results = mod.download_all_datasets(datasets=datasets)
+    results = mod.download_all_datasets(datasets=datasets, force=force)
benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py-556-675 (1)
556-675: ⚠️ Potential issue | 🟡 Minor

Drop unused method_lower assignment.
It’s unused and trips Ruff F841.
🔧 Proposed fix
-        method_lower = method.lower()
-
         # Get metrics
         core_pct = coverage_df[coverage_df["Method"] == method]["Core_Pct"].values[0]
benchmarks/quant-hela-method-comparison/scripts/05_generate_plots.py-564-742 (1)
564-742: ⚠️ Potential issue | 🟡 Minor

Don’t silently swallow parquet read failures.
A bare except Exception: continue can hide data loss and lead to incomplete plots without any hint to the user.
🔧 Proposed fix
-            try:
-                df = pd.read_parquet(quant_path)
-            except Exception:
-                continue
+            try:
+                df = pd.read_parquet(quant_path)
+            except Exception as e:
+                print(f"  WARNING: Failed to read {quant_path}: {e}")
+                continue
benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py-450-520 (1)
450-520: ⚠️ Potential issue | 🟡 Minor

Remove redundant f-prefix in static print statement.
The string literal contains no expressions, triggering Ruff F541.
🔧 Proposed fix
-            print(f"\n  Saved: method_correlations.csv")
+            print("\n  Saved: method_correlations.csv")
benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py-106-162 (1)
106-162: ⚠️ Potential issue | 🟡 Minor

Use the computed p-value and clear unused variables.

Currently the plot hardcodes the p-value and leaves rho and cb unused. Change rho to _rho to indicate intentional non-use, remove the cb assignment, and use the computed p variable in the stats text instead of the hardcoded value.
🔧 Proposed fix
     r, p = pearsonr(x, y)
-    rho, _ = spearmanr(x, y)
+    _rho, _ = spearmanr(x, y)
     ccc = concordance_correlation_coefficient(x, y)

     # Create hexbin density plot
     hb = ax.hexbin(x, y, gridsize=bins, cmap='jet', mincnt=1,
                    norm=LogNorm(), alpha=0.9)

     # Add colorbar
-    cb = plt.colorbar(hb, ax=ax, label='n_neighbors')
+    plt.colorbar(hb, ax=ax, label='n_neighbors')

     # Add diagonal line (y = x)
     lims = [
         min(ax.get_xlim()[0], ax.get_ylim()[0]),
         max(ax.get_xlim()[1], ax.get_ylim()[1]),
     ]
     ax.plot(lims, lims, 'k--', alpha=0.5, linewidth=1)
     ax.set_xlim(lims)
     ax.set_ylim(lims)

     # Add statistics text
-    stats_text = f"R = {r:.2f}, p < 2.2e-16\nLin's CCC: {ccc:.3f}"
+    stats_text = f"R = {r:.2f}, p = {p:.2e}\nLin's CCC: {ccc:.3f}"
     ax.text(0.05, 0.95, stats_text, transform=ax.transAxes,
             fontsize=10, verticalalignment='top',
             bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))
benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py-238-346 (1)
238-346: ⚠️ Potential issue | 🟡 Minor

Remove redundant f-prefixes on static warning strings.
Lines 343 and 345 have f-strings with no interpolation, triggering Ruff F541 (f-string-missing-placeholders).
🔧 Proposed fix
-                print(f"  WARNING: Low LFC correlation indicates quantification instability across labs")
+                print("  WARNING: Low LFC correlation indicates quantification instability across labs")
-                print(f"  WARNING: Low DE concordance - same experiment may give different conclusions in different labs")
+                print("  WARNING: Low DE concordance - same experiment may give different conclusions in different labs")
benchmarks/quant-hela-method-comparison/scripts/03_run_quantification.py-33-104 (1)
33-104: ⚠️ Potential issue | 🟡 Minor

Use dataset_id in error messages to fix unused parameter and f-string lint issues.
The function accepts dataset_id but never uses it, triggering ARG001. Additionally, the second error message is an f-string with no placeholders, triggering F541.
🔧 Proposed fix
     if not fasta_path.exists():
-        print(f"  WARNING: FASTA file not found: {fasta_path}")
+        print(f"  WARNING [{dataset_id}]: FASTA file not found: {fasta_path}")
         return None
             else:
-                print(f"  ERROR: iBAQ output file not generated")
+                print(f"  ERROR [{dataset_id}]: iBAQ output file not generated")
                 return None

🧹 Nitpick comments (3)

benchmarks/quant-hela-method-comparison/scripts/03_run_quantification.py (1)

222-311: Verify whether MaxLFQ should be part of the default method set.
run_maxlfq() exists, but QUANTIFICATION_METHODS (and the CLI --method choices) omit "maxlfq", so it won’t run unless injected manually. Confirm intent: add it to QUANTIFICATION_METHODS or remove the runner to avoid dead code.
benchmarks/quant-hela-method-comparison/scripts/run_benchmark.py (1)
29-34: Defensive guard for module spec/loader is good practice but not required for current usage.

spec_from_file_location() can return None and spec.loader can be None when the file suffix isn't recognized by Python's import machinery. However, all current callers pass .py files, which always receive a SourceFileLoader. That said, adding the guard prevents silent failures if paths change to non-standard extensions in the future.
🔧 Suggested addition for robustness
 def import_module(name, path):
     """Import a module from a file path."""
     spec = importlib.util.spec_from_file_location(name, path)
+    if spec is None or spec.loader is None:
+        raise ImportError(f"Cannot load module '{name}' from {path}")
     module = importlib.util.module_from_spec(spec)
     spec.loader.exec_module(module)
     return module
benchmarks/quant-hela-method-comparison/scripts/02_prepare_peptides.py (1)
247-335: Optional: use dataset_id to avoid unused-arg warnings.
This keeps logs more informative and removes the unused parameter warning.
♻️ Suggested tweak
-    print(f"  Input shape: {df.shape}")
+    print(f"  [{dataset_id}] Input shape: {df.shape}")

coderabbitai · 2026-01-31T09:10:59Z

benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py

+def analyze_protein_coverage(meta, data):
+    """Analyze protein detection across labs and methods."""
+    print("\n" + "=" * 70)
+    print("1. PROTEIN COVERAGE ANALYSIS")
+    print("=" * 70)
+
+    results = []
+
+    for method, matrices in data.items():
+        matrix = matrices["raw"]
+
+        # Per-batch coverage
+        batches = meta["batch"].unique()
+        batch_proteins = {}
+
+        for batch in batches:
+            batch_samples = meta[meta["batch"] == batch]["run_id"].tolist()
+            batch_cols = [c for c in matrix.columns if c in batch_samples]
+
+            if batch_cols:
+                batch_matrix = matrix[batch_cols]
+                # Protein detected in at least 1 sample in batch
+                detected = (batch_matrix.notna().sum(axis=1) > 0).sum()
+                # Protein detected in ALL samples in batch
+                complete = (batch_matrix.notna().sum(axis=1) == len(batch_cols)).sum()
+                batch_proteins[batch] = {
+                    "detected": detected,
+                    "complete": complete,
+                    "samples": len(batch_cols)
+                }
+
+        # Calculate overlap between batches
+        batch_sets = {}
+        for batch in batches:
+            batch_samples = meta[meta["batch"] == batch]["run_id"].tolist()
+            batch_cols = [c for c in matrix.columns if c in batch_samples]
+            if batch_cols:
+                detected_proteins = matrix.index[matrix[batch_cols].notna().any(axis=1)].tolist()
+                batch_sets[batch] = set(detected_proteins)
+
+        # Pairwise overlap
+        overlaps = []
+        for b1, b2 in combinations(batches, 2):
+            if b1 in batch_sets and b2 in batch_sets:
+                intersection = len(batch_sets[b1] & batch_sets[b2])
+                union = len(batch_sets[b1] | batch_sets[b2])
+                jaccard = intersection / union if union > 0 else 0
+                overlaps.append(jaccard)
+
+        # Core proteome (detected in ALL batches)
+        if batch_sets:
+            core = set.intersection(*batch_sets.values())
+            any_batch = set.union(*batch_sets.values())
+        else:
+            core = set()
+            any_batch = set()
+
+        results.append({
+            "Method": method.upper(),
+            "Total_Proteins": len(matrix),
+            "Core_Proteome": len(core),
+            "Core_Pct": len(core) / len(any_batch) * 100 if any_batch else 0,
+            "Mean_Jaccard": np.mean(overlaps) if overlaps else 0,
+            "Min_Batch_Detection": min(bp["detected"] for bp in batch_proteins.values()) if batch_proteins else 0,
+            "Max_Batch_Detection": max(bp["detected"] for bp in batch_proteins.values()) if batch_proteins else 0,
+        })
+
+        print(f"\n{method.upper()}:")
+        print(f"  Total proteins: {len(matrix)}")
+        print(f"  Core proteome (all batches): {len(core)} ({len(core)/len(any_batch)*100:.1f}%)")
+        print(f"  Mean Jaccard similarity: {np.mean(overlaps):.3f}")
+        print(f"  Per-batch detection range: {min(bp['detected'] for bp in batch_proteins.values())} - {max(bp['detected'] for bp in batch_proteins.values())}")
+
+    return pd.DataFrame(results)


⚠️ Potential issue | 🟠 Major

Guard against empty batch mappings before min/max and % calculations.
If no matrix columns match metadata batches, the current min/max and len(core)/len(any_batch) prints will raise.

🔧 Proposed fix

- print(f"\n{method.upper()}:") - print(f" Total proteins: {len(matrix)}") - print(f" Core proteome (all batches): {len(core)} ({len(core)/len(any_batch)*100:.1f}%)") - print(f" Mean Jaccard similarity: {np.mean(overlaps):.3f}") - print(f" Per-batch detection range: {min(bp['detected'] for bp in batch_proteins.values())} - {max(bp['detected'] for bp in batch_proteins.values())}") + print(f"\n{method.upper()}:") + print(f" Total proteins: {len(matrix)}") + if batch_proteins and any_batch: + core_pct = len(core) / len(any_batch) * 100 + print(f" Core proteome (all batches): {len(core)} ({core_pct:.1f}%)") + print(f" Mean Jaccard similarity: {np.mean(overlaps):.3f}") + print( + " Per-batch detection range: " + f"{min(bp['detected'] for bp in batch_proteins.values())} - " + f"{max(bp['detected'] for bp in batch_proteins.values())}" + ) + else: + print(" No matching batch samples found in the matrix")

🤖 Prompt for AI Agents

In `@benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py` around lines 55 - 128, The analyze_protein_coverage function can raise when batch_proteins, any_batch or overlaps are empty; update the logic to guard all min/max and percentage calculations and prints: check if batch_proteins before calling min(...) or max(...), use 0 (or an appropriate default) when empty; check if any_batch before computing len(core)/len(any_batch) and format the percent as 0% when empty; check overlaps before taking np.mean and default to 0; apply these guards in both the results.append dictionary entries and the subsequent print statements so no division or min/max is called on empty collections.

coderabbitai · 2026-01-31T09:10:59Z

benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py

+def analyze_missing_values(meta, data):
+    """Analyze missing value patterns."""
+    print("\n" + "=" * 70)
+    print("2. MISSING VALUE ANALYSIS")
+    print("=" * 70)
+
+    results = []
+
+    for method, matrices in data.items():
+        matrix = matrices["raw"]
+
+        total_values = matrix.size
+        missing_values = matrix.isna().sum().sum()
+        missing_pct = missing_values / total_values * 100
+
+        # Per-batch missing rate
+        batches = meta["batch"].unique()
+        batch_missing = {}
+
+        for batch in batches:
+            batch_samples = meta[meta["batch"] == batch]["run_id"].tolist()
+            batch_cols = [c for c in matrix.columns if c in batch_samples]
+            if batch_cols:
+                batch_matrix = matrix[batch_cols]
+                batch_miss_pct = batch_matrix.isna().sum().sum() / batch_matrix.size * 100
+                batch_missing[batch] = batch_miss_pct
+
+        # Variance in missing rates across batches
+        missing_variance = np.var(list(batch_missing.values()))
+
+        results.append({
+            "Method": method.upper(),
+            "Total_Missing_Pct": missing_pct,
+            "Min_Batch_Missing": min(batch_missing.values()),
+            "Max_Batch_Missing": max(batch_missing.values()),
+            "Missing_Variance": missing_variance,
+        })
+
+        print(f"\n{method.upper()}:")
+        print(f"  Overall missing: {missing_pct:.1f}%")
+        print(f"  Batch missing range: {min(batch_missing.values()):.1f}% - {max(batch_missing.values()):.1f}%")
+        print(f"  Missing variance: {missing_variance:.2f}")
+
+        # CRITIQUE: Flag high missing value variance as a problem
+        if missing_variance > 100:
+            print(f"  WARNING: High variance in missing values across batches suggests batch-specific quantification issues")
+
+    return pd.DataFrame(results)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate the file and verify it exists find . -name "comprehensive_analysis.py" -type f 2>/dev/null

Repository: bigbio/mokume

Length of output: 126

🏁 Script executed:

# Get the file size and read the relevant section if [ -f "benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py" ]; then wc -l "benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py" sed -n '131,178p' "benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py" fi

Repository: bigbio/mokume

Length of output: 1983

🏁 Script executed:

# Check the broader context to understand data flow - read a bit before line 131 sed -n '120,180p' "benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py"

Repository: bigbio/mokume

Length of output: 2378

Handle empty batch_missing mapping to prevent ValueError on min/max.
If no columns match any batch, batch_missing remains empty and calling min() or max() on its values will crash. Additionally, the warning message f-string prefix is unnecessary since it contains no variable interpolation.

🔧 Proposed fix

- results.append({ + if batch_missing: + min_missing = min(batch_missing.values()) + max_missing = max(batch_missing.values()) + else: + min_missing = max_missing = 0 + + results.append({ "Method": method.upper(), "Total_Missing_Pct": missing_pct, - "Min_Batch_Missing": min(batch_missing.values()), - "Max_Batch_Missing": max(batch_missing.values()), + "Min_Batch_Missing": min_missing, + "Max_Batch_Missing": max_missing, "Missing_Variance": missing_variance, }) @@ - print(f" Batch missing range: {min(batch_missing.values()):.1f}% - {max(batch_missing.values()):.1f}%") + print(f" Batch missing range: {min_missing:.1f}% - {max_missing:.1f}%") @@ - print(f" WARNING: High variance in missing values across batches suggests batch-specific quantification issues") + print(" WARNING: High variance in missing values across batches suggests batch-specific quantification issues")

🧰 Tools

🪛 Ruff (0.14.14)

[error] 176-176: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

In `@benchmarks/batch-quartet-multilab/scripts/comprehensive_analysis.py` around lines 131 - 178, In analyze_missing_values, avoid calling min()/max() on an empty batch_missing dict which raises ValueError: check if batch_missing is empty before computing min/max and variance (e.g., set Min_Batch_Missing, Max_Batch_Missing, Missing_Variance to np.nan or 0 when empty), update the results.append and the print statements to use the safe values, and replace the unnecessary f-string prefix on the WARNING message with a normal string literal (no interpolation).

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/batch-quartet-multilab/scripts/run_benchmark.py

+def run_maxlfq_quantification(peptide_df: pd.DataFrame) -> pd.DataFrame:
+    """Run MaxLFQ quantification using mokume."""
+    print("  Running MaxLFQ...")
+
+    # Pivot to wide format for MaxLFQ
+    wide_df = peptide_df.pivot_table(
+        index=["peptide", "protein"],
+        columns="run_id",
+        values="intensity",
+        aggfunc="sum"
+    ).reset_index()
+
+    sample_cols = [c for c in wide_df.columns if c not in ["peptide", "protein"]]
+
+    # Run MaxLFQ per protein
+    maxlfq = MaxLFQQuantification(min_peptides=1)
+
+    proteins = wide_df["protein"].unique()
+    results = []
+
+    for protein in proteins:
+        protein_data = wide_df[wide_df["protein"] == protein]
+        intensity_matrix = protein_data[sample_cols].values
+
+        # Skip if too few peptides
+        if len(protein_data) < 1:
+            continue
+
+        try:
+            # Use log intensities for MaxLFQ
+            log_intensities = np.log(intensity_matrix + 1)
+            log_intensities[~np.isfinite(log_intensities)] = np.nan
+
+            # Simple MaxLFQ: median of peptide ratios
+            protein_intensities = np.nanmedian(log_intensities, axis=0)
+
+            for i, sample in enumerate(sample_cols):
+                if not np.isnan(protein_intensities[i]):
+                    results.append({
+                        "protein": protein,
+                        "run_id": sample,
+                        "intensity": protein_intensities[i]
+                    })
+        except Exception:
+            continue
+
+    return pd.DataFrame(results)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, let's find where MaxLFQQuantification is defined find . -type f -name "*.py" | xargs grep -l "class MaxLFQQuantification" | head -20

Repository: bigbio/mokume

Length of output: 90

🏁 Script executed:

#!/bin/bash # Let's also check the imports in the reviewed file head -30 benchmarks/batch-quartet-multilab/scripts/run_benchmark.py | grep -E "^import|^from"

Repository: bigbio/mokume

Length of output: 257

🏁 Script executed:

#!/bin/bash # Search for MaxLFQQuantification class and quantify method rg -n "class MaxLFQQuantification|def quantify" -A 10 -g "*.py"

Repository: bigbio/mokume

Length of output: 7042

🏁 Script executed:

#!/bin/bash # Get the full quantify method implementation and docstring sed -n '456,550p' mokume/quantification/maxlfq.py

Repository: bigbio/mokume

Length of output: 3582

🏁 Script executed:

#!/bin/bash # Check what columns the method returns rg -A 50 "def quantify" mokume/quantification/maxlfq.py | grep -A 50 "456:"

Repository: bigbio/mokume

Length of output: 39

🏁 Script executed:

#!/bin/bash # Check what the default constants are rg -n "PROTEIN_NAME|PEPTIDE_CANONICAL|NORM_INTENSITY|SAMPLE_ID" mokume/quantification/base.py | head -20

Repository: bigbio/mokume

Length of output: 337

🏁 Script executed:

#!/bin/bash # Also check the current imports and constants in the benchmark file grep -n "PROTEIN_NAME\|PEPTIDE_CANONICAL\|NORM_INTENSITY\|SAMPLE_ID" benchmarks/batch-quartet-multilab/scripts/run_benchmark.py

Repository: bigbio/mokume

Length of output: 39

🏁 Script executed:

#!/bin/bash # Find the actual values of these constants rg -n "PROTEIN_NAME\s*=|PEPTIDE_CANONICAL\s*=|NORM_INTENSITY\s*=|SAMPLE_ID\s*=" mokume/quantification/

Repository: bigbio/mokume

Length of output: 39

🏁 Script executed:

#!/bin/bash # Check full imports section of the benchmark file head -50 benchmarks/batch-quartet-multilab/scripts/run_benchmark.py

Repository: bigbio/mokume

Length of output: 1666

🏁 Script executed:

#!/bin/bash # Look at what the current code actually returns to understand the data structure sed -n '100,156p' benchmarks/batch-quartet-multilab/scripts/run_benchmark.py

Repository: bigbio/mokume

Length of output: 1927

MaxLFQ path doesn't use MaxLFQQuantification.
The current implementation creates a MaxLFQQuantification object but never calls it, instead computing a median of log intensities—which deviates significantly from the MaxLFQ algorithm and skews the benchmark.

Call MaxLFQQuantification.quantify() directly on the peptide DataFrame:

Proposed fix

- # Pivot to wide format for MaxLFQ - wide_df = peptide_df.pivot_table( - index=["peptide", "protein"], - columns="run_id", - values="intensity", - aggfunc="sum" - ).reset_index() - - sample_cols = [c for c in wide_df.columns if c not in ["peptide", "protein"]] - - # Run MaxLFQ per protein - maxlfq = MaxLFQQuantification(min_peptides=1) - - proteins = wide_df["protein"].unique() - results = [] - - for protein in proteins: - protein_data = wide_df[wide_df["protein"] == protein] - intensity_matrix = protein_data[sample_cols].values - - # Skip if too few peptides - if len(protein_data) < 1: - continue - - try: - # Use log intensities for MaxLFQ - log_intensities = np.log(intensity_matrix + 1) - log_intensities[~np.isfinite(log_intensities)] = np.nan - - # Simple MaxLFQ: median of peptide ratios - protein_intensities = np.nanmedian(log_intensities, axis=0) - - for i, sample in enumerate(sample_cols): - if not np.isnan(protein_intensities[i]): - results.append({ - "protein": protein, - "run_id": sample, - "intensity": protein_intensities[i] - }) - except Exception: - continue - - return pd.DataFrame(results) + maxlfq = MaxLFQQuantification(min_peptides=1) + result = maxlfq.quantify( + peptide_df, + protein_column="protein", + peptide_column="peptide", + intensity_column="intensity", + sample_column="run_id", + ) + + # Rename to standard output column + result = result.rename(columns={"MaxLFQIntensity": "intensity"}) + return result[["protein", "run_id", "intensity"]]

🧰 Tools

🪛 Ruff (0.14.14)

[error] 125-125: Local variable maxlfq is assigned to but never used

Remove assignment to unused variable maxlfq

(F841)

[error] 153-154: try-except-continue detected, consider logging the exception

(S112)

[warning] 153-153: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@benchmarks/batch-quartet-multilab/scripts/run_benchmark.py` around lines 110 - 156, The function run_maxlfq_quantification creates a MaxLFQQuantification instance but never uses it; instead it computes a median which is incorrect. Replace the manual per-protein median logic by calling MaxLFQQuantification.quantify(...) with the original peptide-level DataFrame (peptide_df) or the appropriately formatted DataFrame (columns: peptide, protein, run_id, intensity), capture its return (quant_df), and then convert/return quant_df with columns renamed to match expected output ("protein","run_id","intensity"); keep the existing try/except around the quantify call and remove the manual log/median computation using intensity_matrix, log_intensities, and protein_intensities. Ensure you still handle empty results and NaNs as needed.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/01_download_data.py

+def download_file(url: str, dest: Path, verbose: bool = True) -> bool:
+    """
+    Download a file from URL to destination.
+
+    Parameters
+    ----------
+    url : str
+        Source URL
+    dest : Path
+        Destination file path
+    verbose : bool
+        Print progress messages
+
+    Returns
+    -------
+    bool
+        True if download successful, False otherwise
+    """
+    if dest.exists():
+        if verbose:
+            print(f"  Already exists: {dest.name}")
+        return True
+
+    if verbose:
+        print(f"  Downloading: {url}")
+
+    try:
+        urllib.request.urlretrieve(url, dest)
+        if verbose:
+            size_mb = dest.stat().st_size / (1024 * 1024)
+            print(f"  Saved to: {dest.name} ({size_mb:.1f} MB)")
+        return True
+    except urllib.error.HTTPError as e:
+        print(f"  HTTP Error {e.code}: {url}")
+        return False
+    except urllib.error.URLError as e:
+        print(f"  URL Error: {e.reason}")
+        return False
+    except Exception as e:
+        print(f"  Error: {e}")
+        return False
+


⚠️ Potential issue | 🟠 Major

Create destination directories and validate URL scheme before download.
Without ensuring dest.parent exists, downloads can fail; also restrict schemes to expected protocols.

🔧 Proposed fix

@@ -import urllib.request -import urllib.error +import urllib.request +import urllib.error +from urllib.parse import urlparse @@ def download_file(url: str, dest: Path, verbose: bool = True) -> bool: @@ - if dest.exists(): + if dest.exists(): if verbose: print(f" Already exists: {dest.name}") return True + + # Ensure output directory exists + dest.parent.mkdir(parents=True, exist_ok=True) + + # Allow only expected URL schemes + scheme = urlparse(url).scheme + if scheme not in {"http", "https", "ftp", "ftps"}: + print(f" Unsupported URL scheme: {scheme}") + return False

🧰 Tools

🪛 Ruff (0.14.14)

[error] 49-49: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[warning] 53-53: Consider moving this statement to an else block

(TRY300)

[warning] 60-60: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/01_download_data.py` around lines 22 - 63, download_file currently assumes the destination directory exists and accepts any URL scheme; before calling urllib.request.urlretrieve in download_file, ensure dest.parent exists by creating it (Path.mkdir with parents=True, exist_ok=True) and validate the URL scheme using urllib.parse.urlparse(url).scheme, allowing only expected schemes (e.g., 'http', 'https', 'ftp'); if the scheme is invalid, print an informative message and return False. Make these checks near the start of download_file (before the download attempt) so failures are handled early and directories are prepared.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/01_download_data.py

+def check_url_exists(url: str) -> bool:
+    """Check if a URL exists without downloading."""
+    try:
+        request = urllib.request.Request(url, method='HEAD')
+        urllib.request.urlopen(request, timeout=10)
+        return True
+    except:
+        return False


⚠️ Potential issue | 🟠 Major

Avoid bare except and close the response.
This keeps errors explicit and avoids resource leaks.

🔧 Proposed fix

def check_url_exists(url: str) -> bool: """Check if a URL exists without downloading.""" try: request = urllib.request.Request(url, method='HEAD') - urllib.request.urlopen(request, timeout=10) - return True - except: + with urllib.request.urlopen(request, timeout=10): + return True + except (urllib.error.HTTPError, urllib.error.URLError, ValueError): return False

🧰 Tools

🪛 Ruff (0.14.14)

[error] 68-68: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[error] 69-69: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

[warning] 70-70: Consider moving this statement to an else block

(TRY300)

[error] 71-71: Do not use bare except

(E722)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/01_download_data.py` around lines 65 - 72, The check_url_exists function uses a bare except and doesn't close the urlopen response; update check_url_exists to open the request with urllib.request.urlopen(request, timeout=10) assigned to a variable or used as a context manager so the response is closed, and replace the bare except with explicit exception handling (e.g., except urllib.error.HTTPError, urllib.error.URLError, socket.timeout as e) to return False while preserving the exception variable for possible logging; reference the Request creation and urlopen call in check_url_exists when making these changes.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py

+def compute_expression_stability(
+    datasets: Dict[str, DatasetInfo],
+    method: str,
+    proteins: List[str] = None,
+) -> pd.DataFrame:
+    """
+    Compute expression stability metrics (MAD, IQR) for selected proteins.
+
+    Parameters
+    ----------
+    datasets : dict
+        Datasets to analyze
+    method : str
+        Quantification method
+    proteins : list, optional
+        Proteins to analyze. Defaults to PROTEINS_OF_INTEREST.
+
+    Returns
+    -------
+    pd.DataFrame
+        Stability metrics per protein: MAD, IQR, range, n_datasets
+    """
+    if proteins is None:
+        proteins = PROTEINS_OF_INTEREST
+
+    intensity_col = INTENSITY_COLUMNS.get(method)
+
+    # Collect expression values per protein
+    protein_values = {p: [] for p in proteins}
+
+    for dataset_id, dataset in datasets.items():
+        df = load_quantification_result(dataset_id, method)
+        if df is None:
+            continue
+
+        if intensity_col not in df.columns:
+            for col in df.columns:
+                if "intensity" in col.lower() or "ibaq" in col.lower():
+                    intensity_col = col
+                    break
+
+        if intensity_col not in df.columns:
+            continue
+
+        # Get median expression per protein
+        median_expr = df.groupby("ProteinName")[intensity_col].median()
+
+        for protein in proteins:
+            if protein in median_expr.index:
+                val = median_expr.loc[protein]
+                if pd.notna(val) and val > 0:
+                    protein_values[protein].append(np.log2(val + 1))
+
+    # Compute stability metrics
+    results = []
+    for protein, values in protein_values.items():
+        if len(values) >= 2:
+            values = np.array(values)
+            results.append({
+                "ProteinName": protein,
+                "MAD": np.median(np.abs(values - np.median(values))),
+                "IQR": np.percentile(values, 75) - np.percentile(values, 25),
+                "Range": values.max() - values.min(),
+                "Mean": values.mean(),
+                "Std": values.std(),
+                "N_datasets": len(values),
+            })
+
+    return pd.DataFrame(results)


⚠️ Potential issue | 🟠 Major

Avoid double‑logging iBAQ and dropping valid log values.
IbaqLog is already log-transformed; applying log2 again and filtering val > 0 can discard valid data.

🔧 Proposed fix

if proteins is None: proteins = PROTEINS_OF_INTEREST intensity_col = INTENSITY_COLUMNS.get(method) + log_needed = "log" not in method.lower() and method != "ibaq" @@ for protein in proteins: if protein in median_expr.index: val = median_expr.loc[protein] - if pd.notna(val) and val > 0: - protein_values[protein].append(np.log2(val + 1)) + if pd.notna(val) and (val > 0 or not log_needed): + protein_values[protein].append(np.log2(val + 1) if log_needed else val)

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 318-318: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

[warning] 345-345: Loop control variable dataset not used within loop body

Rename unused dataset to _dataset

(B007)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py` around lines 315 - 383, In compute_expression_stability, you are double‑logging already log‑transformed iBAQ values and dropping valid zero/negative log values by applying np.log2(val + 1) and filtering val > 0; instead detect when intensity_col corresponds to a log scale (e.g., INTENSITY_COLUMNS entries like "IbaqLog" or column names containing "ibaqlog"/"log") and skip the np.log2 conversion for those columns, and change the validity check to only pd.notna(val) (or np.isfinite) rather than val > 0 so valid logged values aren’t discarded; update the handling around intensity_col selection, median_expr extraction, and the append logic to apply logging conditionally based on the intensity_col name.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py

+def save_results(results: dict, output_dir: Path = None):
+    """Save all results to files."""
+    if output_dir is None:
+        output_dir = ANALYSIS_DIR
+
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    print(f"\nSaving results to {output_dir}...")
+
+    # CV Summary
+    if results["cv_summary"]:
+        cv_df = pd.DataFrame(results["cv_summary"])
+        cv_df.to_csv(output_dir / "cv_comparison.csv", index=False)
+        print(f"  Saved: cv_comparison.csv")
+
+    # Cross-experiment correlation
+    for method, corr_matrix in results["cross_experiment_corr"].items():
+        corr_matrix.to_csv(output_dir / f"cross_experiment_corr_{method}.csv")
+    print(f"  Saved: cross_experiment_corr_*.csv")
+
+    # Rank consistency
+    for method, rank_matrix in results["rank_consistency"].items():
+        rank_matrix.to_csv(output_dir / f"rank_consistency_{method}.csv")
+    print(f"  Saved: rank_consistency_*.csv")
+
+    # Expression stability
+    for method, stability in results["expression_stability"].items():
+        stability.to_csv(output_dir / f"expression_stability_{method}.csv", index=False)
+    print(f"  Saved: expression_stability_*.csv")
+
+    # TMT vs LFQ correlation
+    if results.get("tmt_lfq"):
+        tmt_lfq_summary = []
+        for method, data in results["tmt_lfq"].items():
+            # Save raw values for plotting
+            plot_df = pd.DataFrame({
+                "ProteinName": data["proteins"],
+                "TMT": data["tmt_values"],
+                "LFQ": data["lfq_values"],
+            })
+            plot_df.to_csv(output_dir / f"tmt_lfq_values_{method}.csv", index=False)
+
+            # Summary stats
+            tmt_lfq_summary.append({
+                "method": method,
+                "n_proteins": data["n_proteins"],
+                "pearson_r": data["pearson_r"],
+                "spearman_r": data["spearman_r"],
+            })
+
+        if tmt_lfq_summary:
+            pd.DataFrame(tmt_lfq_summary).to_csv(output_dir / "tmt_lfq_comparison.csv", index=False)
+            print(f"  Saved: tmt_lfq_comparison.csv, tmt_lfq_values_*.csv")
+
+    # Summary metrics
+    summary = []
+    for method in QUANTIFICATION_METHODS:
+        row = {"method": method}
+
+        # Mean CV
+        cv_data = [x for x in results["cv_summary"] if x["method"] == method]
+        if cv_data:
+            row["mean_cv"] = np.mean([x["mean_cv"] for x in cv_data])
+
+        # Mean cross-experiment correlation
+        if method in results["cross_experiment_corr"]:
+            corr = results["cross_experiment_corr"][method]
+            row["mean_cross_corr"] = corr.values[np.triu_indices_from(corr.values, k=1)].mean()
+
+        # Mean rank consistency
+        if method in results["rank_consistency"]:
+            rank = results["rank_consistency"][method]
+            row["mean_rank_corr"] = rank.values[np.triu_indices_from(rank.values, k=1)].mean()
+
+        # Mean expression stability
+        if method in results["expression_stability"]:
+            row["mean_mad"] = results["expression_stability"][method]["MAD"].mean()
+
+        summary.append(row)
+
+    summary_df = pd.DataFrame(summary)
+    summary_df.to_csv(output_dir / "summary_metrics.csv", index=False)
+    print(f"  Saved: summary_metrics.csv")


⚠️ Potential issue | 🟠 Major

Remove f-strings without placeholders to satisfy linting.
Ruff F541 will fail CI for these lines.

🔧 Proposed fix

- print(f" Saved: cv_comparison.csv") + print(" Saved: cv_comparison.csv") @@ - print(f" Saved: cross_experiment_corr_*.csv") + print(" Saved: cross_experiment_corr_*.csv") @@ - print(f" Saved: rank_consistency_*.csv") + print(" Saved: rank_consistency_*.csv") @@ - print(f" Saved: expression_stability_*.csv") + print(" Saved: expression_stability_*.csv") @@ - print(f" Saved: tmt_lfq_comparison.csv, tmt_lfq_values_*.csv") + print(" Saved: tmt_lfq_comparison.csv, tmt_lfq_values_*.csv") @@ - print(f" Saved: summary_metrics.csv") + print(" Saved: summary_metrics.csv")

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 459-459: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

[error] 472-472: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 477-477: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 482-482: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 487-487: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 511-511: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 541-541: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/04_compute_metrics.py` around lines 459 - 541, In save_results, several print calls use f-strings even though they contain no placeholders (Ruff F541); replace these unnecessary f-strings with regular string literals: change print(f" Saved: cv_comparison.csv"), print(f" Saved: cross_experiment_corr_*.csv"), print(f" Saved: rank_consistency_*.csv"), print(f" Saved: expression_stability_*.csv"), print(f" Saved: tmt_lfq_comparison.csv, tmt_lfq_values_*.csv"), and print(f" Saved: summary_metrics.csv") to plain strings (keep the f-string for the first print that interpolates output_dir).

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py

+def create_method_comparison_grid(
+    methods: List[str] = None,
+    datasets: Dict = None,
+    output_dir: Path = None,
+):
+    """
+    Create a grid of correlation plots comparing all method pairs.
+    """
+    if methods is None:
+        methods = ["ibaq", "ribaq", "directlfq", "top3", "sum"]
+    if datasets is None:
+        datasets = ALL_DATASETS
+    if output_dir is None:
+        output_dir = PLOTS_DIR
+
+    n_methods = len(methods)
+    fig, axes = plt.subplots(n_methods, n_methods, figsize=(4 * n_methods, 4 * n_methods))
+
+    for i, method1 in enumerate(methods):
+        for j, method2 in enumerate(methods):
+            ax = axes[i, j]
+
+            if i == j:
+                # Diagonal - show method name
+                ax.text(0.5, 0.5, method1.upper(), transform=ax.transAxes,
+                       ha='center', va='center', fontsize=14, fontweight='bold')
+                ax.set_xlim(0, 1)
+                ax.set_ylim(0, 1)
+                ax.axis('off')
+                continue
+
+            if i > j:
+                # Lower triangle - hide
+                ax.axis('off')
+                continue
+
+            # Upper triangle - correlation plot
+            all_x = []
+            all_y = []
+
+            for dataset_id in datasets:
+                df1 = load_method_data(dataset_id, method1)
+                df2 = load_method_data(dataset_id, method2)
+
+                if df1 is None or df2 is None:
+                    continue
+
+                col1 = get_intensity_column(method1)
+                col2 = get_intensity_column(method2)
+
+                if col1 not in df1.columns or col2 not in df2.columns:
+                    continue
+
+                expr1 = get_median_expression(df1, col1)
+                expr2 = get_median_expression(df2, col2)
+
+                common = expr1.index.intersection(expr2.index)
+
+                if len(common) < 10:
+                    continue
+
+                x = expr1.loc[common].values
+                y = expr2.loc[common].values
+
+                if not is_log_transformed(method1):
+                    x = np.log2(x + 1)
+                if not is_log_transformed(method2):
+                    y = np.log2(y + 1)
+
+                mask = np.isfinite(x) & np.isfinite(y)
+                x = zscore_normalize(x[mask])
+                y = zscore_normalize(y[mask])
+                all_x.extend(x)
+                all_y.extend(y)
+
+            if len(all_x) > 100:
+                all_x = np.array(all_x)
+                all_y = np.array(all_y)
+
+                # Simple hexbin without colorbar for grid
+                ax.hexbin(all_x, all_y, gridsize=50, cmap='jet', mincnt=1,
+                         norm=LogNorm(), alpha=0.9)
+
+                # Stats
+                r, _ = pearsonr(all_x, all_y)
+                ax.text(0.05, 0.95, f"R={r:.2f}", transform=ax.transAxes,
+                       fontsize=9, verticalalignment='top',
+                       bbox=dict(boxstyle='round', facecolor='white', alpha=0.7))
+
+                # Diagonal
+                lims = [min(all_x.min(), all_y.min()), max(all_x.max(), all_y.max())]
+                ax.plot(lims, lims, 'k--', alpha=0.5, linewidth=0.5)
+
+            if i == 0:
+                ax.set_title(method2.upper())
+            if j == 0:
+                ax.set_ylabel(method1.upper())
+
+    plt.suptitle("Method Correlation Comparison (All Datasets)", fontsize=16, y=1.02)
+    plt.tight_layout()
+
+    plt.savefig(output_dir / "method_correlation_grid.png", dpi=150, bbox_inches="tight")
+    plt.savefig(output_dir / "method_correlation_grid.pdf", bbox_inches="tight")
+    plt.close()
+
+    print("  Saved: method_correlation_grid.png/pdf")
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

matplotlib plt.subplots(1, 1) return value single axes vs 2d array

💡 Result:

plt.subplots(nrows, ncols) returns (fig, ax) where ax’s type/shape depends on the squeeze parameter (default True):

plt.subplots(1, 1) → a single matplotlib.axes.Axes object (scalar), not an array. [1]

plt.subplots(N, 1) or plt.subplots(1, M) (with N>1 or M>1) → a 1D NumPy object array of Axes. [1]

plt.subplots(N, M) with N>1 and M>1 → a 2D NumPy array of Axes. [1]

If you always want a 2D array (even for 1x1), set squeeze=False:

fig, axs = plt.subplots(1, 1, squeeze=False) # axs.shape == (1, 1) ax = axs[0, 0]

[1]

Sources
[1] Matplotlib docs: matplotlib.pyplot.subplots (squeeze behavior) (matplotlib.org)

Citations:

1: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplots.html?utm_source=openai

🏁 Script executed:

# First, let's examine the actual file to understand the full context cd /tmp && git clone --depth 1 https://github.com/user/repo.git 2>/dev/null || true # Look for the file find . -name "06_method_correlation_plots.py" 2>/dev/null

Repository: bigbio/mokume

Length of output: 39

🏁 Script executed:

# Let's check the file size and content around the relevant lines wc -l benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py 2>/dev/null || \ find . -path "*benchmarks*" -name "*.py" 2>/dev/null | head -20

Repository: bigbio/mokume

Length of output: 139

🏁 Script executed:

# Search for the file more broadly fd -type f "06_method_correlation_plots.py"

Repository: bigbio/mokume

Length of output: 227

🏁 Script executed:

# Let's see what files exist in benchmarks directory find . -type d -name benchmarks 2>/dev/null | head -5

Repository: bigbio/mokume

Length of output: 69

🏁 Script executed:

# List Python files in the repo fd -type f -name "*.py" | grep -E "(benchmark|method|correlation)" | head -20

Repository: bigbio/mokume

Length of output: 227

🏁 Script executed:

# Read the file to see the actual implementation and defaults cat -n benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py | sed -n '340,450p'

Repository: bigbio/mokume

Length of output: 4503

🏁 Script executed:

# Also check if there are any calls to this function that might pass custom methods rg "create_method_comparison_grid" benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py -A 5 -B 5

Repository: bigbio/mokume

Length of output: 618

🏁 Script executed:

# Check the beginning of the file for imports and context head -50 benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py

Repository: bigbio/mokume

Length of output: 1187

Add edge case handling for single-method grids.

When methods contains only one item, plt.subplots(1, 1) returns a single Axes object instead of a 2D array, causing axes[i, j] indexing at line 362 to fail. The proposed fix is correct:

Fix

n_methods = len(methods) fig, axes = plt.subplots(n_methods, n_methods, figsize=(4 * n_methods, 4 * n_methods)) + if n_methods == 1: + axes = np.array([[axes]])

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 343-343: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

[warning] 344-344: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

[warning] 345-345: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

[warning] 358-358: Unpacked variable fig is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/06_method_correlation_plots.py` around lines 342 - 448, The code in create_method_comparison_grid assumes axes is a 2D array but plt.subplots returns a single Axes when n_methods==1 (or a 1D array when one dimension is 1), causing axes[i, j] to fail; after creating fig, axes = plt.subplots(...), coerce axes into a 2D array (e.g., axes = np.atleast_2d(axes) or replace with axes = np.array([[axes]]) when n_methods == 1) so subsequent indexing axes[i, j] works reliably for all n_methods values; update the axes variable immediately after the plt.subplots call (before the for-loops) and do not change later logic.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/07_intensity_distribution_plots.py

+def plot_intensity_distribution(df, intensity_col, sample_col, title, output_path,
+                                 xlabel='log10(Intensity)', colors=None):
+    """Plot intensity distribution for each sample using KDE."""
+    fig, ax = plt.subplots(figsize=(10, 7))
+
+    samples = sorted(df[sample_col].unique())
+
+    if colors is None:
+        cmap = plt.cm.get_cmap('tab20', len(samples))
+        colors = [cmap(i) for i in range(len(samples))]
+
+    for i, sample in enumerate(samples):
+        sample_data = df[df[sample_col] == sample][intensity_col].dropna()
+
+        if len(sample_data) > 10:
+            # KDE estimation
+            try:
+                kde = stats.gaussian_kde(sample_data)
+                x_range = np.linspace(sample_data.min() - 0.5, sample_data.max() + 0.5, 200)
+                y = kde(x_range)
+                ax.plot(x_range, y, label=sample, color=colors[i], linewidth=1.5)
+            except:
+                pass
+
+    ax.set_xlabel(xlabel, fontsize=12)
+    ax.set_ylabel('Density', fontsize=12)
+    ax.set_title(title, fontsize=14, fontweight='bold')
+    ax.legend(title='Sample', bbox_to_anchor=(1.02, 1), loc='upper left', fontsize=8)
+    ax.grid(True, alpha=0.3)
+
+    plt.tight_layout()
+    plt.savefig(output_path, dpi=200, bbox_inches='tight', facecolor='white')
+    plt.close()
+    print(f"Saved: {output_path}")
+


⚠️ Potential issue | 🟠 Major

Don’t swallow KDE failures silently.
Bare except: pass hides real data issues and violates lint rules; log and continue instead.

🔧 Example fix (apply to all KDE try/except blocks)

- try: - kde = stats.gaussian_kde(sample_data) - x_range = np.linspace(sample_data.min() - 0.5, sample_data.max() + 0.5, 200) - y = kde(x_range) - ax.plot(x_range, y, label=sample, color=colors[i], linewidth=1.5) - except: - pass + try: + kde = stats.gaussian_kde(sample_data) + x_range = np.linspace(sample_data.min() - 0.5, sample_data.max() + 0.5, 200) + y = kde(x_range) + ax.plot(x_range, y, label=sample, color=colors[i], linewidth=1.5) + except Exception as exc: + print(f" KDE skipped for {sample}: {exc}")

Also applies to: 87-135, 138-230

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 54-54: Unpacked variable fig is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

[error] 72-72: Do not use bare except

(E722)

[error] 72-73: try-except-pass detected, consider logging the exception

(S110)

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/07_intensity_distribution_plots.py` around lines 51 - 85, The try/except in plot_intensity_distribution currently swallows all errors (bare except: pass); change it to catch Exception as e, log the failure with identifying context (e.g., sample name and intensity_col) and the exception message (using the module logger or a descriptive print) and then continue so plotting proceeds for other samples; update the same pattern in the other KDE try/except blocks referenced (lines 87-135, 138-230) to use Exception as e and emit an informative log/print before continuing.

coderabbitai · 2026-01-31T09:11:00Z

benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py

+import matplotlib.pyplot as plt
+from matplotlib.patches import FancyBboxPatch


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, find and examine the file find . -path "*benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" -type f # Check if the file exists if [ -f "benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" ]; then echo "=== File found ===" wc -l "benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" echo "" echo "=== Lines around 6-7 (imports) ===" head -15 "benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" echo "" echo "=== Lines around 233-239 (alleged savefig calls) ===" sed -n '230,245p' "benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" else echo "File not found at benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py" # Try to find similar files find . -name "generate_workflow_diagram.py" -type f fi

Repository: bigbio/mokume

Length of output: 1135

🏁 Script executed:

#!/bin/bash # Check for directory creation logic in the file rg -n "mkdir|makedirs|Path.*plots|os.path.join.*plots" benchmarks/ -g '*.py' --context 2

Repository: bigbio/mokume

Length of output: 5093

🏁 Script executed:

#!/bin/bash # Check the entire directory structure ls -la benchmarks/quant-hela-method-comparison/ 2>/dev/null || echo "Directory not found" ls -la benchmarks/quant-hela-method-comparison/scripts/ 2>/dev/null || echo "Scripts directory not found" ls -la benchmarks/quant-hela-method-comparison/plots/ 2>/dev/null || echo "Plots directory not found or doesn't exist"

Repository: bigbio/mokume

Length of output: 1288

Create the output directory before saving figures.

The script attempts to save to plots/mokume_workflow.png and plots/mokume_workflow.pdf at lines 237-239 without first creating the directory. Running this script in an environment where plots/ does not already exist will fail with FileNotFoundError.

🛠️ Suggested fix

import matplotlib.pyplot as plt from matplotlib.patches import FancyBboxPatch +from pathlib import Path @@ if __name__ == "__main__": fig = create_workflow_diagram() + output_dir = Path("plots") + output_dir.mkdir(parents=True, exist_ok=True) - fig.savefig('plots/mokume_workflow.png', dpi=250, bbox_inches='tight', + fig.savefig(output_dir / 'mokume_workflow.png', dpi=250, bbox_inches='tight', facecolor='white', edgecolor='none') - fig.savefig('plots/mokume_workflow.pdf', bbox_inches='tight', + fig.savefig(output_dir / 'mokume_workflow.pdf', bbox_inches='tight', facecolor='white', edgecolor='none')

🤖 Prompt for AI Agents

In `@benchmarks/quant-hela-method-comparison/scripts/generate_workflow_diagram.py` around lines 6 - 7, The script fails when saving figures because the output directory "plots" may not exist; import os (or pathlib) and create the directory before the save calls by calling os.makedirs("plots", exist_ok=True) (or Path("plots").mkdir(parents=True, exist_ok=True)) immediately before the plt.savefig/fig.savefig calls that write "plots/mokume_workflow.png" and "plots/mokume_workflow.pdf" so the directory exists.

benchmark-folder created

f479a68

ypriverol merged commit 8bb6ce0 into main Jan 31, 2026
2 of 3 checks passed

coderabbitai bot reviewed Jan 31, 2026

View reviewed changes

		import matplotlib.pyplot as plt
		from matplotlib.patches import FancyBboxPatch

Conversation

ypriverol commented Jan 31, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ypriverol commented Jan 31, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2026 •

edited

Loading