Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
Resolve conflicts: - CHANGELOG.md: keep entries from both sides - fastq_preprocess_gatk/main.nf: keep our fix removing old CRAM_TO_BAM_RECAL block Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The process was removed in #2154, so the sample log output in usage.md should no longer list it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of always producing CRAM and converting back to BAM:
- Make GATK4_MARKDUPLICATES/GATK4SPARK_MARKDUPLICATES ext.prefix
conditional on save_output_as_bam (same pattern as APPLYBQSR)
- Emit unified `alignment` channel from bam_markduplicates subworkflows
- Remove CRAM_TO_BAM conversion step at markduplicates stage
- Fix BAM_TO_CRAM_MAPPING ext.when to skip conversion when
save_output_as_bam is set with skip_markduplicates
- Fix CSV create subworkflows to derive file type from actual
filenames instead of using fragile .minus(".cram") hack
- Add BAM publishDir patterns for markduplicates configs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove dead BAM_TO_CRAM config block (no process uses this alias) - Remove unused save_output_as_bam parameter from CHANNEL_MARKDUPLICATES_CREATE_CSV and CHANNEL_BASERECALIBRATOR_CREATE_CSV (type is now derived from filename, making the parameter redundant) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
GATK4_MARKDUPLICATES does not auto-index BAM output (only indexes when converting to CRAM). Add explicit INDEX_MARKDUPLICATES step for the BAM path, matching the pattern already used in the spark variant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The skip QC/recal/md test now correctly skips BAM_TO_CRAM_MAPPING when --save_output_as_bam is set, reducing process count from 10 to 9. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- alignment_from_everything: Remove CRAM_TO_BAM, add INDEX_MARKDUPLICATES, update file listings and stable_content md5s for BAM output - alignment_to_fastq: Same structural changes, update multiqc aggregate md5s - save_output_as_bam: Fix warning field to match CI behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- alignment_from_everything/alignment_to_fastq: Fix .bam.metrics md5s (values were extracted from wrong side of CI diff) - save_output_as_bam: Add missing variant calling snapshot, fix warnings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GATK4_MARKDUPLICATES .bam.metrics output includes timestamps, making md5 values change between CI runs. Add to .nftignore (same as .cram.metrics) and remove from snapshot stable_content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
maxulysse
left a comment
There was a problem hiding this comment.
All good but order of the PR in changelog
|
Hi, what's the time estimate for merging & releasing this fix? Thanks! |
|
I asked the issue authors above to test. @amizeranschi ran into something that i haven't had time to investigate: #2064 it would be good too exclude first if the changes in this PR are causing the stuck issue and/or validate that all the other scenarios are functioning now, if you have time to run any of them |
Summary
Fixes the
--save_output_as_bamflag which was broken in multiple ways. The core approach: instead of always producing CRAM and converting back to BAM, make processes output the correct format directly.APPLYBQSR (recalibrate stage)
bam_applybqsr— both run unconditionally (one is always empty), mixed into singlealignmentemitCRAM_TO_BAM_RECALconversion step (no longer needed — APPLYBQSR outputs BAM directly when flag is set viaext.suffix)Markduplicates stage
GATK4_MARKDUPLICATES/GATK4SPARK_MARKDUPLICATESext.prefixconditional onsave_output_as_bam— produces.md.bamor.md.cramdirectlyalignmentchannel frombam_markduplicatessubworkflows (same pattern as applybqsr)CRAM_TO_BAMconversion step at markduplicates stage entirelyBAM_TO_CRAM_MAPPINGext.whento skip conversion whensave_output_as_bamis set withskip_markduplicatesCRAM_TO_BAMin explicitifblock instead of deprecatedext.whenCSV restart files
file.name/index.name) instead of fragile.minus(".cram")hack that broke with BAM inputssave_output_as_bamparameter fromCHANNEL_MARKDUPLICATES_CREATE_CSVandCHANNEL_BASERECALIBRATOR_CREATE_CSVCleanup
BAM_TO_CRAMandCRAM_TO_BAMconfig blocksCRAM_TO_BAM_RECALreference from docs--save_output_as_bamKnown limitation
BAM_SENTIEON_DEDUPstill always outputs CRAM —--save_output_as_bamdoes not yet produce BAMs for the sentieon dedup path (no crash, just no BAM output)Closes #2136, #2064, #2149, #2148
Test plan
nf-test test tests/save_output_as_bam.nf.test --profile debug,test,docker— both scenarios passnf-test test tests/default.nf.test --profile debug,test,docker— default test passes (no BAM artifacts without flag)--save_output_as_bam: BAM files inpreprocessing/markduplicates/andpreprocessing/recalibrated/, variant calling runs🤖 Generated with Claude Code