Skip to content

feat: bcftools fixploidy wrapper#5176

Open
tdayris wants to merge 10 commits intosnakemake:masterfrom
tdayris:bcftools_fixploidy
Open

feat: bcftools fixploidy wrapper#5176
tdayris wants to merge 10 commits intosnakemake:masterfrom
tdayris:bcftools_fixploidy

Conversation

@tdayris
Copy link
Copy Markdown
Contributor

@tdayris tdayris commented Mar 23, 2026

This PR adds a wrapper for BCFTools' plugin fixploidy.

This plugin does not handle compression by itself, and therefore needs to be wrapped with bcftools view commands to handle gzipped VCF and BCF files.

BCFTools' snakemake helper function parameters are forewarded to bcftools view compression, not the plugin itself. The plugin does not have any of the common IO options present in BCFTools' main tools.

QC

While the contributions guidelines are more extensive, please particularly ensure that:

  • test.py was updated to call any added or updated example rules in a Snakefile
  • input: and output: file paths in the rules can be chosen arbitrarily
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:)
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to
  • the meta.yaml contains a link to the documentation of the respective tool or command under url:
  • conda environments use a minimal amount of channels and packages, in recommended ordering

Summary by CodeRabbit

  • New Features
    • Added a bcftools fixploidy wrapper with configurable extra options and multi-threaded support for compressed and uncompressed VCF/BCF IO.
  • Chores
    • Added reproducible Conda environment definitions (YAML + explicit pin) with pinned packages including bcftools 1.23 and Python 3.14.
  • Tests
    • Added an automated wrapper test plus an example test rule/input to validate end-to-end behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Snakemake wrapper for bcftools plugin fixploidy with Conda environment specs, wrapper metadata, implementation, and tests; the wrapper constructs and executes the bcftools plugin fixploidy command using options derived from snakemake_wrapper_utils and snakemake.params.extra.

Changes

Cohort / File(s) Summary
Environment configuration
bio/bcftools/fixploidy/environment.yaml, bio/bcftools/fixploidy/environment.linux-64.pin.txt
Adds a Conda environment spec and an explicit linux-64 pin file enumerating pinned packages (e.g., bcftools=1.23, Python 3.14.x, htslib and runtime artifacts).
Wrapper metadata
bio/bcftools/fixploidy/meta.yaml
Adds wrapper manifest: name, upstream URL, author, description, input/output formats, and an extra parameter declaration; notes threading requirement for IO compression.
Wrapper implementation
bio/bcftools/fixploidy/wrapper.py
New wrapper that imports get_bcftools_opts, computes bcftools_opts with parse_ref=False, parse_memory=False, reads params.extra, formats Snakemake logging, and runs bcftools plugin fixploidy {bcftools_opts} {extra} {input} {log} via shell.
Tests (rule + data)
bio/bcftools/fixploidy/test/Snakefile, bio/bcftools/fixploidy/test/a.bcf
Adds a Snakemake test rule (test_bcftools_fixploidy) using threads=3 and a test input reference file.
Test runner
test_wrappers.py
Adds pytest entry test_bcftools_fixploidy(run) that invokes the wrapper test with --cores 3 and --use-conda.

Sequence Diagram

sequenceDiagram
    participant User
    participant Snakemake
    participant Wrapper as Wrapper (wrapper.py)
    participant Bcftools as Bcftools (plugin fixploidy)
    participant FS as FileSystem

    User->>Snakemake: run test_bcftools_fixploidy (--cores 3, --use-conda)
    Snakemake->>Wrapper: invoke with input=a.bcf, output=a.fixed.bcf, params.extra
    Wrapper->>Wrapper: get_bcftools_opts(parse_ref=False, parse_memory=False)
    Wrapper->>Wrapper: read params.extra and build log flags
    Wrapper->>Bcftools: execute `bcftools plugin fixploidy {opts} {extra} {input}`
    Bcftools->>FS: read input and write output stream
    Bcftools-->>Wrapper: exit status / logs
    Wrapper-->>Snakemake: return status and logs
    Snakemake-->>User: report results
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: bcftools fixploidy wrapper' directly and clearly describes the main change: adding a new Snakemake wrapper for the BCFTools fixploidy plugin.
Description check ✅ Passed The description explains the purpose of the wrapper, its technical constraints (compression handling, lack of IO options), and confirms all QC checklist items are completed with proper documentation and test updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bio/bcftools/fixploidy/wrapper.py`:
- Around line 17-21: The code inconsistently accesses inputs: it assigns incall
from snakemake.input[0] but checks snakemake.input["call"]; update the logic to
use the same named input key for both operations (e.g., read incall from
snakemake.input["call"] before testing its extension and potentially wrapping it
with "< <(bcftools view {})"), and adjust min_threads and incall only based on
that named entry to ensure the extension test and the file used match.
- Around line 23-29: The code initializes outcall from snakemake.output[0] but
then checks snakemake.output["call"], causing inconsistent access; change the
initialization to use the named key (set outcall = snakemake.output["call"]) so
both the assignment and the endswith check use snakemake.output["call"] (keep
the rest of the logic using bcftools_opts and min_threads unchanged), and ensure
the named key "call" is present in snakemake.output before use.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 672a3cb6-ed63-4ca5-b972-78552cc1033e

📥 Commits

Reviewing files that changed from the base of the PR and between 85f00cc and a74314f.

📒 Files selected for processing (3)
  • bio/bcftools/fixploidy/meta.yaml
  • bio/bcftools/fixploidy/test/Snakefile
  • bio/bcftools/fixploidy/wrapper.py
✅ Files skipped from review due to trivial changes (2)
  • bio/bcftools/fixploidy/meta.yaml
  • bio/bcftools/fixploidy/test/Snakefile

@fgvieira
Copy link
Copy Markdown
Collaborator

But doesn't bcftools already suports reading/writing bcf and vcf.gz?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bio/bcftools/fixploidy/wrapper.py`:
- Line 15: The bcftools plugin call in wrapper.py places the input file after
`{extra}`, which can break plugin option parsing; update the shell invocation
(the shell(...) call invoking "bcftools plugin fixploidy") to move
`{snakemake.input[0]}` immediately after `{bcftools_opts}` and before `{extra}`
so that the input file is passed before the `--`/plugin options (keep `{log}` at
the end).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 437cd957-28b4-476d-b343-f2780458408f

📥 Commits

Reviewing files that changed from the base of the PR and between c208e23 and 26bc07b.

📒 Files selected for processing (1)
  • bio/bcftools/fixploidy/wrapper.py

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell("bcftools plugin fixploidy {bcftools_opts} {extra} {snakemake.input[0]} {log}")
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n bio/bcftools/fixploidy/wrapper.py

Repository: snakemake/snakemake-wrappers

Length of output: 692


🏁 Script executed:

find bio/bcftools/fixploidy -type f

Repository: snakemake/snakemake-wrappers

Length of output: 268


🏁 Script executed:

# Search for bcftools fixploidy usage patterns in the repository
rg "fixploidy" -A 5 -B 5

Repository: snakemake/snakemake-wrappers

Length of output: 2153


🏁 Script executed:

cat -n bio/bcftools/fixploidy/test/Snakefile

Repository: snakemake/snakemake-wrappers

Length of output: 394


🏁 Script executed:

# Search for other bcftools plugin wrappers to see argument ordering patterns
find bio/bcftools -name "wrapper.py" -type f | head -10 | xargs -I {} sh -c 'echo "=== {} ===" && cat {}'

Repository: snakemake/snakemake-wrappers

Length of output: 7360


🏁 Script executed:

# Check bcftools documentation for fixploidy plugin argument parsing
web_search "bcftools fixploidy plugin argument parsing order"

Repository: snakemake/snakemake-wrappers

Length of output: 122


🌐 Web query:

bcftools fixploidy plugin documentation argument parsing

💡 Result:

The bcftools +fixploidy plugin uses standard C getopt_long for argument parsing in its init function. Plugin options (after the "--" separator) are parsed using a static struct option long_options[] defining: - {"default-ploidy",1,0,'d'} - {"force-ploidy",1,0,'f'} - {"ploidy",1,0,'p'} - {"samples",1,0,'s'} Short options: -d, -f, -p, -s. The getopt_long loop processes these: case 'd': default_ploidy = strtod(optarg,&tmp); case 'f': force_ploidy = strtod(optarg,&tmp); case 'p': ploidy_fname = strdup(optarg); case 's': sex_fname = strdup(optarg); It initializes ploidy with defaults for chrX PAR regions if no -p file given, reads samples/sex from -s file, maps samples to sex, and sets sex-to-ploidy mapping. To view full usage: bcftools +fixploidy -h Source code: https://github.com/samtools/bcftools/blob/develop/plugins/fixploidy.c

Citations:


🌐 Web query:

bcftools plugin command line syntax "--" separator

💡 Result:

In BCFtools, the "--" separator is used in plugin command lines to distinguish between general/common options (shared across many bcftools commands, such as -o, -O, -r) that come before it, and plugin-specific options that come after it. The general syntax for running a plugin is: bcftools +plugin_name [general options] input.vcf -- [plugin options] Examples from official documentation: bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t AN,AC bcftools +split-vep file.vcf.gz -- -a BCSQ bcftools +tag2tag in.vcf -- --LXX-to-XX -s 3 Some plugins do not accept general options and their usage is bcftools +plugin_name [input] [plugin options], without the "--" separator. Always check the plugin's usage with bcftools +plugin_name -h. To use plugins, set BCFTOOLS_PLUGINS=/path/to/bcftools/plugins or use full path like bcftools +/path/to/plugin.so.

Citations:


🏁 Script executed:

# Search for actual usage examples of bcftools plugin fixploidy in the codebase
rg "bcftools.*fixploidy" -A 3 -B 3

Repository: snakemake/snakemake-wrappers

Length of output: 1627


🏁 Script executed:

# Check bcftools documentation examples and common patterns
web_search "bcftools +fixploidy usage example input file position"

Repository: snakemake/snakemake-wrappers

Length of output: 122


🌐 Web query:

bcftools +fixploidy usage example input file position argument order

💡 Result:

The standard usage of bcftools +fixploidy follows the general bcftools plugin syntax: bcftools +fixploidy [general options] [input.vcf[.gz]] [-- [plugin options]]. Input file position: The VCF/BCF input file is provided as a positional argument after general options (like -O u for uncompressed BCF output) and before the -- separator. If omitted, it reads from standard input (stdin). Examples from source code and usage: - bcftools +fixploidy in.vcf -- -s samples.txt Here, "in.vcf" is the input file. - bcftools +fixploidy haploid.vcf -- -f 2 > diploid.vcf Here, "haploid.vcf" is the input file, -f 2 forces diploid ploidy for all sites. General syntax template: bcftools +fixploidy -Ou -o output.bcf input.vcf -- -s samples.txt -p ploidy.txt Plugin options (after --): - -d, --default-ploidy [default: 2] - -f, --force-ploidy Ignore -p, set same ploidy everywhere - -p, --ploidy CHROM FROM TO SEX PLOIDY - -s, --sex NAME SEX (M/F) - -t, --tags [GT] Ploidy file format example: X 1 60000 M 1 X 2699521 154931043 M 1 Y 1 59373566 M 1 Y 1 59373566 F 0 Sex file example: sampleName1 M

Citations:


Move input file before {extra} to follow bcftools plugin argument syntax.

The bcftools fixploidy plugin requires input files to be positioned before the -- separator and plugin-specific options. If {extra} contains plugin options (e.g., -- -s samples.txt), the current order places the input file after these options, causing incorrect parsing.

Suggested fix
-shell("bcftools plugin fixploidy {bcftools_opts} {extra} {snakemake.input[0]} {log}")
+shell("bcftools plugin fixploidy {bcftools_opts} {snakemake.input[0]} {extra} {log}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bio/bcftools/fixploidy/wrapper.py` at line 15, The bcftools plugin call in
wrapper.py places the input file after `{extra}`, which can break plugin option
parsing; update the shell invocation (the shell(...) call invoking "bcftools
plugin fixploidy") to move `{snakemake.input[0]}` immediately after
`{bcftools_opts}` and before `{extra}` so that the input file is passed before
the `--`/plugin options (keep `{log}` at the end).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdayris is this relevant?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell("bcftools plugin fixploidy {bcftools_opts} {extra} {snakemake.input[0]} {log}")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what it is suggesting?

Suggested change
shell("bcftools plugin fixploidy {bcftools_opts} {extra} {snakemake.input[0]} {log}")
shell("bcftools plugin fixploidy {bcftools_opts} {snakemake.input[0]} -- {extra} {log}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants