ADR: module tool arguments by pditommaso · Pull Request #6770 · nextflow-io/nextflow

pditommaso · 2026-01-28T13:45:21Z

Summary

Introduces ADR for typed tool arguments in module definitions, replacing the convoluted task.ext.args pattern.

Problem: Current ext.args approach uses opaque strings with no validation, documentation, or IDE support.

Solution: Extend tools section in meta.yaml with typed args component:

tools:
  - bwa:
      args:
        K:
          type: integer
          prefix: '-'
        Y:
          type: boolean
          prefix: '-'

Key features:

Argument attributes: type, enum, prefix, default, description
Script usage: ${tools.bwa.args} for all args, ${tools.bwa.args.K} for single
Config: tools.bwa.args.K = 100000000
CLI: --tools.bwa.K=value

Open problems:

Subcommand argument collision (same option in different subcommands)

Introduces specification for typed tool arguments in module definitions: - Extend tools section in meta.yaml with args component - Argument attributes: type, enum, prefix, default, description - Script usage via tools implicit variable - Configuration and CLI override mechanisms - Migration guide from ext.args pattern Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

netlify · 2026-01-28T13:45:28Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`0c38de2`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/697a12f5a6cdbe0008ab9111

pinin4fjords · 2026-01-28T14:34:27Z

The typed validation and IDE support would be really nice for common options.

However, the rationale behind ext.args in nf-core is deliberately to avoid maintaining option lists. From the module guidelines:

"The justification behind using ext.args is to provide more flexibility to users. As ext.args is derived from the configuration, advanced users can overwrite the default ext.args and supply their own arguments to modify the behaviour of a module. This can increase the capabilities of a pipeline beyond what the original developers intended."

This means users can pass any tool option without waiting for it to be enumerated, and enables patterns like dynamic closures and sample-specific args:

// Parameter-dependent
ext.args = { params.fastqc_kmer_size ? "-k ${params.fastqc_kmer_size}" : '' }

// Sample-specific
ext.args = { "--id ${meta.id}" }

My concern is how we match this flexibility without either:

Keeping meta.yaml in sync with complex, shifting tool parameter landscapes, or
Cutting off access to options that aren't enumerated

I noticed the earlier module-system ADR had this language: "This list does not need to be exhaustive. It should include any arguments known to be used by pipelines or that could be expected to be used by users." - but this ADR doesn't clarify what happens with unlisted options.

Could we take a **kwargs-style approach? In Python, functions can define explicit parameters while still accepting arbitrary additional arguments:

def func(a, b, **kwargs):
    # a and b are typed/documented
    # kwargs catches everything else

Similarly:

tools.bwa.args.K = 100000000  // documented in meta.yaml → validated
tools.bwa.args.B = 3          // not in meta.yaml → passed through as-is

The challenge is command-line formatting - documented args have prefix defined in meta.yaml, but for undocumented args we wouldn't know how to format them. Options:

Require prefix in the key for undocumented args:

tools.bwa.args.K = 100000000       // documented → uses prefix from meta.yaml
tools.bwa.args['-B'] = 3           // undocumented → prefix explicit in key

Single passthrough field for pre-formatted args:

tools.bwa.args.K = 100000000            // documented, validated
tools.bwa.args._passthrough = '-B 3'    // raw string

Option 2 is cleaner in terms of explicit expectations, but it's essentially ext.args again. Maybe that's unavoidable - the realistic goal being typed/validated/documented args for common options, with a raw passthrough for everything else?

ewels · 2026-01-28T15:57:50Z

Do we need the prefix stuff at all? If we have YAML dict keys that are in quotes we can just have the full flag, no?

tools:
  - bwa:
      args:
        "-K":
          type: integer

tools.bwa.args['-K'] = 3

I like the simplicity of that. Then in the module we just do $args and the list gets collapsed nicely and we don't need to worry about what keys were in it.

bentsherman · 2026-01-28T19:28:37Z

It might be enough to just have tools.bwa.args as a drop-in replacement for ext.args, and use the spec only as a "documentation hint" for users / agents. It doesn't have to be exhaustive, but the more options you document, the better

The tool spec could just be a list of argument descriptions. Using nf-core/bwa/mem as an example:

tools:
  - bwa:
      documentation: https://bio-bwa.sourceforge.net/bwa.shtml
      args:
        - shortName: '-t'
          longName: '--threads' # bwa mem doesn't actually have long option names, this is just to illustrate
          type: integer
          default: 1
          description: 'Number of threads'
        - shortName: '-k'
          type: integer
          default: 19
          description: 'Minimum seed length'
        # ...

So, users and agents would still construct the desired string of args, but could use the tool spec for guidance. No need for an additional layer of typed parameters, which might introduce more rough edges

Note that nf-core/bwa/mem already has a documentation URL, which just provides the man-page for bwa. So in this case, an agent might not even need the tool spec if it can just consult the tool documentation. But other tools might not be as well documented, and either way it can be nice to have the most useful options documented locally

pditommaso · 2026-01-29T09:16:03Z

Agree on both last comment. We could explicitly only target GNU CLI conventions, and infer - vs -- depending the the option name (length).

Agree, also the list should not be exhaustive, it's meant to expose the tool options that can be used to parametrise the process execution.

What I dislike in the current for is the "subcommand argument collision"

Subcommand argument collision
A tool having the same option name in two different subcommands cannot be managed with the current design. Arguments are defined at the tool level, not at the subcommand level.

bentsherman · 2026-01-29T15:51:51Z

From our discussion today, people seem to like the prospect of specifying args with a map. Phil's example using the literal option name seems like the safest approach:

tools:
  - bwa:
      args:
        "-K":
          type: integer

tools.bwa.args['-K'] = 3

With a method like tools.bwa.args.toString() to concatenate the args.

Some challenges that came up:

Converting between floating-point numbers and strings can create artifacts. But I just remembered, we can just parse floats as BigDecimal and it will be safe
We could provide tools.bwa.args.K as a further shorthand, but this is not as flexible, and might not be worth adding due to the "multiple ways to do the same thing" problem
Positional args vs named args. Most tools expect positional args after named args, but I don't think we can assume this. We could provide e.g. tools.bwa.args (list of positional args) and tools.bwa.kwargs (map of named args) so that the user can inject them independently.
Subcommand argument collision. To Paolo's comment above, this could be handled by treating the command and subcomannd as separate tools, e.g. tools.bwa.args vs tools.bwa.subcommand.args

bentsherman · 2026-01-29T15:56:41Z

The args / kwargs distinction actually addresses my fundamental concern about trying to model CLI args as a map. Since CLI args are just a list of strings at the end of the day, users can always fall back on the args list if they need to, while using the kwargs map where they can.

ewels mentioned this pull request Jan 28, 2026

ADR: module parameters #6769

Draft

bentsherman changed the title ~~Add tools arguments ADR~~ ADR: module tool arguments Jan 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR: module tool arguments#6770

ADR: module tool arguments#6770
pditommaso wants to merge 1 commit intomasterfrom
260128-tools-arguments

pditommaso commented Jan 28, 2026

Uh oh!

netlify bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

pinin4fjords commented Jan 28, 2026

Uh oh!

ewels commented Jan 28, 2026

Uh oh!

bentsherman commented Jan 28, 2026

Uh oh!

pditommaso commented Jan 29, 2026

Uh oh!

bentsherman commented Jan 29, 2026

Uh oh!

bentsherman commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pditommaso commented Jan 28, 2026

Summary

Related

Uh oh!

netlify bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging canceled.

Uh oh!

pinin4fjords commented Jan 28, 2026

Uh oh!

ewels commented Jan 28, 2026

Uh oh!

bentsherman commented Jan 28, 2026

Uh oh!

pditommaso commented Jan 29, 2026

Uh oh!

bentsherman commented Jan 29, 2026

Uh oh!

bentsherman commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Jan 28, 2026 •

edited

Loading