DSL2 Nextflow pipeline for multi-locus ML phylogenomics using BUSCO single-copy orthologs. Wraps PHYling, PhyKIT, ModelTest-NG, IQ-TREE 3, and RAxML-NG into two named workflows:
| Workflow | --mode |
Input | Data type | RAxML-NG bootstraps |
|---|---|---|---|---|
CDS_TREE |
cds_tree |
.fa per taxon |
DNA / CDS | 500 |
PROTEIN_TREE |
protein_tree |
.fa.gz per taxon |
Amino acid | 100 |
Both workflows run AIC and BIC partition schemes through IQ-TREE and RAxML-NG in parallel, producing four independently-supported trees per markerset.
INPUT: one sequence file per taxon
↓ (per markerset, in parallel)
phyling align BUSCO marker search + MSA
phyling filter Remove loci below occupancy threshold (default 80% of taxa)
phyling tree Per-gene FastTree (exploratory gene trees)
↓
phykit create_concat Concatenate filtered MSAs; write partition file
sed AUTO→DNA/PROT Fix data-type for modeltest-ng
↓
modeltest-ng Partitioned model selection → AIC and BIC partition files
↓ (AIC and BIC run in parallel from here)
iqtree3 MF+MERGE Partition merging + ML tree
iqtree3 UFBoot Bootstrap (-B 1000 --alrt 1000 --bnni --wbtl)
raxml-ng --parse Produce binary alignment; extract thread recommendation
raxml-ng --all ML search + bootstraps
nextflow run stajichlab/phyling-phylogenomics \
-profile slurm,ucr_hpcc \
--mode protein_tree \
--input /path/to/pep \
--prefix my_project \
--markerset fungi_odb12,mucoromycota_odb12 \
--outdir results/pepgit clone https://github.com/stajichlab/phyling-phylogenomics
cd phyling-phylogenomics
nextflow run main.nf \
-profile local \
--mode cds_tree \
--input /path/to/cds \
--prefix my_project \
--markerset fungi_odb12 \
--outdir results/test# Install pixi if not already present
curl -fsSL https://pixi.sh/install.sh | bash
# Install all pipeline tools
pixi installTools installed: nextflow, modeltest-ng, iqtree ≥2.3 (provides iqtree3),
raxml-ng from bioconda; phykit and phyling from PyPI.
Module names expected by -profile slurm:
phyling, phykit, modeltest-ng, iqtree, raxml-ng
Profiles can be combined with commas: -profile slurm,ucr_hpcc
| Profile | Executor | Environment | Use when |
|---|---|---|---|
slurm |
SLURM | module load per tool |
HPC with environment modules |
ucr_hpcc |
— | — | UCR HPCC queue names (combine with slurm or pixi_slurm) |
pixi |
local | pixi env | local workstation with pixi |
pixi_slurm |
SLURM | pixi env | HPC without modules; combine with a site profile |
local |
local | tools in PATH | testing |
Queue names are cluster-specific. Create a minimal site config and pass it with -c:
// my_cluster.config
process {
withName: 'PHYLING_ALIGN' { queue = 'bigmem' }
withName: 'MODELTEST_NG|IQTREE_MF|IQTREE_BOOTSTRAP|RAXMLNG_ALL' { queue = 'compute' }
withName: 'PHYLING_FILTER|PHYLING_TREE|PHYKIT_CONCAT|RAXMLNG_PARSE' { queue = 'short' }
}nextflow run stajichlab/phyling-phylogenomics \
-profile slurm -c my_cluster.config \
--mode protein_tree --input pep/ --prefix my_project \
--markerset fungi_odb12| Parameter | Default | Description |
|---|---|---|
--mode |
protein_tree |
Workflow: cds_tree or protein_tree |
--input |
(required) | Directory of .fa (CDS) or .fa.gz (protein) files |
--prefix |
(required) | Base name for output files, e.g. my_project_v1 |
--markerset |
fungi_odb12,mucoromycota_odb12 |
Comma-separated BUSCO lineage names |
--outdir |
results |
Directory for published outputs |
--publish_mode |
copy |
publishDir mode: copy, link, or symlink |
--top_n_to_keep |
80 |
Number of top markers to retain (TOP_N_TOVERR in phyling filter -n) |
--rcluster |
10 |
IQ-TREE partition merging aggressiveness |
--bs_count |
1000 |
IQ-TREE UFBoot replicates (-B) |
--alrt_count |
1000 |
IQ-TREE SH-aLRT replicates (--alrt) |
--pars_trees |
10 |
RAxML-NG parsimony starting trees |
--bs_trees_cds |
500 |
RAxML-NG bootstrap replicates for CDS mode |
--bs_trees_pep |
100 |
RAxML-NG bootstrap replicates for protein mode |
| Markerset | Scope |
|---|---|
fungi_odb10 / fungi_odb12 |
All Fungi (ODB10 / ODB12) |
mucoromycota_odb10 / mucoromycota_odb12 |
Mucoromycota-specific |
basidiomycota_odb10 |
Basidiomycota |
ascomycota_odb10 |
Ascomycota |
Multiple markersets run as independent parallel branches.
nextflow run stajichlab/phyling-phylogenomics \
-profile slurm,ucr_hpcc \
--mode protein_tree \
--input pep/ \
--prefix my_project_v1 \
--markerset fungi_odb12,mucoromycota_odb12 \
--outdir results/pepnextflow run stajichlab/phyling-phylogenomics \
-profile slurm,ucr_hpcc \
--mode cds_tree \
--input cds/ \
--prefix my_project_v1 \
--markerset fungi_odb12,mucoromycota_odb12 \
--outdir results/cdsnextflow run stajichlab/phyling-phylogenomics -resume \
-profile slurm,ucr_hpcc \
--mode protein_tree \
--input pep/ \
--prefix my_project_v1 \
--markerset fungi_odb12,mucoromycota_odb12 \
--outdir results/pepresults/
└── {cds|pep}/
├── align/{markerset}/ phyling align output
├── filter/{markerset}/ filtered .mfa files (one per locus)
├── tree/{markerset}/ per-gene FastTree trees
├── buildtree/{markerset}/
│ ├── *.fa concatenated alignment
│ ├── *.partition partition file (data-type fixed)
│ ├── *.part.aic / *.part.bic ModelTest-NG AIC and BIC schemes
│ ├── *.aic.bs.treefile IQ-TREE AIC bootstrap consensus
│ ├── *.bic.bs.treefile IQ-TREE BIC bootstrap consensus
│ ├── *.aic.raxml.support RAxML-NG AIC support tree
│ └── *.bic.raxml.support RAxML-NG BIC support tree
└── pipeline_info/
├── timeline.html
├── report.html
└── dag.svg
Tree files (.treefile, .raxml.support) are Newick format — open in FigTree, iTOL, or ggtree.
| Measure | Flag | Strong support threshold |
|---|---|---|
| UFBoot2 (ultrafast bootstrap) | -B 1000 |
≥ 95 |
| SH-aLRT | --alrt 1000 |
≥ 80 |
| RAxML-NG bootstrap | --bs-trees |
≥ 70 |
UFBoot values are systematically higher than standard bootstrap — do not compare directly. A clade is robustly supported when all three exceed their thresholds.
phyling-phylogenomics/
├── main.nf entry point
├── nextflow.config params + profiles
├── pixi.toml conda/PyPI environment
├── conf/
│ ├── base.config per-process resource labels
│ └── ucr_hpcc.config UCR HPCC queue assignments (site example)
├── workflows/
│ ├── cds_tree.nf
│ └── protein_tree.nf
└── modules/local/
├── phyling/{align,filter,tree}/
├── phykit/concat/
├── modeltestng/
├── iqtree/{mf,bootstrap}/
└── raxmlng/{parse,all}/
If you use this pipeline, please cite the underlying tools:
- PHYling: Tsai et al. (2026) G3 jkag062 https://doi.org/10.1093/g3journal/jkag062
- PhyKIT: Steenwyk et al. (2021) Bioinformatics 37:2325–2328 https://doi.org/10.1093/bioinformatics/btab096
- ModelTest-NG: Darriba et al. (2020) Mol Biol Evol 37:291–294
- IQ-TREE 3: Minh et al. (2020) Mol Biol Evol 37:1530–1534
- RAxML-NG: Kozlov et al. (2019) Bioinformatics 35:4453–4455
- BUSCO: Manni et al. (2021) Mol Biol Evol 38:4647–4654