The Metagenomics-Toolkit is a scalable, data agnostic workflow that automates the analysis of short and long metagenomic reads obtained from Illumina or Oxford Nanopore Technology devices, respectively. The Toolkit offers not only standard features expected in a metagenome workflow, such as quality control, assembly, binning, and annotation, but also distinctive features, such as plasmid identification based on various tools, the recovery of unassembled microbial community members, and the discovery of microbial interdependencies through a combination of dereplication, co-occurrence, and genome-scale metabolic modeling. Furthermore, the Metagenomics-Toolkit includes a machine learning-optimized assembly step that tailors the peak RAM value requested by a metagenome assembler to match actual requirements, thereby minimizing the dependency on dedicated high-memory hardware.
Quickstart and documentation can be found here.
Peter Belmann, Benedikt Osterholz, Nils Kleinbölting, Alfred Pühler, Andreas Schlüter, Alexander Sczyrba, Metagenomics-Toolkit: the flexible and efficient cloud-based metagenomics workflow featuring machine learning-enabled resource allocation, NAR Genomics and Bioinformatics, Volume 7, Issue 3, September 2025, lqaf093, https://doi.org/10.1093/nargab/lqaf093