HyDRA – Hybrid De novo Assembly and Resistance Analysis

HyDRA (Hybrid De novo Assembly and Resistance Analysis) is a state-of-the-art and user-friendly Snakemake workflow designed for the analysis of WGS data. It integrates multiple bioinformatics tools and algorithms to facilitate key steps in WGS analysis, including quality control of sequencing reads, hybrid assembly, taxonomic classification, gene prediction and annotation as well as identification of plasmids and antibiotic resistance genes (ARGs).

Key Features

Comprehensive Quality Control: Involves a thorough assessment and filtering of sequencing data to identify and remove errors, ensuring high-quality and reliable data for downstream analysis.

Hybrid Assembly: Utilize short and long sequencing reads in combination to create more accurate and complete genome assemblies, leveraging the strengths of both read types.

Taxonomic Classification: Apply advanced taxonomic classification methods to assign taxonomic labels to the isolate.

Gene Prediction and Annotation: Utilizing methods to identify gene locations within a genome assembly and annotating their functions, enabling to understand the biological roles and interactions of the genes.

Plasmid Identification: Involves methods to detect and characterize plasmid DNA within a genomic dataset, differentiating it from chromosomal DNA to study its role in horizontal gene transfer and other functions.

Antibiotic Resistance Gene Identification: Perform in-depth analysis to detect and characterize antibiotic resistance genes within the sample, providing valuable insights into antimicrobial resistance profiles.

Overview

%%{init: {
   'theme':'base',
   'themeVariables': {
      'secondaryColor': '#fff',
      'tertiaryColor': '#fff',
      'tertiaryBorderColor' : '#fff'}
   }}%%

flowchart TB;

   subgraph " "
      direction TB

      %% Nodes
      A[/short reads/]
      B[/long reads/]
      C["<b>Quality Control</b> <br> <i>fastp, FastQC, Porechop, Chopper<i>"]
      D[/MultiQC report/]
      E["<b>Hybrid <i>de novo<i> Assembly</b> <br> <i>Unicycler</i>"]
      F["<b>Assembly Control</b> <br> <i>CheckM2, QUAST<i>"]
      G[/Genome Assembly/]
      H["<b>Gene Prediction & Annotation</b> <br> <i>Prokka<i>"]
      I["<b>Taxonomic Classification</b> <br> <i>GTDB-Tk<i>"]
      J["<b>Plasmid Identification</b> <br> <i>geNomad<i>"]
      K["<b>Resistance Analysis</b> <br> <i>CARD & RGI<i>"]
      
      

      %% input & output node design
      classDef in_output fill:#fff,stroke:#cde498,stroke-width:4px
      class A,B,D,G in_output
      %% rule node design
      classDef rule fill:#cde498,stroke:#000
      class C,E,F,H,I,J,K rule

      %% Node links
      A --> C
      B --> C
      C --> E
      C --- D
      E --> F
      E ---- G
      G --> H
      G --> I
      G --> J
      G --> K

   end

Usage

Preparations

To prepare the workflow

Clone it to your desired working directory via git or your preferred IDE
Edit the config/config.yaml file:
- Specify the run date or project name (run_name)
Provide a sample information in the config/pep/samples.csv file with keeping the header and format as .csv:

sample_name,long,short1,short2,assembly
sample1,path/to/your/long_read/fastq/sample1.fastq.gz,path/to/your/short_read/fastq/sample1_R1.fastq.gz,path/to/your/short_read/fastq/sample1_R2.fastq.gz,path/to/your/assembly/sample1.fna.gz

Generate samples.csv

The sample.csv can also be generated instead of filling it manualy. Run the config/sample_file_generator.py python script to generate a sample.csv for all files in a folder. folder_input, assembly_type_input and file_output have to be set. assembly_type_input can be set in exactly the same way as the assembly_type (see Settings). Example:

python config/sample_file_generator.py /path/to/short/read/data short config/pep/samples.csv

Dependencies:

Settings

Set the assembly_type at config/config.yaml to change the used input files:

short: Assamble only with short reads
long: Assemble only with long reads
hybrid: Assemble with long and short reads
none: Assembly is skipped and Assembled files are already provided

Make sure that all paths for the assembly_type specific files are also specified in config/pep/samples.csv

Run the workflow

snakemake --use-conda --cores all

The usage of this workflow is described in the Snakemake Workflow Catalog.

Performance

It can be useful to reduce the number of threads used for all rules:

snakemake --use-conda --cores all --max-threads 4

This depends heavily on the rules used.

License

HyDRA is released under the BSD-2 Clause. Please review the license file for more details.

Contact Information

For any questions, or feedback, please contact the project maintainer at [email protected]. We appreciate your input and support in using and improving HyDRA.

Tools

CARD & RGI
Checkm2
Chopper
fastp
FastQC
geNomad
minimap2
MultiQC
Nanoplot
pandas
PLM-ARG
Porechop_ABI
Prokka
QUAST
samtools
Unicycler

Literature

Not here yet

Citation

A paper is on its way. If you use HyDRA in your work before the paper, then please consider citing this GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
config		config
workflow		workflow
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HyDRA – Hybrid De novo Assembly and Resistance Analysis

Key Features

Overview

Usage

Preparations

Generate samples.csv

Settings

Run the workflow

Performance

License

Contact Information

Tools

Literature

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

IKIM-Essen/HyDRA

Folders and files

Latest commit

History

Repository files navigation

HyDRA – Hybrid De novo Assembly and Resistance Analysis

Key Features

Overview

Usage

Preparations

Generate samples.csv

Settings

Run the workflow

Performance

License

Contact Information

Tools

Literature

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages