Skip to content

Handling ITS reads with variable lengths: Should ASVs be normalized to avoid redundancy? #2166

@CarolTMartins

Description

@CarolTMartins

Hello all,

I am working with ITS amplicon data, which naturally contains fragments of variable lengths. I noticed that in my dataset, some ASVs have identical sequences to others but are shorter versions of the same sequence. This raised a question about how DADA2 handles ITS length variability.

Given this, I would like to ask:

Should ITS sequences be length-standardized (trimmed or padded) to avoid generating redundant ASVs that are biologically the same but differ only in length?
Or does DADA2 internally handle such cases so that shorter fragments do not artificially inflate ASV diversity?

In our case, the samples consist almost exclusively of Saccharomyces cerevisiae yeast, with extremely low diversity. Our goal is to detect very small differences (ideally down to the strain level). We are concerned that the presence of ITS fragments of different lengths might affect our ability to resolve fine-scale variation, especially given the biology of our samples.

Any guidance on best practices for ITS processing with DADA2 in such low-diversity, strain-focused datasets would be greatly appreciated.

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions