Skip to content

Increase tolerable collision rate for low quality read libraries #172

@EmeraldSama94

Description

@EmeraldSama94

Hello @arangrhie! First of all, thank you for all your work and support!

I'm implementing meryl/merqury in a pipeline to evaluate a bunch of public genomic assemblies of neglected parasites. While some of them used newer sequencing technologies, a good portion relied on genomic reads with rather low quality, specially early Illumina and PacBio reads, and even some now discontinued technologies such as 454 and IonTorrent. As many scientists of our community still use these assemblies, it is a necessity to evaluate them as well. I know that you do not recommend running meryl/merqury for these low quality read libraries. However, I noticed that the tolerable collision rate parameter in the best k script is set to the same error rate generally expected from a Illumina sequencing, that is 0.001, equivalent to a Q30. So, I was wondering that, if this association is true, I could increase the tolerable collision rate for read libraries with lower quality and continue to run meryl/merqury without big issues. Does this make any sense or am I getting the meaning of the collision rate completely wrong?

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions