Hi Arang!
I'm trying to assess the completeness of mobile elements in Trypanosoma cruzi genomes, and I thought of using Merqury to do it. Instead of running the read set against the assembly, I'm trying to run the read set and the assembly (separately) against a multi-fasta file containing all the sequences annotated as mobile elements for this parasite. With this, I'm hoping to compare the completeness percentage of the read set with the same value for the assembly, so if they differ it would mean that they are not well represented in the assembly.
To do this, I generated 20-mers of the read library and the assembly using meryl, and then ran Merqury for each, considering the multi-FASTA file as the assembly. I expected that I would get comparable completeness values, but I was met with these results instead:
db_mobile [reads] all 429004 48535497 0.883897
db_mobile [assembly] all 15456 169646 9.11074
I have some questions about these results:
- I thought that the third column, representing the solid k-mers in the assembly (db_mobile), should be the same regardless of the read set (in this case, the proper reads or the assembly), but it do not seem to be the case. Why is this happening?
- Do you think that this use of merqury could work? I also thought of generating simulated reads of the assembly, with the same parameters of the original read library, and than running merqury with these two read sets, but I'm kind of stuck :/
Any help is much appreciated!
Best regards,
Samuel
Hi Arang!
I'm trying to assess the completeness of mobile elements in Trypanosoma cruzi genomes, and I thought of using Merqury to do it. Instead of running the read set against the assembly, I'm trying to run the read set and the assembly (separately) against a multi-fasta file containing all the sequences annotated as mobile elements for this parasite. With this, I'm hoping to compare the completeness percentage of the read set with the same value for the assembly, so if they differ it would mean that they are not well represented in the assembly.
To do this, I generated 20-mers of the read library and the assembly using meryl, and then ran Merqury for each, considering the multi-FASTA file as the assembly. I expected that I would get comparable completeness values, but I was met with these results instead:
db_mobile [reads] all 429004 48535497 0.883897db_mobile [assembly] all 15456 169646 9.11074I have some questions about these results:
Any help is much appreciated!
Best regards,
Samuel