Skip to content

Reporting query k-mer positions? #20

@bananabenana

Description

@bananabenana

Hi,

Thank you for developing this tool. Really enjoyed your preprint.

Query

I was wondering if there is a way to export the query k-mer positions for each reference position in the output vcf files? Or is there some way I can expose this as a temp file? The contig and position of the reference are present in the VCF files as expected, but not the query positional info.

Use case

I've been using kbo on assemblies in a pairwise manner, where each genome is used as a reference and a query for every other genome.
Example:

---Genome_dir
┕-----Genome_1.fasta
┕-----Genome_2.fasta
┕-----Genome_3.fasta

However, when I compare the same genomes as queries and references, I get different results:

Ref Query SNPs Indels
Genome_1 Genome_2 79 10
Genome_2 Genome_1 86 11

I understand why this is occurring, but I'd really like to be able to see which reference positions these refer to, so I can remove common SNPs between both comparisons, and get the true number of coalesced pairwise SNPs (Genome_1 SNPs ∩ Genome_2 SNPs + Genome_1 SNPs + Genome_2 SNPs . This would be more accurate than taking the mean of them both I'd guess.

In these cases, the majority of these SNPs will overlap, but I cannot determine which ones do without the query positional information. Since I am using assemblies and not raw reads, this should be possible right?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions