Skip to content

bug in tclust() when clustering threshold is small #7

@killidude

Description

@killidude

@boopsboops

There is a bug in tclust() which duplicates some sequence pair distances when clustering threshold is small.

Image

Running delimtools::hap_collapse() on the data
geophagus_haps <-hap_collapse(geophagus)
results in 137 unique haplotypes.

Running dna.dist() gives an expected lower diagonal matrix of 137x137 taxa (9316 distance pairs)
mat <- dist.dna(geophagus_clean, model="raw", pairwise.deletion=TRUE)

locmin <- localMinima(mat)
[1] 0.007782468 0.021110956 0.056472251 0.081769178

Then running delimtools::locmin_tbl()
locmin_df <- locmin_tbl(mat, threshold=locmin$localMinima[1], haps=geophagus_hap_tbl$labels)
results in a 165 x 2 tibble instead of the expected 137 × 2 tibble

Then running delimtools::locmin_tbl()
locmin_df <- locmin_tbl(mat, threshold=locmin$localMinima[2], haps=geophagus_hap_tbl$labels)
results in the expected 137 × 2 tibble

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions