Skip to content

Detect potential duplicates #12

@MatthieuBizien

Description

@MatthieuBizien

Roam does not seems to have an advanced de-duplication algorithm for notes.

  1. A space at the end of a note title is not voluntary most of the time
  2. Unicode is not normalized

Eg. for 1.: [[Charlène]] and [[Charlène]]. Looks identical, but if we print the bytes, they are different, b'Charle\xcc\x80ne' versus b'Charl\xc3\xa8ne'. If I use unicodedata.normalize on both, they become identical.

We could detect them and save the list of errors in a dedicated files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions