-
-
Notifications
You must be signed in to change notification settings - Fork 20
Closed
Description
When comparing strings containing complex Unicode characters like emojis with ZWJ sequences, the diff functions produce incorrect results due to improper character segmentation.
π Affected Unicode Characters
Emojis with ZWJ: π¨βπ³, π©βπ», π¨βπ¨, etc.
Multi-byte characters: accented characters, CJK characters
Surrogate pairs: any emoji or character outside BMP
Combining characters: characters with diacritics
Related
Unicode Standard: https://unicode.org/
Intl.Segmenter MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/S
Cherinho, sanshan and vladimir-shidlovskystreamich
Metadata
Metadata
Assignees
Labels
No labels