Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions unicodetools/data/security/dev/IdentifierType.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# IdentifierType.txt
# Date: 2025-09-12, 03:24:49 GMT
# Date: 2025-10-09, 03:26:38 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -4492,6 +4492,7 @@ A8F8..A8FA ; Obsolete Not_XID # 5.2 [3] DEVANAGARI SIGN PUSH
20BF ; Not_XID # 10.0 BITCOIN SIGN
20C0 ; Not_XID # 14.0 SOM SIGN
20C1 ; Not_XID # 17.0 SAUDI RIYAL SIGN
20C3 ; Not_XID # 18.0 UAE DIRHAM SIGN
2104 ; Not_XID # 1.1 CENTRE LINE SYMBOL
2108 ; Not_XID # 1.1 SCRUPLE
2114 ; Not_XID # 1.1 L B BAR SYMBOL
Expand Down Expand Up @@ -4814,8 +4815,10 @@ FFFD ; Not_XID # 1.1 REPLACEMENT CHARACTE
1F780..1F7D4 ; Not_XID # 7.0 [85] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..HEAVY TWELVE POINTED PINWHEEL STAR
1F7D5..1F7D8 ; Not_XID # 11.0 [4] CIRCLED TRIANGLE..NEGATIVE CIRCLED SQUARE
1F7D9 ; Not_XID # 15.0 NINE POINTED WHITE STAR
1F7DB ; Not_XID # 18.0 BULLET IN DOUBLE CIRCLE
1F7E0..1F7EB ; Not_XID # 12.0 [12] LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
1F7F0 ; Not_XID # 14.0 HEAVY EQUALS SIGN
1F7F1..1F7FF ; Not_XID # 18.0 [15] CIRCLE WITH DOUBLE VERTICAL AND HORIZONTAL LINE..RHOMBUS
1F800..1F80B ; Not_XID # 7.0 [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
1F810..1F847 ; Not_XID # 7.0 [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
1F850..1F859 ; Not_XID # 7.0 [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
Expand Down Expand Up @@ -4916,7 +4919,7 @@ FFFD ; Not_XID # 1.1 REPLACEMENT CHARACTE
1FBCB..1FBEF ; Not_XID # 16.0 [37] WHITE CROSS MARK..TOP LEFT JUSTIFIED LOWER RIGHT QUARTER BLACK CIRCLE
1FBFA ; Not_XID # 17.0 ALARM BELL SYMBOL

# Total code points: 6487
# Total code points: 6504

# Identifier_Type: Not_NFKC

Expand Down
155 changes: 78 additions & 77 deletions unicodetools/data/security/dev/confusables.txt

Large diffs are not rendered by default.

30 changes: 15 additions & 15 deletions unicodetools/data/security/dev/confusablesSummary.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# confusablesSummary.txt
# Date: 2025-09-12, 03:24:49 GMT
# Date: 2025-10-09, 03:26:38 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -5778,10 +5778,20 @@
(‎ ̭ ‎) 032D COMBINING CIRCUMFLEX ACCENT BELOW
← (‎ ᳙ ‎) 1CD9 VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA SCHROEDER

# ̮
# 𑭢 ̮ ॖ ੁ
(‎ ̮ ‎) 032E COMBINING BREVE BELOW
← (‎ 𑭢 ‎) 11B62 SHARADA VOWEL SIGN UE # →ॖ→
← (‎ ॖ ‎) 0956 DEVANAGARI VOWEL SIGN UE
← (‎ ੁ ‎) 0A41 GURMUKHI VOWEL SIGN U # →ॖ→
← (‎ ᳘ ‎) 1CD8 VEDIC TONE CANDRA BELOW

# 𑭢𑭢 ̮̮ 𑭣 ॗ ੂ
(‎ ̮̮ ‎) 032E 032E COMBINING BREVE BELOW, COMBINING BREVE BELOW
← (‎ 𑭢𑭢 ‎) 11B62 11B62 SHARADA VOWEL SIGN UE, SHARADA VOWEL SIGN UE
← (‎ 𑭣 ‎) 11B63 SHARADA VOWEL SIGN UUE # →ॗ→
← (‎ ॗ ‎) 0957 DEVANAGARI VOWEL SIGN UUE
← (‎ ੂ ‎) 0A42 GURMUKHI VOWEL SIGN UU # →ॗ→

# ̳ ͇
(‎ ̳ ‎) 0333 COMBINING DOUBLE LOW LINE
← (‎ ͇ ‎) 0347 COMBINING EQUALS SIGN BELOW
Expand Down Expand Up @@ -8688,16 +8698,6 @@
← (‎ ੍ ‎) 0A4D GURMUKHI SIGN VIRAMA
← (‎ ્ ‎) 0ACD GUJARATI SIGN VIRAMA

# 𑭢 ॖ ੁ
(‎ ॖ ‎) 0956 DEVANAGARI VOWEL SIGN UE
← (‎ 𑭢 ‎) 11B62 SHARADA VOWEL SIGN UE
← (‎ ੁ ‎) 0A41 GURMUKHI VOWEL SIGN U

# 𑭣 ॗ ੂ
(‎ ॗ ‎) 0957 DEVANAGARI VOWEL SIGN UUE
← (‎ 𑭣 ‎) 11B63 SHARADA VOWEL SIGN UUE
← (‎ ੂ ‎) 0A42 GURMUKHI VOWEL SIGN UU

# । ꠰
(‎ । ‎) 0964 DEVANAGARI DANDA
← (‎ ꠰ ‎) A830 NORTH INDIC FRACTION ONE QUARTER
Expand Down Expand Up @@ -8885,9 +8885,9 @@
← (‎ ੳ𑭢 ‎) 0A73 11B62 GURMUKHI URA, SHARADA VOWEL SIGN UE # →ੳੁ→
← (‎ ੳੁ ‎) 0A73 0A41 GURMUKHI URA, GURMUKHI VOWEL SIGN U

# ੳ𑭣 ੳੂ ਊ
# ੳ𑭢𑭢 ੳੂ ਊ
(‎ ਊ ‎) 0A0A GURMUKHI LETTER UU
← (‎ ੳ𑭣 ‎) 0A73 11B63 GURMUKHI URA, SHARADA VOWEL SIGN UUE # →ੳੂ→
← (‎ ੳ𑭢𑭢 ‎) 0A73 11B62 11B62 GURMUKHI URA, SHARADA VOWEL SIGN UE, SHARADA VOWEL SIGN UE # →ੳੂ→
← (‎ ੳੂ ‎) 0A73 0A42 GURMUKHI URA, GURMUKHI VOWEL SIGN UU

# અા આ
Expand Down Expand Up @@ -17834,5 +17834,5 @@
(‎ 𪘀 ‎) 2A600 CJK UNIFIED IDEOGRAPH-2A600
← (‎ 𪘀 ‎) 2FA1D CJK COMPATIBILITY IDEOGRAPH-2FA1D

# total : 7579
# total : 7582

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# confusablesSummaryIdentifier.txt
# Date: 2025-09-12, 03:24:49 GMT
# Date: 2025-10-09, 03:26:38 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -556,6 +556,14 @@
← (‎ ઼ ‎) 0ABC GUJARATI SIGN NUKTA
← (‎ ଼ ‎) 0B3C ORIYA SIGN NUKTA

# ॖ ੁ
(‎ ॖ ‎) 0956 DEVANAGARI VOWEL SIGN UE
← (‎ ੁ ‎) 0A41 GURMUKHI VOWEL SIGN U

# ॗ ੂ
(‎ ॗ ‎) 0957 DEVANAGARI VOWEL SIGN UUE
← (‎ ੂ ‎) 0A42 GURMUKHI VOWEL SIGN UU

# Γ Г
(‎ Γ ‎) 0393 GREEK CAPITAL LETTER GAMMA
← (‎ Г ‎) 0413 CYRILLIC CAPITAL LETTER GHE
Expand Down Expand Up @@ -925,14 +933,6 @@
← (‎ ੍ ‎) 0A4D GURMUKHI SIGN VIRAMA
← (‎ ્ ‎) 0ACD GUJARATI SIGN VIRAMA

# ॖ ੁ
(‎ ॖ ‎) 0956 DEVANAGARI VOWEL SIGN UE
← (‎ ੁ ‎) 0A41 GURMUKHI VOWEL SIGN U

# ॗ ੂ
(‎ ॗ ‎) 0957 DEVANAGARI VOWEL SIGN UUE
← (‎ ੂ ‎) 0A42 GURMUKHI VOWEL SIGN UU

# २ ર ૨
(‎ २ ‎) 0968 DEVANAGARI DIGIT TWO
← (‎ ર ‎) 0AB0 GUJARATI LETTER RA # →૨→
Expand Down
7 changes: 4 additions & 3 deletions unicodetools/data/security/dev/data/draft-restrictions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59697,6 +59697,7 @@ E0100..E01EF ; Allowed ; Recommended # [240] (U+E0100..U+E01EF) VARIATION SELE
2074..208E ; ~Unicode Identifier # [27] (⁴..₎) SUPERSCRIPT FOUR..SUBSCRIPT RIGHT PARENTHESIS
2090..209C ; ~Unicode Identifier # [13] (ₐ..ₜ) LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER T
20A0..20C1 ; ~Unicode Identifier # [34] (₠..⃁) EURO-CURRENCY SIGN..SAUDI RIYAL SIGN
20C3 ; ~Unicode Identifier # (⃃) UAE DIRHAM SIGN
20DD..20E0 ; ~Unicode Identifier # [4] (⃝..⃠) COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH
20E2..20E4 ; ~Unicode Identifier # [3] (⃢..⃤) COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE
2100..2117 ; ~Unicode Identifier # [24] (℀..℗) ACCOUNT OF..SOUND RECORDING COPYRIGHT
Expand Down Expand Up @@ -59998,9 +59999,9 @@ FFF9..FFFD ; ~Unicode Identifier # [5] (U+FFF9..�) INTERLINEAR ANNOTATION
1F6DC..1F6EC ; ~Unicode Identifier # [17] (🛜..🛬) WIRELESS..AIRPLANE ARRIVING
1F6F0..1F6FC ; ~Unicode Identifier # [13] (🛰..🛼) SATELLITE..ROLLER SKATE
1F700..1F7D9 ; ~Unicode Identifier # [218] (🜀..🟙) ALCHEMICAL SYMBOL FOR QUINTESSENCE..NINE POINTED WHITE STAR
1F7DB ; ~Unicode Identifier # (🟛) BULLET IN DOUBLE CIRCLE
1F7E0..1F7EB ; ~Unicode Identifier # [12] (🟠..🟫) LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
1F7F0 ; ~Unicode Identifier # (🟰) HEAVY EQUALS SIGN
1F800..1F80B ; ~Unicode Identifier # [12] (🠀..🠋) LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
1F7F0..1F80B ; ~Unicode Identifier # [28] (🟰..🠋) HEAVY EQUALS SIGN..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
1F810..1F847 ; ~Unicode Identifier # [56] (🠐..🡇) LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
1F850..1F859 ; ~Unicode Identifier # [10] (🡐..🡙) LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
1F860..1F887 ; ~Unicode Identifier # [40] (🡠..🢇) WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-HEADED SOUTH WEST VERY HEAVY BARB ARROW
Expand All @@ -60023,4 +60024,4 @@ FFF9..FFFD ; ~Unicode Identifier # [5] (U+FFF9..�) INTERLINEAR ANNOTATION
E0001 ; ~Unicode Identifier # (U+E0001) LANGUAGE TAG
E0020..E007F ; ~Unicode Identifier # [96] (U+E0020..U+E007F) TAG SPACE..CANCEL TAG

# Total code points: 14287
# Total code points: 14304
21 changes: 19 additions & 2 deletions unicodetools/data/security/dev/data/review.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# review.txt
# Date: 2025-09-12, 03:25:00 GMT
# Date: 2025-10-09, 03:27:02 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -71181,6 +71181,7 @@ E0100..E01EF ; Restricted ; output-disallowed # [240] (U+E0100..U+E01EF) VARIA
2029 ; Restricted ; not in XID+ # (U+2029) PARAGRAPH SEPARATOR
202A..202E ; Restricted ; not in XID+ # [5] (U+202A..U+202E) LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE
2066..2069 ; Restricted ; not in XID+ # [4] (U+2066..U+2069) LEFT-TO-RIGHT ISOLATE..POP DIRECTIONAL ISOLATE
20C3 ; Restricted ; not in XID+ # (⃃) UAE DIRHAM SIGN
2488..249B ; Restricted ; not in XID+ # [20] (⒈..⒛) DIGIT ONE FULL STOP..NUMBER TWENTY FULL STOP
2FF0 ; Restricted ; not in XID+ # (⿰) IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT
2FF1 ; Restricted ; not in XID+ # (⿱) IDEOGRAPHIC DESCRIPTION CHARACTER ABOVE TO BELOW
Expand Down Expand Up @@ -71230,7 +71231,23 @@ FFFD ; Restricted ; not in XID+ # (�) REPLACEMENT CHARACTER
1343E ; Restricted ; not in XID+ # (U+1343E) EGYPTIAN HIEROGLYPH BEGIN WALLED ENCLOSURE
1343F ; Restricted ; not in XID+ # (U+1343F) EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE
1F100 ; Restricted ; not in XID+ # (🄀) DIGIT ZERO FULL STOP
1F7DB ; Restricted ; not in XID+ # (🟛) BULLET IN DOUBLE CIRCLE
1F7F1 ; Restricted ; not in XID+ # (🟱) CIRCLE WITH DOUBLE VERTICAL AND HORIZONTAL LINE
1F7F2 ; Restricted ; not in XID+ # (🟲) DOUBLE CIRCLE WITH DOUBLE HORIZONTAL LINE
1F7F3 ; Restricted ; not in XID+ # (🟳) CIRCLED BOTTOM RIGHT OBLIQUE HALF BLACK CIRCLE
1F7F4 ; Restricted ; not in XID+ # (🟴) LEFT HALF WHITE CIRCLE
1F7F5 ; Restricted ; not in XID+ # (🟵) RIGHT HALF WHITE CIRCLE
1F7F6 ; Restricted ; not in XID+ # (🟶) TRANSPARENT CUBE
1F7F7 ; Restricted ; not in XID+ # (🟷) WHITE CUBE
1F7F8 ; Restricted ; not in XID+ # (🟸) HORIZONTAL DOUBLE WHITE SMALL SQUARE
1F7F9 ; Restricted ; not in XID+ # (🟹) VERTICAL DOUBLE WHITE SMALL SQUARE
1F7FA ; Restricted ; not in XID+ # (🟺) WHITE SQUARE WITH BOTTOM HALF BISECTED
1F7FB ; Restricted ; not in XID+ # (🟻) WHITE SQUARE WITH TOP HALF BISECTED
1F7FC ; Restricted ; not in XID+ # (🟼) WHITE SQUARE WITH HORIZONTAL AND VERTICAL BISECTING LINES
1F7FD ; Restricted ; not in XID+ # (🟽) LOWER LEFT FLATTENED RIGHT TRIANGLE
1F7FE ; Restricted ; not in XID+ # (🟾) LOWER RIGHT FLATTENED RIGHT TRIANGLE
1F7FF ; Restricted ; not in XID+ # (🟿) RHOMBUS
E0001 ; Restricted ; not in XID+ # (U+E0001) LANGUAGE TAG
E0020..E007F ; Restricted ; not in XID+ # [96] (U+E0020..U+E007F) TAG SPACE..CANCEL TAG

# Total code points: 226
# Total code points: 243
Original file line number Diff line number Diff line change
Expand Up @@ -5724,3 +5724,7 @@ A7F1 ; 02E2 # ( ꟱ → ˢ ) MODIFIER LETTER CAPITAL S → MODIFIER LETTER SMAL

# Confusable Katakana-Han pair (PAG ref #442)
1B122 ; 4E8E

# Confusables for Devanagari UE and UUE (PAG ref #449)
0956 ; 032E
0957 ; 032E 032E
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# formatted-source.txt
# Date: 2025-09-12, 03:24:47 GMT
# Date: 2025-10-09, 03:26:35 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1265,8 +1265,11 @@

032D ; 1CD9 # ( ̭ ~ ᳙ ) COMBINING CIRCUMFLEX ACCENT BELOW ~ VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA SCHROEDER

032E ; 0956 # ( ̮ ~ ॖ ) COMBINING BREVE BELOW ~ DEVANAGARI VOWEL SIGN UE
032E ; 1CD8 # ( ̮ ~ ᳘ ) COMBINING BREVE BELOW ~ VEDIC TONE CANDRA BELOW

032E 032E ; 0957 # ( ̮̮ ~ ॗ ) COMBINING BREVE BELOW, COMBINING BREVE BELOW ~ DEVANAGARI VOWEL SIGN UUE

0331 ; 0320 # ( ̱ ~ ̠ ) COMBINING MACRON BELOW ~ COMBINING MINUS SIGN BELOW
0331 ; 0952 # ( ̱ ~ ॒ ) COMBINING MACRON BELOW ~ DEVANAGARI STRESS SIGN ANUDATTA

Expand Down