Skip to content

Conversation

makotokato
Copy link
Member

From #6367, @eggrobin suggests to use Indic_Conjunct_Break (InCB) property for Grapheme Cluster Break.

Also, InCB.toml is incomplete yet like the following, since it is added by ICU76 as a draft API.

values = [
  {discr = 0, long = "None", short = "None"},
]

It means that names (short / long / parse) are empty for this implementation.

@makotokato
Copy link
Member Author

After fixing https://unicode-org.atlassian.net/browse/ICU-23092 / unicode-org/icu#3457, icuexport will export valid value for Indic_Conjunct_Break.

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code lgtm

Given that this doesn't have names and is a draft property, should we make this doc(hidden) for now until it has names? This is only needed internally for segmenter, so we don't have to commit to the property enum just yet if we don't want to.

cc @sffc @robertbastian @hsivonen

@robertbastian
Copy link
Member

We can remove names from the API. I'm happy to expose this as a stable (nameless) property.

@makotokato
Copy link
Member Author

Does it mean that we can remove parse, short name and long name?

@robertbastian
Copy link
Member

I think that would be the best option for now.

robertbastian
robertbastian previously approved these changes Apr 3, 2025
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cc @sffc for doc(hidden)

@makotokato makotokato merged commit 4146498 into unicode-org:main Apr 7, 2025
29 checks passed
@makotokato makotokato deleted the InCB branch April 15, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants