Skip to content

Conversation

@roubert
Copy link
Member

@roubert roubert commented Jun 6, 2025

All the most commonly used Locale objects have very little payload, most of them don't use any extensions, don't use a language tag longer than 3 characters and don't use more than a single variant.

There's room for all that data in a simple 32 byte large payload object, which can be nested directly in the Locale object.

Any payload larger than that can instead be heap allocated as needed, in order to save storage for the most commonly used objects while retaining the ability to create arbitrarily large and complex Locale objects.

This reduces the storage requirements for all Locale objects.

For nested payloads, this reduction is from 224 bytes to 48 bytes.

For payloads that need to be heap allocated, the reduction depends on several factors, but for most cases there's some reduction. There are also cases where this refactoring actually increases the storage used, because CharString allocates more storage than necessary. There are a number of ways in which this could be improved upon, such as optimizing CharString to not allocate more than necessary when copying a string of known length, not allocating any empty CharString objects or possibly replacing CharString with a new class for fixed length strings.

The public API remains unchanged but the operations which can lead to U_MEMORY_ALLOCATION_ERROR change.

Checklist

  • Required: Issue filed: ICU-20392
  • Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

ALLOW_MANY_COMMITS=true

@roubert roubert force-pushed the 20392 branch 12 times, most recently from e348eb0 to 33362b4 Compare June 10, 2025 14:40
@roubert roubert marked this pull request as ready for review July 22, 2025 13:41
@roubert roubert requested a review from markusicu July 22, 2025 13:41
Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

Except, I was hoping for sizeof(Locale) to go down even more.

I think with these changes you are getting 48 bytes on a 64-bit machine for

  • vtable pointer
  • variant payload
    • variant discriminator
    • union of Nest / unique_ptr, 32B and 8B-aligned for the pointer

This makes the variant discriminator take up 8 bytes (because of the alignment/padding), distinguishing three states (bogus, Nest, unique_ptr).

Idea: Replace the std::variant with an explicit/manual one, using a union for the heap pointer vs. some other fields.

enum Which : uint8_t { BOGUS, NEST, HEAP };
struct Payload {
    union {
        struct {
            char language[4];
            char region[4];
        } langRegion;
        std::unique_ptr<Heap> heapPtr;
    } langRegionOrHeap;
    Which which : 2;
    uint8_t variantBegin : 6;  // 6 to fill the byte, 5 would suffice
    char script[5];
    char baseName[18];

    Payload() {
        which = BOGUS;
    }
    ~Payload() {
        // if which == HEAP: release langRegionOrHeap.heapPtr
    }
    ...
};

That should get us to 8B vtable + 32B payload = 40B.

I think that shaving off another 8B in every Locale instance is worth some localized fiddling with the discriminator & union.

WDYT?

@roubert
Copy link
Member Author

roubert commented Aug 6, 2025

For the first step here, I'd very much like to use the well-tested and type-safe standard library std::variant<>, as that means less new code that needs to be written, adding no risk of introducing any additional bugs.

The next most important optimization after that would then be to eliminate the storage currently wasted by CharString allocating more than needed for strings of known size.

With that done, these changes will have resulted in significant size reductions for all kinds of Locale objects, finally making them small enough to use everywhere and making it possible to then start removing workarounds (such as ICU-23005) that currently use different kinds of strings instead of using Locale objects directly (saving some storage at the cost of having to repeatedly re-parse these strings).

At that point the primary problem will have been fully solved and then we can start looking into further improvements. I myself would then like to start out by taking a critical look at Locale::initBaseName(), which seems overly convoluted and maybe could be eliminated altogether by some thoughtful refactoring. And then, sure, replacing std::variant<> with something specialized could both save a few more bytes and be quite interesting to implement.

@markusicu
Copy link
Member

For the first step here, I'd very much like to use the well-tested and type-safe standard library std::variant<>, as that means less new code that needs to be written, adding no risk of introducing any additional bugs.

Ok for a first step / first PR.

The next most important optimization after that would then be to eliminate the storage currently wasted by CharString allocating more than needed for strings of known size.

I partially disagree. I think that making the Locale object smaller for commonly used locale IDs is most important.
I am ok if that's in a second PR.

CharString is "only" used for less-common locale IDs. sizeof(Heap) should be reduced after sizeof(Locale).

I agree that then there are other opportunities, such as making the parser/initBaseName() less convoluted, and even storing keywords in an at least slightly structured way, so that we need not traipse through the string all the time.

roubert added 3 commits August 7, 2025 19:42
All the most commonly used Locale objects have very little payload, most
of them don't use any extensions, don't use a language tag longer than 3
characters and don't use more than a single variant.

There's room for all that data in a simple 32 byte large payload object,
which can be nested directly in the Locale object.

Any payload larger than that can instead be heap allocated as needed, in
order to save storage for the most commonly used objects while retaining
the ability to create arbitrarily large and complex Locale objects.

This reduces the storage requirements for all Locale objects.

For nested payloads, this reduction is from 224 bytes to 48 bytes.

For payloads that need to be heap allocated, the reduction depends on
several factors, but for most cases there's some reduction. There are
also cases where this refactoring actually increases the storage used,
because CharString allocates more storage than necessary. There are a
number of ways in which this could be improved upon, such as optimizing
CharString to not allocate more than necessary when copying a string of
known length, not allocating any empty CharString objects or possibly
replacing CharString with a new class for fixed length strings.

The public API remains unchanged but the operations which can lead to
U_MEMORY_ALLOCATION_ERROR change.
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@roubert roubert merged commit 00c199b into unicode-org:main Aug 7, 2025
94 checks passed
@roubert roubert deleted the 20392 branch August 7, 2025 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants