Skip to content

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Dec 4, 2024

Renames "AssumeCapacity" to "Reserved", following the convention that a single word used to describe a function variant is desired if possible. Similarly, renames "ensure" to "reserve", so that the same word appears on both ends of those fundamentally connected functions.

try list.ensureUnusedCapacity(gpa, 1);
list.appendAssumeCapacity("foo");

⬇️

try list.reserveUnused(gpa, 1);
list.appendReserved("foo");

I generally like this, since I think it's a slimmer and more coherent API but I can think of one reason not to do this, which is that "ensure unused capacity" succeeds in communicating that a secondary call is a no-op while "reserve unused" incorrectly implies that there is an additional reservation counter.

try list.ensureUnusedCapacity(gpa, 1);
try list.ensureUnusedCapacity(gpa, 1); // no-op

⬇️

try list.reserveUnused(gpa, 1);
try list.reserveUnused(gpa, 1); // no-op. Is that still obvious to you?

I think this naming convention shines when combined with more complicated stuff, for example, the ArrayHashMap method putAssumeCapacityNoClobber would become putReservedNew.

Follows the convention that a single word used to describe a function
variant is desired if possible.
Matches the same word used for the corresponding append functions.
@andrewrk andrewrk added standard library This issue involves writing Zig code for the standard library. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. release notes This PR should be mentioned in the release notes. labels Dec 4, 2024
This word is shorter and has a more accurate definition.
@RetroDev256
Copy link
Contributor

I like the change.

@T-727
Copy link

T-727 commented Dec 4, 2024

Would it make sense to always push "Reserved" to the end of the name? and also not abolish "ensure"?
I find ensureReserved easier to read than reserveUnused, for the more complex example putReservedNew and putNewReserved read about the same to me but the second imo has the advantages of prioritizing the more important difference (New) and matching the rest of the functions in name

@dweiller
Copy link
Contributor

dweiller commented Dec 4, 2024

I find ensureReserved easier to read than reserveUnused

I also like the idea of calling the function that may no-op ensureReserved to help communicate that it may be a no-op:

try list.ensureReserved(gpa, 1);
try list.ensureReserved(gpa, 1);

I think it plays well with the usage of 'ensure' in ensureTotalCapacity.

@squeek502
Copy link
Collaborator

squeek502 commented Dec 4, 2024

I also like the idea of calling the function that may no-op ensureReserved to help communicate that it may be a no-op

I think it plays well with the usage of 'ensure' in ensureTotalCapacity.

But ensureTotalCapacity is renamed to reserveTotal in this PR. And ensureReserved would reintroduce the unused vs total ambiguity (c8ae581 and #9788).

@dweiller
Copy link
Contributor

dweiller commented Dec 4, 2024

But ensureTotalCapacity is renamed to reserveTotal in this PR.

Whoops, didn't read the actual diff. I don't see how ensureReserved reintroduces ambiguity if we keep the ensureTotalCapacity naming. That combination is more clear to me.

In my mind, 'reserved' space is talking about only the unused elements, so maybe I would just need to adjust how I view it. This means the semantics of reserveTotal is less clear than ensureTotalCapacity to me, based on just the name. When I first see reserveTotal(count) I think it could be adding (at least) count to the capacity, rather than ensuring the capacity is at least count.

@mlugg
Copy link
Member

mlugg commented Dec 4, 2024

I find the naming introduced in this PR less clear. To me, a call like try list.reserveUnused(gpa, 5) sounds like it does something like what _ = try list.addManyAsSlice(gpa, 5) does. I think the status quo names are better because they make direct reference to "capacity", which is the key thing to understand about these functions: they affect the ArrayList's capacity, not its actual contents.

@matklad
Copy link
Contributor

matklad commented May 27, 2025

My color of the bike shed (assuming ArrayListUnmanaged is the way):

pub fn push(self: *ArrayList, item: T) ?void
pub fn push_alloc(self: *ArrayList, allocator: Allocator, item: T) !void

Expanding:

  • push/pop is a natural pair, much better than append/pop.
  • Zig should favor up-front reservation pattern, so punish allocating methods with _alloc suffix (to be fair, allocating methods already have an extra parameters and a try, so we don't necessarily need to salt it even more, but I do think we should make non-allocating sweeter).
  • As a special case of the above, BoundedArray only has non-allocating methods, assumeCapacity/reserved would be redundant for it, and it's nice if allocating array is a super-set of bounded array.
  • Use ?void instead of assume, forcing .? at the call site, to make it syntactically obvious that there's a precondition to uphold, rather than relying on informal semantic meaning of the word assume.

Call site:

list.push("foo").?;
litt.push("bar") orelse {
    log.warn("dropping an item, not enough capacity", .{});
    return;
};

EDIT: as applied to TigerBeetle codebase: tigerbeetle/tigerbeetle#2990

@mlugg
Copy link
Member

mlugg commented May 27, 2025

Ooh, that's really quite an interesting idea. My concern with those names would be that they would confuse people by making the capacity assumption unclear... but perhaps the ?void is enough to stop that? The idea of pushing the IB onto the caller is definitely interesting.

Actually, I do have a concern with that. Beginners do often have a habit of discarding return values they don't understand, without reading docs. If they do that here, their ArrayList will just silently remain empty, leading to confusion. That... doesn't seem like a great experience for a first-time user, even if we could freely say "well, they misused the API!".

Thinking about that further... I think having the caller trigger the IB is a worse API. I can't think of many (any?) realistic cases where the null case isn't programmer error (of course, your snippet just logs an error, but I don't think that's a common situation in practice!). Making it the caller's responsibility to invoke IB just means we're maybe putting a little more pressure on the optimizer (to inline the call to avoid the inner branch), and that it's now possible for someone misusing the API to silently get unexpected behaviour.

I almost think your idea would be better if the assertion were kept inside alloc. This has the problem that beginners will almost certainly use the API wrong at first, but they will at least be guaranteed to immediately trigger a safety check when they do so, which seems preferable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. release notes This PR should be mentioned in the release notes. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants