Skip to content

admission: set minimums on cpu time granter buckets#164370

Open
joshimhoff wants to merge 1 commit intocockroachdb:masterfrom
joshimhoff:toonegative
Open

admission: set minimums on cpu time granter buckets#164370
joshimhoff wants to merge 1 commit intocockroachdb:masterfrom
joshimhoff:toonegative

Conversation

@joshimhoff
Copy link
Collaborator

admission: set minimums on cpu time granter buckets

This commit introduces minimums on all the token buckets in cpu time token AC's granter. This prevents higher priority work from putting lower priority buckets into unbounded token debt.

Fixes: #158539

Release note: None

@joshimhoff joshimhoff requested a review from a team as a code owner February 25, 2026 17:52
@trunk-io
Copy link
Contributor

trunk-io bot commented Feb 25, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

This commit introduces minimums on all the token buckets
in cpu time token AC's granter. This prevents higher
priority work from putting lower priority buckets into
unbounded token debt.

Fixes: cockroachdb#158539

Release note: None
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a question about the "floors".

@tbg reviewed 4 files and all commit messages, and made 2 comments.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on joshimhoff and sumeerbhola).


pkg/util/admission/cpu_time_token_filler.go line 249 at r1 (raw file):

// computeMinimums computes per-bucket minimums from refill rates. The top
// priority bucket (tier0/canBurst) has a floor of 0. Each subsequent bucket's
// floor is its rate minus the top priority rate, which is negative. This ensures

Can you say just a little more here about the "why"? Perhaps add an example. It makes sense to me that when there is tons of tier0/burst traffic over extended periods of time, the other buckets will basically sink to -infinity. So even when the burst traffic stops, no lower-tier tokens will be available "forever", which is just undesirable.
The current choice is aesthetically pleasing in that it prevents the spacing between the buckets (if all buckets are initially full and then they get drained continuously, the spacing is what we get here, and if we say the top bucket bottoms out at zero, which makes sense, the other bottoms have to be chosen according to the current code).
But - why is that the right choice? If I understand correctly, in the state in which the buckets are all at the minimums, we'll have something like this (durations are made up):

  • for the first 0.25s, only t0b serves
  • in [0.25,0.5), only t0b and t0nb serve
  • in [0.5,0.75) only t0b, t0nb, t1b
  • after 0.75 everyone serves
    but this also means that whenever there is a higher-priority burst, even if that burst stops completely, the other tiers get "stalled" for some time, potentially leaving CPU under-utilized.
    Is there some invariant that I'm missing that makes the solution you've chosen the "obviously correct" one?
    Whatever we end up settling on, please add an explanation of why the choice was made (and pointing out arbitrary choices/trade-offs if there are any).

Btw I riffed on this with claude and it wasn't too sure either, which makes me comfortable saying this definitely needs more words 😆

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbg made 1 comment.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on joshimhoff and sumeerbhola).


pkg/util/admission/cpu_time_token_filler.go line 249 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Can you say just a little more here about the "why"? Perhaps add an example. It makes sense to me that when there is tons of tier0/burst traffic over extended periods of time, the other buckets will basically sink to -infinity. So even when the burst traffic stops, no lower-tier tokens will be available "forever", which is just undesirable.
The current choice is aesthetically pleasing in that it prevents the spacing between the buckets (if all buckets are initially full and then they get drained continuously, the spacing is what we get here, and if we say the top bucket bottoms out at zero, which makes sense, the other bottoms have to be chosen according to the current code).
But - why is that the right choice? If I understand correctly, in the state in which the buckets are all at the minimums, we'll have something like this (durations are made up):

  • for the first 0.25s, only t0b serves
  • in [0.25,0.5), only t0b and t0nb serve
  • in [0.5,0.75) only t0b, t0nb, t1b
  • after 0.75 everyone serves
    but this also means that whenever there is a higher-priority burst, even if that burst stops completely, the other tiers get "stalled" for some time, potentially leaving CPU under-utilized.
    Is there some invariant that I'm missing that makes the solution you've chosen the "obviously correct" one?
    Whatever we end up settling on, please add an explanation of why the choice was made (and pointing out arbitrary choices/trade-offs if there are any).

Btw I riffed on this with claude and it wasn't too sure either, which makes me comfortable saying this definitely needs more words 😆

s/prevents/preserves/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

admission: revisit deltaRefillRates in CPU time token AC

3 participants