admission: set minimums on cpu time granter buckets#164370
admission: set minimums on cpu time granter buckets#164370joshimhoff wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
Merging to
|
This commit introduces minimums on all the token buckets in cpu time token AC's granter. This prevents higher priority work from putting lower priority buckets into unbounded token debt. Fixes: cockroachdb#158539 Release note: None
f13437e to
6c219cf
Compare
tbg
left a comment
There was a problem hiding this comment.
Looks good! Just a question about the "floors".
@tbg reviewed 4 files and all commit messages, and made 2 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on joshimhoff and sumeerbhola).
pkg/util/admission/cpu_time_token_filler.go line 249 at r1 (raw file):
// computeMinimums computes per-bucket minimums from refill rates. The top // priority bucket (tier0/canBurst) has a floor of 0. Each subsequent bucket's // floor is its rate minus the top priority rate, which is negative. This ensures
Can you say just a little more here about the "why"? Perhaps add an example. It makes sense to me that when there is tons of tier0/burst traffic over extended periods of time, the other buckets will basically sink to -infinity. So even when the burst traffic stops, no lower-tier tokens will be available "forever", which is just undesirable.
The current choice is aesthetically pleasing in that it prevents the spacing between the buckets (if all buckets are initially full and then they get drained continuously, the spacing is what we get here, and if we say the top bucket bottoms out at zero, which makes sense, the other bottoms have to be chosen according to the current code).
But - why is that the right choice? If I understand correctly, in the state in which the buckets are all at the minimums, we'll have something like this (durations are made up):
- for the first 0.25s, only t0b serves
- in [0.25,0.5), only t0b and t0nb serve
- in [0.5,0.75) only t0b, t0nb, t1b
- after 0.75 everyone serves
but this also means that whenever there is a higher-priority burst, even if that burst stops completely, the other tiers get "stalled" for some time, potentially leaving CPU under-utilized.
Is there some invariant that I'm missing that makes the solution you've chosen the "obviously correct" one?
Whatever we end up settling on, please add an explanation of why the choice was made (and pointing out arbitrary choices/trade-offs if there are any).
Btw I riffed on this with claude and it wasn't too sure either, which makes me comfortable saying this definitely needs more words 😆
tbg
left a comment
There was a problem hiding this comment.
@tbg made 1 comment.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on joshimhoff and sumeerbhola).
pkg/util/admission/cpu_time_token_filler.go line 249 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Can you say just a little more here about the "why"? Perhaps add an example. It makes sense to me that when there is tons of tier0/burst traffic over extended periods of time, the other buckets will basically sink to -infinity. So even when the burst traffic stops, no lower-tier tokens will be available "forever", which is just undesirable.
The current choice is aesthetically pleasing in that it prevents the spacing between the buckets (if all buckets are initially full and then they get drained continuously, the spacing is what we get here, and if we say the top bucket bottoms out at zero, which makes sense, the other bottoms have to be chosen according to the current code).
But - why is that the right choice? If I understand correctly, in the state in which the buckets are all at the minimums, we'll have something like this (durations are made up):
- for the first 0.25s, only t0b serves
- in [0.25,0.5), only t0b and t0nb serve
- in [0.5,0.75) only t0b, t0nb, t1b
- after 0.75 everyone serves
but this also means that whenever there is a higher-priority burst, even if that burst stops completely, the other tiers get "stalled" for some time, potentially leaving CPU under-utilized.
Is there some invariant that I'm missing that makes the solution you've chosen the "obviously correct" one?
Whatever we end up settling on, please add an explanation of why the choice was made (and pointing out arbitrary choices/trade-offs if there are any).Btw I riffed on this with claude and it wasn't too sure either, which makes me comfortable saying this definitely needs more words 😆
s/prevents/preserves/
admission: set minimums on cpu time granter buckets
This commit introduces minimums on all the token buckets in cpu time token AC's granter. This prevents higher priority work from putting lower priority buckets into unbounded token debt.
Fixes: #158539
Release note: None