- 
                Notifications
    You must be signed in to change notification settings 
- Fork 11
labels: Adjust cost estimates #990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Map access is constant time, not dependent on map size. Updated the cost model to use a fixed cost for map operations rather than per-element cost. <!-- TODO: Add detailed benchmark results here --> Changes: - Fixed map cost model to be constant time (4.0) instead of per-element - Added comprehensive benchmarks for string equality, hasPrefix, slice contains, and map contains operations - Added postings iteration benchmarks across different sizes - Removed outdated TODO comments that are now addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adjusts the cost estimation model for matcher operations in Prometheus labels, moving from per-element costs to fixed costs for map operations. It also introduces comprehensive benchmarks to compare the performance characteristics of different string matching operations.
- Updated cost constants to better reflect actual performance characteristics
- Changed map operations from per-element to fixed cost model
- Added extensive benchmarks for string equality, prefix matching, slice/map contains operations
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description | 
|---|---|
| model/labels/cost.go | Updated cost estimation constants and changed map matcher cost calculation from per-element to fixed cost | 
| model/labels/cost_test.go | Added comprehensive benchmarks for string operations, slice/map contains, and postings iteration | 
| model/labels/postings_bench_test.go | Added benchmarks for iterating over postings lists of various sizes | 
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Extracted the test cases into a shared matcherTestCases variable that can be reused by other functions.
Creates sub-benchmarks for each matcher from matcherTestCases, measures actual runtime vs theoretical cost, and outputs ranking comparison. Runtime is rounded to multiples of 3 nanoseconds and uses competition-style ranking where equal values get the same rank.
### Problem We've seen that ingesters can end up doing more work when this optimization is enabled. This is due to fetching significantly more series and checking them for sharding. This is not something our cost model accounted for. ### What this PR does this PR introduces the cost of retrieving a single series from the index. The cost of doing this depends a lot on whether the block is an in-memory or an on-disk block. In-memory blocks have much more efficient sharding code. For now we're sticking with a single cost for all blocks, but we can change that later. ### why 10? I added a benchmark to help me come up with the new number. I'm comparing the baseline of 2ns on my machine with the 2ns that it takes to do a string comparison (see grafana/mimir-prometheus#990). This means our cost should be around 10-40. 10 is 3 orders of magnitude higher than what we had before (0.01) already, so I was cautious not to move this too much <details><summary>Cost of sharded vs non-sharded postings iteration</summary> <p> false: not sharded true: sharded ``` benchstat -col=/sharded -row=/size -filter='(/sharded:false OR /reuseCache:true)' │ false │ true │ │ sec/op │ sec/op vs base │ 128 326.1n ± ∞ ¹ 2937.0n ± ∞ ¹ ~ (p=0.100 n=3) ² 128K 301.5µ ± ∞ ¹ 4943.5µ ± ∞ ¹ ~ (p=0.100 n=3) ² 1M 2.435m ± ∞ ¹ 83.056m ± ∞ ¹ ~ (p=0.100 n=3) ² geomean 62.10µ 1.064m +1614.11% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 │ false │ true │ │ sec/posting │ sec/posting vs base │ 128 2.548n ± ∞ ¹ 22.940n ± ∞ ¹ ~ (p=0.100 n=3) ² 128K 2.300n ± ∞ ¹ 37.720n ± ∞ ¹ ~ (p=0.100 n=3) ² 1M 2.322n ± ∞ ¹ 79.210n ± ∞ ¹ ~ (p=0.100 n=3) ² geomean 2.387n 40.92n +1614.16% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 ``` </p> </details> ### Potential next steps 1. detect whether the block is in-memory or on-disk and adjust our plans. 2. adjust the cost based on whether we have query sharding enabled or not; hashing series is expensive 1\. will probably give higher returns <!-- Thanks for sending a pull request! Before submitting: 1. Read our CONTRIBUTING.md guide 3. Rebase your PR if it gets out of sync with main --> #### What this PR does #### Which issue(s) this PR fixes or relates to related to #11920 --------- Signed-off-by: Dimitar Dimitrov <[email protected]>
Cost model primitives
Updated the cost model to use a fixed cost for map operations rather than per-element cost. Also added benchmarks to help compare the cost of these with each other.
apple silicon arm64 benchmarks
intel amd64 benchmarks
Matcher benchmarking
I wanted to implement some way in which we can validate our cost model and track improvements. What I went with are average rank diff + Kendall's Tau.
Below are the results from before and after these adjustments. I think we didn't fix some major discrepancies, but we did move the average.
Before
After
Advice to reviewers
It's probably easier to review without whitespace diff
related to grafana/mimir#11920