Adds multi-tier partitioning to randM vector track#1093
Open
ah89 wants to merge 2 commits intoelastic:masterfrom
Open
Adds multi-tier partitioning to randM vector track#1093ah89 wants to merge 2 commits intoelastic:masterfrom
ah89 wants to merge 2 commits intoelastic:masterfrom
Conversation
Introduces a realistic multi-partition model with small, medium, and large partition tiers, supporting configurable counts and reproducible sizing via seeded RNG. Updates benchmarking to separately measure search performance per partition tier, improving analysis of index filtering and routing efficiency. Enhances documentation and parameterization for clarity and reproducible nightly runs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This updates the random_vector track to model a more realistic multi-partition workload with variable-sized small, medium, and large partitions. Documents are assigned to partitions using deterministic weighted sampling, routed by
partition_id, and benchmarked with separate search phases for each partition tier so latency and QPS can be measured independently.The change also renames the public configuration and task surface from
tenanttopartition, adds tier-specific search operations and conditional challenge phases, and documents the new parameters in the README. It also fixes the index template and query generation needed for the new flow by enabling custom routing for the data stream, requiring routing in mappings, and only sendingrescore_vectorwhen oversampling is greater than0.Validated with Rally test mode on Elasticsearch
9.3.1, which is the latest installabledarwin-aarch64release available in this environment. The track also passed a sparse configuration withsmall_partitions:3,medium_partitions:0,large_partitions:0, completing successfully with0%error rate; empty partition tiers are now skipped instead of causing the benchmark to fail.