-
Notifications
You must be signed in to change notification settings - Fork 99
Feat/over saturation stopping #438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/over saturation stopping #438
Conversation
85cf65e to
f996254
Compare
## Summary E2E tests which check basic GuideLLM functionality, using vLLM simulator. ## Details - [x] Max requests test - [x] Max duration test - [ ] Over-saturation stopping test - skipped for now, will be enabled when #438 lands ## Test Plan - [x] Local testing - [x] GitHub action --- - [x] "I certify that all code in this PR is my own, except as noted below." ## Use of AI - [x] Includes AI-assisted code completion - [ ] Includes code generated by an AI application - [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)
ce85b1c to
46bc491
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prompt-python_module_docs.md
Overall looks good. I have a few comments/discussion points to run through around names, types, package layout, etc. I haven't gone through the full logic yet to validate, will do that soon, but adding a writeup of the algorithm would help at the top.
Also, can you ensure all public methods have docs? I'll attach the prompt I use to generate docs that is generally fairly self sufficient and accurate in case you'd like to use that so it doesn't take too much time.
| sophisticated benchmark stopping criteria through composable constraint types. | ||
| """ | ||
|
|
||
| from .base import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we merge this and protocols into a single file called constraint.py to better merge some other new patterns across the repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| ConstraintInitializer, | ||
| SerializableConstraintInitializer, | ||
| ) | ||
| from .standard import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we break this apart into request.py and error.py or something along those lines? That way we have some clearer delineation of where future constraints might go particularly when looking at in the context of adding over_saturation.py module
| OverSaturationConstraintInitializer, | ||
| OverSaturationDetector, | ||
| ) | ||
| from .protocols import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note above on merging
| max_global_error_rate: float | None = Field( | ||
| default=None, description="Maximum global error rate (0-1) before stopping" | ||
| ) | ||
| stop_over_saturated: bool = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this and retype this a bit? Something along the lines of just over_saturation, detect_saturation, etc? And then can ideally can support either a bool or dict input to resolve with arguments for the constraint/initializer implementations. This way we can keep these configurations local and stored within each benchmark command rather than global envs which can change across systems. This last parte being more important since we are saving this entrypoints class in the output report which can then be reloaded again later for replication / other use
| max_errors: int | None, | ||
| max_error_rate: float | None, | ||
| max_global_error_rate: float | None, | ||
| stop_over_saturated: bool | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same point here as earlier on the naming and type settings
| return (slope > 0) and (margin_of_error < self.moe_threshold) | ||
|
|
||
|
|
||
| class OverSaturationDetector(OverSaturationDetectorBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to fold this class in with the Constraint class to simplify the flow. That way the initializer instantiates a constraint which actively enforces detection rather than chaining another level down. The initializer would then be responsible for instantiating the correct over saturation constraint based on settings passed in if we add some more later on
| @@ -0,0 +1,467 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: maybe rename to just saturation.py for the module if we are looking to include under saturation and other general variants within
| @@ -0,0 +1,467 @@ | |||
| """ | |||
| Over-saturation detection constraint implementation. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide a short writeup of the method/algorithm here for what's implemented down below and how the classes interface?
src/guidellm/__main__.py
Outdated
| help="Maximum global error rate across all benchmarks.", | ||
| ) | ||
| @click.option( | ||
| "--stop-over-saturated", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note on the previous comment for entrypoints args for name and type
| @@ -0,0 +1,256 @@ | |||
| # Over-Saturation Feature Test Coverage | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doc would become hard to keep updated and reflective of the test cases so they could quickly become out of sync. Also, not sure how much value it adds since it should be able to be fairly reproducible by feeding the code back into an LLM with a prompt to generate an overview/description
Add over-saturation detection and stopping capability to GuideLLM CLI. - Implement OverSaturationDetector with statistical slope detection - Add OverSaturationConstraint for scheduler integration - Add CLI flags --stop-over-saturated and --stop-osd - Integrate with benchmark entrypoints and main CLI Signed-off-by: Alon Kellner <[email protected]>
Add unit tests for over-saturation detection and constraint functionality. Signed-off-by: Alon Kellner <[email protected]>
Add comprehensive test suite for over-saturation detection algorithm. Signed-off-by: Alon Kellner <[email protected]>
Enable end-to-end test for over-saturation stopping functionality. Signed-off-by: Alon Kellner <[email protected]>
Refactor the single constraints.py file into a package structure where each constraint type has its own file: - protocols.py: Protocol definitions (Constraint, ConstraintInitializer, SerializableConstraintInitializer) - factory.py: ConstraintsInitializerFactory for creating and managing constraints - base.py: Base classes (PydanticConstraintInitializer, UnserializableConstraintInitializer) - standard.py: Standard constraints (MaxNumber, MaxDuration, MaxErrors, MaxErrorRate, MaxGlobalErrorRate, RequestsExhausted) - over_saturation.py: Over-saturation detection constraint implementation This improves code organization and maintainability while preserving backward compatibility through the package's __init__.py exports. Signed-off-by: Alon Kellner <[email protected]>
…Args The field was referenced in __main__.py but missing from the schema definition, causing ValueError when trying to get default values. Signed-off-by: Alon Kellner <[email protected]>
- Fix first_iteration -> first_token_iteration attribute name - Add type ignore for OverSaturationConstraint return type - Fix validated_kwargs type handling for stop_over_saturated parameter Signed-off-by: Alon Kellner <[email protected]>
When is_flag=True, Click automatically handles boolean values. Specifying type=bool can cause Pydantic validation errors. Signed-off-by: Alon Kellner <[email protected]>
- Fix import paths from advanced_constraints to constraints package - Fix line length errors (E501) in test files - Fix type error for request_start None check - Update test coverage documentation Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Samuel Monson <[email protected]> Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
59ffa16 to
3ebdcaa
Compare
Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
Signed-off-by: Alon Kellner <[email protected]>
Summary
This PR adds over-saturation stopping to the GuideLLM CLI.
It's based on the OSD (Over-Saturation Detection) algorithm we developed and evaluated at Jounce.
Use
--stop-over-saturatedor--stop-osdto enable.Details
This PR adds:
--stop-over-saturated)Test Plan
Use of AI
## WRITTEN BY AI ##)