-
Notifications
You must be signed in to change notification settings - Fork 286
[GPTQ] Change actorder default to "static" #1425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
nice! |
|
@kylesayrs quick question from my side. Since the old default was None, do we risk defaulting to an incorrect value for older recipes that don't include it? Especially for ones that specified it in the quantization scheme? |
|
## Purpose ## * Make actorder option more intuitive for users * Enable easier adjustment of actorder default #1425 * This change is conceptually intuitive because activation ordering is a concept that only applies to the GPTQ algorithm (the only algorithm for which quantization group order matters) ## Changes ## * Add `actorder` argument to `GPTQModifier` * Override `resolve_quantization_config` method to resolve config groups with `actorder` argument * (Misc) rearrange method order to match the typical order in which they are called in the modifier lifecycle ## Testing ## * Ran llama w4a16 example to completion Signed-off-by: Kyle Sayers <[email protected]>
The base branch was changed.
8cf408e to
8b9f795
Compare
8b9f795 to
f6a0e25
Compare
## Purpose ## * Fix false assumption that `actorder` field is of enum type * Despite the fact that actorder passes through a [field_validator](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_args.py#L200), `QuantizationArgs` has the [use_enum_values](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_args.py#L128) configuration set, meaning that enum values are converted to strings. * This was done in relation to [this fix](neuralmagic/sparseml#2327) * Remove conflict with recipes which manually specify activation ordering by using a sentinel value ## Follow ups ## * #1425 ## Testing ## * Ran llama3 example with manually specified `actorder=group` --------- Signed-off-by: Kyle Sayers <[email protected]> Co-authored-by: Dipika Sikka <[email protected]>
|
Waiting for next weekly to run before merging |
The base branch was changed.
|
@dsikka I've updated all of the tests to explicitly retain original actorder behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file was added after and may have been missed?
https://github.com/vllm-project/llm-compressor/blob/main/tests/e2e/vLLM/recipes/WNA16/recipe_w4a16_awq_sym.yaml
For test cases that do not use a recipe / use a scheme directly, they will have act order. So we may want to update to use a recipe
tests/e2e/vLLM/configs/w4a16_grouped_quant.yaml
tests/e2e/vLLM/configs/w8a16_grouped_quant.yaml
tests/lmeval/configs/w4a16_grouped_quant.yaml
We definitely do want to maintain at least one test with no act order
Everything else looks good.
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
The test you linked to is an AWQ recipe, not a GPTQ recipe?
I've updated the
Not sure why this has taken me so long to add 🙃. I added |
brian-dellabetta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like there is unreachable code we would want to delete with these changes?
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
409d1b5
## Purpose ## * Use best defaults for GPTQ quantization ## Prerequisites ## * #1453 * #1468 ## Changes ## * Set gptq actorder default to "static" ## Testing ## * Ran llama w4a16 example to completion and validated the correct activation ordering --------- Signed-off-by: Kyle Sayers <[email protected]> Co-authored-by: Dipika Sikka <[email protected]>
Purpose
Prerequisites
Changes
Testing