Skip to content

Conversation

@kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented May 12, 2025

Purpose

  • Use best defaults for GPTQ quantization

Prerequisites

Changes

  • Set gptq actorder default to "static"

Testing

  • Ran llama w4a16 example to completion and validated the correct activation ordering

@dsikka
Copy link
Collaborator

dsikka commented May 13, 2025

nice!

@markurtz
Copy link
Collaborator

@kylesayrs quick question from my side. Since the old default was None, do we risk defaulting to an incorrect value for older recipes that don't include it? Especially for ones that specified it in the quantization scheme?

@kylesayrs kylesayrs changed the base branch from kylesayrs/gptq-actorder to main May 15, 2025 04:40
@kylesayrs kylesayrs dismissed brian-dellabetta’s stale review May 15, 2025 04:40

The base branch was changed.

@kylesayrs kylesayrs changed the base branch from main to kylesayrs/gptq-actorder May 15, 2025 04:40
@kylesayrs
Copy link
Collaborator Author

@markurtz

  1. "older" recipes which did not specify will now use "static". This is (more or less) inevitable and imho acceptable, since a user which does not specify actorder probably does not care and just wants the best configuration
  2. For recipes which specify anything except "static", they will see this error encouraging them to modify their recipe to disable global actorder

rahul-tuli
rahul-tuli previously approved these changes May 16, 2025
Base automatically changed from kylesayrs/gptq-actorder to main May 19, 2025 16:39
kylesayrs added a commit that referenced this pull request May 19, 2025
## Purpose ##
* Make actorder option more intuitive for users
* Enable easier adjustment of actorder default #1425
* This change is conceptually intuitive because activation ordering is a
concept that only applies to the GPTQ algorithm (the only algorithm for
which quantization group order matters)

## Changes ##
* Add `actorder` argument to `GPTQModifier`
* Override `resolve_quantization_config` method to resolve config groups
with `actorder` argument
* (Misc) rearrange method order to match the typical order in which they
are called in the modifier lifecycle

## Testing ##
* Ran llama w4a16 example to completion

Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs dismissed stale reviews from rahul-tuli and brian-dellabetta May 19, 2025 16:39

The base branch was changed.

@kylesayrs kylesayrs force-pushed the kylesayrs/gptq-actorder-default branch from 8cf408e to 8b9f795 Compare May 20, 2025 20:58
@kylesayrs kylesayrs changed the base branch from main to kylesayrs/fix-default-actorder May 20, 2025 20:58
@kylesayrs kylesayrs force-pushed the kylesayrs/gptq-actorder-default branch from 8b9f795 to f6a0e25 Compare May 20, 2025 21:11
Base automatically changed from kylesayrs/fix-default-actorder to main May 22, 2025 18:09
kylesayrs added a commit that referenced this pull request May 22, 2025
## Purpose ##
* Fix false assumption that `actorder` field is of enum type
* Despite the fact that actorder passes through a
[field_validator](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_args.py#L200),
`QuantizationArgs` has the
[use_enum_values](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_args.py#L128)
configuration set, meaning that enum values are converted to strings.
* This was done in relation to [this
fix](neuralmagic/sparseml#2327)
* Remove conflict with recipes which manually specify activation
ordering by using a sentinel value

## Follow ups ##
* #1425

## Testing ##
* Ran llama3 example with manually specified `actorder=group`

---------

Signed-off-by: Kyle Sayers <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
@kylesayrs kylesayrs changed the base branch from main to kylesayrs/actorder-test May 29, 2025 18:23
@kylesayrs kylesayrs requested a review from rahul-tuli May 29, 2025 18:29
rahul-tuli
rahul-tuli previously approved these changes May 30, 2025
@kylesayrs
Copy link
Collaborator Author

Waiting for next weekly to run before merging

Base automatically changed from kylesayrs/actorder-test to main May 30, 2025 04:57
@kylesayrs kylesayrs dismissed stale reviews from rahul-tuli and brian-dellabetta May 30, 2025 04:57

The base branch was changed.

@kylesayrs
Copy link
Collaborator Author

@dsikka I've updated all of the tests to explicitly retain original actorder behavior

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file was added after and may have been missed?
https://github.com/vllm-project/llm-compressor/blob/main/tests/e2e/vLLM/recipes/WNA16/recipe_w4a16_awq_sym.yaml

For test cases that do not use a recipe / use a scheme directly, they will have act order. So we may want to update to use a recipe

tests/e2e/vLLM/configs/w4a16_grouped_quant.yaml
tests/e2e/vLLM/configs/w8a16_grouped_quant.yaml
tests/lmeval/configs/w4a16_grouped_quant.yaml

We definitely do want to maintain at least one test with no act order

Everything else looks good.

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs
Copy link
Collaborator Author

@dsikka

I think this file was added after and may have been missed?
https://github.com/vllm-project/llm-compressor/blob/main/tests/e2e/vLLM/recipes/WNA16/recipe_w4a16_awq_sym.yaml

The test you linked to is an AWQ recipe, not a GPTQ recipe?

For test cases that do not use a recipe / use a scheme directly, they will have act order. So we may want to update to use a recipe

I've updated the e2e_utils to explicitly set actorder=None in order to maintain behavior.

We definitely do want to maintain at least one test with no act order

Not sure why this has taken me so long to add 🙃. I added recipe_w4a16_actorder_weight.yaml

@kylesayrs kylesayrs added the ready When a PR is ready for review label Sep 9, 2025
dsikka
dsikka previously approved these changes Sep 9, 2025
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like there is unreachable code we would want to delete with these changes?

Signed-off-by: Kyle Sayers <[email protected]>
dsikka
dsikka previously approved these changes Sep 11, 2025
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs dismissed stale reviews from brian-dellabetta and dsikka via 409d1b5 September 11, 2025 14:51
@dsikka dsikka merged commit 9e916b6 into main Sep 11, 2025
7 of 8 checks passed
@dsikka dsikka deleted the kylesayrs/gptq-actorder-default branch September 11, 2025 15:51
dsikka added a commit that referenced this pull request Sep 21, 2025
## Purpose ##
* Use best defaults for GPTQ quantization

## Prerequisites ##
* #1453
* #1468

## Changes ##
* Set gptq actorder default to "static"

## Testing ##
* Ran llama w4a16 example to completion and validated the correct
activation ordering

---------

Signed-off-by: Kyle Sayers <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants