You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow torchao quant to support quantization configs relying on module swap
Summary:
Current torchao integration quantizes the weights by wrapping weights
in a top level linear module and use quantize_ to quantize it, this works
for quantization methods that do inplace changes to the weight itself, such as
int4, float8, but there are quantization configs that would need module swap,
such as awq, that's not supported, in order to support these, we wrap the linear
in nn.Sequential so it is no longer a top level module and can be swapped to another module.
Test Plan:
uplodated an awq checkpoint: https://huggingface.co/torchao-testing/Phi-4-mini-instruct-int4wo-awq-0.13-dev
and we test by loading the checkpoint
```
python tests/quantization/test_torchao.py
```
Reviewers:
Subscribers:
Tasks:
Tags:
Signed-off-by: Jerry Zhang <[email protected]>
0 commit comments