4bit quantization for arbitrary `nn.Parameter` #1720

matthewdouglas · 2025-08-01T20:49:48Z

This PR is in the same spirit as the recently introduced feature in huggingface/peft#2638.

Several models exist in the Hugging Face ecosystem where there are MoE layers that use nn.Parameter and are not compatible with the default quantization approach of replacing nn.Linear. Such example models include, but are not limited to:

A new utility, bitsandbytes.nn.parametrize.replace_parameter_4bit() is introduced. This will quantize and replace an nn.Parameter with a parametrization layer which automatically dequantizes the parameter when it is accessed

Additional work will be done on the HF Transformers side to enable integration with options in BitsAndBytesConfig.
.

…dules

github-actions · 2025-08-01T20:53:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

winglian · 2025-08-06T16:21:37Z

Will there need to be any changes in PEFT to apply lora adapters to quantized parameters once this lands?

BenjaminBossan · 2025-08-06T16:33:45Z

We'll have to test, but at the very least, huggingface/peft#2710 needs to be merged in PEFT for this to work properly.

cmp-nct · 2025-08-17T21:27:15Z

We'll have to test, but at the very least, huggingface/peft#2710 needs to be merged in PEFT for this to work properly.

it's merged

Add parametrize util for targeting parameters outside of nn.Linear mo…

4579891

…dules

matthewdouglas added this to the v0.47.0 milestone Aug 1, 2025

matthewdouglas added the Enhancement New feature or request label Aug 1, 2025

matthewdouglas mentioned this pull request Aug 1, 2025

WIP: Initial support for bnb 4bit on any nn.Parameter huggingface/transformers#39859

Draft

4 tasks

matthewdouglas and others added 10 commits August 11, 2025 14:02

Parametrize 4bit: replace existing prequantized weight

82ee356

Fix Params4bit tensor subclass handling

9a6fb9b

test_params4bit_torch_chunk_split

b4de310

lint

48d9b4f

Fixing quantization uint8 packing bug for NF4 and FP4

3624b22

Temporary updates for release

2051c85

Release 0.47.0

2f8be88

Bump dev version

b8d585b

Restore temporary changes from release

07dbb97

cleanup

179db08

matthewdouglas modified the milestones: v0.47.0, v0.48.0 Aug 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

4bit quantization for arbitrary `nn.Parameter` #1720

4bit quantization for arbitrary `nn.Parameter` #1720

Uh oh!

matthewdouglas commented Aug 1, 2025

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

winglian commented Aug 6, 2025

Uh oh!

BenjaminBossan commented Aug 6, 2025

Uh oh!

cmp-nct commented Aug 17, 2025

Uh oh!

Uh oh!

Uh oh!

4bit quantization for arbitrary nn.Parameter #1720

Are you sure you want to change the base?

4bit quantization for arbitrary nn.Parameter #1720

Uh oh!

Conversation

matthewdouglas commented Aug 1, 2025

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

winglian commented Aug 6, 2025

Uh oh!

BenjaminBossan commented Aug 6, 2025

Uh oh!

cmp-nct commented Aug 17, 2025

Uh oh!

Uh oh!

4bit quantization for arbitrary `nn.Parameter` #1720

4bit quantization for arbitrary `nn.Parameter` #1720