WIP: Initial support for bnb 4bit on any nn.Parameter #39859

matthewdouglas · 2025-08-01T22:24:18Z

What does this PR do?

This PR adds a new option to BitsAndBytesConfig called bnb_4bit_target_parameters with the same spirit as target_parameters in huggingface/peft#2638. The intent is to allow quantization of nn.Parameter that are not within a nn.Linear, e.g. those found commonly in certain MoE model implementations.

Requires bitsandbytes-foundation/bitsandbytes#1720 which is being concurrently developed.

Example usage with a Granite MoE:

model = GraniteMoeForCausalLM.from_pretrained(
    "ibm-granite/granite-3.1-3b-a800m-base",
    torch_dtype=torch.bfloat16,
    device_map="cuda:0",
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=False,
        bnb_4bit_target_parameters=["block_sparse_moe.input_linear.weight", "block_sparse_moe.output_linear.weight"],
        llm_int8_skip_modules=["lm_head", "block_sparse_moe.router"]
    ),
)

Memory Usage - BF16

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	6291 MiB	6292 MiB	12583 MiB	6292 MiB
Active memory	6291 MiB	6292 MiB	12583 MiB	6292 MiB
Requested memory	6291 MiB	6291 MiB	12583 MiB	6291 MiB

Memory Usage - Before PR

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	6019 MiB	6027 MiB	9935 MiB	3916 MiB
Active memory	6019 MiB	6027 MiB	9935 MiB	3916 MiB
Requested memory	6015 MiB	6024 MiB	9929 MiB	3913 MiB

Memory Usage - After PR

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	1894 MiB	2054 MiB	9424 MiB	7530 MiB
Active memory	1894 MiB	2054 MiB	9424 MiB	7530 MiB
Requested memory	1875 MiB	2035 MiB	9389 MiB	7513 MiB

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ x ] Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
(See Slack discussion)
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@SunMarc @MekkCyber @BenjaminBossan

HuggingFaceDocBuilderDev · 2025-08-01T22:47:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Rocketknight1 · 2025-08-06T13:26:31Z

cc @MekkCyber

SunMarc

Nice it would be great to add some tests (inference / saving) with gptoss model !

matthewdouglas requested review from BenjaminBossan and SunMarc August 1, 2025 22:24

SunMarc reviewed Aug 7, 2025

View reviewed changes

matthewdouglas force-pushed the bnb-parametrize-4bit branch from 59029b6 to 9a7a30f Compare August 7, 2025 21:20

matthewdouglas added 4 commits August 14, 2025 15:46

Initial support for bnb 4bit on any nn.Parameter

530ea30

Fix style

10df98c

Fix copies

a282c57

Enable bnb 4bit nn.Parameter from prequantized checkpoint

61fdac5

matthewdouglas force-pushed the bnb-parametrize-4bit branch from 78d55b3 to 61fdac5 Compare August 14, 2025 19:46

typo

2131f9b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Uh oh!

matthewdouglas commented Aug 1, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 1, 2025

Uh oh!

Rocketknight1 commented Aug 6, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Are you sure you want to change the base?

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Uh oh!

Conversation

matthewdouglas commented Aug 1, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 1, 2025

Uh oh!

Rocketknight1 commented Aug 6, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!