feat: Add ShortGPT pruning algorithm #436

sky-2002 · 2025-11-12T15:48:59Z

Description

Implements the ShortGPT pruning algorithm.
Refer #418

Related Issue

Fixes #418

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

I am figuring out how to write tests for this, found the test framework a little tricky. Would appreciate some help there. I did not understand how the run_full_integration gets the dataset required to run tests, tokenizer etc.

Also, I found the SmashConfigPrefixWrapper confusing, so I tested with an alternative way:

from transformers import AutoModelForCausalLM, AutoTokenizer
from pruna.algorithms.shortgpt import ShortGPT
from datasets import load_dataset
algo = ShortGPT()

llama_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B", 
llama_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B", torch_dtype="auto")

dataset = load_dataset("wikitext", "wikitext-2-raw-v1", split="train[:100]")

dl = DataLoader(dataset, batch_size=2, shuffle=True)

qmodel = algo._apply(llama_model, {
    "tokenizer": llama_tokenizer,
    "device": "cpu",
    "prune_ratio": 0.25,
    "angular": True,
    "train_dataloader": dl,
})

cursor

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2025-11-12T15:52:23Z

src/pruna/algorithms/shortgpt.py

+                bis[i] += bi
+            counts += 1
+
+        bis /= counts


Bug: NaNs: The Silent Killer of Predictable Behavior

If the dataloader is empty or yields no batches, counts remains 0, causing bis /= counts to produce NaN values. These NaN values propagate through np.argsort, resulting in unpredictable pruning behavior instead of a clear error message.

github-actions · 2025-11-23T00:09:01Z

This PR has been inactive for 10 days and is now marked as stale.

sky-2002 added 4 commits November 9, 2025 23:36

add initial implementation for shortgpt

e3fb475

support dataloder

393163d

simplify capturing hidden states

2b8b019

add todo

15bc073

cursor bot reviewed Nov 12, 2025

View reviewed changes

github-actions bot added the stale label Nov 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add ShortGPT pruning algorithm #436

feat: Add ShortGPT pruning algorithm #436

Uh oh!

sky-2002 commented Nov 12, 2025

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Nov 12, 2025

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add ShortGPT pruning algorithm #436

Are you sure you want to change the base?

feat: Add ShortGPT pruning algorithm #436

Uh oh!

Conversation

sky-2002 commented Nov 12, 2025

Description

Related Issue

Type of Change

How Has This Been Tested?

Checklist

Additional Notes

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Nov 12, 2025

Choose a reason for hiding this comment

Bug: NaNs: The Silent Killer of Predictable Behavior

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant