-
Notifications
You must be signed in to change notification settings - Fork 279
Add Intel AutoRound algorithm support #1994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
kylesayrs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. There's a few more things I'd like to point out, will give a full review soon.
| if cur_layer_idx >= len(state.model.model.layers): | ||
| # skip the lm_head layer | ||
| return | ||
| decoding_layer = state.model.model.layers[cur_layer_idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sequential pipeline is not guaranteed to break a model into decoder layers. If a user specifies sequential_targets="Linear", then each SEQUENTIAL_EPOCH_END will trigger on each linear layer of the model.
One way to generalize this would be to have the sequential pipeline return the modules that were in the sequential layer
class Subgraph:
def get_modules(self, model: torch.nn.Module, recurse: bool = False) -> Set[torch.nn.Module]:
nodes = self.graph.find_nodes(op="call_module")
modules = set(model.get_submodule(node.target) for node in nodes)
if recurse:
modules = set(module.modules() for module in modules)
return modulesclass SequentialPipeline:
...
for subgraph in subgraphs:
LifecycleCallbacks.sequential_epoch_end(subgraph)def apply_autoround(self, state, subgraph):
decoding_layer = torch.nn.ModuleList(list(subgraph.modules()))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI these changes are implemented here, and this PR can potentially rebase on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation — this approach is definitely more robust. I’ve tested it locally, and it works well.
Will #1998 be merged soon? If so, I’d prefer to rebase on main and update my PR accordingly to avoid introducing too much code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rebased the main branch and updated it accordingly.
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
…ve `layer_sequential` pipeline (#1998) ## Purpose ## * Enable better targeting of modules by modifiers such as [AutoRound](#1994) * Remove legacy pipeline (which is incompatible with this change) ## Changes ## * Pass subgraph to `sequential_epoch_end`, allowing modifiers to view all of the module that were called in the subgraph * Implement `submodules` method on `Subgraph` which returns all the modules called by this subgraph * Remove `LayerSequentialPipeline`, which does not use the `Subgraph` API and has been superseded by the sequential pipeline --------- Signed-off-by: Kyle Sayers <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Dipika Sikka <[email protected]>
brian-dellabetta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! This is really cool. I have a few comments/questions in an initial review
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Co-authored-by: Brian Dellabetta <[email protected]> Signed-off-by: Yi Liu <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
…mpressor-fork into autoround-support
brian-dellabetta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments! A few more small things:
| if BUILD_TYPE == "release" | ||
| else "compressed-tensors>=0.12.3a2" | ||
| ), | ||
| # TODO: replace it with the release version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: yiliu30 <[email protected]>
Co-authored-by: Brian Dellabetta <[email protected]> Signed-off-by: Yi Liu <[email protected]>
Hi @brian-dellabetta , We're planning to release the next version within the next 1–2 weeks—hope that works for you! |
Resolve #1968
Highlights
AutoRoundModifierto enable AutoRound quantization forwNa16.Attached eval cmd FYI.
Next stage (in later PRs)
cc @hshen14 @thuang6 @wenhuach21