xe: gemm: disable swapab for weights decompression#4867
Open
Simonsays095 wants to merge 1 commit intouxlfoundation:mainfrom
Open
xe: gemm: disable swapab for weights decompression#4867Simonsays095 wants to merge 1 commit intouxlfoundation:mainfrom
Simonsays095 wants to merge 1 commit intouxlfoundation:mainfrom
Conversation
atkassen
approved these changes
Mar 20, 2026
kealan-barbieri
approved these changes
Mar 20, 2026
Contributor
Author
|
make test perf-gpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses MFDNN-14797. We generally don't have type-transposed versions of weights-only quantization layers in the kernel catalog. When we attempt to use the
swap_abmechanic on these shapes, we fail to find a kernel and dispatch to the reference kernel instead.Fixes the issue by disabling
swap_abin all weights-only quantization cases, so we look for the familiar A/B types in the catalog. The long-term solution will eventually be to make the catalog aware of when swapping is possible, and return a kernel that's tagged withswap_abso we can search for both patterns and use the better one.