Skip to content

xe: gemm: disable swapab for weights decompression#4867

Open
Simonsays095 wants to merge 1 commit intouxlfoundation:mainfrom
Simonsays095:no_swapab_woq
Open

xe: gemm: disable swapab for weights decompression#4867
Simonsays095 wants to merge 1 commit intouxlfoundation:mainfrom
Simonsays095:no_swapab_woq

Conversation

@Simonsays095
Copy link
Contributor

Addresses MFDNN-14797. We generally don't have type-transposed versions of weights-only quantization layers in the kernel catalog. When we attempt to use the swap_ab mechanic on these shapes, we fail to find a kernel and dispatch to the reference kernel instead.

Fixes the issue by disabling swap_ab in all weights-only quantization cases, so we look for the familiar A/B types in the catalog. The long-term solution will eventually be to make the catalog aware of when swapping is possible, and return a kernel that's tagged with swap_ab so we can search for both patterns and use the better one.

@Simonsays095 Simonsays095 requested a review from a team as a code owner March 20, 2026 16:32
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Mar 20, 2026
@Simonsays095
Copy link
Contributor Author

make test perf-gpu
set primitive=gpu:gemm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants