remove xpu_fused_moe weights handling by mayuyuace · Pull Request #163 · vllm-project/vllm-xpu-kernels

mayuyuace · 2026-02-28T03:08:39Z

align with https://github.com/intel-innersource/applications.ai.gpu.vllm-xpu/pull/145

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

Copilot

Pull request overview

This PR removes implicit weight-layout mutation/caching from xpu_fused_moe and updates tests to prepare weights in the expected layout externally, while also loosening activation-name matching for the SWIGLUOAI path.

Changes:

Removed in-function transpose/caching for non-int4/non-mxfp4 weights and adjusted inter_size derivation based on expected layout.
Updated fused MoE tests to transpose w13/w2 before calling xpu_fused_moe.
Expanded SWIGLUOAI activation matching logic.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`vllm_xpu_kernels/fused_moe_interface.py`	Stops mutating weights for layout, changes `inter_size` inference, and adjusts SWIGLUOAI activation detection.
`tests/fused_moe/test_fused_moe.py`	Pre-transposes weights in tests to match the new expected layout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T03:11:58Z

vllm_xpu_kernels/fused_moe_interface.py

+        inter_size = list(w13.shape)[-1] // 2
+    else:
+        inter_size = list(w13.shape)[-2] // 2
+
+    assert w13.is_contiguous() and w2.is_contiguous()


xpu_fused_moe now derives inter_size from w13.shape[-1] for non-int4/non-mxfp4, which implicitly requires w13/w2 to already be in the [E, K, N] layout (transposed vs the shapes documented above and previously handled in this function). To avoid silent wrong results/crashes when callers pass the old layout, please (1) update the parameter shape docs to match the new expectation and (2) add explicit shape/layout validation (e.g., check the dimension that should equal hidden_size, and that w13.shape[-1] is divisible by 2) before calling the GEMM op.

Copilot · 2026-02-28T03:11:59Z

vllm_xpu_kernels/fused_moe_interface.py

    elif activation == "gelu":
        torch.ops._C.gelu_and_mul(act_output, gemm1_output)
-    elif activation == "swigluoai":
+    elif activation == "swigluoai" or ("SWIGLUOAI" in str(activation)):


The condition ("SWIGLUOAI" in str(activation)) is overly broad: it will also match unrelated values like "NOT_SWIGLUOAI" and route them to the swiglu kernel. Safer is to normalize once (e.g., str(activation).lower()) and compare against an explicit allowlist of supported names (including any enum/string variants you intend to support).

Copilot · 2026-02-28T03:11:59Z

tests/fused_moe/test_fused_moe.py

+    w13.data = w13.transpose(-1, -2).contiguous()
+    w2.data = w2.transpose(-1, -2).contiguous()


Avoid assigning through .data here. It bypasses autograd and can lead to hard-to-debug issues if this test is adapted/reused; since these are local variables, prefer rebinding w13/w2 to the transposed contiguous tensors (or otherwise copying safely) instead of mutating .data.

Suggested change

w13.data = w13.transpose(-1, -2).contiguous()

w2.data = w2.transpose(-1, -2).contiguous()

w13 = w13.transpose(-1, -2).contiguous()

w2 = w2.transpose(-1, -2).contiguous()

Copilot · 2026-02-28T03:11:59Z

tests/fused_moe/test_fused_moe.py

+    w13.data = w13.transpose(-1, -2).contiguous()
+    w2.data = w2.transpose(-1, -2).contiguous()


Avoid assigning through .data here. It bypasses autograd and can lead to hard-to-debug issues if this test is adapted/reused; since these are local variables, prefer rebinding w13/w2 to the transposed contiguous tensors (or otherwise copying safely) instead of mutating .data.

Suggested change

w13.data = w13.transpose(-1, -2).contiguous()

w2.data = w2.transpose(-1, -2).contiguous()

w13 = w13.transpose(-1, -2).contiguous()

w2 = w2.transpose(-1, -2).contiguous()

Liangliang-Ma · 2026-02-28T03:19:32Z

we probably should add some contiguous/stride check here to ensure correctness.

jikunshang

LGTM. while I want to on hold this at least this week, to not block weekly test.

mayuyuace · 2026-02-28T03:36:33Z

we probably should add some contiguous/stride check here to ensure correctness.

Sure. I can add check later.

remove xpu_fused_moe weights handling

3987f7b

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

Copilot AI review requested due to automatic review settings February 28, 2026 03:08

Copilot started reviewing on behalf of mayuyuace February 28, 2026 03:09 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

Liangliang-Ma approved these changes Feb 28, 2026

View reviewed changes

jikunshang approved these changes Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove xpu_fused_moe weights handling#163

remove xpu_fused_moe weights handling#163
mayuyuace wants to merge 1 commit intovllm-project:mainfrom
mayuyuace:qiming/remove_weights_handling

mayuyuace commented Feb 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Liangliang-Ma commented Feb 28, 2026

Uh oh!

jikunshang left a comment

Uh oh!

mayuyuace commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		w13.data = w13.transpose(-1, -2).contiguous()
		w2.data = w2.transpose(-1, -2).contiguous()

Conversation

mayuyuace commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Liangliang-Ma commented Feb 28, 2026

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

mayuyuace commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mayuyuace commented Feb 28, 2026 •

edited

Loading