support qwen3.5 input layout by mayuyuace · Pull Request #190 · vllm-project/vllm-xpu-kernels

mayuyuace · 2026-03-12T06:07:33Z

Qwen3 Next and Qwen 3.5 have different inputs layouts, modify the kernel to support qwen 3.5.
The UT also is updated.

vllm-xpu PR:
https://github.com/intel-innersource/applications.ai.gpu.vllm-xpu/pull/161

bash run-lm-eval-gsm-vllm-baseline.sh -m Qwen/Qwen3.5-9B,dtype=bfloat16 -b 20 -l 250 -f 5 -t 2

triton gdn:

sycl gdn:

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

Copilot

Pull request overview

Adds support for an alternate QKVZ/BA input layout (Qwen 3.5) to the XPU GDN attention path by threading a reorder_input flag through the Torch op binding, C++ interface, and SYCL kernels, and updating the unit test to validate both layouts.

Changes:

Extend gdn_attention Torch op signature + C++ interface to accept reorder_input.
Update causal-conv1d (native + XE2 chunk) kernels to interpret inputs in either legacy (Qwen Next) or reordered (Qwen 3.5) layout.
Expand the GDN attention unit test to run with reorder_input=True/False and reorder the reference inputs accordingly.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/gdn_attn/test_gdn_attn.py`	Adds `reorder_input` parametrization and reference-side input reshaping to validate both layouts.
`csrc/xpu/torch_bindings.cpp`	Updates the Torch library schema for `gdn_attention` to include `reorder_input`.
`csrc/xpu/ops.h`	Extends the C++ op declaration with the `reorder_input` flag.
`csrc/xpu/gdn_attn/gdn_attn_interface.cpp`	Wires `reorder_input` through to causal-conv1d (native + XE2) launch paths.
`csrc/xpu/gdn_attn/causal_conv1d.hpp`	Adds compile-time specialization for reordered layout and dispatch based on `reorder_input`.
`csrc/xpu/gdn_attn/xe_2/chunk_causal_conv1d_xe2.hpp`	Adds reordered-layout specialization and runtime dispatch to the XE2 chunk kernels.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

csrc/xpu/gdn_attn/causal_conv1d.hpp

tests/gdn_attn/test_gdn_attn.py

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

jikunshang · 2026-03-12T10:37:08Z

tests/gdn_attn/test_gdn_attn.py

-                                   ref_ssm_state[state_id],
-                                   atol=atol,
-                                   rtol=rtol)
+        if num_actual_tokens != 8192:


add below at the test case entry?

if num_actual_tokens == 8192: pytest.skip("FIXME, skip because of random error")

jikunshang

Overall LGTM.

support qwen3.5 input layout

3e0cd09

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

Copilot AI review requested due to automatic review settings March 12, 2026 06:07

Copilot started reviewing on behalf of mayuyuace March 12, 2026 06:07 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

csrc/xpu/gdn_attn/causal_conv1d.hpp Show resolved Hide resolved

tests/gdn_attn/test_gdn_attn.py Outdated Show resolved Hide resolved

rename pytest param name and add assert to v_heads

1ef51cc

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

mayuyuace force-pushed the qiming/qwen3.5 branch from d3656d2 to 1ef51cc Compare March 12, 2026 06:27

skip long seqlen for now

fa04083

Signed-off-by: mayuyuace <qiming1.zhang@intel.com>

mayuyuace force-pushed the qiming/qwen3.5 branch from abd8797 to fa04083 Compare March 12, 2026 08:06

jikunshang reviewed Mar 12, 2026

View reviewed changes

jikunshang approved these changes Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support qwen3.5 input layout#190

support qwen3.5 input layout#190
mayuyuace wants to merge 3 commits intovllm-project:mainfrom
mayuyuace:qiming/qwen3.5

mayuyuace commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jikunshang Mar 12, 2026

Uh oh!

jikunshang left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mayuyuace commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

jikunshang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mayuyuace commented Mar 12, 2026 •

edited

Loading