Add Sycl topk per row kernel by wuxun-zhang · Pull Request #191 · vllm-project/vllm-xpu-kernels

wuxun-zhang · 2026-03-12T07:10:29Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

Purpose

Add Sycl top_k_per_row kernel for prefill and decode
Copilot adapted most codes from https://github.com/vllm-project/vllm/blob/main/csrc/sampler.cu except subgroup-level prefix sum algorithm since CUDA uses CUB library.

Test Plan

python -m pytest tests/test_topk_per_row.py -v

Test Result

pass

(Optional) Documentation Update

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Zhang, Wuxun <wuxun.zhang@intel.com>

Copilot

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

jikunshang · 2026-03-12T09:04:33Z

tests/test_topk_per_row.py

+import torch
+
+from tests.register_ops import topk_per_row_prefill, topk_per_row_decode
+from vllm.platforms import current_platform


should not import any vllm code in this repo.

jikunshang · 2026-03-12T09:05:17Z

tests/test_topk_per_row.py

+    Compare results from CUDA top_k_per_row with torch.topk.
+    Both results should be sorted and contain the same top-k elements.
+    """
+    num_rows = cuda_indices.shape[0]


better rename cuda

renamed cuda to xpu

Signed-off-by: yangqun <qun.yang@intel.com>

wuxun-zhang and others added 4 commits March 12, 2026 06:44

[WIP] topk per row prefill/decode

17369a2

Signed-off-by: Zhang, Wuxun <wuxun.zhang@intel.com>

Add topk per row prefill/decode impl

1c0c8a5

Signed-off-by: Zhang, Wuxun <wuxun.zhang@intel.com>

Fix accuracy issue when context length > 200000

fcbdc17

Signed-off-by: Zhang, Wuxun <wuxun.zhang@intel.com>

updates

b0edecd

Signed-off-by: Zhang, Wuxun <wuxun.zhang@intel.com>

Copilot AI review requested due to automatic review settings March 12, 2026 07:10

wuxun-zhang mentioned this pull request Mar 12, 2026

DeepSeek V3.2 support plan #154

Open

5 tasks

Copilot AI reviewed Mar 12, 2026

View reviewed changes

jikunshang reviewed Mar 12, 2026

View reviewed changes

remove vllm deps and rename cuda to xpu in test

239376b

Signed-off-by: yangqun <qun.yang@intel.com>

YangQun1 force-pushed the dev/ds-v3.2 branch from 47ba91b to 239376b Compare March 12, 2026 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sycl topk per row kernel#191

Add Sycl topk per row kernel#191
wuxun-zhang wants to merge 5 commits intovllm-project:mainfrom
wuxun-zhang:dev/ds-v3.2

wuxun-zhang commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

jikunshang Mar 12, 2026

Uh oh!

YangQun1 Mar 12, 2026

Uh oh!

jikunshang Mar 12, 2026

Uh oh!

YangQun1 Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

wuxun-zhang commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

jikunshang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

YangQun1 Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

YangQun1 Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wuxun-zhang commented Mar 12, 2026 •

edited

Loading