[0.9.1][MTP V1]MTP model adapt torchair #1840

JC-ut0 · 2025-07-17T02:50:39Z

What this PR does / why we need it?

MTP model adapt torchair graph mode

The MTP model only utilizes the torchair graph during the decode phase. Our padding strategy involves running a fixed number of 1+MTP tokens per batch, regardless of whether the main model accepts or rejects them. When generating the MTP hidden state, we only take the last index of the accepted tokens for each batch.

Does this PR introduce any user-facing change?

How was this patch tested?

Tested in DP4/TP4, TP16 DP1, TP8 DP2, and Prefilling Decoding Disaggregation

vllm_ascend/worker/mtp_proposer_v1.py

vllm_ascend/worker/model_runner_v1.py

Signed-off-by: xuyexiong <[email protected]>

JC-ut0 force-pushed the v0.9.1-dev branch from 48bd47e to eef0c30 Compare July 17, 2025 07:52

github-actions bot added the module:ops label Jul 22, 2025

NeverRaR suggested changes Jul 22, 2025

View reviewed changes

vllm_ascend/worker/mtp_proposer_v1.py Show resolved Hide resolved

vllm_ascend/worker/mtp_proposer_v1.py Show resolved Hide resolved

vllm_ascend/worker/model_runner_v1.py Show resolved Hide resolved

JC-ut0 force-pushed the v0.9.1-dev branch 3 times, most recently from c201ddb to aec1751 Compare July 23, 2025 06:33

JC-ut0 requested a review from NeverRaR July 23, 2025 06:57

JC-ut0 force-pushed the v0.9.1-dev branch 2 times, most recently from ec6ac82 to bb12ac5 Compare July 23, 2025 07:15

[0.9.1][MTP V1]MTP model adapt torchair

7ec2325

Signed-off-by: xuyexiong <[email protected]>

JC-ut0 force-pushed the v0.9.1-dev branch from bb12ac5 to 7ec2325 Compare July 23, 2025 07:40

ganyi1996ppo approved these changes Jul 23, 2025

View reviewed changes

ganyi1996ppo merged commit 4c369a8 into vllm-project:v0.9.1-dev Jul 23, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[0.9.1][MTP V1]MTP model adapt torchair #1840

[0.9.1][MTP V1]MTP model adapt torchair #1840

Uh oh!

JC-ut0 commented Jul 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[0.9.1][MTP V1]MTP model adapt torchair #1840

[0.9.1][MTP V1]MTP model adapt torchair #1840

Uh oh!

Conversation

JC-ut0 commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JC-ut0 commented Jul 17, 2025 •

edited

Loading