[Bugfix] Fix broken MTP weight loading for FP8 KV Scales #27227

benchislett · 2025-10-20T21:03:38Z

Purpose

maybe_remap_kv_scale_name is a part of the DS weight loading logic, but it missing from MTP. This causes a crash when loading, for example, nvidia/DeepSeek-R1-0528-FP4 with MTP which uses this feature by default.

Test Plan

LM Eval GSM8k of nvidia/DeepSeek-R1-0528-FP4 on 4xB200:

Test Result

local-completions (base_url=http://0.0.0.0:8049/v1/completions,model=nvidia/DeepSeek-R1-0528-FP4,tokenized_requests=False,tokenizer_backend=None,num_concurrent=128,timeout=120,max_retries=5), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 1
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.9530|±  |0.0058|
|     |       |strict-match    |     5|exact_match|↑  |0.9484|±  |0.0061|

Signed-off-by: Benjamin Chislett <[email protected]>

gemini-code-assist

Code Review

This pull request addresses a bug that caused a crash when loading models with FP8 KV scales using Multi-Token Prediction (MTP). The fix involves incorporating the maybe_remap_kv_scale_name function into the MTP weight loading process, which was previously missing. This change aligns the MTP weight loading logic with the existing implementation for the base model, ensuring that KV scale names are correctly remapped. The implementation is correct and directly resolves the reported issue. The code is clean and there are no further issues.

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]>

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: Alberto Perdomo <[email protected]>

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: 0xrushi <[email protected]>

apply maybe_remap_kv_scale_name in MTP load_weights

9d307ec

Signed-off-by: Benjamin Chislett <[email protected]>

benchislett requested a review from luccafong as a code owner October 20, 2025 21:03

benchislett added the bug Something isn't working label Oct 20, 2025

mergify bot added the deepseek Related to DeepSeek models label Oct 20, 2025

gemini-code-assist bot reviewed Oct 20, 2025

View reviewed changes

aarnphm enabled auto-merge (squash) October 20, 2025 21:12

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 20, 2025

mgoin approved these changes Oct 20, 2025

View reviewed changes

vllm-bot merged commit f381cf2 into vllm-project:main Oct 21, 2025
55 of 58 checks passed

Zhuul pushed a commit to Zhuul/vllm that referenced this pull request Oct 21, 2025

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (vllm-projec…

323d7e8

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]>

baonudesifeizhai pushed a commit to baonudesifeizhai/vllm that referenced this pull request Oct 21, 2025

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (vllm-projec…

f90c393

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]>

Kay-Tian mentioned this pull request Oct 23, 2025

vLLM PR #27227 变更核心文件提醒 Kay-Tian/vllm#12

Closed

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (vllm-projec…

9d41021

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: 0xrushi <[email protected]>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (vllm-projec…

960d83a

…t#27227) Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: 0xrushi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales #27227

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales #27227

Uh oh!

benchislett commented Oct 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales #27227

[Bugfix] Fix broken MTP weight loading for FP8 KV Scales #27227

Uh oh!

Conversation

benchislett commented Oct 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

benchislett commented Oct 20, 2025 •

edited by github-actions bot

Loading