Skip to content

Commit f381cf2

Browse files
authored
[Bugfix] Fix broken MTP weight loading for FP8 KV Scales (#27227)
Signed-off-by: Benjamin Chislett <[email protected]>
1 parent 5ff5d94 commit f381cf2

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

vllm/model_executor/models/deepseek_mtp.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,10 @@
1616
ParallelLMHead,
1717
VocabParallelEmbedding,
1818
)
19-
from vllm.model_executor.model_loader.weight_utils import default_weight_loader
19+
from vllm.model_executor.model_loader.weight_utils import (
20+
default_weight_loader,
21+
maybe_remap_kv_scale_name,
22+
)
2023
from vllm.platforms import current_platform
2124
from vllm.sequence import IntermediateTensors
2225

@@ -278,6 +281,10 @@ def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) -> set[str]:
278281
if name.endswith(".bias") and name not in params_dict:
279282
continue
280283

284+
name = maybe_remap_kv_scale_name(name, params_dict)
285+
if name is None:
286+
continue
287+
281288
# According to DeepSeek-V3 Technical Report, MTP modules
282289
# shares embedding layer. We only load the first weights.
283290
if (

0 commit comments

Comments
 (0)