[HS-6944] Fix for deepseek distill models #359

nazneenn · 2025-09-10T15:01:24Z

In the current v1.22.0 release and in the future branch, deepseek distill model fails because it incorrectly falls into the DeepSeek if condition and triggers expert parallelism, which results in the error: “Value error, Number of experts in the model must be greater than 0 when expert parallelism is enabled.”
The proposed fix should remove this dependency for DeepSeek distill models and ensures that expert parallelism is enabled only for DeepSeek models.

root added 2 commits September 10, 2025 07:55

deepseek-distill changes

f04f5c2

deepseek-distill changes

92e494f

nazneenn requested review from afierka-intel, jikunshang, kzawora-intel, madamczyk-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, tzielinski-habana and xuechendi as code owners September 10, 2025 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[HS-6944] Fix for deepseek distill models #359

[HS-6944] Fix for deepseek distill models #359

Uh oh!

nazneenn commented Sep 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[HS-6944] Fix for deepseek distill models #359

Are you sure you want to change the base?

[HS-6944] Fix for deepseek distill models #359

Uh oh!

Conversation

nazneenn commented Sep 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant