Skip to content

Conversation

@luccafong
Copy link

@luccafong luccafong commented Sep 30, 2025

fallback to eager for MTP model when using v32.

The cuda graph capturing isssue is still under debugging, fallback to eager to avoid crashing when user enable it mtp. The target model will not be impacted

@lichaojacobs
Copy link

Is this fix associated with the issue mentioned here: vllm-project/recipes#73?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants