-
Notifications
You must be signed in to change notification settings - Fork 454
[Bugfix] Adopt the new changes on disaggregated pd from vllm main branch #2122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Adopt the new changes on disaggregated pd from vllm main branch #2122
Conversation
Signed-off-by: ganyi <[email protected]>
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
looks good if CI passed |
Signed-off-by: ganyi <[email protected]>
output = new_output | ||
|
||
assert isinstance(output, ModelRunnerOutput) | ||
return output if self.is_driver_worker else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be removed? I suppose this is usefull in spmd case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm's main branch removed this in https://github.com/vllm-project/vllm/pull/21473/files , now every tp return its own result, so the tp scenario will encounter the None input case with our previous code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emm, but it may break PP function I guess. @MengqingCao can you take a check as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is is ok with pp, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked with contributor for PP and RL case. It's OK to do the change
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2122 +/- ##
==========================================
+ Coverage 74.41% 76.67% +2.26%
==========================================
Files 100 107 +7
Lines 11208 11968 +760
==========================================
+ Hits 8340 9177 +837
+ Misses 2868 2791 -77
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
output = new_output | ||
|
||
assert isinstance(output, ModelRunnerOutput) | ||
return output if self.is_driver_worker else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emm, but it may break PP function I guess. @MengqingCao can you take a check as well?
hidden_states: torch.Tensor, | ||
num_scheduled_tokens: int, | ||
num_scheduled_tokens_np: np.ndarray, | ||
finished_sending: Optional[set[str]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vLLM now use kv_connector_metadata
to store finished_sending and finished_receiving, @Potabk will fix it in a new PR
…nch (vllm-project#2122) ### What this PR does / why we need it? We notice that vllm's main branch merged the PR vllm-project/vllm#21072 and vllm-project/vllm#21473 to support ray backend and fix some rebase bug from previous change. Those changes makes the disaggregate pd in vllm ascend breaks in some scenario. In this PR, we adopt those changes to make sure the `llmdatddist_c_mgr_connector` works fine on the newest vllm main branch. ### Does this PR introduce _any_ user-facing change? No user face change. ### How was this patch tested? relevant ut will be added to make sure the functionality of those changes. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ad57f23 --------- Signed-off-by: ganyi <[email protected]>
…nch (vllm-project#2122) ### What this PR does / why we need it? We notice that vllm's main branch merged the PR vllm-project/vllm#21072 and vllm-project/vllm#21473 to support ray backend and fix some rebase bug from previous change. Those changes makes the disaggregate pd in vllm ascend breaks in some scenario. In this PR, we adopt those changes to make sure the `llmdatddist_c_mgr_connector` works fine on the newest vllm main branch. ### Does this PR introduce _any_ user-facing change? No user face change. ### How was this patch tested? relevant ut will be added to make sure the functionality of those changes. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ad57f23 --------- Signed-off-by: ganyi <[email protected]>
What this PR does / why we need it?
We notice that vllm's main branch merged the PR vllm-project/vllm#21072 and vllm-project/vllm#21473 to support ray backend and fix some rebase bug from previous change. Those changes makes the disaggregate pd in vllm ascend breaks in some scenario.
In this PR, we adopt those changes to make sure the
llmdatddist_c_mgr_connector
works fine on the newest vllm main branch.Does this PR introduce any user-facing change?
No user face change.
How was this patch tested?
relevant ut will be added to make sure the functionality of those changes.