Skip to content

Conversation

paulyu12
Copy link
Collaborator

@paulyu12 paulyu12 commented Sep 4, 2025

What this PR does / why we need it?

The PR is for the document of the prefiller&decoder disaggregation deloyment guide.

The scenario of the guide is:

  • Use 3 nodes totally and 2 NPUs on each node
  • Qwen3-30B-A3B
  • 1P2D
  • Expert Parallel

The deployment can be used to verify PD Disggregation / Expert Parallel features with a slightly less resources.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

No.

Copy link

github-actions bot commented Sep 4, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 4, 2025
@paulyu12 paulyu12 marked this pull request as ready for review September 4, 2025 14:03
@wangxiyuan
Copy link
Collaborator

@Potabk please follow this guide to test locally and review it. Thanks.

@Potabk
Copy link
Collaborator

Potabk commented Sep 5, 2025

@paulyu12 Have you once tried deploying all instances on the same node

@paulyu12
Copy link
Collaborator Author

paulyu12 commented Sep 5, 2025

@paulyu12 Have you once tried deploying all instances on the same node

Not yet. But if you need, I can try this scenario soon.

@paulyu12 paulyu12 changed the title [DOC] Qwen3 PD disaggregateion user guide [DOC] Qwen3 PD disaggregation user guide Sep 5, 2025
@paulyu12 paulyu12 added the ready read for review label Sep 5, 2025
@wangxiyuan
Copy link
Collaborator

Thanks for the contribution

@wangxiyuan wangxiyuan merged commit a746f82 into vllm-project:main Sep 7, 2025
16 checks passed
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Sep 10, 2025
### What this PR does / why we need it?
The PR is for the document of the prefiller&decoder disaggregation
deloyment guide.

The scenario of the guide is:
- Use 3 nodes totally and 2 NPUs on each node
- Qwen3-30B-A3B
- 1P2D
- Expert Parallel

The deployment can be used to verify PD Disggregation / Expert Parallel
features with a slightly less resources.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@e599e2c

---------

Signed-off-by: paulyu12 <[email protected]>
offline893 pushed a commit to offline893/vllm-ascend that referenced this pull request Sep 16, 2025
### What this PR does / why we need it?
The PR is for the document of the prefiller&decoder disaggregation
deloyment guide.

The scenario of the guide is:
- Use 3 nodes totally and 2 NPUs on each node
- Qwen3-30B-A3B
- 1P2D
- Expert Parallel

The deployment can be used to verify PD Disggregation / Expert Parallel
features with a slightly less resources.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.

- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@e599e2c

---------

Signed-off-by: paulyu12 <[email protected]>
Signed-off-by: offline0806 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ready read for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants