Codex/topology separation research#1655
Codex/topology separation research#1655haowu1234 wants to merge 12 commits intovllm-project:mainfrom
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
✅ Supply Chain Security Report — All Clear
Findings in this PR's diff1 finding(s) — click to expand
Scanned at |
9b36500 to
19773e6
Compare
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: haowu1234 <13258260125@163.com>
.github/workflows/docker-publish.yml
Outdated
| strategy: | ||
| matrix: | ||
| image: [dashboard, extproc, extproc-rocm, llm-katan, vllm-sr, vllm-sr-rocm, vllm-sr-sim] | ||
| image: [dashboard, extproc, extproc-rocm, llm-katan, vllm-sr, vllm-sr-envoy, vllm-sr-rocm, vllm-sr-router, vllm-sr-router-rocm, vllm-sr-sim] |
There was a problem hiding this comment.
we shouldnt increase the images i think, we already have vllm-sr(now: router only), and dashboard image, and envoy has its own images and we shouldnt maintain
There was a problem hiding this comment.
we shouldnt increase the images i think, we already have vllm-sr(now: router only), and dashboard image, and envoy has its own images and we shouldnt maintain
ok
1e27e46 to
b874bc4
Compare
Signed-off-by: qingyangwu <qingyangwu@tencent.com>
Signed-off-by: haowu1234 <13258260125@163.com>
Signed-off-by: qingyangwu <qingyangwu@tencent.com>
Signed-off-by: qingyangwu <qingyangwu@tencent.com>

Summary
This PR adds an opt-in split local runtime topology for
vllm-sr servewhile preserving the existing local default path.Instead of replacing the current monolithic
vllm-sr-containerflow, the split topology is now gated behind explicit local-dev controls:vllm-sr serve --topology splitVLLM_SR_TOPOLOGY=splitThis keeps the blast radius small for existing local, Helm, Kubernetes, and operator workflows while allowing us to iterate on the split topology safely.
What changed
1. Add opt-in split local topology
vllm-sr-containerrouter/envoy/dashboardcontainers2. Add role-specific local runtime images
routerenvoydashboard3. Preserve setup/config/deploy behavior under split topology
4. Fix split-topology local runtime regressions discovered during rollout
huggingface-cliin router runtime images5. Make split local dev faster and safer
VLLM_SR_TOPOLOGY=splitnow defaults local dev to skipping compatibility-image rebuildsSKIP_COMPAT_IMAGE=0Why
The goal is to make local topology separation production-like enough for deep development and debugging without breaking the current developer path.
This PR intentionally optimizes for:
Validation
Validated through the repo-native harness and local smoke paths, including:
make agent-validatemake vllm-sr-testUsage
Legacy local runtime remains unchanged:
AMD legacy local runtime:
Opt into split local topology explicitly:
Opt into split local topology on AMD:
If a split-topology debug loop still needs the compatibility image rebuilt:
Useful local runtime commands:
Notes
This PR is intentionally scoped to local runtime behavior.
It does not migrate Helm / Kubernetes / operator deployment topologies to the split model. The change is designed so those surfaces are not directly restructured by this rollout.
Related to #1508