Codex/topology separation research by haowu1234 · Pull Request #1655 · vllm-project/semantic-router

haowu1234 · 2026-03-25T13:05:15Z

Summary

This PR adds an opt-in split local runtime topology for vllm-sr serve while preserving the existing local default path.

Instead of replacing the current monolithic vllm-sr-container flow, the split topology is now gated behind explicit local-dev controls:

vllm-sr serve --topology split
VLLM_SR_TOPOLOGY=split

This keeps the blast radius small for existing local, Helm, Kubernetes, and operator workflows while allowing us to iterate on the split topology safely.

What changed

1. Add opt-in split local topology

Keep legacy single-container local runtime as the default
Add explicit split topology selection in the CLI
Route local startup through either:
- legacy vllm-sr-container
- split router/envoy/dashboard containers

2. Add role-specific local runtime images

Introduce dedicated local runtime images for:
- router
- envoy
- dashboard
Keep compatibility image support for legacy local runtime
Wire image selection through the local CLI and Make-based workflows

3. Preserve setup/config/deploy behavior under split topology

Support dashboard-first setup mode with router/envoy on standby
Switch split setup activation to container lifecycle operations
Preserve config apply / runtime config sync semantics
Preserve router hot reload behavior after config updates

4. Fix split-topology local runtime regressions discovered during rollout

restore dashboard auth sqlite bootstrap
mount dashboard docker socket correctly before image args
include huggingface-cli in router runtime images
build dashboard WASM assets in both dev and runtime image paths
harden Envoy readiness/status detection in split topology

5. Make split local dev faster and safer

VLLM_SR_TOPOLOGY=split now defaults local dev to skipping compatibility-image rebuilds
explicit override remains available with SKIP_COMPAT_IMAGE=0

Why

The goal is to make local topology separation production-like enough for deep development and debugging without breaking the current developer path.

This PR intentionally optimizes for:

minimal disruption to existing users
explicit rollout control
easier regression isolation
safer future extension into other deployment surfaces

Validation

Validated through the repo-native harness and local smoke paths, including:

make agent-validate
make vllm-sr-test
focused pytest coverage for CLI topology/image selection
local smoke on legacy topology
local smoke on explicit split topology
dashboard / setup / runtime handler tests for split container lifecycle

Usage

Legacy local runtime remains unchanged:

make vllm-sr-dev
vllm-sr serve --image-pull-policy never

AMD legacy local runtime:

make vllm-sr-dev VLLM_SR_PLATFORM=amd
vllm-sr serve --image-pull-policy never --platform amd

Opt into split local topology explicitly:

make vllm-sr-dev VLLM_SR_TOPOLOGY=split
vllm-sr serve --image-pull-policy never --topology split

Opt into split local topology on AMD:

make vllm-sr-dev VLLM_SR_PLATFORM=amd VLLM_SR_TOPOLOGY=split
vllm-sr serve --image-pull-policy never --platform amd --topology split

If a split-topology debug loop still needs the compatibility image rebuilt:

make vllm-sr-dev VLLM_SR_TOPOLOGY=split SKIP_COMPAT_IMAGE=0
make vllm-sr-dev VLLM_SR_PLATFORM=amd VLLM_SR_TOPOLOGY=split SKIP_COMPAT_IMAGE=0

Useful local runtime commands:

vllm-sr status all
vllm-sr logs router -f
vllm-sr logs envoy -f
vllm-sr logs dashboard -f
vllm-sr stop

Notes

This PR is intentionally scoped to local runtime behavior.

It does not migrate Helm / Kubernetes / operator deployment topologies to the split model. The change is designed so those surfaces are not directly restructured by this rollout.

Related to #1508

netlify · 2026-03-25T13:05:23Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`50df862`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/69c4cdacd0130b0008cedfe5
😎 Deploy Preview	https://deploy-preview-1655--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2026-03-25T13:06:09Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

.github/workflows/docker-publish.yml
.github/workflows/integration-test-memory.yml
.github/workflows/integration-test-vllm-sr-cli.yml
docs/agent/environments.md
docs/agent/plans/README.md
docs/agent/plans/pl-0015-local-runtime-topology-separation.md
docs/agent/plans/pl-0016-local-runtime-three-image-rollout.md

📁 `dashboard`

Owners: @JaredforReal, @Xunzhuo
Files changed:

dashboard/backend/Dockerfile
dashboard/backend/handlers/logs.go
dashboard/backend/handlers/logs_test.go
dashboard/backend/handlers/openclaw.go
dashboard/backend/handlers/openclaw_helpers.go
dashboard/backend/handlers/openclaw_image_test.go
dashboard/backend/handlers/openclaw_provision.go
dashboard/backend/handlers/openclaw_test.go
dashboard/backend/handlers/runtime_config_apply.go
dashboard/backend/handlers/runtime_config_apply_test.go
dashboard/backend/handlers/runtime_config_sync.go
dashboard/backend/handlers/runtime_config_sync_test.go
dashboard/backend/handlers/runtime_managed_container_lifecycle.go
dashboard/backend/handlers/runtime_managed_container_lifecycle_test.go
dashboard/backend/handlers/runtime_managed_services.go
dashboard/backend/handlers/runtime_managed_services_test.go
dashboard/backend/handlers/setup.go
dashboard/backend/handlers/setup_test.go
dashboard/backend/handlers/status_collectors.go
dashboard/backend/handlers/status_collectors_test.go
dashboard/backend/handlers/status_runtime.go

📁 `e2e`

Owners: @Xunzhuo
Files changed:

e2e/config/config.agent-smoke.amd.yaml
e2e/config/config.agent-smoke.cpu.yaml
e2e/testing/run_memory_integration.sh
e2e/testing/vllm-sr-cli/README.md
e2e/testing/vllm-sr-cli/cli_test_base.py
e2e/testing/vllm-sr-cli/run_cli_tests.py
e2e/testing/vllm-sr-cli/test_integration.py
e2e/testing/vllm-sr-cli/test_unit_runtime_topology.py
e2e/testing/vllm-sr-cli/test_unit_serve.py

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/modeldownload/config_parser_test.go
src/vllm-sr/Dockerfile
src/vllm-sr/Dockerfile.envoy
src/vllm-sr/Dockerfile.rocm
src/vllm-sr/Dockerfile.router
src/vllm-sr/Dockerfile.router.rocm
src/vllm-sr/Makefile
src/vllm-sr/cli/commands/runtime.py
src/vllm-sr/cli/commands/runtime_support.py
src/vllm-sr/cli/config_generator.py
src/vllm-sr/cli/consts.py
src/vllm-sr/cli/core.py
src/vllm-sr/cli/docker_backend.py
src/vllm-sr/cli/docker_images.py
src/vllm-sr/cli/docker_run_command.py
src/vllm-sr/cli/docker_runtime.py
src/vllm-sr/cli/docker_services.py
src/vllm-sr/cli/docker_start.py
src/vllm-sr/cli/runtime_lifecycle.py
src/vllm-sr/cli/runtime_stack.py
src/vllm-sr/cli/runtime_topology.py
src/vllm-sr/cli/templates/envoy.template.yaml
src/vllm-sr/rebuild-and-test.sh
src/vllm-sr/start-dashboard.sh
src/vllm-sr/start-envoy.sh
src/vllm-sr/start-router.sh
src/vllm-sr/tests/test_cli_main.py
src/vllm-sr/tests/test_config_generator.py
src/vllm-sr/tests/test_core_status.py
src/vllm-sr/tests/test_deployment_backend.py
src/vllm-sr/tests/test_docker_images.py
src/vllm-sr/tests/test_docker_runtime.py
src/vllm-sr/tests/test_local_dev_make_surface.py
src/vllm-sr/tests/test_makefile_surface.py
src/vllm-sr/tests/test_openclaw_shared_network.py
src/vllm-sr/tests/test_router_dockerfile_surface.py
src/vllm-sr/tests/test_runtime_topology.py
src/vllm-sr/tests/test_split_runtime_stack.py

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/agent/repo-manifest.yaml
tools/make/agent.mk
tools/make/dashboard.mk
tools/make/docker.mk

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

github-actions · 2026-03-25T13:22:36Z

✅ Supply Chain Security Report — All Clear

Scanner	Status	Findings
AST Codebase Scan (Py, Go, JS/TS, Rust)	✅	28 finding(s) — MEDIUM: 21 · LOW: 7
AST PR Diff Scan	✅	1 finding(s) — LOW: 1
Regex Fallback Scan	✅	No issues detected

Findings in this PR's diff

1 finding(s) — click to expand

Severity	File	Line	Description
🔵 LOW	`dashboard/backend/handlers/openclaw_test.go`	529	Function 'TestResolveOpenClawModelBaseURL_TargetEnvoyWinsOverRouterConfig' has high source entropy (5.51 bits/byte)

Scanned at 2026-03-26T04:34:30.537Z · View full workflow logs

Signed-off-by: haowu1234 <13258260125@163.com>

Xunzhuo · 2026-03-26T02:38:00Z

.github/workflows/docker-publish.yml

    strategy:
      matrix:
-        image: [dashboard, extproc, extproc-rocm, llm-katan, vllm-sr, vllm-sr-rocm, vllm-sr-sim]
+        image: [dashboard, extproc, extproc-rocm, llm-katan, vllm-sr, vllm-sr-envoy, vllm-sr-rocm, vllm-sr-router, vllm-sr-router-rocm, vllm-sr-sim]


we shouldnt increase the images i think, we already have vllm-sr(now: router only), and dashboard image, and envoy has its own images and we shouldnt maintain

we shouldnt increase the images i think, we already have vllm-sr(now: router only), and dashboard image, and envoy has its own images and we shouldnt maintain

ok

Signed-off-by: qingyangwu <qingyangwu@tencent.com>

Signed-off-by: haowu1234 <13258260125@163.com>

Signed-off-by: qingyangwu <qingyangwu@tencent.com>

haowu1234 requested review from Xunzhuo and rootfs as code owners March 25, 2026 13:05

github-actions bot assigned JaredforReal, rootfs, wangchen615, Xunzhuo and yuluo-yx Mar 25, 2026

haowu1234 requested review from Xunzhuo and removed request for Xunzhuo March 25, 2026 13:06

haowu1234 force-pushed the codex/topology-separation-research branch from 9b36500 to 19773e6 Compare March 25, 2026 13:32

haowu1234 added 8 commits March 26, 2026 10:37

feat: split local runtime topology

7bd25f0

Signed-off-by: haowu1234 <13258260125@163.com>

chore: align local tooling with split runtime

ea70cb3

Signed-off-by: haowu1234 <13258260125@163.com>

chore: align memory workflow with split runtime

7b0af2e

Signed-off-by: haowu1234 <13258260125@163.com>

test: align cli harness with docker runtime

8785e13

Signed-off-by: haowu1234 <13258260125@163.com>

fix: require docker for openclaw runtime

5643a99

Signed-off-by: haowu1234 <13258260125@163.com>

feat: add opt-in split local runtime topology

6d41602

Signed-off-by: haowu1234 <13258260125@163.com>

fix: stabilize split-runtime lint and pr docker builds

157259e

Signed-off-by: haowu1234 <13258260125@163.com>

fix: address remaining split-runtime python lint issues

b874bc4

Signed-off-by: haowu1234 <13258260125@163.com>

Xunzhuo reviewed Mar 26, 2026

View reviewed changes

haowu1234 force-pushed the codex/topology-separation-research branch from 1e27e46 to b874bc4 Compare March 26, 2026 02:38

qingyangwu and others added 4 commits March 26, 2026 11:21

refactor: contract split local runtime image surface

688a800

Signed-off-by: qingyangwu <qingyangwu@tencent.com>

fix: preserve split runtime source config paths

b382f11

Signed-off-by: haowu1234 <13258260125@163.com>

fix: regenerate split envoy config with dashboard venv

4fdc138

Signed-off-by: qingyangwu <qingyangwu@tencent.com>

fix: clear split runtime lint blockers

50df862

Signed-off-by: qingyangwu <qingyangwu@tencent.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex/topology separation research#1655

Codex/topology separation research#1655
haowu1234 wants to merge 12 commits intovllm-project:mainfrom
haowu1234:codex/topology-separation-research

haowu1234 commented Mar 25, 2026 •

edited

Loading

Uh oh!

netlify bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Xunzhuo Mar 26, 2026

Uh oh!

haowu1234 Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

haowu1234 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

1. Add opt-in split local topology

2. Add role-specific local runtime images

3. Preserve setup/config/deploy behavior under split topology

4. Fix split-topology local runtime regressions discovered during rollout

5. Make split local dev faster and safer

Why

Validation

Usage

Notes

Uh oh!

netlify bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 dashboard

📁 e2e

📁 src

📁 tools

🎉 Thanks for your contributions!

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Supply Chain Security Report — All Clear

Findings in this PR's diff

Uh oh!

Xunzhuo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

haowu1234 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

haowu1234 commented Mar 25, 2026 •

edited

Loading

netlify bot commented Mar 25, 2026 •

edited

Loading

github-actions bot commented Mar 25, 2026 •

edited

Loading

📁 `Root Directory`

📁 `dashboard`

📁 `e2e`

📁 `src`

📁 `tools`

github-actions bot commented Mar 25, 2026 •

edited

Loading