[Doc] Refresh homepage architecture and research content by Xunzhuo · Pull Request #1659 · vllm-project/semantic-router

Xunzhuo · 2026-03-26T02:56:11Z

Summary

Scope: refresh homepage stats, architecture copy, current docs architecture descriptions, and research publications content
Primary skill: cross-stack-bugfix
Impacted surfaces: docs_examples
Conditional surfaces intentionally skipped: local_e2e, ci_e2e (documentation-only homepage/docs change)
Behavior-visible change: yes
Debt entry: none

Validation

Environment: not run
Fast gate: cd website && npm run lint; make markdown-lint
Feature gate: not run (make agent-report classified this change as documentation-only and reported validation commands: none)
Local smoke / E2E: not run
CI expectations / blockers: none beyond standard docs/site checks

Checklist

PR title uses the repo prefix format: [Doc]
If the PR spans multiple categories, the title includes all relevant prefixes
Commits in this PR are signed off with git commit -s
Source-of-truth docs or indexed debt entries were updated when applicable
The validation results above reflect the actual commands or blockers for this change

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

netlify · 2026-03-26T02:56:17Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`42cfd6f`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/69c4b2e8959fe10008d8a2a6
😎 Deploy Preview	https://deploy-preview-1659--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2026-03-26T02:56:43Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `website`

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md
website/docs/intro.md
website/docs/overview/collective-intelligence.md
website/docs/overview/goals.md
website/docs/overview/semantic-router-overview.md
website/docs/tutorials/signal/overview.md
website/docusaurus.config.ts
website/i18n/zh-Hans/code.json
website/i18n/zh-Hans/docusaurus-theme-classic/footer.json
website/i18n/zh-Hans/docusaurus-theme-classic/navbar.json
website/src/components/PaperFigureShowcase/index.module.css
website/src/components/PaperFigureShowcase/index.tsx
website/src/components/PaperViewerPage/index.tsx
website/src/components/ResearchPaperCarousel/index.tsx
website/src/components/site/CapabilityGlyph.tsx
website/src/data/researchContent.js
website/src/data/socialPreview.ts
website/src/pages/index.tsx
website/src/pages/publications.js
website/src/pages/vision-paper.tsx
website/src/pages/white-paper.tsx
website/static/img/vllm-sr-logo.social.png
website/static/vision-paper.pdf

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

github-actions · 2026-03-26T02:58:06Z

✅ Supply Chain Security Report — All Clear

Scanner	Status	Findings
AST Codebase Scan (Py, Go, JS/TS, Rust)	✅	27 finding(s) — MEDIUM: 21 · LOW: 6
AST PR Diff Scan	✅	No issues detected
Regex Fallback Scan	✅	No issues detected

Scanned at 2026-03-26T04:18:51.540Z · View full workflow logs

Copilot

Pull request overview

Refreshes the website homepage + docs architecture narrative to reflect the updated “signal → projection → decision → plugin” mental model, and updates the publications dataset used across the site.

Changes:

Update homepage stats/capability copy and add a new “projection” capability layer (including a new glyph).
Expand PaperFigureShowcase and Chinese i18n strings to describe the four-layer architecture and 14-signal taxonomy.
Add a new research paper entry and update overview/tutorial docs to reference the 14 maintained signal families.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
website/src/pages/index.tsx	Updates homepage stats (signals/papers) and capability copy to match the refreshed architecture.
website/src/data/researchContent.js	Adds a new research paper entry used by homepage/publications components.
website/src/components/site/CapabilityGlyph.tsx	Adds a new `projection` glyph kind and renderer.
website/src/components/PaperFigureShowcase/index.tsx	Updates figure copy and the interactive architecture panel to a 4-layer flow and 14-signal taxonomy.
website/src/components/PaperFigureShowcase/index.module.css	Adjusts the figure layer grid to support 4 layers (extra connector/column).
website/i18n/zh-Hans/code.json	Updates zh-Hans translations for the revised figure copy and adds new figure keys.
website/docs/tutorials/signal/overview.md	Updates signal catalog summary and table formatting for 14 families (5 heuristic, 9 learned).
website/docs/overview/semantic-router-overview.md	Updates the overview architecture diagram/sections to include projection coordination and plugin dispatch.
website/docs/overview/goals.md	Refreshes goals copy to reference 14 families and projection coordination.
website/docs/overview/collective-intelligence.md	Updates diagrams and examples to include projection coordination and updated signal counts.
website/docs/intro.md	Refreshes the “Core System” framing to include projections and updated signal families list.
website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md	Updates blog copy to match the refreshed routing architecture language and linking.

Copilot · 2026-03-26T03:04:38Z

website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md

 ### 3. Personal AI and Local Personal Agents

-The third opportunity is personal AI. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.
+The third opportunity is personal AI like deploying a personal model on AMD AI MAX+ and connecting to external Models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.


Minor style/grammar: “connecting to external Models” capitalizes “Models” mid-sentence and the overall sentence reads a bit run-on. Consider lowercasing “models” and splitting into two sentences for clarity.

Suggested change

The third opportunity is personal AI like deploying a personal model on AMD AI MAX+ and connecting to external Models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.

The third opportunity is personal AI. You might deploy a personal model on AMD AI MAX+ and connect to external models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.

Copilot · 2026-03-26T03:04:39Z

website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md

 The most immediate opportunity is intelligent routing. A single ROCm backend on AMD Developer Cloud can serve as the physical execution layer for multiple logical lanes. That means teams can prototype a Mixture-of-Models experience, cost-aware routing, replay-driven debugging, and tiered product behavior without first standing up a large multi-backend fleet.

-In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto one self-hosted Qwen backend. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.
+In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto different models. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.


The post says the SIMPLE/MEDIUM/COMPLEX/REASONING/PREMIUM lanes “resolve onto different models”, but the referenced deploy/recipes/balance.yaml maps each of those providers.models[*] entries to the same backend_refs endpoint (vllm_primary at vllm:8000). If the intent is the single-backend alias demo, please update this sentence to say the lanes resolve onto one backend via multiple served model names/aliases; if the intent is truly different models, the earlier “single ROCm backend” section and recipe reference should be updated to match.

Suggested change

In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto different models. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.

In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes each resolve onto distinct served model names/aliases that all point to the same ROCm-backed vLLM endpoint. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.

Copilot · 2026-03-26T03:04:39Z

website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md

 ### 2. Privacy Routing and Local-First Governance

-The second opportunity is privacy routing. This repository already includes a maintained privacy recipe that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.
+The second opportunity is privacy routing, that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.


Grammar/readability: “privacy routing, that keeps …” reads like an incorrect relative clause. Consider removing the comma ("privacy routing that keeps …") or rewriting the sentence to avoid the comma splice.

Suggested change

The second opportunity is privacy routing, that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.

The second opportunity is privacy routing that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.

Copilot · 2026-03-26T03:04:39Z

website/docs/overview/semantic-router-overview.md

 ```yaml
 # Traditional: Simple keyword matching
-if "math" in query:
-    route_to_math_model()
+if "math" in query: route_to_math_model()
 ```

 Signal-driven routing uses multiple signals:

 ```yaml
 # Signal-driven: Multiple signals combined
-if (has_math_keywords AND is_math_domain) OR has_high_math_embedding:
-    route_to_math_model()
+if (has_math_keywords AND is_math_domain) OR has_high_math_embedding: route_to_math_model()
 ```


This section uses fenced code blocks labeled as yaml, but the contents are pseudo-Python (if ...: route_to_math_model()) and not valid YAML. Please change the fence language to something appropriate (e.g., python or text) or rewrite the examples into valid YAML so syntax highlighting and copy/paste behavior aren’t misleading.

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Xunzhuo added 2 commits March 26, 2026 10:29

docs: update amd blog details

2041425

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

docs: refresh homepage architecture and research content

4c48ec4

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Xunzhuo requested a review from rootfs as a code owner March 26, 2026 02:56

Copilot AI review requested due to automatic review settings March 26, 2026 02:56

github-actions bot assigned rootfs, Xunzhuo and yuluo-yx Mar 26, 2026

Copilot started reviewing on behalf of Xunzhuo March 26, 2026 03:00 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Xunzhuo added 3 commits March 26, 2026 11:05

docs: localize research and paper page chrome

20e05e7

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

docs: update amd blog details

7101f76

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

docs: update amd blog details

42cfd6f

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Xunzhuo merged commit ab2aa16 into vllm-project:main Mar 26, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Refresh homepage architecture and research content#1659

[Doc] Refresh homepage architecture and research content#1659
Xunzhuo merged 5 commits intovllm-project:mainfrom
Xunzhuo:vsr/docs-homepage-research-refresh

Xunzhuo commented Mar 26, 2026

Uh oh!

netlify bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto different models. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.
	In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes each resolve onto distinct served model names/aliases that all point to the same ROCm-backed vLLM endpoint. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.

Conversation

Xunzhuo commented Mar 26, 2026

Summary

Validation

Checklist

Uh oh!

netlify bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 website

🎉 Thanks for your contributions!

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Supply Chain Security Report — All Clear

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Mar 26, 2026 •

edited

Loading

github-actions bot commented Mar 26, 2026 •

edited

Loading

📁 `website`

github-actions bot commented Mar 26, 2026 •

edited

Loading