Skip to content

[Doc] Refresh homepage architecture and research content#1659

Merged
Xunzhuo merged 5 commits intovllm-project:mainfrom
Xunzhuo:vsr/docs-homepage-research-refresh
Mar 26, 2026
Merged

[Doc] Refresh homepage architecture and research content#1659
Xunzhuo merged 5 commits intovllm-project:mainfrom
Xunzhuo:vsr/docs-homepage-research-refresh

Conversation

@Xunzhuo
Copy link
Member

@Xunzhuo Xunzhuo commented Mar 26, 2026

Summary

  • Scope: refresh homepage stats, architecture copy, current docs architecture descriptions, and research publications content
  • Primary skill: cross-stack-bugfix
  • Impacted surfaces: docs_examples
  • Conditional surfaces intentionally skipped: local_e2e, ci_e2e (documentation-only homepage/docs change)
  • Behavior-visible change: yes
  • Debt entry: none

Validation

  • Environment: not run
  • Fast gate: cd website && npm run lint; make markdown-lint
  • Feature gate: not run (make agent-report classified this change as documentation-only and reported validation commands: none)
  • Local smoke / E2E: not run
  • CI expectations / blockers: none beyond standard docs/site checks

Checklist

  • PR title uses the repo prefix format: [Doc]
  • If the PR spans multiple categories, the title includes all relevant prefixes
  • Commits in this PR are signed off with git commit -s
  • Source-of-truth docs or indexed debt entries were updated when applicable
  • The validation results above reflect the actual commands or blockers for this change

Xunzhuo added 2 commits March 26, 2026 10:29
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
@Xunzhuo Xunzhuo requested a review from rootfs as a code owner March 26, 2026 02:56
Copilot AI review requested due to automatic review settings March 26, 2026 02:56
@netlify
Copy link

netlify bot commented Mar 26, 2026

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 42cfd6f
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/69c4b2e8959fe10008d8a2a6
😎 Deploy Preview https://deploy-preview-1659--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md
  • website/docs/intro.md
  • website/docs/overview/collective-intelligence.md
  • website/docs/overview/goals.md
  • website/docs/overview/semantic-router-overview.md
  • website/docs/tutorials/signal/overview.md
  • website/docusaurus.config.ts
  • website/i18n/zh-Hans/code.json
  • website/i18n/zh-Hans/docusaurus-theme-classic/footer.json
  • website/i18n/zh-Hans/docusaurus-theme-classic/navbar.json
  • website/src/components/PaperFigureShowcase/index.module.css
  • website/src/components/PaperFigureShowcase/index.tsx
  • website/src/components/PaperViewerPage/index.tsx
  • website/src/components/ResearchPaperCarousel/index.tsx
  • website/src/components/site/CapabilityGlyph.tsx
  • website/src/data/researchContent.js
  • website/src/data/socialPreview.ts
  • website/src/pages/index.tsx
  • website/src/pages/publications.js
  • website/src/pages/vision-paper.tsx
  • website/src/pages/white-paper.tsx
  • website/static/img/vllm-sr-logo.social.png
  • website/static/vision-paper.pdf

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

✅ Supply Chain Security Report — All Clear

Scanner Status Findings
AST Codebase Scan (Py, Go, JS/TS, Rust) 27 finding(s) — MEDIUM: 21 · LOW: 6
AST PR Diff Scan No issues detected
Regex Fallback Scan No issues detected

Scanned at 2026-03-26T04:18:51.540Z · View full workflow logs

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refreshes the website homepage + docs architecture narrative to reflect the updated “signal → projection → decision → plugin” mental model, and updates the publications dataset used across the site.

Changes:

  • Update homepage stats/capability copy and add a new “projection” capability layer (including a new glyph).
  • Expand PaperFigureShowcase and Chinese i18n strings to describe the four-layer architecture and 14-signal taxonomy.
  • Add a new research paper entry and update overview/tutorial docs to reference the 14 maintained signal families.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
website/src/pages/index.tsx Updates homepage stats (signals/papers) and capability copy to match the refreshed architecture.
website/src/data/researchContent.js Adds a new research paper entry used by homepage/publications components.
website/src/components/site/CapabilityGlyph.tsx Adds a new projection glyph kind and renderer.
website/src/components/PaperFigureShowcase/index.tsx Updates figure copy and the interactive architecture panel to a 4-layer flow and 14-signal taxonomy.
website/src/components/PaperFigureShowcase/index.module.css Adjusts the figure layer grid to support 4 layers (extra connector/column).
website/i18n/zh-Hans/code.json Updates zh-Hans translations for the revised figure copy and adds new figure keys.
website/docs/tutorials/signal/overview.md Updates signal catalog summary and table formatting for 14 families (5 heuristic, 9 learned).
website/docs/overview/semantic-router-overview.md Updates the overview architecture diagram/sections to include projection coordination and plugin dispatch.
website/docs/overview/goals.md Refreshes goals copy to reference 14 families and projection coordination.
website/docs/overview/collective-intelligence.md Updates diagrams and examples to include projection coordination and updated signal counts.
website/docs/intro.md Refreshes the “Core System” framing to include projections and updated signal families list.
website/blog/2026-03-25-vllm-sr-on-amd-developer-cloud.md Updates blog copy to match the refreshed routing architecture language and linking.

### 3. Personal AI and Local Personal Agents

The third opportunity is personal AI. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.
The third opportunity is personal AI like deploying a personal model on AMD AI MAX+ and connecting to external Models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style/grammar: “connecting to external Models” capitalizes “Models” mid-sentence and the overall sentence reads a bit run-on. Consider lowercasing “models” and splitting into two sentences for clarity.

Suggested change
The third opportunity is personal AI like deploying a personal model on AMD AI MAX+ and connecting to external Models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.
The third opportunity is personal AI. You might deploy a personal model on AMD AI MAX+ and connect to external models as needed. Once routing, privacy, and reasoning are expressed as policy, an AMD-hosted stack can support assistants that feel more personal and more controlled. A personal AI system can keep ordinary tasks, memory-aware follow-ups, and private context on a local lane, while only escalating special cases when explicitly permitted.

Copilot uses AI. Check for mistakes.
The most immediate opportunity is intelligent routing. A single ROCm backend on AMD Developer Cloud can serve as the physical execution layer for multiple logical lanes. That means teams can prototype a Mixture-of-Models experience, cost-aware routing, replay-driven debugging, and tiered product behavior without first standing up a large multi-backend fleet.

In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto one self-hosted Qwen backend. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.
In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto different models. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The post says the SIMPLE/MEDIUM/COMPLEX/REASONING/PREMIUM lanes “resolve onto different models”, but the referenced deploy/recipes/balance.yaml maps each of those providers.models[*] entries to the same backend_refs endpoint (vllm_primary at vllm:8000). If the intent is the single-backend alias demo, please update this sentence to say the lanes resolve onto one backend via multiple served model names/aliases; if the intent is truly different models, the earlier “single ROCm backend” section and recipe reference should be updated to match.

Suggested change
In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes all resolve onto different models. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.
In the AMD reference profile, the cheapest, medium, complex, reasoning, and premium lanes each resolve onto distinct served model names/aliases that all point to the same ROCm-backed vLLM endpoint. The router still gives you differentiated behavior because the policy lives in signals, projections, and decisions, not only in the number of containers you run.

Copilot uses AI. Check for mistakes.
### 2. Privacy Routing and Local-First Governance

The second opportunity is privacy routing. This repository already includes a maintained privacy recipe that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.
The second opportunity is privacy routing, that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar/readability: “privacy routing, that keeps …” reads like an incorrect relative clause. Consider removing the comma ("privacy routing that keeps …") or rewriting the sentence to avoid the comma splice.

Suggested change
The second opportunity is privacy routing, that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.
The second opportunity is privacy routing that keeps PII, private code, internal documents, and suspicious prompts on a local lane while only escalating clearly non-sensitive reasoning work when policy allows it. That pattern is especially meaningful on AMD because it supports a local-first deployment story: keep sensitive traffic on infrastructure you control, audit every decision, and make cloud escalation a governed exception instead of the default.

Copilot uses AI. Check for mistakes.
Comment on lines 144 to 154
```yaml
# Traditional: Simple keyword matching
if "math" in query:
route_to_math_model()
if "math" in query: route_to_math_model()
```
Signal-driven routing uses multiple signals:
```yaml
# Signal-driven: Multiple signals combined
if (has_math_keywords AND is_math_domain) OR has_high_math_embedding:
route_to_math_model()
if (has_math_keywords AND is_math_domain) OR has_high_math_embedding: route_to_math_model()
```
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section uses fenced code blocks labeled as yaml, but the contents are pseudo-Python (if ...: route_to_math_model()) and not valid YAML. Please change the fence language to something appropriate (e.g., python or text) or rewrite the examples into valid YAML so syntax highlighting and copy/paste behavior aren’t misleading.

Copilot uses AI. Check for mistakes.
Xunzhuo added 3 commits March 26, 2026 11:05
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
@Xunzhuo Xunzhuo merged commit ab2aa16 into vllm-project:main Mar 26, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants