Skip to content

Conversation

@Sophie8
Copy link
Contributor

@Sophie8 Sophie8 commented Nov 1, 2025

What type of PR is this?

What this PR does / why we need it:
Implement the embedding based in tree classification based on the design roadmap:

  • Reuse existing candle-binding embedding models
  • Embedding similarity scoring
  • Configurable thresholds per category (e.g. category A: 80%, category B: 60%)
  • Support mean/max/any aggregation methods to find the best match

Which issue(s) this PR fixes:

Feature # 336

Release Notes: Yes/No

@netlify
Copy link

netlify bot commented Nov 1, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 1762bd0
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/6905519f46c6650008f69ef9
😎 Deploy Preview https://deploy-preview-567--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Nov 1, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/src/components/ScrollToTop/index.tsx
  • website/src/components/ScrollToTop/styles.module.css
  • website/src/theme/Root.tsx

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/api/server.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@Xunzhuo Xunzhuo changed the title fead: Implement In-Tree Embedding Similarity Matching feat: Implement In-Tree Embedding Similarity Matching Nov 3, 2025
@Xunzhuo
Copy link
Member

Xunzhuo commented Nov 3, 2025

can you design the API first?

@Sophie8
Copy link
Contributor Author

Sophie8 commented Nov 4, 2025

can you design the API first?

sure.
Todos:

  1. The PR will be refactored after embedding model at candle binding is finalized: feat: add embedding model continuous batching scheduler #564
  2. design doc for API, benchmarks and unit tests to be added

@Xunzhuo
Copy link
Member

Xunzhuo commented Nov 7, 2025

fixes: #366

@Sophie8
Copy link
Contributor Author

Sophie8 commented Nov 7, 2025

closing this draft PR as changes are refactored and migrated to #606 for better code structure.

@Sophie8 Sophie8 closed this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants