feat: add taxonomy classifier platform integration#1654
feat: add taxonomy classifier platform integration#1654
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Supply Chain Security Report — All Clear
Findings in this PR's diff1 finding(s) — click to expand
Scanned at |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
There was a problem hiding this comment.
Pull request overview
Adds end-to-end “taxonomy classifier” support across config, DSL, router runtime, APIs, dashboard surfaces, and packaging so routing can bind named taxonomy tier/category matches (and taxonomy metrics) into signals/decisions.
Changes:
- Introduces
global.model_catalog.classifiers[]taxonomy classifier registry,routing.signals.taxonomy[], andtaxonomy_metricprojection inputs. - Implements runtime taxonomy classification (category KB classifier), signal propagation, headers, DSL compile/decompile updates, and validation logic.
- Adds router API + dashboard proxy/UI for taxonomy classifier CRUD; updates Docker images to ship classifier assets and updates docs/recipes.
Reviewed changes
Copilot reviewed 111 out of 111 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| website/sidebars.ts | Adds Taxonomy docs page to sidebar navigation. |
| website/docs/tutorials/signal/overview.md | Documents new taxonomy learned-signal family. |
| website/docs/tutorials/signal/learned/taxonomy.md | New tutorial page describing taxonomy signals and config examples. |
| website/docs/proposals/unified-config-contract-v0-3.md | Updates contract doc to include global.model_catalog.classifiers. |
| website/docs/installation/configuration.md | Documents classifier registry and taxonomy classifiers in config. |
| tools/linter/go/.golangci.agent.yml | Exempts new/updated test files from complexity linters. |
| tools/docker/Dockerfile.extproc-rocm | Ships config/classifiers/ into extproc ROCm image. |
| tools/docker/Dockerfile.extproc | Ships config/classifiers/ into extproc image. |
| src/vllm-sr/tests/test_deployment_backend.py | Updates deploy backend tests to assert config path plumbing. |
| src/vllm-sr/tests/test_cli_main.py | Updates CLI tests to assert config path plumbing. |
| src/vllm-sr/cli/models.py | Extends projection input schema; adds taxonomy signal models. |
| src/vllm-sr/cli/docker_backend.py | Passes source/runtime config paths through deploy backend. |
| src/vllm-sr/cli/config_contract.py | Adds taxonomy to signal-family contract mapping. |
| src/vllm-sr/cli/commands/runtime.py | Passes both source and runtime config file paths to backend deploy. |
| src/vllm-sr/Dockerfile.rocm | Ships config/classifiers/ into vllm-sr ROCm image. |
| src/vllm-sr/Dockerfile | Ships config/classifiers/ into vllm-sr image. |
| src/semantic-router/pkg/services/classification_signal_contract.go | Adds taxonomy matched signals to contract serialization. |
| src/semantic-router/pkg/headers/headers.go | Defines x-vsr-matched-taxonomy response header. |
| src/semantic-router/pkg/extproc/tool_scope_test.go | Adds tests for tool-scope filtering behavior and constants. |
| src/semantic-router/pkg/extproc/request_context.go | Tracks matched taxonomy signals in request context. |
| src/semantic-router/pkg/extproc/req_filter_tools.go | Adds decision tool-scope enforcement + refactors tool selection helpers. |
| src/semantic-router/pkg/extproc/req_filter_looper_response.go | Emits matched taxonomy header to looper response. |
| src/semantic-router/pkg/extproc/req_filter_classification_signal.go | Applies taxonomy matches into request context + aggregation. |
| src/semantic-router/pkg/extproc/processor_res_header_mutation.go | Emits matched taxonomy header on response. |
| src/semantic-router/pkg/dsl/taxonomy_e2e_test.go | E2E DSL tests ensuring taxonomy classifiers/signals appear in YAML/AST. |
| src/semantic-router/pkg/dsl/taxonomy_dsl_test.go | DSL parse/compile/decompile/roundtrip tests for taxonomy + tool_scope. |
| src/semantic-router/pkg/dsl/routing_contract.go | Decompiler emits taxonomy signals and classifier/metric projection inputs. |
| src/semantic-router/pkg/dsl/privacy_recipe_roundtrip_test.go | Validates maintained privacy recipe round-trips with taxonomy/tool_scope. |
| src/semantic-router/pkg/dsl/parser.go | Parses projection classifier/metric fields and route TOOL_SCOPE. |
| src/semantic-router/pkg/dsl/emitter_yaml.go | Emits canonical YAML when taxonomy classifiers present; adds merge-with-base support. |
| src/semantic-router/pkg/dsl/dsl_test.go | Updates CLI compile tests for new basePath parameter. |
| src/semantic-router/pkg/dsl/decompiler.go | Adds taxonomy + tool_scope emission; supports classifier/metric projection input formatting. |
| src/semantic-router/pkg/dsl/compiler.go | Compiles taxonomy signals and propagates tool_scope + projection classifier/metric. |
| src/semantic-router/pkg/dsl/cli.go | Adds --base/basePath support and routing merge into base YAML. |
| src/semantic-router/pkg/dsl/ast.go | Extends AST for TOOL_SCOPE and projection classifier/metric fields. |
| src/semantic-router/pkg/decision/engine_taxonomy_test.go | Adds decision-engine tests for taxonomy signal matching. |
| src/semantic-router/pkg/decision/engine.go | Adds taxonomy matches into decision-engine signal matching. |
| src/semantic-router/pkg/config/validator_taxonomy.go | Adds taxonomy classifier + binding validation (manifest-based when available). |
| src/semantic-router/pkg/config/validator_projection.go | Supports taxonomy_metric projection inputs + taxonomy signal declaration tracking. |
| src/semantic-router/pkg/config/validator_decision.go | Validates decision tool_scope values and warns on ineffective configs. |
| src/semantic-router/pkg/config/validator.go | Runs taxonomy validation as part of config validation pipeline. |
| src/semantic-router/pkg/config/taxonomy_config_test.go | Adds taxonomy/tool_scope config tests incl. legacy rejection. |
| src/semantic-router/pkg/config/taxonomy_config.go | Adds taxonomy classifier/signal config types and asset-path resolution. |
| src/semantic-router/pkg/config/signal_config.go | Adds taxonomy signals to signals config struct. |
| src/semantic-router/pkg/config/routing_surface_catalog.go | Registers taxonomy as supported signal type. |
| src/semantic-router/pkg/config/reference_config_routing_surface_test.go | Adds taxonomy signal key mapping to reference-config coverage. |
| src/semantic-router/pkg/config/reference_config_public_surface_test.go | Adds taxonomy signal coverage assertions for reference config. |
| src/semantic-router/pkg/config/reference_config_global_test.go | Adds classifier registry coverage assertions for reference config global. |
| src/semantic-router/pkg/config/projection_config.go | Makes projection input name optional; adds classifier/metric fields. |
| src/semantic-router/pkg/config/loader.go | Tracks base-dir for asset loading; rejects legacy category_kb signal block. |
| src/semantic-router/pkg/config/docs_contract_signal_test.go | Classifies taxonomy tutorial bucket as learned. |
| src/semantic-router/pkg/config/decision_config.go | Adds tool_scope/allow_tools/block_tools to decision schema + constants. |
| src/semantic-router/pkg/config/config.go | Adds taxonomy signal type constant; stores loaded taxonomy classifiers + base dir. |
| src/semantic-router/pkg/config/canonical_global.go | Adds canonical global.model_catalog.classifiers export/import. |
| src/semantic-router/pkg/config/canonical_export.go | Exports taxonomy signals + classifier registry into canonical config. |
| src/semantic-router/pkg/config/canonical_defaults.go | Adds built-in privacy classifier to canonical defaults. |
| src/semantic-router/pkg/config/canonical_config.go | Adds canonical routing.signals.taxonomy normalization/serialization. |
| src/semantic-router/pkg/classification/classifier_signal_taxonomy.go | Evaluates taxonomy classifiers and maps bindings to matched taxonomy signals. |
| src/semantic-router/pkg/classification/classifier_signal_eval.go | Tracks taxonomy signal types/metrics and threads taxonomy matches to decision engine. |
| src/semantic-router/pkg/classification/classifier_signal_dispatch.go | Adds taxonomy signal dispatcher. |
| src/semantic-router/pkg/classification/classifier_signal_context.go | Marks taxonomy readiness when classifiers are initialized. |
| src/semantic-router/pkg/classification/classifier_projections.go | Adds taxonomy matches for projection + taxonomy_metric values; defines taxonomy metric key. |
| src/semantic-router/pkg/classification/classifier_construction.go | Initializes taxonomy classifiers at startup from global classifier registry. |
| src/semantic-router/pkg/classification/classifier_composers.go | Enables taxonomy conditions in composer leaf evaluation. |
| src/semantic-router/pkg/classification/classifier.go | Stores taxonomy classifier instances on Classifier struct. |
| src/semantic-router/pkg/classification/category_kb_classifier_test.go | Adds unit tests for taxonomy manifest parsing and KB classifier behavior. |
| src/semantic-router/pkg/classification/category_kb_classifier.go | Implements taxonomy/category KB classifier with exemplar embeddings + contrastive scoring. |
| src/semantic-router/pkg/apiserver/server.go | Registers taxonomy classifier CRUD endpoints. |
| src/semantic-router/pkg/apiserver/route_taxonomy_classifiers_test.go | Adds lifecycle + mutation-blocking tests for classifier CRUD endpoints. |
| src/semantic-router/pkg/apiserver/route_taxonomy_classifiers.go | Implements taxonomy classifier CRUD endpoints + persistence integration. |
| src/semantic-router/pkg/apiserver/route_api_doc.go | Adds API doc registry entries for classifier CRUD endpoints. |
| src/semantic-router/cmd/dsl/main.go | Adds --base option for DSL compile to merge routing into base config. |
| docs/agent/plans/pl-0015-taxonomy-classifier-platform-loop.md | Adds execution plan / loop closure record for taxonomy integration. |
| docs/agent/plans/README.md | Links new plan document. |
| deploy/recipes/privacy/providers.yaml | Adds base providers YAML used for merged config example. |
| deploy/recipes/privacy/privacy-router.yaml | Updates recipe to use taxonomy signals/metrics and tool_scope changes. |
| deploy/recipes/privacy/privacy-router.dsl | Updates recipe DSL to include taxonomy + tool_scope + taxonomy_metric projection. |
| deploy/recipes/privacy/config.yaml | Adds full runnable canonical config example including taxonomy classifier registry and signals. |
| deploy/recipes/privacy/README.md | Documents taxonomy classifier usage, tool scopes, and privacy override behavior. |
| dashboard/frontend/src/pages/ConfigPageTaxonomyClassifiers.module.css | Adds styles for taxonomy classifier management UI. |
| dashboard/frontend/src/pages/ConfigPageRouterConfigSection.tsx | Integrates taxonomy classifier UI into router config section (visual mode). |
| dashboard/frontend/src/components/chatRequestSupport.ts | Adds matched taxonomy response header capture. |
| dashboard/frontend/src/components/HeaderReveal.tsx | Adds matched taxonomy header metadata for reveal UI. |
| dashboard/frontend/src/components/HeaderDisplay.tsx | Adds matched taxonomy header badge in header display. |
| dashboard/backend/router/core_routes.go | Registers dashboard proxy endpoints for router classifier CRUD. |
| dashboard/backend/handlers/config_classifier_proxy_test.go | Tests dashboard proxy forwarding and read-only blocking. |
| dashboard/backend/handlers/config_classifier_proxy.go | Implements dashboard proxy for router taxonomy classifier endpoints. |
| config/signal/taxonomy/privacy.yaml | Adds taxonomy signal fragments for default privacy classifier bindings. |
| config/config.yaml | Adds taxonomy signals, taxonomy_metric projection, decision tool scopes, and classifier registry entry. |
| config/classifiers/privacy/trade_secret_ip.json | Adds privacy taxonomy category exemplars (trade secrets/IP). |
| config/classifiers/privacy/taxonomy.json | Adds privacy taxonomy manifest defining tiers/categories/groups. |
| config/classifiers/privacy/system_prompt_extraction.json | Adds security taxonomy category exemplars. |
| config/classifiers/privacy/simple_task.json | Adds local-standard taxonomy category exemplars. |
| config/classifiers/privacy/root_cause_analysis.json | Adds frontier-reasoning taxonomy category exemplars. |
| config/classifiers/privacy/proprietary_code.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/prompt_injection.json | Adds security taxonomy category exemplars. |
| config/classifiers/privacy/pii.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/operational_infrastructure.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/multi_step_tradeoffs.json | Adds frontier-reasoning taxonomy category exemplars. |
| config/classifiers/privacy/locality_directive.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/jailbreak_role.json | Adds security taxonomy category exemplars. |
| config/classifiers/privacy/internal_document.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/generic_coding.json | Adds local-standard taxonomy category exemplars. |
| config/classifiers/privacy/general_knowledge.json | Adds local-standard taxonomy category exemplars. |
| config/classifiers/privacy/customer_data.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/credential_exfiltration.json | Adds security taxonomy category exemplars. |
| config/classifiers/privacy/business_strategy.json | Adds privacy taxonomy category exemplars. |
| config/classifiers/privacy/architecture_analysis.json | Adds frontier-reasoning taxonomy category exemplars. |
| config/README.md | Documents classifier registry vs taxonomy signal bindings. |
Comments suppressed due to low confidence (4)
src/semantic-router/pkg/apiserver/server.go:1
- The classifier route handlers are registered unconditionally, but their implementations are introduced in files guarded by
//go:build !windows && cgo. In builds where those files are excluded (e.g., Windows orCGO_ENABLED=0), this will fail to compile due to missing methods. Fix by either removing/relaxing the build tags on the route implementation files, or adding stub implementations under the complementary build tags that return501 Not Implemented(and only registering these routes when supported).
src/semantic-router/pkg/classification/category_kb_classifier.go:1 - Similarity maxima and
bestSimare initialized to zero, which breaks classification when cosine similarities are negative (a common case):maxSimwill stay at 0, andbestCatcan remain empty, producing incorrect best-category/tier outcomes. Fix by initializing with a sentinel (e.g.,maxSim = -1/math.SmallestNonzeroFloat32depending on cosine range, andbestSim = -math.MaxFloat64) or by using afoundboolean to seed from the first observed similarity.
src/semantic-router/pkg/classification/category_kb_classifier.go:1 - Similarity maxima and
bestSimare initialized to zero, which breaks classification when cosine similarities are negative (a common case):maxSimwill stay at 0, andbestCatcan remain empty, producing incorrect best-category/tier outcomes. Fix by initializing with a sentinel (e.g.,maxSim = -1/math.SmallestNonzeroFloat32depending on cosine range, andbestSim = -math.MaxFloat64) or by using afoundboolean to seed from the first observed similarity.
src/vllm-sr/cli/models.py:1 - By making
nameoptional for all projection inputs, the CLI model now accepts invalid configurations for non-taxonomy_metricinputs (wherenameis required), and it also doesn’t enforce thatclassifier/metricare present whentype == taxonomy_metric. Add a Pydantic model validator that enforces: (a)nameis required unlesstype == taxonomy_metric, and (b)classifierandmetricare required (andnameshould likely be absent/ignored) whentype == taxonomy_metric.
| resp, err := http.DefaultClient.Do(proxyReq) | ||
| if err != nil { | ||
| http.Error(w, fmt.Sprintf("Router API request failed: %v", err), http.StatusBadGateway) | ||
| return | ||
| } | ||
| defer resp.Body.Close() |
There was a problem hiding this comment.
The proxy forwards/copies headers verbatim and uses http.DefaultClient (no timeouts). Two concrete issues: (1) hop-by-hop headers (e.g., Connection, Transfer-Encoding, Upgrade, etc.) should not be forwarded per RFC 7230; copying them can cause request/response smuggling and proxy interoperability bugs, and (2) using http.DefaultClient risks hanging requests under network stalls. Fix by filtering hop-by-hop headers in copyProxyHeaders (and optionally restricting which inbound headers are forwarded), and by using a dedicated http.Client with reasonable timeouts (and potentially a size limit on io.ReadAll).
| func copyProxyHeaders(dst, src http.Header) { | ||
| for key, values := range src { | ||
| dst.Del(key) | ||
| for _, value := range values { | ||
| dst.Add(key, value) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The proxy forwards/copies headers verbatim and uses http.DefaultClient (no timeouts). Two concrete issues: (1) hop-by-hop headers (e.g., Connection, Transfer-Encoding, Upgrade, etc.) should not be forwarded per RFC 7230; copying them can cause request/response smuggling and proxy interoperability bugs, and (2) using http.DefaultClient risks hanging requests under network stalls. Fix by filtering hop-by-hop headers in copyProxyHeaders (and optionally restricting which inbound headers are forwarded), and by using a dedicated http.Client with reasonable timeouts (and potentially a size limit on io.ReadAll).
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
b46a3bb to
4ccf363
Compare
Performance Benchmark ResultsComponent benchmarks completed successfully. Summary
DetailsSee attached benchmark artifacts for detailed results and profiles. Performance testing powered by vLLM Semantic Router |
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>
Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Closes #xxxx
Summary
yes/nonone/TDxxxValidation
cpu-local/amd-local/not runChecklist
[Bugfix],[CI/Build],[CLI],[Dashboard],[Doc],[Feat],[Router], or[Misc]git commit -sSee CONTRIBUTING.md for the full contributor workflow and commit guidance.