-
Notifications
You must be signed in to change notification settings - Fork 583
Description
Describe the feature
Make the router-side state surfaces that already look product-facing restart-safe and explicitly durable where appropriate, instead of leaving them on process memory or temp-file conventions.
Primary layer
global level
Why this layer?
This spans router defaults, extproc runtime storage, vector-store metadata, file registries, startup status, and dashboard-facing operator APIs. The problem is intentionally cross-cutting rather than belonging to one signal, plugin, or single package.
Why do you need this feature?
The current router runtime still mixes durable-looking features with process-local ownership:
global.services.response_api.store_backenddefaults tomemoryglobal.services.router_replay.store_backenddefaults tomemorysrc/semantic-router/pkg/vectorstore/manager.gokeeps vector-store metadata in an in-memory registry even when backend collections persist elsewheresrc/semantic-router/pkg/vectorstore/filestore.gokeeps file metadata in memory while file bytes live on disksrc/semantic-router/pkg/startupstatus/status.gofalls back to temp-owned JSON for runtime readiness and model download progress
That means restart behavior is inconsistent today: a router can retain some bytes or backend collections but still lose the metadata and control-plane truth required to recover cleanly.
Additional context
Child of #1606.
Repository evidence:
docs/agent/tech-debt/td-034-runtime-and-dashboard-state-durability-and-telemetry-contract.mddocs/agent/state-taxonomy-and-inventory.mdsrc/semantic-router/pkg/responsestore/{factory.go,memory_store.go}src/semantic-router/pkg/routerreplay/store/{factory.go,memory.go}src/semantic-router/pkg/extproc/router_storage.gosrc/semantic-router/pkg/extproc/router_replay_setup.gosrc/semantic-router/pkg/vectorstore/{manager.go,filestore.go}src/semantic-router/pkg/startupstatus/status.go
Related issues to keep aligned, not duplicated:
- feature: introduce a shared Milvus lifecycle seam across runtime stores #1601 shared Milvus lifecycle seam
- feature: harden RAG and memory usability for production workflows #1516 production RAG and memory usability
Suggested acceptance:
- define the intended durability contract for Response API storage, router replay, vector-store metadata, file metadata, and startup status
- make router-visible metadata restart-safe without requiring process memory to remain intact
- keep cache-like surfaces as caches; do not force semantic cache or RAG cache into relational storage when a shared cache is the correct abstraction
- expose typed startup and recovery status through a documented persistence seam rather than temp-file-only behavior
- add at least one restart-recovery test that proves router-side metadata survives a process restart
Metadata
Metadata
Assignees
Labels
Type
Projects
Status