-
Notifications
You must be signed in to change notification settings - Fork 583
Description
Describe the feature
Make dashboard workflow state restart-safe and server-owned for ML pipeline, model research, and OpenClaw collaboration or control surfaces, instead of relying on in-memory maps, workspace-local JSON, or browser-only state.
Primary layer
global level
Why this layer?
This crosses dashboard backend persistence, SSE and WebSocket delivery, browser assumptions, local workspace mounts, and operator-facing recovery behavior. It is a control-plane state problem, not one isolated UI or backend handler change.
Why do you need this feature?
Several dashboard-native workflows already look like product surfaces, but their state ownership is still fragmented:
dashboard/backend/mlpipeline/runner.gokeeps jobs in an in-memory map and pushes progress through an in-memory channeldashboard/backend/modelresearch/manager.gopersists JSON snapshots, but still keeps active campaign truth in memory and marks running work as failed after a dashboard restartdashboard/backend/handlers/openclaw.goandopenclaw_rooms.gokeep registry, team, worker, room, and message state in workspace-local JSON files- OpenClaw room message appends rewrite whole JSON files, which will not scale to larger rooms or longer histories
- the frontend still keeps some chat or auth state in
localStorage
As a result, live connection state, durable workflow state, and browser convenience state are not cleanly separated today.
Additional context
Child of #1606.
Repository evidence:
docs/agent/tech-debt/td-034-runtime-and-dashboard-state-durability-and-telemetry-contract.mddocs/agent/state-taxonomy-and-inventory.mddashboard/backend/{mlpipeline/runner.go,modelresearch/manager.go,auth/store.go}dashboard/backend/handlers/{mlpipeline.go,evaluation.go,modelresearch.go,openclaw.go,openclaw_rooms.go}dashboard/frontend/src/{hooks/useConversationStorage.ts,utils/authFetch.ts}
Related issues to coordinate with, not replace:
- feature: persist dashboard control-plane and config state in a database #1509 persist dashboard control-plane and config state in a database
- feature: restore system eval and signal eval as stable dashboard-native workflows #1515 restore system eval and signal eval as stable dashboard-native workflows
Suggested acceptance:
- keep SSE and WebSocket client registries in memory only, but move workflow jobs, typed progress, terminal state, and collaboration entities into server-owned durable records
- make ML pipeline and model research progress reconstructable after restart
- move OpenClaw teams, workers, rooms, and room messages off ad hoc JSON files into a persistence seam that can support larger histories and multiple operators
- decide explicitly which browser chat or auth surfaces remain demo-only or ephemeral and which become supported server-owned state
- add at least one restart-aware dashboard workflow test and one typed progress/health contract that does not depend on log scraping
Metadata
Metadata
Assignees
Labels
Type
Projects
Status