Skip to content

v0.2.0#26

Merged
Timwood0x10 merged 31 commits into
masterfrom
improve
Jun 12, 2026
Merged

v0.2.0#26
Timwood0x10 merged 31 commits into
masterfrom
improve

Conversation

@Timwood0x10

@Timwood0x10 Timwood0x10 commented Jun 12, 2026

Copy link
Copy Markdown
Owner

v0.2.0

New Features

  • Leader Failover: Checkpoint-based recovery with LeaderSupervisor detecting leader failure, recovering stale tasks from last checkpoint, and reassigning work to available sub-agents. ColdRestartStrategy for deterministic recovery.
  • Runtime Dynamic Graph: MutableDAG with thread-safe mutation (add/remove nodes and edges at runtime). DynamicExecutor with ApplyMode for hot-reload without stopping execution. Incremental cycle detection on edge insertion.
  • Human-in-the-Loop: InterruptConfig on workflow steps for human approval gates. InterruptHandler blocks execution until approved. InterruptStore provides crash recovery of pending approvals.
  • Agent Resurrection Plugin: Pluggable HealthChecker interface for custom health detection. HeartbeatAdapter for heartbeat-based liveness. Supervisor for automatic agent restart on failure.
  • Event Sourcing: EventStore interface with optimistic concurrency control. MemoryEventStore for dev/test, PostgresEventStore for production. 17 event types covering agent lifecycle, tasks, sessions, workflows, and failover. Pub/sub via Subscribe with filtered event channels. DLQ auto-retry with configurable retry budgets.
  • Pluggable Vector Store: VectorStore interface replacing concrete *VectorSearcher in Repository. PostgreSQL + pgvector for production, in-memory for dev/test. Drop-in replacement support for Qdrant, Milvus, SQLite, or custom backends.
  • WorkflowService API: High-level workflow orchestration abstraction over the DAG engine.

Bug Fixes (46 fixes)

Storage (12 fixes)

  • C1: Embedding queue dedup key mismatch causing duplicate embeddings
  • C2: Write buffer data loss on Stop() before flush completes
  • C3: Embedding enqueue outside transaction leading to orphaned records
  • C4: FetchPendingTasks lock ineffective with FOR UPDATE SKIP LOCKED
  • C5: Reconcile threshold time arithmetic off by orders of magnitude
  • M1: ManagedRow connection leak on error paths
  • M2: Missing migration tables in migrate.go
  • M3: Circuit breaker halfOpenInflight counter leak
  • M4: VectorSearcher missing dimension validation
  • M6: FileWatcher TOCTOU race in scanAndLoad
  • M7: Map reference shared unsafely in callbacks
  • M8: Graph Edge() no validation of node endpoints

Workflow (8 fixes)

  • C6: Panic recovery ordering in executor.go (recovery after cleanup)
  • C7: Graph executor in-degree tracking incorrect after node removal
  • H1: Deadlock false positive in executor.go (errgroup misuse)
  • H2: DynamicExecutor hang on node removal during execution
  • H3: stepEg.Wait() concurrent with Go() causing race
  • H4: NewDAG silently dropping duplicate step IDs
  • M5: MaxAttempts=0 skips execution entirely
  • M9: recomputeOrder version-check race on concurrent access

AHP Protocol (7 fixes)

  • C8: Queue send on closed channel panic during shutdown
  • C9: HeartbeatSender Start/Stop race condition
  • H5: getRandomSuffix nil dereference on empty slice
  • H6: SendMessage swallows all errors silently
  • H7: Protocol has no Close() method (resource leak)
  • M10: Peek() non-atomic read (race under concurrent access)
  • M12: DLQ.Remove leaks trailing pointer after deletion

Agent System (8 fixes)

  • L6: Start partial validation cleanup (inconsistent state on error)
  • L7: SubAgent ProcessStream goroutine leak on context cancellation
  • L8: doFailover uses cancelled ctx for Stop (fails to clean up)
  • L9: Dispatcher partial results misleading (reports success on partial failure)
  • L10: Dynamic executor uses bare go instead of stepEg
  • L11: Missing SnapshotWithSteps() on MutableDAG
  • M11: NewTaskMessage allows nil payload
  • BUG-5: Dead verification in pg_store_test ConcurrentAppend

Event Sourcing (5 fixes)

  • BUG-1: FromVersion inclusive/exclusive boundary wrong in memory_store.go
  • BUG-2: Since filter inclusive boundary wrong in pg_store.go
  • BUG-3: ReadAll incorrectly applies FromVersion filter in memory_store
  • BUG-4: Append does not return ErrVersionConflict on unique violation in pg_store
  • STYLE-1: Bare go keyword without context cancellation in event store

Other (6 fixes)

  • L1: safeFormatTable returning empty string on valid input
  • L2: Missing immediate retry after flush failure in write buffer
  • F1: workflow_test.go config mismatch with actual types
  • Executor nil pointer check on context cancellation (commit 9565f79)
  • STYLE-2: Channel buffer size 64 changed to 1 (backpressure)
  • STYLE-3: Empty slice literals replaced with nil returns

Infrastructure

  • CI/CD pipeline via GitHub Actions (lint, test, race detection, build)
  • Integration tests for workflow engine (5 test cases)
  • Benchmark suite (32 benchmarks across 8 categories)
  • Bilingual documentation (English + Chinese) in docs/en/ and docs/zh/
  • Reorganized documentation into language-specific directories
  • .golangci.yml configuration for consistent linting

Breaking Changes

  • NewLeaderSupervisor signature changed: added eventStore parameter
  • NewColdRestartStrategy signature changed: added checkpoint parameter
  • MemoryManager interface added GetLatestSessionForLeader method
  • VectorStore interface replaces concrete *VectorSearcher in Repository
  • NewResultAggregator signature changed: added sortBy string parameter
  • TaskPlanner.Plan signature changed: added inputText string parameter
  • ResultAggregator.Aggregate signature changed: added tasks []*models.Task parameter
  • Domain types renamed: FashionFilters -> ResourceFilters, FashionItem -> ResourceItem, AgentProfile -> AgentUserProfile, AgentRecommendation -> TaskRecommendation, OutfitSuggestion -> Suggestion, AgentTrend -> Trend

Rename domain types, upgrade default models, implement TTL query cache, fix tests, add migration guide
…framework

- Add ProcessStream for SSE streaming support
- Implement LoopConfig and Evaluator for iterative refinement
- Create ToolFactory and PluginRegistry for dynamic tool loading
- Add evaluation framework with YAML test suites and report generation
- Fix security issues: path traversal, file permissions, context usage

🤖 Generated with CodeArts Agent
…c graph engine

  Leader Failover:
  - HeartbeatMonitor: add TimeoutCallback and RegisterCallback for agent timeout notifications
  - CheckpointRepository: persist leader state (sessionID) to leader_checkpoints table for recovery
  - TaskRecovery: mark orphaned pending/running tasks as failed after leader crash
  - LeaderSupervisor: monitor leader health, trigger ColdRestartStrategy on failure, recover session
  - Leader agent: initMemoryContext attempts checkpoint recovery before creating new session

  Runtime Dynamic Graph:
  - MutableDAG: thread-safe AddNode/RemoveNode/AddEdge/RemoveEdge with incremental cycle detection
  - GraphEventHub: pub/sub for graph mutation events with non-blocking publish
  - DynamicExecutor: ExecuteDynamic with ApplyAtCheckpoint/ApplyImmediate modes, DAG version tracking

  Infrastructure:
  - Add leader_checkpoints migration with status index
  - MemoryManager: add GetLatestSessionForLeader interface method
  - Fix bare  keywords: replace with errgroup (distillEg, streamEg, stepEg)
  - Fix Stop() race: move status transitions into cleanupOnce.Do
  - Extract hardcoded default_user to config.UserID
  - Add tenant guard to GetLatestSessionForLeader
  - Add sentinel errors for checkpoint operations
…chmarks

  Leader Failover:
  - HeartbeatMonitor: add TimeoutCallback for agent timeout notifications
  - CheckpointRepository: persist leader state to leader_checkpoints table
  - TaskRecovery: mark orphaned tasks as failed after leader crash
  - LeaderSupervisor: auto-detect failure, create successor, recover session
  - Leader agent: checkpoint recovery in initMemoryContext

  Runtime Dynamic Graph:
  - MutableDAG: thread-safe AddNode/RemoveNode/AddEdge/RemoveEdge with incremental BFS cycle detection
  - DynamicExecutor: ExecuteDynamic with ApplyAtCheckpoint/ApplyImmediate modes
  - GraphEventHub: pub/sub for graph mutation events

  API Abstraction:
  - WorkflowService interface (Execute, ExecuteStream, ListWorkflows, GetWorkflow)
  - Workflow service implementation composing MutableDAG + DynamicExecutor
  - Client.Workflow() accessor

  Infrastructure:
  - Replace bare  with errgroup (distillEg, streamEg, stepEg)
  - Fix Stop() race: move status transitions into cleanupOnce.Do
  - Extract hardcoded default_user to config.UserID
  - Add tenant guard to GetLatestSessionForLeader
  - Add sentinel errors for checkpoint operations
  - Fix SA5011 nil-pointer warnings in 17 test files (28 replacements)

  Documentation:
  - Reorganize docs into en/ and zh/ directories
  - Add leader-failover, dynamic-graph, v2-architecture docs (bilingual)
  - Update framework-comparison with GoAgent positioning

  Benchmarks:
  - Run all 54 benchmarks with -count=3
  - Update benchmark report with v2 component results
  - Save raw logs to benchmarks/logs/
…gable vector store

New Features:
- Runtime layer: agent lifecycle management, resurrection, event-sourced recovery
- Event Store: MemoryEventStore + PostgresEventStore with optimistic concurrency
- Dynamic Workflow: MutableDAG, DynamicExecutor, GraphEventHub
- Human-in-the-Loop: InterruptConfig, InterruptHandler, InterruptStore
- Agent Resurrection Plugin: pluggable HealthChecker, factory-based recovery
- Pluggable VectorStore: interface-based, PostgreSQL + in-memory implementations
- WorkflowService API: Execute, ExecuteStream, ListWorkflows, GetWorkflow
- StatefulAgent interface: RestoreState, ReplayEvents, Snapshot

Bug Fixes (50):
- Storage: dedup key, write buffer, transactional enqueue, FOR UPDATE SKIP LOCKED
- Workflow: panic recovery, in-degree tracking, deadlock false positive
- AHP: closed channel panic, HeartbeatSender race, error preservation
- Agent: WaitGroup panic, Start/Stop TOCTOU, Process mutual exclusion
- Runtime: nil errgroup, Stop() data race, unbounded replay

Infrastructure:
- CI/CD pipeline, integration tests, benchmarks, bilingual docs
- 6 runnable examples in examples/advanced/

Tests: 2642 pass with -race across 50 packages, 0 lint issues
…n example

  - Fix 32 errcheck/ineffassign/SA1012/unused lint issues across 7 test files
  - Add GetAgent(agentID) method to runtime.Manager for agent lookup
  - Wire verifyRestoredState() call in runtime_resurrection example
  - Remove ineffectual assignment in pg_store_test.go
  - Add nolint directive for intentional nil context test
- Implement HITL workflow integration tests in hitl_workflow_test.go to validate agent execution, approval/rejection handling, and data modification during interrupts.
- Create runtime_resurrection_test.go to test the full resurrection flow of agents, including state recovery, concurrent kills, and handling of maximum restart limits.
- Introduce a new knowledge-base file for documentation purposes.
- Add a PID file for the embedding service to track the running process.
@Timwood0x10 Timwood0x10 merged commit 296d12f into master Jun 12, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant