OpenCode Swarm Architecture

Design Philosophy

OpenCode Swarm is built on a simple premise: multi-agent systems fail when they're unstructured.

Most frameworks throw agents at a problem and hope coherence emerges. It doesn't. You get race conditions, conflicting changes, lost context, and code that doesn't work.

Swarm enforces discipline:

One Architect owns all decisions
One task executes at a time
Every task gets QA'd before the next starts
Project state persists in files, not memory

Control Model

                    ┌─────────────┐
                    │  ARCHITECT  │
                    │  (control)  │
                    └──────┬──────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
        ▼                  ▼                  ▼
┌───────────────┐  ┌───────────────┐  ┌───────────────┐
│   EXPLORER    │  │     SMEs      │  │   PIPELINE    │
│  (discovery)  │  │  (advisory)   │  │ (execution)   │
└───────────────┘  └───────────────┘  └───────────────┘
                                              │
                                   ┌──────────┴──────────┐
                                   │                     │
                                   ▼                     ▼
                           ┌─────────────┐       ┌─────────────┐
                           │    CODER    │       │     QA      │
                           │ (implement) │       │  (verify)   │
                           └─────────────┘       └─────────────┘

Architect: The Brain

Owns the plan
Makes all delegation decisions
Synthesizes inputs from other agents
Handles failures and escalations
Maintains project memory

Explorer: The Eyes

Fast codebase scanner and factual mapping agent. Explorer is strictly observational — it reports what is observed without judgment, verdict, or directive.

Identity Rules

Explorer analyzes directly — does NOT delegate to sub-agents
Output is under 2000 characters
Read-only: cannot write, edit, or patch code

Analysis Protocol (Phase 2 Hardening)

Explorer systematically reports across four dimensions:

Dimension	Content
STRUCTURE	Entry points and call chains (≤3 levels), public API surface (exports with signatures), internal/external dependencies
PATTERNS	Design patterns (factory, observer, strategy), error handling approach (throw/Result/error callback), state management (global/module/passed), configuration pattern (env/config/hardcoded)
COMPLEXITY INDICATORS	Structural complexity — cyclomatic complexity, deep nesting, complex control flow, large files (>500 lines), deep inheritance/type hierarchies
RUNTIME/BEHAVIORAL CONCERNS	Missing error handling paths, platform-specific assumptions (path separators, line endings, OS APIs)

Output Sections (Phase 2)

Section	When Populated	Example
`OBSERVED CHANGES`	INPUT referenced specific files	What changed in those targets
`CONSUMERS_AFFECTED`	Integration impact mode	Files importing changed symbols
`RELEVANT CONSTRAINTS`	Always	Architectural patterns, error handling coverage, platform assumptions, established conventions
`FOLLOW-UP CANDIDATE AREAS`	Always	Observable conditions warranting later review (e.g., "function has 12 parameters — consider splitting")
`DOMAINS`	Always	Relevant SME domains (typescript, nodejs, powershell, etc.)

Judgmental Language Removed (Phase 2)

Explorer output uses observational language only. The following were removed:

VERDICT → replaced with COMPATIBILITY SIGNALS (COMPATIBLE / INCOMPATIBLE / UNCERTAIN)
MIGRATION_NEEDED → replaced with MIGRATION_SURFACE (observable call signatures affected)
REVIEW NEEDED, dead, missing labels → removed
All directive language → recast as observable conditions

Integration Impact Analysis Mode

Activates when delegated with "Integration impact analysis" or INPUT lists contract changes. Uses diff + imports tools to classify each change as BREAKING or COMPATIBLE and list all affected consumer files.

Documentation Discovery Mode

Automatically activates during codebase reality check at plan ingestion. Uses doc_scan to build a manifest index of project docs, then doc_extract to surface relevant constraints per task using Jaccard bigram similarity scoring.

Curator Modes

Explorer also operates in two curator modes for phase boundary consolidation:

Mode	Trigger	Purpose
`CURATOR_INIT`	Session start	Consolidates prior session knowledge into architect briefing; flags contradictions
`CURATOR_PHASE`	Phase complete	Extends running digest with phase outcomes; observes workflow deviations and knowledge update candidates

Both curator modes are dispatched via the standard MODE: EXPLORER with the appropriate curator mode trigger (CURATOR_INIT or CURATOR_PHASE). Explorer uses OBSERVATIONS (not KNOWLEDGE_UPDATES) and reports with observational language — no directives.

Role Boundary Hardening (Phase 3–5)

Explorer's behavioral boundaries were tightened across Phases 3–5 to eliminate residual routing authority and preserve its discovery function.

Explorer's scope is strictly factual. Explorer reports what is observed without judgment, verdict, or directive. Explorer does not route tasks, prioritize findings, or declare areas out-of-scope.

Architect owns all routing decisions. DOMAINS and FOLLOW-UP CANDIDATE AREAS are Explorer output fields — they are informational hints for the architect to consider. Explorer does not delegate, dispatch, or determine next steps. Only the architect decides what to act on.

Reviewer must validate Explorer findings. Explorer observations are not authoritative. Reviewer and other consumers of Explorer output must independently verify factual claims before using them in decisions or implementations.

Critic remains a pure challenge layer. Critic does not map code, scan files, or make routing decisions. Its role is to challenge the architect's plan — not to replace the explorer.

Speed preservation is a design goal. Explorer's efficiency is intentional. Lightweight, fast scans serve the architect better than exhaustive analysis that delays routing decisions.

Explorer output is for architect consumption, not autonomous action. No agent other than the architect should treat Explorer output as a directive or verdict.

Test coverage. Role boundary hardening is validated by explorer-role-boundary.test.ts (25 tests) and explorer-consumer-contract.test.ts (58 tests). All 182 explorer tests pass. The test suite enforces that Explorer output contains only observational language, that VERDICT and MIGRATION_NEEDED never appear, and that routing authority is never asserted by Explorer.

Before and after examples:

The following judgmental language was systematically removed from Explorer output and replaced with observational language only:

Removed (judgmental)	Added (observational)
`VERDICT: INCOMPATIBLE — needs rewrite`	`COMPATIBILITY SIGNALS: INCOMPATIBLE — observable call signatures affected: 3`
`VERDICT: COMPATIBLE — safe to use`	`COMPATIBILITY SIGNALS: COMPATIBLE — no observable call signature conflicts detected`
`REVIEW NEEDED — dead code`	`FOLLOW-UP CANDIDATE AREAS: function has 12 parameters, 8 of which are unused in current call sites`
`MIGRATION_NEEDED: X must be rewritten`	`MIGRATION_SURFACE: 5 call sites reference the affected function signature`
`REVIEW NEEDED — missing labels`	`FOLLOW-UP CANDIDATE AREAS: 3 exported functions lack doc comments`

All directive language (must, should, needs, verdict, review needed, dead) was recast as observable conditions. Explorer reports facts; the architect decides what to do with them.

Designer: The Blueprint

UI/UX specification agent
Generates component scaffolds and design tokens before coding begins on UI-heavy tasks
Runs in MODE: EXECUTE before Coder (Rule 9)

SME: The Advisor

Single open-domain expert (any domain: security, ios, rust, kubernetes, etc.)
Consulted serially, one call per domain
Guidance cached in context.md
Read-only (cannot write code)

Pipeline Agents: The Hands

Coder: Implements one task at a time
Reviewer: Dual-pass review — general correctness first, then automatic security-only pass for security-sensitive files (OWASP Top 10 categories)
Test Engineer: Generates verification tests + adversarial tests (attack vectors, boundary violations, injection attempts)
Gates: Automated diff, imports, lint, and secretscan tools verify contracts, dependencies, style, and security before/during review.

Critic: The Gate

Reviews architect's plan BEFORE implementation begins
Returns APPROVED / NEEDS_REVISION / REJECTED
Read-only (cannot write code)

Docs: The Scribe

Documentation synthesizer
Automatically updates READMEs, API docs, and guides based on implementation changes
Runs in Phase 6 as part of project wrap-up

Execution Flow

Phase 0: Initialize or Resume

Is .swarm/plan.md present?
├── YES → Read plan.md and context.md
│         Find current phase and task
│         Resume execution
│
└── NO  → New project
          Proceed to Phase 1

Phase 1: Clarify

Is the user request clear?
├── YES → Proceed to Phase 2
│
└── NO  → Ask up to 3 clarifying questions
          Wait for answers
          Then proceed

Phase 2: Discover

@explorer analyzes codebase
    │
    ├── Project structure
    ├── Languages and frameworks
    ├── Key files
    ├── Patterns observed
    └── Relevant SME domains

Phase 3: Consult SMEs

For each relevant domain:
    │
    ├── Check context.md for cached guidance
    │   └── If cached → Skip this SME
    │
    └── If not cached:
        ├── Delegate to @sme with DOMAIN: [domain]
        ├── Wait for response
        └── Cache guidance in context.md

Phase 4: Plan

Create/Update .swarm/plan.md:
    │
    ├── Project overview
    ├── Phases (logical groupings)
    │   └── Tasks (atomic units of work)
    │       ├── Dependencies
    │       ├── Acceptance criteria
    │       └── Complexity estimate
    │
    └── Status tracking

Phase 4.5: Critic Gate

MODE: CRITIC-GATE

@critic reviews plan
    │
    ├── APPROVED → Proceed to MODE: EXECUTE
    ├── NEEDS_REVISION → Revise plan, resubmit (max 2 cycles)
    └── REJECTED → Escalate to user

MODE: EXECUTE

For each task in current phase:
    │
    ├── Check dependencies complete
    │   └── If blocked → Skip, mark [BLOCKED]
    │
    ├── 5a. @coder implements (ONE task only)
    │       └── → REQUIRED: Print task start confirmation
    │
    ├── 5b. diff + imports tools analyze changes + semantic diff injection
    │       ├── Detect contract changes (exports, interfaces, types)
    │       ├── Track import dependencies across files
    │       ├── system-enhancer injects AST-based semantic diff summary into reviewer context
    │       │   - Computes AST diffs for changed files (up to 10, from `declaredCoderScope`)
    │       │   - Enriches with blast radius (consumers count from repo graph)
    │       │   - Generates risk-ranked markdown for reviewer prioritization
    │       │   - Fully error-resilient: never throws, returns null on failure
    │       └── → REQUIRED: Print change summary
    │
    ├── 5c. syntax_check validates code syntax (v6.9.0)
    │       ├── Tree-sitter parse validation for 20 languages
    │       ├── Catches syntax errors before review
    │       └── → REQUIRED: Print syntax status
    │
    ├── 5d. placeholder_scan detects incomplete code (v6.9.0)
    │       ├── Scans for TODO/FIXME comments
    │       ├── Detects placeholder text and stub implementations
    │       └── → REQUIRED: Print placeholder scan results
    │
    ├── 5e. lint fix → lint:check (auto-fix then verify)
    │       ├── Run `lint` tool with fix mode, then check mode
    │       └── → REQUIRED: Print lint status
    │
    ├── 5f. imports audit analyzes dependencies (AST-based)
    │       ├── Track import dependencies across files
    │       └── → REQUIRED: Print import analysis
    │
    ├── 5g. build_check verifies compilation (v6.9.0)
    │       ├── Runs repo-native build/typecheck commands
    │       ├── Validates code compiles correctly
    │       └── → REQUIRED: Print build status
    │
    ├── 5h. pre_check_batch runs parallel verification (v6.10.0)
    │       ├── Runs 4 tools in parallel with p-limit (max 4 concurrent):
    │       │   ├── lint:check (code quality verification - hard gate)
    │       │   ├── secretscan (secret detection - hard gate)
    │       │   ├── sast_scan (security analysis - hard gate)
    │       │   └── quality_budget (maintainability metrics)
    │       ├── Returns unified result with gates_passed boolean
    │       ├── If gates_passed === false → Return to coder with specific failures
    │       └── → REQUIRED: Print gates_passed status

    ├── 5i. @reviewer reviews (correctness, edge-cases, performance)
    │       ├── APPROVED → Continue
    │       └── REJECTED → Retry from 5a (max 5)
    │       └── → REQUIRED: Print approval decision
    │
    ├── 5j. @reviewer security-only pass (if file matches security globs
    │       or coder output contains security keywords)
    │       ├── Security globs: auth, crypto, session, token, middleware, api, security
    │       ├── Uses OWASP Top 10 2021 categories
    │       └── → REQUIRED: Print security approval
    │
    ├── 5k. @test_engineer generates AND runs verification tests
    │       ├── PASS → Continue
    │       └── FAIL → Send failures to @coder, retry from 5a with RETRY protocol
    │       └── → REQUIRED: Print test results
    │
├── 5l. @test_engineer adversarial testing pass
│ ├── Attack vectors, boundary violations, injection attempts
│ ├── PASS → Continue
│ └── FAIL → Send failures to @coder, retry from 5a with RETRY protocol
│ └── → REQUIRED: Print adversarial test results
│
├── 5m. architect regression sweep (scope:"graph")
│ ├── Runs test_runner with scope:"graph" to find cross-task test regressions
│ ├── If no related tests beyond task scope → SKIP. Print "regression-sweep: SKIPPED — no related tests beyond task scope"
│ ├── If additional tests pass → PASS. Print "regression-sweep: PASS [N additional tests, M files]"
│ ├── If additional tests FAIL → return to coder: "REGRESSION DETECTED: Your changes broke [N] tests in [test files]"
│ └── → REQUIRED: Print regression-sweep result
│
├── 5n. ⛔ HARD STOP: Pre-commit checklist (v6.11.0)
    │       ├── [ ] All QA gates passed (lint:check, secretscan, sast_scan)
    │       ├── [ ] Reviewer approval documented
    │       ├── [ ] Tests pass with evidence
    │       └── [ ] No security findings
    │       └── → REQUIRED: Print checklist completion
    │       **No override. A commit without completed QA gate is a workflow violation.**
    │
    └── 5o. TASK COMPLETION CHECKLIST (v6.11.0)
            ├── Evidence written to .swarm/evidence/{taskId}/
            ├── update_task_status called with status='completed' (advances state machine to 'complete')
            └── → REQUIRED: Print completion confirmation

MODE: PHASE-WRAP

All tasks in phase done
│
├── 1. @explorer - Rescan codebase after changes
├── 2. @docs - Update documentation for all changes in this phase
├── 3. Update context.md with learnings and decisions
├── 4. Write retrospective evidence via write_retro tool
├── 4.5. Run evidence_check to verify all completed tasks have required evidence
├── 5. Run sbom_generate with scope='changed' for dependency snapshot
├── 5.5. Defense-in-depth drift check: Delegate to @critic_drift_verifier BEFORE phase_complete
│ - Returns early feedback on plan drift
│ - Architect calls `write_drift_evidence(phase, verdict, summary)` tool after critic_drift_verifier returns
│ - Writes drift verification evidence to .swarm/evidence/{phase}/drift-verifier.json
│ - Verdict automatically normalized: APPROVED → approved, NEEDS_REVISION → rejected
│ - Skip this step if spec.md does not exist
├── 5.6. Verify mandatory gate evidence exists:
│         - .swarm/evidence/{phase}/completion-verify.json (auto-written by completion-verify gate)
│         - .swarm/evidence/{phase}/drift-verifier.json (written by @critic_drift_verifier)
│         - .swarm/evidence/{phase}/hallucination-guard.json (if hallucination_guard enabled; written by write_hallucination_evidence)
│         - .swarm/evidence/{phase}/mutation-gate.json (if mutation_test enabled; written by write_mutation_evidence after generate_mutants + mutation_test)
│         If either missing: run the missing gate first
│         Note: All gates automatically bypassed in turbo mode
├── 6. Call phase_complete (enforces four phase gates automatically)
│         - Gate 1: completion-verify — deterministic identifier check in source files
│         - Gate 2: drift verifier evidence — reads drift-verifier.json for approved verdict
│         - Gate 3: hallucination guard — reads hallucination-guard.json for approved verdict (if enabled)
│         - Gate 4: mutation gate — reads mutation-gate.json for pass/warn/fail verdict (if enabled)
│         - All four gates bypassed when turbo mode is active
└── 7. Ask user: "Ready for Phase [N+1]?"

Phase Completion Gates (v6.33.4)

The phase_complete tool enforces two mandatory gates before marking a phase complete:

Gate	Purpose	Blocking Reason	Turbo Bypass
`completion-verify`	Deterministic check that plan task identifiers exist in source files	`COMPLETION_INCOMPLETE` — zero identifiers found in target files	Yes
`drift-verifier`	Evidence-based check that `critic_drift_verifier` approved the implementation	`DRIFT_VERIFICATION_MISSING` or `DRIFT_VERIFICATION_REJECTED`	Yes
`hallucination-guard`	Evidence-based check that `critic_hallucination_verifier` approved plan/implementation claims	`HALLUCINATION_VERIFICATION_MISSING` or `HALLUCINATION_VERIFICATION_REJECTED`	Yes
`mutation-test`	Evidence-based check that mutation tests achieved a passing kill rate	`MUTATION_GATE_MISSING` or `MUTATION_GATE_FAIL`	Yes

Gate 1: Completion Verify

Parses plan task descriptions for identifiers (backtick, camelCase, PascalCase, config keys)
Scans target source files for matches
Blocks if zero identifiers found in any task's target files
Non-blocking on errors — treats as warning and continues

Gate 2: Drift Verifier Evidence

Reads .swarm/evidence/{phase}/drift-verifier.json
Checks for entry with type containing 'drift' and verdict of 'approved'
Blocks if evidence missing or verdict is 'rejected'
Defense-in-depth: architect should delegate to @critic_drift_verifier BEFORE calling phase_complete for early feedback
Uses write_drift_evidence tool to persist verification results (verdict is normalized automatically)

Gate 3: Hallucination Guard (optional)

Only enforced when hallucination_guard: true in the QA gate profile
Reads .swarm/evidence/{phase}/hallucination-guard.json
Checks for entry with type containing 'hallucination' and verdict of 'approved'
Blocks if evidence missing or verdict is 'rejected'
Architect should delegate to @critic_hallucination_verifier and call write_hallucination_evidence before phase_complete
Enabled via set_qa_gates tool with hallucination_guard: true

Gate 4: Mutation Gate (optional)

Only enforced when mutation_test: true in the QA gate profile (default: OFF)
Reads .swarm/evidence/{phase}/mutation-gate.json
Checks for entry with type of 'mutation-gate' and verdict of 'pass', 'warn', or 'skip'
Blocks if verdict is 'fail' — resolve surviving mutants or lower threshold before continuing
Warns (proceeds) if verdict is 'warn'
Allows if verdict is 'pass' or 'skip'
Architect workflow: call generate_mutants to create patches, pipe patches to mutation_test tool, then call write_mutation_evidence with verdict and kill rate metrics before phase_complete
Enabled via set_qa_gates tool with mutation_test: true
Expensive: requires one LLM call per mutation phase due to LLM-based patch generation; OFF by default to avoid cost on all projects

Turbo Mode Behavior: When hasActiveTurboMode() returns true, all four gates are automatically bypassed (completion-verify, drift-verifier, hallucination-guard, mutation-test). The phase_complete tool logs a warning and proceeds without enforcement.

MODE Labels (v6.11)

Plan Cursor (v6.13)

The plan_cursor config enables a compact representation of the project plan that is injected into the LLM context, keeping the total token count under a configurable limit.

{
  "plan_cursor": {
    "enabled": true,
    "max_tokens": 1500,
    "lookahead_tasks": 2
  }
}

enabled – When true (default) Swarm injects a plan cursor instead of the full plan.md.
max_tokens – Upper bound on tokens emitted for the cursor (default 1500). The cursor includes the current phase summary, the full current task, and up to lookahead_tasks upcoming tasks. Earlier phases are reduced to one‑line summaries.
lookahead_tasks – Number of future tasks to include in full detail (default 2). Set to 0 to show only the current task.

Disabling ("enabled": false) falls back to the pre‑v6.13 behavior of injecting the entire plan text.

Tool Output Truncation (v6.13)

Controls the size of tool outputs sent back to the LLM.

{
  "tool_output": {
    "truncation_enabled": true,
    "max_lines": 150,
    "per_tool": {
      "diff": 200,
      "symbols": 100,
      "search": 200,
      "batch_symbols": 150,
      "suggest_patch": 50
    }
  }
}

truncation_enabled – Global switch (default true).
max_lines – Default line limit for any tool output.
per_tool – Overrides max_lines for specific tools. diff, symbols, search, and batch_symbols are truncated by default because their outputs can be very large. suggest_patch uses a conservative limit since patch output is typically compact.

When truncation is active, a footer is appended to the output:

---
[output truncated to {maxLines} lines – use `tool_output.per_tool.<tool>` to adjust]

Structured Search Tool — `search` (v6.45.0)

Added as the default structured search path for workspace pattern lookup. Replaces shell grep workarounds with a machine-readable JSON output.

Property	Type	Default	Description
`pattern`	`string`	required	Search pattern (literal or regex)
`workspace`	`string`	required	Root directory to search
`mode`	`"literal" \| "regex"`	`"literal"`	Match mode
`glob`	`string[]`	`[]`	Include globs (e.g. `["*/.ts"]`)
`exclude`	`string[]`	`[]`	Exclude globs
`maxResults`	`number`	`100`	Hard cap on returned matches
`maxLines`	`number`	`10`	Max lines per match snippet
`context`	`number`	`0`	Surrounding lines per match (only when > 0)

Returns structured JSON with file, line, text, and truncated fields per hit. Falls back to a graceful error when ripgrep is unavailable. Not a structural AST search — combine with symbols and imports for full module analysis.

Registered for: architect, coder, reviewer, explorer, test_engineer.

Reviewer-Safe Patch Suggestion Tool — `suggest_patch` (v6.45.0)

Generates structured diff hunks for a target file without modifying it. Used by the reviewer agent to deliver actionable remediation artifacts in read-only review passes.

Property	Type	Default	Description
`file`	`string`	required	Target file path
`patch`	`string`	required	Unified diff body
`context`	`number`	`3`	Context lines per hunk

Returns structured patch output with hunk anchors. Detects and reports stale context when the file content has changed since the patch was authored. Not a write tool — no file is modified. Not registered for coder agents.

Registered for: reviewer, architect.

Batched Symbol Extraction — `batch_symbols` (v6.45.0)

Extracts exported symbols from multiple files in a single call. Each file is processed independently with per-file error isolation — a parse failure in one file does not affect others.

Property	Type	Default	Description
`files`	`string[]`	required	List of file paths to analyse
`includeTopLevel`	`boolean`	`false`	Include non-exported top-level definitions

Returns a per-file symbol summary with exports, topLevel (optional), parseError, and empty fields. Benchmarks show 75–98% call reduction versus sequential single-file calls. Reuses symbols.ts parsing logic.

Registered for: architect, explorer, reviewer.

Tool-Name Normalization — `normalizeToolName` (v6.45.0)

Canonical helper for stripping namespace prefixes from tool names (e.g. mega:search → search, mega.search → search). Exposed via src/hooks/normalize-tool-name.ts.

Function	Description
`normalizeToolName(name)`	Returns the bare tool name string
`normalizeToolNameLowerCase(name)`	Returns the bare tool name in lowercase

Replaces 13 inline replace(/^[^:]+[:.]/, '') regex patterns that were duplicated across guardrails.ts, scope-guard.ts, index.ts, delegation-gate.ts, self-review.ts, and delegation-ledger.ts. Test-file sites are excluded — they validate the raw pattern behavior directly.

Mode Detection (v6.13)

Swarm now explicitly distinguishes five architect modes, which affect what injection blocks are added to the LLM prompt.

Mode	When Injected
`DISCOVER`	After the explorer finishes scanning the codebase.
`PLAN`	When the architect writes or updates the plan.
`EXECUTE`	During task implementation (the normal pipeline).
`PHASE-WRAP`	After all tasks in a phase are completed, before docs are updated.
`UNKNOWN`	Fallback when the current state does not match any known mode.

The Architect workflow uses explicit MODE labels internally to distinguish architect execution phases from project plan phases:

MODE	Description
`MODE: RESUME`	Detect and restore previous session state
`MODE: CLARIFY`	Ask clarifying questions for ambiguous requirements
`MODE: DISCOVER`	Explore codebase structure and patterns
`MODE: CONSULT`	Consult SMEs for domain guidance
`MODE: PLAN`	Create or update project plan (includes CODEBASE REALITY CHECK on brownfield)
`MODE: CRITIC-GATE`	Plan review checkpoint before execution
`MODE: EXECUTE`	Task implementation with QA gates
`MODE: PHASE-WRAP`	Phase completion and retrospective

CODEBASE REALITY CHECK (v6.29.2): Before any planning (MODE: PLAN), spec generation (MODE: SPECIFY), or plan import, the Architect dispatches Explorer in targeted scoped chunks to verify the current state of every referenced item. Produces a CODEBASE REALITY REPORT with statuses: NOT STARTED, PARTIALLY DONE, ALREADY COMPLETE, or ASSUMPTION INCORRECT. This prevents planning against stale assumptions (e.g., assuming a function exists when it was already fixed). Skipped for purely greenfield projects with no existing codebase references.

NAMESPACE RULE: MODE labels refer to the architect's internal workflow phases. Project plan phases (in .swarm/plan.md) remain as "Phase N" to avoid confusion.

Observable Output (v6.11)

All blocking steps require explicit printed output for visibility:

→ REQUIRED: Print {description}

This ensures:

Clear progress tracking through gates
Determinable failure points
Evidence of execution for debugging

Retry Protocol (v6.11)

On gate failure, emit structured rejection:

RETRY #{count}/5
FAILED GATE: {gate_name}
REASON: {specific failure}
REQUIRED FIX: {actionable instruction}
RESUME AT: {step_5x}

Failure Counting: Track retry count, escalate to user after 5 failures.

Anti-Exemption Rules (v6.11)

The following rationalization patterns are explicitly blocked:

"It's a simple change"
"Just updating docs"
"Only a config tweak"
"Hotfix, no time for QA"
"The tests pass locally"
"I'll clean it up later"
"No logic changes"
"Already reviewed the pattern"

Rule: There are NO simple changes. There are NO exceptions to the QA gate sequence.

Pre-Commit Rule (v6.11)

⛔ HARD STOP before marking any task complete:

All QA gates passed (no overrides)
Reviewer approval documented
Tests pass with evidence
No security findings

There is no override. A commit without a completed QA gate is a workflow violation.

Task Granularity Rules (v6.11)

Tasks classified by size with strict decomposition rules:

Size	Criteria	Decomposition Required
SMALL	1 file, single verb, <2 hours	No
MEDIUM	1-2 files, compound action, <4 hours	No
LARGE	>2 files OR compound verbs	Yes

Task Atomicity Checks (Critic validates):

Max 2 files per task (otherwise decompose)
No compound verbs ("and", "plus", "with") in task descriptions
Clear acceptance criteria required

File Structure

project/
├── .swarm/
│   ├── plan.md            # Legacy phased roadmap (migrated to plan.json)
│   ├── plan.json          # Machine-readable plan with Zod-validated schema
│   ├── context.md         # Project knowledge, SME cache
│   ├── evidence/          # Per-task execution evidence
│   │   ├── 1.1/           # Evidence for task 1.1
│   │   └── 2.3/           # Evidence for task 2.3
│   └── history/
│       ├── phase-1.md     # Archived phase summaries
│       └── phase-2.md
│
├── src/
│   ├── index.ts           # Plugin entry — registers 8 hook types
│   ├── state.ts           # Shared swarm state singleton (zero imports)
│   ├── agents/            # Agent definitions and factory
│   ├── config/            # Schema, constants, loader
│   ├── commands/          # Slash command handlers (12 commands)
│   │   ├── index.ts       # Factory + dispatcher (createSwarmCommandHandler)
│   │   ├── status.ts      # /swarm status
│   │   ├── plan.ts        # /swarm plan [N]
│   │   ├── agents.ts      # /swarm agents
│   │   ├── evidence.ts    # /swarm evidence [task]
│   │   ├── archive.ts     # /swarm archive [--dry-run]
│   │   └── reset.ts       # /swarm reset --confirm
│   ├── hooks/             # Hook handlers
│   │   ├── index.ts       # Barrel exports
│   │   ├── utils.ts       # safeHook, composeHandlers, readSwarmFileAsync, estimateTokens
│   │   ├── extractors.ts  # Plan/context file parsers
│   │   ├── pipeline-tracker.ts      # Message transform (pipeline logging)
│   │   ├── context-budget.ts        # Message transform (token budget warnings)
│   │   ├── system-enhancer.ts       # System prompt transform + cross-agent context
│   │   ├── compaction-customizer.ts # Session compaction enrichment
│   │   ├── agent-activity.ts        # Tool hooks (activity tracking + flush)
│   │   ├── delegation-tracker.ts    # Chat message hook (active agent tracking)
│   │   └── cc-command-intercept.ts  # CC command interception (hard-block /reset, soft-correct /plan)
│   ├── lang/             # Language profiles, framework detector (PHP, Laravel, etc.)
│   ├── tools/             # Domain detector, file extractor, gitingest, diff, retrieve-summary
│   ├── plan/              # Plan management
│   │   └── manager.ts     # load/save/migrate/derive plan operations
│   └── evidence/          # Evidence bundle management
│       ├── index.ts       # Barrel exports
│       └── manager.ts     # CRUD: save/load/list/delete/archive evidence
│
├── tests/unit/            # 1211 tests across 54+ files (bun test)
│   ├── agents/            # creation (64), factory (20), architect-v6-prompt (15),
│   │                      # security-categories (12)
│   ├── config/            # constants (14), schema (35), loader (17), plan-schema (40),
│   │                      # evidence-schema (23), evidence-config (8),
│   │                      # review-integration-schemas (20)
│   ├── hooks/             # pipeline-tracker (16), utils (25), system-enhancer (58),
│   │                      # compaction-customizer (26), context-budget (23),
│   │                      # extractors (32), agent-activity (14), delegation-tracker (16),
│   │                      # guardrails (39), system-enhancer-v6 (18), cc-command-intercept (23 unit + 22 adversarial)
│   ├── commands/          # status (6), plan (9), agents (28), index (11),
│   │                      # archive (8), benchmark (5)
│   ├── evidence/          # manager (25)
│   ├── plan/              # manager (40)
│   ├── tools/             # domain-detector (30), file-extractor (16), gitingest (5),
│   │                      # diff (22), retrieve-summary (28)
│   ├── smoke/             # packaging (8)
│   └── state.test.ts      # Shared state (31)
│
├── tests/adversarial/     # 22 adversarial tests for cc-command-intercept (bun test)
└── dist/                  # Build output (ESM)

plan.md Schema

# Project: [Name]
Created: [ISO date]
Last Updated: [ISO date]
Current Phase: [N]

## Overview
[1-2 paragraph project summary]

## Phase 1: [Name] [STATUS]
Estimated: [SMALL/MEDIUM/LARGE]

- [x] Task 1.1: [Description] [SIZE]
  - Acceptance: [Criteria]
- [ ] Task 1.2: [Description] [SIZE] (depends: 1.1)
  - Acceptance: [Criteria]
  - Attempt 1: REJECTED - [Reason]
  - Attempt 2: REJECTED - [Reason]
- [BLOCKED] Task 1.3: [Description]
  - Reason: [Why blocked]

## Phase 2: [Name] [PENDING]
...

context.md Schema

# Project Context: [Name]

## Summary
[What the project does, who it's for]

## Technical Decisions
- [Decision]: [Rationale]

## Architecture
[Key patterns, file organization]

## SME Guidance Cache
### [Domain] (Phase [N])
- [Guidance point]

## Patterns Established
- [Pattern]: [Where/how used]

## Known Issues / Tech Debt
- [ ] [Issue to address later]

## File Map
- [path]: [Purpose]

Agent Permissions

Agent	Read	Write	Execute	Delegate
architect	✅	✅	✅	✅
explorer	✅	❌	❌	❌
sme	✅	❌	❌	❌
coder	✅	✅	✅	❌
reviewer	✅	❌	❌	❌
critic	✅	❌	❌	❌
test_engineer	✅	✅	✅	❌

Architect-Only Tools

Some tools are restricted to the architect and cannot be called by other agents:

Tool	Purpose
`update_task_status`	Mark plan tasks as `pending \| in_progress \| completed \| blocked`
`write_retro`	Write retrospective evidence bundles before `phase_complete`
`declare_scope`	Pre-declare which files the coder may modify for a given task

File Locking for Concurrent Write Safety

Swarm uses file locking to prevent concurrent writes from corrupting shared state files. Even though tasks execute serially, agents may still race on plan.json and events.jsonl during phase transitions or when multiple sessions interact with the same project.

Implementation

Library: proper-lockfile with retries: 0 (fail-fast — no polling)
Lock acquisition: tryAcquireLock(directory, filename, agentName, taskId) before every write to a shared state file
Lock release: _release() called in a finally block to ensure cleanup even on error
Tagging: Each lock records the agent name and task context for diagnostics

Protected Files

File	Tool	Lock Behavior
`.swarm/plan.json`	`update_task_status`	Exclusive lock acquired before calling `updateTaskStatus()`. Lock losers return `success: false` with `recovery_guidance: "retry"`.
`.swarm/events.jsonl`	`phase_complete`	Lock acquired before `appendFileSync`. If lock is unavailable, logs a warning and proceeds without lock protection (non-blocking).

`update_task_status` Lock Semantics

Caller A: tryAcquireLock(plan.json) → acquired=true → write → _release()
Caller B: tryAcquireLock(plan.json) → acquired=false → returns { success: false, recovery_guidance: "retry" }

Only one call wins. The loser receives:

{
  "success": false,
  "message": "Task status write blocked: plan.json is locked by architect (task: update-task-status-1.1-1234567890)",
  "errors": ["Concurrent plan write detected — retry after the current write completes"],
  "recovery_guidance": "Wait a moment and retry update_task_status. The lock will expire automatically if the holding agent fails."
}

The architect should retry after a short delay. Sequential calls (no contention) always succeed.

Why Not Just Serial Execution?

Tasks execute serially, but the architect may dispatch multiple concurrent agent sessions or tool calls that race on plan/event file writes. Locking provides a second layer of protection against corruption from:

Concurrent update_task_status calls from different sessions
phase_complete concurrent appends to events.jsonl
Race conditions during session rehydration or recovery

Lock Recovery

If the lock holder crashes, the OS or lock library will eventually clean up the stale lock file. On retry, the next call will acquire the lock and proceed. Swarm does not auto-retry on lock contention — the architect receives the error and decides when to retry.

Failure Handling

Task Rejection

Attempt 1: @coder implements
           @reviewer rejects with feedback
           
Attempt 2: @coder fixes based on feedback
           @reviewer rejects again
           
Attempt 3: @coder fixes again
           @reviewer rejects
           
Escalation: Architect handles directly
            OR re-scopes task
            Document in plan.md

Blocked Tasks

Task cannot proceed (external dependency):
├── Mark [BLOCKED] in plan.md
├── Record reason
├── Skip to next unblocked task
└── Inform user

Agent Failure

Agent times out or errors:
├── Retry once
├── If still failing:
│   └── Architect handles directly
└── Document in context.md

Model Fallback (v6.33)

When an agent encounters a transient model error (rate limit, 429, 503, timeout, overloaded, model not found), the guardrails hook detects the failure and triggers fallback behavior:

Detection: The toolAfter hook in guardrails.ts checks for null/undefined tool output combined with error messages matching TRANSIENT_MODEL_ERROR_PATTERN:

const TRANSIENT_MODEL_ERROR_PATTERN =
  /rate.?limit|429|503|timeout|overloaded|model.?not.?found|temporarily unavailable|server error/i;

Behavior on transient error:

Increment session.model_fallback_index (position in fallback_models array)
Set session.modelFallbackExhausted = true (prevents advisory spam)
Inject MODEL FALLBACK advisory into session.pendingAdvisoryMessages

Behavior on success:

Reset session.model_fallback_index = 0
Reset session.modelFallbackExhausted = false

Configuration: Per-agent fallback_models array (max 3) in AgentOverrideConfigSchema:

export const AgentOverrideConfigSchema = z.object({
  model: z.string().optional(),
  temperature: z.number().min(0).max(2).optional(),
  disabled: z.boolean().optional(),
  fallback_models: z.array(z.string()).max(3).optional(),
});

Bounded Coder Revisions (v6.33)

When a task requires multiple coder attempts (e.g., reviewer rejections), the guardrails hook tracks revision counts and warns when limits are approached.

State fields:

session.coderRevisions — Number of times the coder has been re-delegated for the current task
session.revisionLimitHit — Boolean flag set when coderRevisions >= max_coder_revisions

Detection: The toolAfter hook in guardrails.ts increments session.coderRevisions when a coder delegation completes:

// In guardrails.ts toolAfter handler
if (delegation.isDelegation && delegation.targetAgent === 'coder') {
  if (!session.revisionLimitHit) {
    session.coderRevisions++;
    const maxRevisions = cfg.max_coder_revisions ?? 5;
    if (session.coderRevisions >= maxRevisions) {
      session.revisionLimitHit = true;
      session.pendingAdvisoryMessages.push(
        `CODER REVISION LIMIT: Agent has been revised ${session.coderRevisions} times ` +
          `(max: ${maxRevisions}) for task ${session.currentTaskId ?? 'unknown'}. ` +
          `Escalate to user or consider a fundamentally different approach.`
      );
    }
  }
}

Reset: coderRevisions resets to 0 when a new coder delegation is dispatched (unless revisionLimitHit is already true).

Configuration: guardrails.max_coder_revisions in GuardrailsConfigSchema (default: 5, range: 1–20).

Snapshot persistence: Both coderRevisions and revisionLimitHit are serialized in SerializedAgentSession and restored on session rehydration.

Why Serial Execution?

Parallel execution causes:

Race conditions in file modifications
Context inconsistency between agents
Non-deterministic outputs
Debugging nightmares

Serial execution provides:

Predictable order of operations
Clear causal chain
Reproducible results
Easy debugging

Correctness > Speed

Why QA Per Task?

QA at the end causes:

Accumulated bugs
Cascading failures (Task 3 builds on buggy Task 2)
Massive rework
Lost context on what each task was supposed to do

QA per task provides:

Immediate feedback
Issues fixed while context is fresh
No bug accumulation
Clear task boundaries

v6.9.0 Quality Gates (6 New Gates)

v6.9.0 "Quality & Anti-Slop Tooling" adds 6 automated gates to the pre-reviewer pipeline. v6.10.0 adds parallel batch execution for faster QA gates:

Gate	Purpose	Local-Only
`syntax_check`	Tree-sitter parse validation across 20 languages	✅
`placeholder_scan`	Anti-slop detection for TODO/FIXME/stubs	✅
`sast_scan`	Static security analysis with 63+ rules	✅
`sbom_generate`	CycloneDX SBOM generation for dependencies	✅
`build_check`	Build/typecheck verification	✅
`pre_check_batch`	Parallel verification batch (4x faster)	✅
`quality_budget`	Maintainability threshold enforcement	✅

Local-Only Guarantee: All v6.9.0 gates run without Docker, network connections, external APIs, or cloud services. Optional enhancement via Semgrep (if already on PATH).

Why Persistent Files?

Session-only memory causes:

Lost progress on session end
No way to resume projects
Re-explaining context every time
No institutional knowledge

Persistent .swarm/ files provide:

Resume any project instantly
Knowledge transfer between sessions
Audit trail of decisions
Cached SME guidance (no re-asking)

Hooks System

The hooks system is the foundation of v5.1.x+, extended in v6.0.0 with config-aware hint injection. All features are built as hook handlers registered on OpenCode's Plugin API.

Core Utilities

safeHook(handler) — Wraps any hook handler in a try/catch. Errors are logged at warning level; the original payload is returned unchanged. This ensures no hook can crash the plugin.
composeHandlers<I,O>(...handlers) — Composes multiple handlers for the same hook type into a single handler. Runs handlers sequentially on shared mutable output. Each handler is individually wrapped in safeHook.
readSwarmFileAsync(directory, filename) — Reads .swarm/ files using Bun.file().text(). Returns empty string on missing files.
estimateTokens(text) — Conservative token estimation: Math.ceil(text.length * 0.33).

Hook Registration Table

Hook Type	Handler	Purpose
`experimental.chat.messages.transform`	`composeHandlers(pipelineTracker, contextBudget, ccCommandInterceptHook)`	Pipeline logging + token budget warnings; progressive task disclosure; deliberation preamble injection; tier-based behavioral prompt trimming; CC command interception (hard-block /reset, soft-correct /plan, log HIGH severity)
`experimental.chat.system.transform`	`systemEnhancerHook`	Inject phase/task/decisions + cross-agent context; reviewer receives semantic AST diff summary with blast radius (consumers count) from `buildSemanticDiffBlock()`
`experimental.session.compacting`	`compactionHook`	Enrich compaction with plan.md + context.md data
`command.execute.before`	`safeHook(commandHandler)`	Handle `/swarm` slash commands
`tool.execute.before`	`safeHook(activityHooks.toolBefore)`	Track tool usage per agent; append written file paths to `modifiedFilesThisCoderTask`; reset tracking on coder delegation
`tool.execute.after`	`safeHook(activityHooks.toolAfter)`	Record tool results + trigger flush; advance per-task state machine on gate completions; populate `lastGateOutcome`; check scope containment after coder task; set `lastScopeViolation` on drift
`chat.message`	`safeHook(delegationHandler)`	Track active agent per session; extract FILE: directives into `declaredCoderScope`; advance state to `coder_delegated` on new coder delegation

Composition Constraint

The OpenCode Plugin API allows one handler per hook type. When multiple features need the same hook type (e.g., pipeline-tracker and context-budget both use experimental.chat.messages.transform), they must be composed via composeHandlers() into a single registered handler.

Intelligence & Audit Tools (v6.5)

Five new tools extend the architect's decision-making capabilities with intelligence gathering and QA auditing, plus a framework detection layer for first-class framework-specific tooling.

`todo_extract` — Annotation Scanner

Extracts TODO, FIXME, and HACK annotations across the codebase using regex matching and file discovery (Node.js native glob for cross-platform safety).

Usage: Phase 0 (resume check) or Phase 2 (discovery) to identify pre-existing work items and prioritize planning.

Input: paths (directory whitelist), tags (annotation types), exclude (directory patterns)
Output: Structured JSON with file, line, tag, and content for each annotation

Safety: Validates paths against workspace root, rejects shell metacharacters, enforces file size limits

`evidence_check` — Completeness Auditor

Audits completed tasks in .swarm/evidence/ against required evidence types (review, test, diff, approval) and ensures a valid retrospective evidence bundle exists before a phase can be completed.

Usage: Phase 6 (phase complete) to verify every task has sufficient QA artifacts

Input: Task ID pattern (wildcard support)
Output: JSON with per-task evidence status, missing types, and overall completeness score

Safety: Validates task ID format (N.M, N.M.P, retro-N, or internal tool IDs), skips symlinks, reads JSON with size limits

`check_gate_status` — Task Gate Status Query

Read-only tool to query the gate status of a specific task. Reads .swarm/evidence/{taskId}.json and EvidenceBundle entries and returns structured JSON describing which gates have passed, which are missing, and the overall task status.

Usage: Any phase to check task completion status without mutating evidence

Input: task_id (task identifier in N.M, N.M.P, retro-N format, or internal tool ID like "sast_scan", "quality_budget", etc.)
Output: JSON with taskId, status (all_passed|incomplete|no_evidence), required_gates, passed_gates, missing_gates, gates map, message, todo_scan (advisory TODO count if available), and secretscan_verdict (v6.33)

Secretscan verdict (v6.33): The tool now scans EvidenceBundle entries for secretscan type evidence using the isSecretscanEvidence type guard. When secretscan entries are found, it reports:

pass — No secrets found (verdict: pass, approved, or info)
fail — Secrets detected (verdict: fail or rejected), status becomes incomplete and task is BLOCKED
not_run — No secretscan evidence found for this task

Safety: Validates task ID format against three accepted patterns (canonical N.M or N.M.P numeric format, retrospective format retro-N, or internal tool IDs like sast_scan/quality_budget/syntax_check/placeholder_scan/sbom_generate/build/secretscan), enforces path containment within workspace .swarm/evidence/ directory, reads-only (no writes)

`pkg_audit` — Vulnerability Scanner

Wraps npm audit, pip-audit, cargo audit, and composer audit via Bun.spawn to identify security vulnerabilities in project dependencies.

Usage: Phase 2 (discovery) or Phase 6 (phase complete) to scope security risk and feed results to reviewer

Input: ecosystem (npm|pip|cargo|composer), days (vulnerability age), top_n (limit results)
Output: Structured CVE data with severity, patched versions, and advisory URLs

Safety: Validates enum args strictly, bounds-checks integers (1-365 days, 1-100 results), enforces timeout via Promise.race

`complexity_hotspots` — Risk Mapper

Combines cyclomatic complexity analysis with git churn metrics to identify high-risk modules before implementation.

Usage: Phase 0/2 (early warning) or Phase 6 (post-implementation assessment) to flag modules needing stricter QA

Input: paths (file patterns), metrics (complexity|churn|both)
Output: Ranked list of risky files with complexity score, recent commits, and risk level

Safety: Uses Bun.spawn for git commands (not shell pipes), parses output in JavaScript, cross-platform path handling

`schema_drift` — API Contract Auditor

Compares OpenAPI specification files against actual route implementations to surface undocumented routes and phantom spec paths.

Usage: Phase 6 (when API routes were modified) to catch documentation drift before release

Input: spec_file (path to OpenAPI JSON/YAML), routes_dir (implementation directory)
Output: Drift report with missing implementations, extra routes, and parameter mismatches

Safety: Validates spec file extension whitelist and size limits (<10MB), uses lstatSync to skip symlinks, YAML parsing with regex g flag for multi-line patterns

`doc_scan` + `doc_extract` — Two-Pass Documentation Discovery

Implements a two-pass progressive disclosure for documentation: Pass 1 indexes docs at plan time, Pass 2 extracts relevant constraints at task start.

Pass 1: `doc_scan` — Documentation Indexer

Scans project documentation files and builds an index manifest at .swarm/doc-manifest.json.

Usage: Phase 4 (PLAN mode) to build documentation index before tasks begin

Input: force (optional boolean, force re-scan even if cache is valid)
Output: JSON with success, files_count, cached, and full manifest object

Manifest schema:

interface DocManifest {
  schema_version: 1;
  scanned_at: string;        // ISO timestamp
  files: DocManifestFile[];  // Sorted by path
}

interface DocManifestFile {
  path: string;     // Relative to project root
  title: string;    // First # heading or filename
  summary: string;  // First non-empty paragraph (max 200 chars)
  lines: number;    // Total line count
  mtime: number;   // fs.statSync().mtimeMs for cache invalidation
}

Discovery patterns: Uses doc_patterns from DocsConfigSchema plus extras (ARCHITECTURE.md, CLAUDE.md, AGENTS.md, .github/*.md, doc/**/*.md). Skips node_modules, .git, .swarm, test files, and type definitions.

Caching: mtime-based — only re-scans if any indexed file has changed since last scan.

Pass 2: `doc_extract` — Constraint Extractor

Reads the manifest, scores docs against task context using Jaccard bigram similarity, and extracts actionable constraints into .swarm/knowledge.jsonl.

Usage: Phase 5 (EXECUTE mode) at task start to load relevant documentation constraints

Input: task_files (array of file paths), task_description (string)
Output: JSON with success, extracted count, skipped count, and per-doc details with score and constraints

Algorithm:

Read .swarm/doc-manifest.json (or generate via doc_scan if missing)
Build task context from task_files + task_description
Compute Jaccard bigram similarity between task context and each doc's {path, title, summary}
For docs with score > 0.1 relevance threshold, read full content and extract constraints
Extract lines matching constraint patterns: MUST, MUST NOT, SHOULD, SHOULD NOT, DO NOT, ALWAYS, NEVER, REQUIRED; or bullet points with action words
Dedup against existing .swarm/knowledge.jsonl entries via findNearDuplicate (0.6 similarity threshold)
Append non-duplicate constraints as SwarmKnowledgeEntry objects

Constraints: Max 5 per document, 15-200 characters each, markdown stripped.

`framework_detector` — Laravel Framework Detection (Phase 3)

Detects Laravel projects via multi-signal logic in src/lang/framework-detector.ts. Requires 2-of-3 signals for a positive detection:

Signal	Check
`artisan_file`	`artisan` CLI script exists at project root
`laravel_dep`	`laravel/framework` present in `composer.json`
`app_config`	`config/app.php` exists

Usage: During discovery (Phase 2) or before test command resolution, to activate Laravel-specific tooling overlays.

Output:

interface LaravelDetectionSignals {
  hasArtisanFile: boolean;
  hasLaravelDep: boolean;
  hasAppConfig: boolean;
  signals: ('artisan_file' | 'laravel_dep' | 'app_config')[];
  detected: boolean;  // true when 2+ signals present
}

interface LaravelCommandOverlay {
  test: string;           // "php artisan test"
  lint: string[];         // ["vendor/bin/pint", "vendor/bin/php-cs-fixer fix"]
  analyze: string[];       // ["vendor/bin/phpstan analyse", "vendor/bin/phpstan analyse --memory-limit=1G"]
  audit: string;          // "composer audit --locked --format=json"
}

SAST rules active for Laravel projects: sast/php-laravel-sql-injection, sast/php-laravel-mass-assignment, sast/php-laravel-destructive-migration

**Common Security Patterns

All five tools follow strict security practices:

Path validation: path.resolve() + startsWith(workspaceRoot + path.sep) prevents traversal bypass
Command execution: Bun.spawn with array args (never string concat) to prevent shell injection
Timeout protection: Promise.race on all async operations to prevent hangs
Input validation: Enum/range checks on all user-supplied arguments
File access: Node.js native fs (not shell grep/find) for cross-platform safety

Context Pruning

The context pruning system now incorporates several new controls introduced in v6.14.12:

Provider‑aware model limits – token limits are looked up per model via context_budget.model_limits (e.g., a higher limit for Claude Sonnet). This allows the system to adapt the budget based on the active provider.
Priority‑based pruning tiers – messages are classified using the MessagePriority tiers (CRITICAL, HIGH, MEDIUM, LOW, DISPOSABLE). Lower‑priority messages are removed first when the token budget is exceeded.
Agent‑switch enforcement – when enforce_on_agent_switch is true, a hard context reset is triggered whenever the active agent changes (e.g., from explorer to coder).
Tool‑output masking – large tool outputs are masked/truncated once they exceed tool_output_mask_threshold tokens, preventing budget overruns.

These enhancements work together to keep the architect’s context within limits while preserving the most important information.

Context pruning manages the architect's context window to prevent overflow.

Token Budget Tracker

Registered on experimental.chat.messages.transform (composed with pipeline-tracker):

Estimates total tokens across all message parts using estimateTokens()
Looks up model-specific token limit from context_budget.model_limits config (default: 128,000)
At warn_threshold (default 70%): injects [CONTEXT WARNING] message
At critical_threshold (default 90%): injects [CONTEXT CRITICAL] message

Compaction Enhancement

Registered on experimental.session.compacting:

Reads .swarm/plan.md: extracts current phase + incomplete tasks
Reads .swarm/context.md: extracts decisions + patterns
Injects these as compaction context strings (max 500 chars each)
When .swarm/summaries/ exists and contains files:
- Injects [CONTEXT OPTIMIZATION] hint: instructs LLM to replace large tool output blocks (bash, test_runner, lint, diff) with retrieve references, preserving tool name, exit status, and errors
- Injects [STORED OUTPUTS] count showing how many tool outputs are stored
Guides OpenCode's built-in compaction to preserve swarm-relevant context

System Prompt Enhancement

Registered on experimental.chat.system.transform:

Injects current phase + task from plan.md (~200 chars)
Injects top 3 most recent decisions from context.md
Keeps agents focused even after conversation history is compacted
Respects max_injection_tokens budget (default: 4,000 tokens)
Priority ordering: phase → task → decisions → agent context
Lower-priority items dropped when budget is exhausted
v6.0.0: Injects config override hints for always_security_review and integration_analysis.enabled when non-default values are detected

Evidence System

The evidence system persists verifiable execution artifacts per task.

Evidence Types

Type	Fields	Purpose
`review`	risk, issues[]	Reviewer findings
`test`	tests_passed, tests_failed	Test engineer results
`diff`	files_changed[], additions, deletions	Code change summary
`approval`	(base fields only)	Explicit approval record
`note`	(base fields only)	Free-form annotation
`secretscan`	findings_count, scan_directory, files_scanned, skipped_files	Secret scan results (v6.33)
`mutation-gate`	verdict, killRate, adjustedKillRate, summary, survivedMutants[]	Phase-close mutation testing gate results — written by `write_mutation_evidence` tool; blocks phase completion when verdict is `fail` (v6.68+)

Storage

.swarm/evidence/
├── 1.1/
│   └── evidence.json    # EvidenceBundleSchema (array of entries)
└── 2.3/
    ├── evidence.json
    └── diff.patch       # Optional raw diff

Security

Task IDs are sanitized: regex ^[\w-]+(\.[\w-]+)*$, rejects .., null bytes, control chars
Two-layer path validation: sanitize task ID + validateSwarmPath() on full path
Size limits: JSON 500KB, diff.patch 5MB, total per task 20MB
Atomic writes via temp+rename pattern

Retention

Configurable via evidence config:

max_age_days: Archive evidence older than N days (default: 90)
max_bundles: Maximum evidence bundles before auto-archive (default: 1000)
auto_archive: Enable automatic archiving (default: false)

Quality Gates (v6.9.0)

Six new automated gates enforce code quality before human review. All gates run locally without Docker or network dependencies.

Gate Overview

Gate	Function	Fail Action
`syntax_check`	Tree-sitter parse validation	Return to coder for fix
`placeholder_scan`	Detect TODO/FIXME/stubs	Return to coder to complete
`sast_scan`	Static security analysis (66 rules)	Return to coder for fix
`sbom_generate`	CycloneDX SBOM generation	Log for audit trail
`build_check`	Build/typecheck verification	Return to coder for fix
`pre_check_batch`	Parallel verification (v6.10.0)	Return to coder for fix
`quality_budget`	Maintainability enforcement	Return to coder or adjust limits

syntax_check - Parse Validation

Uses Tree-sitter grammars for 20 languages:

TypeScript/JavaScript
Python
Rust
Go
Java
C/C++
Ruby
PHP
C#

Fail condition: Parse errors, unclosed brackets, invalid syntax Resolution: Coder fixes syntax errors before review

placeholder_scan - Anti-Slop Detection

Detects patterns indicating incomplete implementation:

TODO, FIXME, XXX, HACK comments
Placeholder strings (placeholder, stub, implement me)
Empty function bodies
Hardcoded dummy values

Fail condition: Any placeholder pattern in changed files Resolution: Coder completes implementation before review

sast_scan - Static Security Analysis

66+ security rules across 9 languages covering:

SQL injection vectors (including Laravel-specific DB::raw() concatenation)
Path traversal patterns
Hardcoded secrets
Insecure crypto usage
XSS vulnerabilities
Command injection
Laravel-specific: mass-assignment via empty $guarded, destructive migrations without rollback

Offline operation: Built-in rule engine, no external API calls Optional enhancement: Semgrep Tier B rules if available on PATH Fail condition: High/critical severity findings Resolution: Coder fixes security issues before review

sbom_generate - Dependency Tracking

Generates CycloneDX SBOMs from manifest files:

package.json + package-lock.json (npm)
requirements.txt, Pipfile, poetry.lock (Python)
Cargo.toml + Cargo.lock (Rust)
go.mod + go.sum (Go)
pom.xml, build.gradle (Java)
Gemfile.lock (Ruby)
composer.lock (PHP)
.csproj + packages.lock.json (C#)

Output: CycloneDX JSON format Purpose: Security auditing, license compliance Fail condition: None (informational gate)

build_check - Compilation Verification

Runs repository-native build commands:

npm run build / tsc --noEmit (TypeScript)
cargo build / cargo check (Rust)
go build (Go)
javac / Maven / Gradle (Java)
python -m py_compile (Python)

Fail condition: Build errors, type check failures Resolution: Coder fixes build errors before review

quality_budget - Maintainability Enforcement

Enforces configurable thresholds on code changes:

Budget	Default	Description
`max_complexity_delta`	5	Maximum cyclomatic complexity increase
`max_public_api_delta`	10	Maximum new public API surface
`max_duplication_ratio`	0.05	Maximum code duplication ratio (5%)
`min_test_to_code_ratio`	0.3	Minimum test-to-code ratio (30%)

Fail condition: Budget exceeded Resolution: Refactor code or adjust budget thresholds

pre_check_batch - Parallel Verification (v6.10.0)

Runs four verification tools in parallel for 4x faster gate execution:

Tool	Purpose	Gate Type
`lint:check`	Code quality verification	Hard gate
`secretscan`	Secret/credential detection	Hard gate
`sast_scan`	Static security analysis	Hard gate
`quality_budget`	Maintainability metrics	Hard gate

Parallel Execution:

Uses p-limit with max 4 concurrent operations
60-second timeout per tool
500KB combined output limit
Individual failures don't cascade

Return Value:

{
  "gates_passed": true,
  "lint": { "ran": true, "result": {}, "duration_ms": 1200 },
  "secretscan": { "ran": true, "result": {}, "duration_ms": 800 },
  "sast_scan": { "ran": true, "result": {}, "duration_ms": 2500 },
  "quality_budget": { "ran": true, "result": {}, "duration_ms": 400 },
  "total_duration_ms": 3200
}

Configuration:

{
  "pipeline": {
    "parallel_precheck": true  // default: true
  }
}

Fail condition: Any hard gate fails (lint errors, secrets found, SAST findings, budget exceeded) Resolution: Fix specific failures identified in tool results and retry

Local-Only Guarantee

All v6.9.0 quality gates:

✅ Run entirely locally
✅ No Docker containers required
✅ No network connections
✅ No external APIs
✅ No cloud services

Optional enhancement:

Semgrep CLI (if already installed on PATH)

Slash Commands

Twelve commands registered under /swarm:

Command	Description
`/swarm status`	Shows current phase, task progress (completed/total), and agent count
`/swarm plan`	Displays full plan.md content
`/swarm plan N`	Displays only Phase N from plan.md
`/swarm agents`	Lists all registered agents with model, temperature, read-only status, and guardrail profiles
`/swarm history`	View completed phases with status icons
`/swarm config`	View current resolved plugin configuration
`/swarm diagnose`	Health check for .swarm/ files, plan structure, and evidence completeness
`/swarm export`	Export plan and context as portable JSON
`/swarm reset --confirm`	Clear swarm state files (with safety gate)
`/swarm evidence [task]`	View evidence bundles for a task or list all tasks with evidence
`/swarm archive [--dry-run]`	Archive old evidence bundles with retention policy
`/swarm benchmark`	Run performance benchmarks and display metrics
`/swarm retrieve [id]`	Retrieve auto-summarized tool outputs by ID

Implementation

Commands are registered in two steps:

config hook — Adds swarm command to OpenCode's command registry
command.execute.before hook — Intercepts /swarm commands and routes to handlers

The command handler uses a factory pattern: createSwarmCommandHandler(directory, agents) creates a closure over the project directory and agent definitions, returning a handler function.

Context Engineering (v6.21 Phase 4)

Four features added to delegation-gate.ts and guardrails.ts reduce context waste and improve model decision quality.

Progressive Task Disclosure

When the last user message contains more than 5 task lines (matching - [ ] or - [x] with task IDs), messagesTransform trims the list to a context window: the current task (from session.currentTaskId), 2 tasks before it, and 3 tasks after it. All other message content is preserved. A [Task window: showing N of M tasks] comment marks the trim point. If no currentTaskId is set, no trimming occurs.

Deliberation Preamble

At the end of every messagesTransform invocation, a preamble is prepended to the last user message:

[Last gate: {tool} {result} for task {taskId}]
[DELIBERATE: Before proceeding — what is the SINGLE next task? What gates must it pass?]

The preamble is sourced from session.lastGateOutcome. On first invocation (lastGateOutcome is null), an introductory variant fires instead:

[DELIBERATE: Identify the first task from the plan. What gates must it pass before marking complete?]

Low-Capability Model Detection

src/config/constants.ts exports:

LOW_CAPABILITY_MODELS: string[] — Substrings that identify smaller models: 'mini', 'nano', 'small', 'free'
isLowCapabilityModel(modelId: string): boolean — Returns true if modelId (case-insensitive) contains any entry in LOW_CAPABILITY_MODELS. Uses session.activeModel as the input, the same field used for rate limiting in model-limits.ts.

Tier-Based Behavioral Prompt Trimming

Three  …  marker pairs wrap verbose behavioral sections in src/agents/architect.ts:

The BATCHING DETECTION section
The ARCHITECT CODING BOUNDARIES section
The QA gate behavioral description paragraphs

When isLowCapabilityModel(session.activeModel) returns true, guardrails.ts messagesTransform strips all text between every marker pair (inclusive) and replaces each removed block with [Enforcement: programmatic gates active]. If session.activeModel is null or undefined, the prompt is left unchanged.

Rationale: Smaller models benefit from shorter, more directive prompts. The programmatic enforcement mechanisms (state machine, hard blocks, scope containment) provide equivalent safety guarantees without verbose behavioral instructions consuming context.

Structural Scope Declaration (v6.21 Phase 5)

`declare_scope` Tool

Architect-only tool that pre-declares which files a coder delegation is allowed to modify.

Input:

{
  taskId: string,          // N.M or N.M.P format
  files: string[],         // file paths the coder may modify
  whitelist?: string[],    // additional allowed paths
  working_directory?: string
}

Validation:

taskId must match N.M or N.M.P format
files must be non-empty; no null bytes, path traversal (..), or overly long paths
working_directory must exist and contain .swarm/plan.json
taskId must exist in the plan; task must not already be in 'complete' state

On success: Sets session.declaredCoderScope = mergedFiles (files + whitelist) and session.lastScopeViolation = null on ALL active architect sessions. Returns { success: true, taskId, fileCount }.

On failure: Returns structured error with { success: false, message, errors[] }.

Automatic alternative: delegation-gate.ts extracts FILE: directive values from any coder delegation envelope and stores them as session.declaredCoderScope automatically. An explicit declare_scope call is only needed when scope must be declared before the delegation text is composed.

Scope Containment Enforcement

Tracking (guardrails.ts toolBefore):

When the architect uses any file-modifying tool (write, edit, patch, create_file, insert, replace), the target file path is appended to session.modifiedFilesThisCoderTask.
When a coder Task delegation is dispatched, modifiedFilesThisCoderTask resets to [].

Checking (guardrails.ts toolAfter):

After a coder Task delegation completes, modifiedFilesThisCoderTask is compared against declaredCoderScope.
If declaredCoderScope is non-null and more than 2 files in modifiedFilesThisCoderTask are outside the declared scope: session.lastScopeViolation is set to a message listing the undeclared file paths.
modifiedFilesThisCoderTask resets to [] after the check.

Warning injection (guardrails.ts messagesTransform):

If session.scopeViolationDetected is set, a scope violation warning is injected into the next architect message.
The flag clears immediately (before nested conditionals) to prevent stale state across turns.

Agent Awareness

Agent awareness tracks what each agent is doing and shares relevant context across agents via system prompts. The architect remains the sole orchestrator — there is no direct inter-agent communication.

Shared State

src/state.ts exports a module-scoped singleton (swarmState) with:

activeAgent: Map<sessionId, agentName> — Which agent is active in each session (updated by chat.message hook)
agentSessions: Map<sessionId, AgentSessionState> — Per-session guardrail tracking. Key fields:
- toolCallCount, startTime, delegationActive — Guardrail counters
- taskWorkflowStates: Map<string, TaskWorkflowState> — Per-task state machine. States: 'idle' | 'coder_delegated' | 'pre_check_passed' | 'reviewer_run' | 'tests_run' | 'complete'. Transitions are forward-only; complete can only be reached from tests_run.
- lastGateOutcome: { gate, taskId, passed, timestamp } | null — Most recent gate result, populated by guardrails.ts toolAfter. Used for deliberation preamble injection in Phase 4 context engineering.
- declaredCoderScope: string[] | null — File paths from the coder delegation FILE: directives or explicit declare_scope tool call (Phase 5). Null means no scope has been declared.
- lastScopeViolation: string | null — Last scope containment violation message (Phase 5). Set when coder modifies >2 files outside declared scope; cleared after warning is injected.
- modifiedFilesThisCoderTask: string[] — File paths the architect has written during the current coder task (Phase 5). Reset to [] when the next coder delegation starts.
- scopeViolationDetected?: boolean — One-shot flag (Phase 5). Set when a scope violation is found; cleared immediately after the warning is injected into the next architect message.
- model_fallback_index: number — Current index into the fallback_models array (v6.33). Incremented on transient model failure; resets to 0 on success.
- modelFallbackExhausted: boolean — Flag set when all fallback models have been exhausted (v6.33). Prevents advisory spam on consecutive transient errors.
eventCounter: number — Tracks events for flush threshold
flushLock: Promise | null — Serializes context.md writes
resetSwarmState() — Clears all state (used in tests)

State machine helpers: advanceTaskState(session, taskId, newState) enforces forward-only transitions (throws INVALID_TASK_STATE_TRANSITION on illegal transitions); getTaskState(session, taskId) returns 'idle' for unknown tasks.

Scope containment helper: isInDeclaredScope(filePath, scopeEntries) in guardrails.ts resolves both the candidate path and each scope entry with path.resolve() then checks containment with path.relative(). This handles directory entries correctly (a scope entry of src/ covers all files below it) without brittle string includes() matching.

The module has zero imports — it's pure TypeScript with no project dependencies.

Stale Delegation Detection

When a subagent finishes and returns control to the architect, there's a race condition between the chat.message hook (which updates activeAgent) and the tool.execute.before hook (which checks guardrails). To prevent the architect from inheriting subagent limits during this transition:

Stale delegation window: If lastToolCallTime is >10 seconds old, the session is considered stale and reverts to architect
Delegation active flag: If delegationActive=false (subagent finished), immediately revert to architect
Early exemption: Three name-based architect checks in the guardrails hook provide defense-in-depth

The 10-second window is tight enough to prevent architect misidentification but loose enough to allow slow subagent operations (file I/O, network).

Activity Tracking Flow

chat.message hook                tool.execute.before hook         tool.execute.after hook
─────────────────               ────────────────────────         ───────────────────────
│                                │                                │
├─ Extract agent name            ├─ Read active agent from        ├─ Record tool result
│  (strip prefix:                │  swarmState                    │  (success heuristic)
│   paid_, local_,               ├─ Log: "agent X using          ├─ Increment event counter
│   mega_, default_)             │  tool Y"                       ├─ If counter >= 20:
├─ Update activeAgent            │                                │  └─ Flush to context.md
│  map                           │                                │     (promise-based lock)
│                                │                                │

Cross-Agent Context Injection

The system-enhancer reads the ## Agent Activity section from context.md and maps agent names to context labels:

coder → implementation context
reviewer → review findings
test_engineer → test results
Other agents → general context

Injected text is truncated to hooks.agent_awareness_max_chars (default: 300 characters).

v6.7 Background Automation Framework

v6.7 is GUI-first and background-first: Slash commands remain the primary control surface, but background automation provides autonomous operation when enabled.

Automation Modes

Three modes control background-first rollout:

Mode	Behavior	Use Case
`manual`	No background automation, all actions via slash commands (default)	Conservative rollout, full control
`hybrid`	Background automation for safe operations, slash commands for sensitive ones	Gradual feature rollout
`auto`	Full background automation (target state)	Future production use

Default: manual for backward compatibility. Enable automation via config.

Per-Capability Feature Flags

All v6.7 automation features are gated behind explicit feature flags (all default false):

Feature Flag	Description	Security
`plan_sync`	Plan auto-heal: regenerate plan.md from canonical plan.json when out of sync	Safe - read-only regeneration
`phase_preflight`	Phase-boundary preflight checks before agent execution	Safe - validation-only
`config_doctor_on_startup`	Config Doctor runs on startup to validate/fix configuration	Moderate - auto-fix requires explicit opt-in
`config_doctor_autofix`	Auto-fix mode for Config Doctor (requires config_doctor_on_startup)	Safety: Defaults to false - autofix requires explicit opt-in
`evidence_auto_summaries`	Generate automatic summaries for evidence bundles	Safe - read-only aggregation
`decision_drift_detection`	Detect drift between planned and actual decisions	Moderate - drift detection only

Core Automation Components

1. Event Bus (`AutomationEventBus`)

Typed event system for internal automation events. Events:

Queue events (enqueued, dequeued, completed, failed, retry scheduled)
Worker events (started, stopped, error)
Circuit breaker events (opened, half-open, closed, callSuccess, callFailure)
Loop protection events (triggered)
Preflight events (requested, triggered, skipped, completed)
Phase boundary events (detected, checked)
Task events (completed)
Evidence summary events (generated, error)

All events include timestamp, payload, and optional source identifier.

2. Queue (`AutomationQueue`)

Lightweight in-process queue with:

Priority levels: critical > high > normal > low
FIFO ordering within priority
Exponential backoff retry (configurable max)
Max queue size protection (default 1000)
Retry metadata tracking (attempts, next attempt time, backoff)

3. Worker Manager (`WorkerManager`)

Lifecycle manager for background workers:

Register workers with handler functions
Configurable concurrency (default 1)
Auto-start support
Processing loop with idle detection
Statistics tracking (processed count, error count, queue size)

4. Circuit Breaker (`CircuitBreaker`)

Fault tolerance primitive:

States: closed (normal) → open (fail fast) → half-open (testing recovery)
Configurable failure threshold (default 5) and reset timeout (default 30s)
Call timeout support (default 10s)
Success threshold for half-open → closed (default 3)

Prevents cascading failures during background automation.

5. Loop Protection (`LoopProtection`)

Infinite loop prevention:

Tracks operation frequency over time window (default 10s)
Configurable max iterations (default 5)
Operation key for tracking specific operations
Automatic detection and abort on threshold exceed

6. Status Artifact (`AutomationStatusArtifact`)

Passive status writer for GUI visibility:

Writes to .swarm/automation-status.json
Tracks mode, capabilities, phase, triggers, outcomes
GUI-friendly summary with readable status text

Plan Sync Auto-Heal

v6.7 Task 5.1: Automatic plan.json ↔ plan.md synchronization.

Auto-Heal Flow

loadPlan(directory):
  1. Try to load plan.json
     ├─ VALID → Check if plan.md in sync
     │  ├─ In sync → Return plan
     │  └─ Out of sync → Auto-regenerate plan.md from plan.json
     │
     └─ INVALID → Try to migrate from plan.md
        ├─ plan.md exists → Migrate, save both files, return plan
        └─ plan.md doesn't exist → Fall through
  2. Try to load plan.md only (no auto-migration)
     ├─ Exists → Return migrated plan
     └─ Doesn't exist → Fall through
  3. Neither exists → Return null

Deterministic Hashing

Content hash uses natural numeric sorting for task IDs:

"1.2" < "1.10" (not "1.2" < "1.10")
Example: 1.1, 1.2, 1.10, 1.11, 2.1 (sorted correctly)
Hash stored in plan.md header as

Atomic Writes

Plan.json writes use temp+rename pattern for atomicity:

Write to plan.json.tmp.{timestamp}
Atomic rename to plan.json
Derive plan.md with hash comment

Services (Extracted from Commands)

Preflight Service

Validates project state before agent execution:

Checks plan completeness
Validates evidence requirements per task
Detects blockers and missing dependencies
Returns actionable findings with severity levels

Gated behind: automation.capabilities.phase_preflight

Config Doctor

Startup service that validates and fixes configuration:

Validates config schema and types
Detects stale/invalid settings
Classifies findings by severity (info/warn/error)
Proposes safe auto-fixes

Security: Defaults to scan-only mode. Autofix requires explicit automation.capabilities.config_doctor_autofix = true.

Backups: Creates encrypted backups in .swarm/ before auto-fix. Supports restore via /swarm config doctor --restore <backup-id>.

Decision Drift Analyzer

Detects drift between planned and actual decisions:

Stale decisions (age/phase mismatch)
Contradictory decisions (use vs don't use, keep vs remove, etc.)
Caches decisions from ## Decisions section in context.md
Returns structured drift signals for context injection

Gated behind: automation.capabilities.decision_drift_detection

Evidence Summary Service

Aggregates evidence per task and phase:

Machine-readable JSON summary in .swarm/evidence-summary.json
Human-readable markdown in .swarm/evidence-summary.md
Per-task completion status
Phase-level blockers (missing evidence, incomplete tasks, blocked tasks)

Gated behind: automation.capabilities.evidence_auto_summaries

Slash Command Adapters

Commands expose service functionality without blocking UI:

Command	Function	Security
`/swarm preflight`	Run preflight checks on current plan	Safe - validation-only
`/swarm config doctor [--fix] [--restore <id>]`	Config Doctor with optional auto-fix and restore	Moderate - auto-fix opt-in
`/swarm doctor tools`	Tool registration coherence, AGENT_TOOL_MAP alignment, and Class 3 binary readiness check	Safe - validation-only
`/swarm sync-plan`	Force plan.md regeneration from plan.json	Safe - read-only

All commands:

Non-blocking (fire and forget for background ops)
Async execution (don't block OpenCode UI)
Log results to console
Store artifacts in .swarm/

GUI Visibility

AutomationStatusArtifact provides passive status for GUI:

{
  "timestamp": 1234567890,
  "mode": "manual",
  "enabled": false,
  "currentPhase": 2,
  "lastTrigger": null,
  "pendingActions": 0,
  "lastOutcome": null,
  "capabilities": {
    "plan_sync": false,
    "phase_preflight": false,
    "config_doctor_on_startup": false,
    "config_doctor_autofix": false,
    "evidence_auto_summaries": false,
    "decision_drift_detection": false
  }
}

GUI uses getGuiSummary() for display:

Status (Disabled / Hybrid / Auto)
Current phase
Last trigger time
Pending actions count
Last outcome (success/failure/skipped)

v6.8 Enhanced Automation & Background Workers

v6.8 builds on v6.7 with automatic execution triggers and persistent background workers.

Auto-Trigger System

Phase Monitor Hook

createPhaseMonitorHook() in src/hooks/phase-monitor.ts:

Detects phase transitions in the execution pipeline
Automatically triggers preflight checks when phase changes
Registers on PhaseChangeHook event hook
Configurable via automation.capabilities.phase_preflight (requires explicit enable)

Preflight Integration

createPreflightIntegration() wires the phase monitor into the hook chain:

Phase Monitor Hook → Preflight Service → Auto-trigger if phase changes

Benefits:

No manual /swarm preflight command needed during execution
Consistent preflight checks at every phase boundary
Automatic blocker detection before agent execution

Background Workers

Plan Sync Worker

PlanSyncWorker class in src/background/plan-sync-worker.ts:

File Watching:

fs.watch on plan.json for real-time synchronization
2-second polling fallback if fs.watch fails (network/mount issues)

Debouncing:

300ms debounce before processing changes
Prevents multiple rapid updates from triggering unnecessary regenerations
Batch writes for better performance

Overlap Lock:

Exclusive lock on plan.json during regeneration
Concurrent reads during regeneration (readers/writer pattern)
Safe shutdown with grace period

Safe Shutdown:

Graceful shutdown on plugin unload
Cancel pending operations
Release all locks before exit

Evidence Summary Worker

Background service that auto-generates evidence summaries:

Scheduled generation for long-running tasks
Aggregates per-task evidence into phase-level summaries
Writes to .swarm/evidence-summary.json
Triggers via /swarm evidence summary command

Configuration Updates

New Defaults

Configuration	v6.7 Default	v6.8 Default	Reason
`evidence_auto_summaries`	`false`	`true`	Long-running tasks benefit from automatic summaries
`plan_sync`	`false`	`true`	Auto-healing plan.json ↔ plan.md is safe and recommended

Migration Path

v6.7 configs are fully compatible with v6.8:

Set evidence_auto_summaries: true in config to enable automation
Set plan_sync: true in config to enable background synchronization
Previous configs remain valid (defaults preserved for disabled features)

Testing Coverage

808 new tests across 6 new test files:

evidence-summary-init.test.ts — Evidence summary service initialization
evidence-summary-init.adversarial.test.ts — Error handling and recovery
evidence-summary-automation.test.ts — Auto-generation triggers
phase-preflight-auto.test.ts — Phase monitor and auto-trigger
plan-sync-worker.test.ts — Worker core functionality
plan-sync-worker.adversarial.test.ts — Edge cases and failures
plan-sync-init.test.ts — Worker initialization

Total: 4008 tests across 136 files

Integration Details

Hook Chain

Plugin Init
├── EvidenceSummaryIntegration (auto-generates summaries)
├── PhaseMonitorHook (detects phase changes)
└── PreflightIntegration (wires phase monitor to preflight)

Background Worker Registration

Plugin Init
└── WorkerManager
    └── Register PlanSyncWorker (auto-sync plan.json → plan.md)

Slash Command: `/swarm evidence summary`

Manual trigger for evidence summary generation:

// Command handler in src/commands/evidence.ts
evidenceCommand.execute(async (args) => {
  await EvidenceSummaryService.generate();
  return { success: true, message: 'Evidence summary generated' };
});

Benefits

For Architects:

Less manual intervention during long-running tasks
Automatic plan synchronization without refresh
Consistent preflight checks at every phase boundary

For Projects:

Resumable, maintainable execution state
Automatic evidence aggregation
Reduced risk of plan drift
Comprehensive audit trail

For Users:

"set it and forget it" automation
No breaking changes to existing configs
Clear visibility via status artifacts
Graceful degradation on failures

Recommended .swarm/ Memory Architecture (v6.14+)

Research findings (Claude Code, Windsurf, JetBrains, Trajectory Miner) support separating .swarm/context.md into distinct concerns:

.swarm/
├── context.md          ← Human-authored rules & architecture (static, version-controlled)
├── patterns.md         ← Agent-discovered patterns (auto-updated, 200-line limit)
├── plan.json / plan.md ← Current plan state
├── evidence/
│   ├── retro-1/evidence.json   ← Phase 1 retrospective
│   ├── retro-2/evidence.json   ← Phase 2 retrospective
│   └── {task-id}/evidence.json ← Task-level evidence bundles
├── events.jsonl        ← Event log
└── telemetry.jsonl     ← Session observability (JSONL, 10MB rotation)

Key principles:

Static rules (context.md) always override auto-learned patterns (patterns.md)
Retrospectives are keyed by retro-{phase_number} convention
User directives with scope: project are persisted to context.md
Agent-discovered patterns require 2+ session frequency before persisting

Note: The actual restructuring of context.md is deferred to v6.14. This section documents the target architecture for planning purposes.

Parallelization Foundation (PR 1 — Dark)

PR 1 of the stacked parallelization release lands dark infrastructure with zero behavior change. See docs/archive/dev/pr1-foundation.md for the full design note covering:

What is dark now vs. deferred to PR 2 and PR 3
Process-global variable audit and isolation classification
Invariants that prove runtime behavior is unchanged

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

OpenCode Swarm Architecture

Design Philosophy

Control Model

Architect: The Brain

Explorer: The Eyes

Identity Rules

Analysis Protocol (Phase 2 Hardening)

Output Sections (Phase 2)

Judgmental Language Removed (Phase 2)

Integration Impact Analysis Mode

Documentation Discovery Mode

Curator Modes

Role Boundary Hardening (Phase 3–5)

Designer: The Blueprint

SME: The Advisor

Pipeline Agents: The Hands

Critic: The Gate

Docs: The Scribe

Execution Flow

Phase 0: Initialize or Resume

Phase 1: Clarify

Phase 2: Discover

Phase 3: Consult SMEs

Phase 4: Plan

Phase 4.5: Critic Gate

MODE: EXECUTE

MODE: PHASE-WRAP

Phase Completion Gates (v6.33.4)

MODE Labels (v6.11)

Plan Cursor (v6.13)

Tool Output Truncation (v6.13)

Structured Search Tool — search (v6.45.0)

Reviewer-Safe Patch Suggestion Tool — suggest_patch (v6.45.0)

Batched Symbol Extraction — batch_symbols (v6.45.0)

Tool-Name Normalization — normalizeToolName (v6.45.0)

Mode Detection (v6.13)

Observable Output (v6.11)

Retry Protocol (v6.11)

Anti-Exemption Rules (v6.11)

Pre-Commit Rule (v6.11)

Task Granularity Rules (v6.11)

File Structure

plan.md Schema

context.md Schema

Agent Permissions

Architect-Only Tools

File Locking for Concurrent Write Safety

Implementation

Protected Files

update_task_status Lock Semantics

Why Not Just Serial Execution?

Lock Recovery

Failure Handling

Task Rejection

Blocked Tasks

Agent Failure

Model Fallback (v6.33)

Bounded Coder Revisions (v6.33)

Why Serial Execution?

Why QA Per Task?

v6.9.0 Quality Gates (6 New Gates)

Why Persistent Files?

Hooks System

Core Utilities

Hook Registration Table

Composition Constraint

Intelligence & Audit Tools (v6.5)

todo_extract — Annotation Scanner

evidence_check — Completeness Auditor

check_gate_status — Task Gate Status Query

pkg_audit — Vulnerability Scanner

complexity_hotspots — Risk Mapper

schema_drift — API Contract Auditor

doc_scan + doc_extract — Two-Pass Documentation Discovery

Structured Search Tool — `search` (v6.45.0)

Reviewer-Safe Patch Suggestion Tool — `suggest_patch` (v6.45.0)

Batched Symbol Extraction — `batch_symbols` (v6.45.0)

Tool-Name Normalization — `normalizeToolName` (v6.45.0)

`update_task_status` Lock Semantics

`todo_extract` — Annotation Scanner

`evidence_check` — Completeness Auditor

`check_gate_status` — Task Gate Status Query

`pkg_audit` — Vulnerability Scanner

`complexity_hotspots` — Risk Mapper

`schema_drift` — API Contract Auditor

`doc_scan` + `doc_extract` — Two-Pass Documentation Discovery

Pass 1: `doc_scan` — Documentation Indexer

Pass 2: `doc_extract` — Constraint Extractor

`framework_detector` — Laravel Framework Detection (Phase 3)

`declare_scope` Tool

1. Event Bus (`AutomationEventBus`)

2. Queue (`AutomationQueue`)

3. Worker Manager (`WorkerManager`)

4. Circuit Breaker (`CircuitBreaker`)

5. Loop Protection (`LoopProtection`)

6. Status Artifact (`AutomationStatusArtifact`)