agent-contracts

Design multi-agent systems as contracts.

agent-contracts is a toolkit for declaratively defining multi-agent development workflows in YAML DSL, with static validation, semantic linting, and prompt rendering.

It is designed for teams that need more than “agents that happen to work”. It helps you define, validate, and evolve:

who each agent is
what tasks can be delegated
which artifacts exist and who owns them
what validations are required
how handoffs are structured
how prompts are rendered from the design itself

Instead of letting workflow rules live only in prompts and code, agent-contracts makes the system explicit, reviewable, and CI-checkable.

Why agent-contracts?

Most agent frameworks focus on runtime execution.

agent-contracts focuses on design-time guarantees.

As multi-agent systems grow, teams usually run into the same problems:

agent responsibilities become ambiguous
handoff rules drift across prompts
artifact ownership is unclear
validation logic is inconsistent
prompts diverge from the intended workflow
shared team conventions stop being enforceable

agent-contracts addresses this by treating your agent workflow as a contract, not just a set of prompts.

You can think of it as:

OpenAPI for multi-agent workflows
a contract layer above runtime orchestration
a source of truth for agent roles, handoffs, and artifact flows

Who this is for

agent-contracts is a strong fit for teams that build or operate:

multi-agent coding workflows
spec → implement → audit → release style pipelines
internal agent platforms
review-heavy or gate-heavy delivery processes
agent systems where artifact ownership matters
reusable team definitions shared across projects

Typical users include:

platform teams standardizing agent workflows
engineering teams building internal coding/review agents
products that require explicit validation and handoff policies
teams that want CI enforcement for agent design consistency

Who this is not for

agent-contracts is probably not the right starting point if you want:

a single-agent chatbot
a quick prompt prototype
an all-in-one hosted agent runtime
built-in scheduling, memory, tracing, or hosting
a purely code-first orchestration style with no declarative spec
maximum flexibility with minimal process constraints

In short:

if you want to run agents quickly, start with a runtime framework
if you want to design multi-agent systems that stay coherent over time, use agent-contracts

What makes it different?

agent-contracts does not try to replace every agent framework.

It occupies a different layer.

Positioning

Product / approach	Primary focus	Best fit	How `agent-contracts` differs
OpenAI Agents SDK	runtime execution with instructions, tools, and handoffs	apps built around agent runtime behavior	`agent-contracts` focuses on design contracts, static guarantees, and artifact relationships
CrewAI	agent/task workflow orchestration	teams that want runtime task execution in YAML	`agent-contracts` goes deeper on validation, ownership, inheritance, and renderable design specs
AutoGen	code-first multi-agent programming	research or custom orchestration flows	`agent-contracts` is more declarative, reviewable, and CI-oriented
Google ADK style patterns	choosing runtime interaction patterns	production systems built around runtime composition	`agent-contracts` is framework-agnostic and centered on workflow design as a contract

The key distinction is simple:

Other frameworks mainly answer: How do I run these agents?
agent-contracts answers: What is the allowed structure of this agent system, and how do we keep it correct as it evolves?

This positioning is consistent with common industry patterns: some frameworks center the agent runtime, others separate agent definition and task invocation, but agent-contracts is strongest as a design-time contract layer across those execution models.

Quick Start

Define your system in a single YAML file:

# agent-contracts.yaml
version: 1
system:
  id: my-project
  name: My Agent Workflow
  default_workflow_order: [design, implement]

agents:
  architect:
    role_name: "Architect"
    purpose: "Drive phases and delegate work"
    can_invoke_agents: [implementer]

  implementer:
    role_name: "Implementer"
    purpose: "Implement features based on specs"

tasks:
  implement-feature:
    description: "Delegate feature implementation"
    target_agent: implementer
    allowed_from_agents: [architect]
    workflow: implement
    input_artifacts: [spec-md]
    invocation_handoff: task-delegation
    result_handoff: implementation-result

artifacts:
  spec-md:
    type: document
    owner: architect
    producers: [architect]
    editors: [architect]
    consumers: [implementer]
    states: [draft, reviewed, approved]

Validate and generate:

agent-contracts validate
agent-contracts generate -c agent-contracts.config.yaml

A working example is available in sample/, including:

A multi-team example is available in sample/multi-team/, demonstrating cross-team interface declaration and consumption.

Core concepts

Agent

An Agent defines who an execution entity is:

role name
purpose
capabilities
permissions
constraints
behavioral rules
structured content sections (reference material, procedures, criteria)
memory — optional capability declaration for session resume support (resumable, ref_required, emits_memory_ref)

Task

A Task defines a delegatable unit of work:

target agent
allowed callers
workflow
input artifacts
invocation/result handoffs
task-specific execution expectations
model_class — optional LLM capability requirement (fast, standard, thinking)

Artifact

An Artifact defines the objects that move through the workflow:

owner
producers
editors
consumers
states
required validations
visibility

Tool

A Tool defines an invokable CLI/MCP tool:

kind (cli, mcp, etc.)
input/output artifacts
invokable_by (which agents can use it)
extends — inherit from a base tool definition
command — single command name (alternative to commands[])
commands — structured list of sub-commands with category, reads, writes, and purpose
cli_contract — path to a CLI contract YAML (for CLI/MCP adapter invocation)
component_contract — path to an AaaC Component contract YAML (for in-process / SDK / MCP Component invocation). Mutually exclusive with cli_contract.
artifact_bindings — maps contract slot names to project artifact IDs
effects (on agents/tasks) — optional narrow-only override of capability effects derived from executable tools

Workflow

A Workflow defines a phase-level execution sequence:

description — human-readable summary
entry_conditions
trigger
external_participants — actors/participants outside the agent system (e.g., User, external advisory)
ordered steps (delegate, gate, team_task, decision; legacy: handoff, validation)

Workflow steps support additional properties:

group — consecutive steps with the same group are rendered as par (parallel) blocks in sequence diagrams
depends_on — list of step task IDs that must complete before this step starts. When specified, the runtime can execute independent steps in parallel. When omitted, the step implicitly depends on all preceding steps (sequential execution)
max_retries (delegate steps) — maximum number of full task re-executions (new sessions) allowed per step. Defaults to 0 (no retries), or 1 when a retry block is present
max_follow_ups (delegate steps) — maximum number of lightweight same-session follow-up messages for output format corrections
retry (delegate steps) — defines a conditional retry loop with condition, fix_task, and optional revalidate_task. These are rendered as recovery instructions in the LLM prompt
routing_key (decision steps) — the field that determines branch selection. The legacy field on is still accepted but deprecated due to YAML 1.1 reserved word collision

Validation

A Validation defines a verification step for an artifact:

target_artifact — the artifact being verified
kind — the type of verification (see below)
executor_type — tool (automated) or agent (agent-driven)
executor — the tool or agent that runs the validation
blocking — whether the validation must pass before proceeding
produces_evidence — optional artifact produced as evidence

Validation kinds

Kind	Purpose	Example
`schema`	Structural schema check	JSON Schema validation, OpenAPI lint, SQL syntax
`mechanical`	Automated tool check	CLI linters, diff checks, coverage reports
`semantic`	Meaning-level review	Agent-based review of spec intent, plan coherence
`approval`	Human/agent sign-off gate	Architect approval before implementation
`provenance`	Source derivation verification	Confirm generated artifact derives from its canonical source (e.g., manifest from API contracts)
`traceability`	Cross-artifact link completeness	Verify every spec requirement reaches contracts, tests, and code
`fidelity`	Semantic faithfulness to source	Confirm tests actually verify spec intent, not just structural compliance

schema and mechanical are best suited for automated checks via tools. semantic, fidelity, and approval are typically agent-driven. provenance and traceability can be either tool or agent-based depending on the verification complexity.

Guardrail

A Guardrail declares a cross-cutting constraint:

description — what is protected
scope — which DSL entities it applies to (agents, tasks, tools, artifacts, workflows)
rationale — why the constraint exists
tags — classification for filtering
exemptions — glob patterns or entity IDs exempt from the guardrail

Guardrail Policy

A Guardrail Policy defines enforcement strategy for guardrails:

rules — array of enforcement rules mapping guardrails to actions
Each rule specifies: severity (critical/mandatory/warning/info), action (block/warn/shadow/info), override permissions
action supports a conditional form for state-dependent enforcement: { default: "block", when: { maintenance: "shadow" } }
Available states are declared system-wide via system.states

Handoff Type

A Handoff Type defines the schema for inter-agent messages:

schema — a JSON Schema object describing the full message structure
description
example
version

Schemas can use allOf with $ref: "#/components/schemas/..." to compose shared fields (e.g., common envelope) with type-specific properties.

Components

Components provide reusable definitions, following the OpenAPI pattern:

components.schemas — named JSON Schema fragments that can be referenced from anywhere via $ref: "#/components/schemas/<name>"

Why teams adopt it

1. Explicit workflow design

Your architecture stops living only in prompts, code, and tribal knowledge.

2. Static guarantees before runtime

You can catch broken references, invalid ownership, missing validations, and workflow inconsistencies before execution.

3. Prompt generation from source of truth

Rendered prompts come from the same DSL that defines roles, tasks, artifacts, and policies.

4. Reuse across teams and projects

Shared base definitions can be extended safely with extends.

5. Better CI discipline

Design regressions become testable.

Features

Declarative YAML DSL for multi-agent development workflows
Agent sections for embedding structured reference material, procedures, and criteria directly in agent definitions
Static schema validation
Reference integrity checks
Semantic linting
Structured handoff definitions with formal JSON Schema and allOf composition
Reusable schema components via components.schemas and JSON Pointer $ref
Artifact ownership and lifecycle modeling
Config-driven prompt rendering with skip_empty support for conditional file generation
Variable substitution via ${vars.xxx} in DSL values
Inheritance with merge operators via extends
Guardrail definitions for cross-cutting process constraints
Guardrail policies with configurable enforcement (block/warn/shadow/info)
State-dependent guardrail action — action accepts a conditional form { default, when } keyed by system.states for workspace-mode-aware enforcement
Software bindings (DI) for tool-specific guardrail implementation (Cursor, Git, GitHub)
Guardrail generation from DSL + policy + bindings via generate guardrails
Navigation index — compile-time artifact-centric model mapping artifacts to operations, agent permissions, relations, and action routes
Artifact coverage — measure what percentage of project files are covered by artifact path_patterns definitions, with CI gating via --min-coverage
Tool extends — tool inheritance for sharing cli_contract, artifact_bindings, and other metadata across related tool definitions
Interface generation from DSL via generate interface for cross-team contracts
Flexible file splitting via $ref (replacement), $refs (import + deep-merge), and JSON Pointer $ref (in-document)
Multi-team collaboration via team_interface (public boundary), imports (team consumption), and team_task (cross-team delegation)
YAML safety linting for reserved word collision detection across YAML 1.1/1.2
extensions declarations with scope, schema validation, and strict enforcement for custom x-* fields
resolve --expand-defaults to materialize all Zod schema defaults in output
DSL completeness scoring with 7 dimensions, text/JSON output, and --threshold CI gate
LLM-based semantic audit — design coherence, prompt fidelity, and completeness checks via Claude, OpenAI, Gemini, or Cursor adapters
JSON Schema for editor support and external tooling
CI-friendly workflow checks

DSL structure

Entities are defined as maps keyed by ID.

version: 1
extends: "./base/"

system:
  id: my-project
  name: My Agent Workflow
  default_workflow_order:
    - analyze
    - specify
    - plan
    - implement
    - audit
    - release
    - reflect
  states: []                       # optional — named workspace states for conditional guardrail action

agents: {}
tasks: {}
artifacts: {}
tools: {}
validations: {}
handoff_types: {}
team_interface:             # optional — multi-team public boundary
  version: 1
  accepts:
    workflows: {}
  exposes:
    artifacts: []
imports: {}                 # optional — consumed team interfaces
workflow: {}
policies: {}
guardrails: {}
guardrail_policies: {}
components:
  schemas: {}

extensions:
  x-flags:
    type: array
    items: string
    description: "CLI flags for tool commands"
  x-path-hint:
    type: string
    description: "Filesystem path hint"
    scope: [artifact]
    schema:
      type: string
      minLength: 1
    required: true
extensions_strict: false

This makes definitions easy to merge, extend, and reference by stable identifiers.

Single-file format

version: 1
system: { ... }
agents: { ... }
tasks: { ... }
artifacts: { ... }

Multi-file format (section-level `$ref`)

version: 1
extends: "./base/"
system:
  id: my-project
  name: My Agent Workflow
  default_workflow_order: [analyze, specify, plan, implement, audit, release, reflect]

agents: { $ref: "./agents.yaml" }
tasks: { $ref: "./tasks.yaml" }
artifacts: { $ref: "./artifacts.yaml" }
tools: { $ref: "./tools.yaml" }
validations: { $ref: "./validations.yaml" }
handoff_types: { $ref: "./handoff-types.yaml" }
workflow: { $ref: "./workflow.yaml" }
policies: { $ref: "./policies.yaml" }

Per-entry `$ref`

$ref can be used at any object position. This allows splitting individual entries into separate files:

agents:
  architect: { $ref: "./agents/architect.yaml" }
  implementer: { $ref: "./agents/implementer.yaml" }
  test-writer: { $ref: "./agents/test-writer.yaml" }

Each referenced file contains the agent definition directly (without the key):

# agents/architect.yaml
role_name: "Architect"
purpose: "Drive phases and delegate work"
can_invoke_agents: [implementer]

Directory `$ref`

When $ref points to a directory, all *.yaml / *.yml files in the directory are loaded and merged:

agents: { $ref: "./agents/" }

Each file in the directory contains one or more keyed entries:

# agents/architect.yaml
architect:
  role_name: "Architect"
  purpose: "Drive phases and delegate work"

Files are loaded in alphabetical order. Conflicting leaf values across files result in an error.

`$refs` (import and merge)

$refs imports multiple files and deep-merges them into the containing map. Unlike $ref (which replaces an object entirely), $refs allows mixing inline definitions with external files.

agents:
  inline-agent:
    role_name: "Inline Agent"
    purpose: "Defined right here"
  $refs:
    - "./agents/architect.yaml"
    - "./agents/implementer.yaml"
    - "./more-agents/"           # directories are also supported

Each referenced file uses the same keyed format:

# agents/architect.yaml
architect:
  role_name: "Architect"
  purpose: "Drive phases and delegate work"

$refs can also be used at the root level to compose a DSL from multiple aspect-oriented files:

version: 1
system:
  id: my-project
  name: My Agent Workflow
  default_workflow_order: [analyze, implement]
$refs:
  - "./agents-core.yaml"        # agents + artifacts definitions
  - "./agents-constraints.yaml"  # constraints for the same agents
  - "./tasks.yaml"

Overlapping map keys are deep-merged recursively. Conflicting leaf values (scalar or array) result in an error.

Directive	Type	Behavior
`$ref`	string	Replace the object at that position with file contents
`$ref` (`#/...`)	string	Replace with the value at the given JSON Pointer path within the document
`$refs`	array	Import files and deep-merge into the containing map

JSON Pointer `$ref`

$ref also supports in-document references using JSON Pointer syntax (RFC 6901). When the value starts with #/, it resolves against the root document instead of the file system.

components:
  schemas:
    handoff-common:
      type: object
      required: [from_agent, to_agent]
      properties:
        from_agent: { type: string }
        to_agent: { type: string }

handoff_types:
  task-delegation:
    version: 1
    schema:
      allOf:
        - $ref: "#/components/schemas/handoff-common"
        - type: object
          required: [payload]
          properties:
            payload:
              type: object
              required: [objective]
              properties:
                objective: { type: string }

This is particularly useful for sharing common schema fragments across multiple handoff_types entries via components.schemas.

JSON Pointer references are resolved in the same processing phase as file $ref — before Zod validation. They can be used anywhere in the document, not just within handoff_types.

Example: Agent definition

agents:
  main-architect:
    role_name: "Architect"
    purpose: "Drive phases, delegate, make gate decisions, integrate audits"
    dispatch_only: true
    mode: read-only
    can_read_artifacts:
      - spec-md
      - codebase
      - test-report
    can_write_artifacts:
      - review-note
    can_execute_tools:
      - spec-impact-check
    can_perform_validations:
      - evidence-gate-review
    can_invoke_agents:
      - implementer
      - test-writer
    can_return_handoffs:
      - evidence-gate-verdict

    responsibilities:
      - "Manage phase progression and gate decisions"
    constraints:
      - "Never write code directly"

    memory:
      resumable: true
      emits_memory_ref: true

    sections:
      - title: "Delegation Protocol"
        content: |
          You act as the Architect. You NEVER implement or test directly.
          Instead you delegate to specialist sub-agents.

Example: Task definition

tasks:
  implement-feature:
    description: "Delegate feature implementation"
    target_agent: implementer
    allowed_from_agents:
      - main-architect
    workflow: implement
    model_class: standard            # optional: fast | standard | thinking
    input_artifacts:
      - spec-md
      - plan-md
    invocation_handoff: task-delegation
    result_handoff: dependency-evidence
    responsibilities:
      - "Implement all requirements from spec-md"
    execution_steps:
      - id: read-specs
        action: "Read spec-md and design-docs"
        reads_artifact: spec-md
      - id: implement
        action: "Implement changes in codebase"
        produces_artifact: codebase
        depends_on: [read-specs]
      - id: run-db-lint
        action: "Run db-lint"
        uses_tool: db-lint
        x-timeout: 120
    completion_criteria:
      - "canonical artifacts updated"

x- prefixed custom properties work at any nesting level — including inside execution_steps, rules, workflow.steps, and other nested objects.

Extension declarations

Projects can declare their custom x-* extension fields in the DSL using extensions. This makes extensions discoverable, self-documenting, and — optionally — machine-validated:

extensions:
  x-flags:
    type: array
    items: string
    description: "CLI flags for tool commands"
  x-path-hint:
    type: string
    description: "Filesystem path hint"
    scope: [artifact]
    schema:
      type: string
      minLength: 1
    required: true

extensions_strict: true  # undeclared x-* properties become errors

Each key must start with x- (validated at schema level). The declaration supports:

Field	Type	Default	Description
`type`	`string`	(required)	Informational type descriptor
`items`	`string`	—	Item type (for array-typed extensions)
`description`	`string`	—	Human-readable description
`scope`	`string[]`	all node types	Restricts which DSL node types this extension may appear on
`schema`	`object`	—	JSON Schema to validate the extension value
`required`	`boolean`	`false`	Whether the extension must be present on every in-scope entity

Scope values: root, system, agent, task, execution_step, artifact, tool, tool_command, validation, handoff_type, workflow, workflow_step, policy, guardrail, guardrail_policy, rule, escalation_criterion, prerequisite

extensions_strict: When true, any x-* property not declared in extensions is an error. When false (default), undeclared extensions produce a warning.

Diagnostics:

Code	Severity	Trigger
`extension-scope-mismatch`	error	Extension used on a node type outside its declared `scope`
`extension-schema-violation`	error	Extension value fails the declared JSON Schema
`extension-required-missing`	error	Required extension missing on an in-scope entity
`undeclared-extension`	warning/error	Extension not declared in `extensions` (error when `extensions_strict: true`)

Backward compatibility: x-extensions and x-extensions-strict are still accepted as deprecated aliases. They produce a deprecated-property warning and are normalized to extensions / extensions_strict during validation.

Example: Artifact definition

artifacts:
  spec-md:
    type: document
    description: "Specification document"
    owner: main-architect
    producers: [main-architect]
    editors: [main-architect]
    consumers: [implementer, test-writer]
    states: [draft, reviewed, approved]
    required_validations: [spec-semantic-review]
    visibility: internal

artifact-contracts integration

agent-contracts integrates with artifact-contracts and cli-contracts to provide a unified artifact governance model.

Design principle

Relationships flow in one direction: agent → artifact.

Agents declare which artifacts they own (own_artifacts), read (can_read_artifacts), or write (can_write_artifacts)
Tools declare which abstract slots map to project artifacts (artifact_bindings)
Artifact definitions themselves do not reference agents (the legacy owner/producers/editors/consumers fields are deprecated)

How it works

1. Define project artifacts in artifact-contracts.yaml (project-specific):

artifacts:
  api-specs:
    type: source
    authority: canonical
    path_patterns: ["specs/**/*.yaml"]
  api-contracts:
    type: generated-code
    authority: generated
    path_patterns: ["src/generated/**/*.ts"]

2. Import artifacts into your agent-contracts DSL via $ref:

artifacts: { $ref: "./artifact-contracts.yaml#/artifacts" }

3. cli-contracts define tools with domain-agnostic slot names (reusable across projects):

# cli-contract.yaml (tool's interface)
artifactSlots:
  source-specs:
    description: "Source specification files"
    direction: read
  contract-output:
    description: "Generated contract output"
    direction: write

commandSets:
  tool-name:
    commands:
      generate:
        summary: Generate contracts
        effects:
          reads: [source-specs]
          writes: [contract-output]
        exits:
          '0':
            description: Success

4. Map abstract slots to project artifacts using artifact_bindings on tools:

tools:
  micro-contracts:
    kind: cli
    cli_contract: tools/micro-contracts/cli-contract.yaml
    artifact_bindings:
      source-specs: api-specs
      contract-output: api-contracts

5. Agents reference tools and artifacts directly:

agents:
  architect:
    own_artifacts: [api-contracts, api-specs]
    can_read_artifacts: [api-specs, api-contracts]
    can_write_artifacts: [api-contracts]
    can_execute_tools: [micro-contracts]

Validation and linting

own_artifacts entries are validated to exist in the artifacts section
artifact_bindings values are validated to exist in the artifacts section
A lint rule warns if own_artifacts entries are not included in can_read_artifacts

Example: Workflow definition

workflow:
  specify:
    description: "Externalize requirements — create spec.md from user stories"
    entry_conditions:
      - User story or feature request received
    trigger: "User invokes /speckit.specify or asks to create a feature spec."
    steps:
      - type: delegate
        task: specify-feature
        from_agent: main-architect
      - type: validation
        validation: spec-semantic-review
      - type: decision
        routing_key: evidence-gate-verdict.verdict
        branches:
          PASS: [plan]
          REVISE: [specify-feature]

Decision steps use routing_key to specify the field that determines branching. The legacy on field is still accepted but deprecated — see YAML safety below.

Example: Handoff type definition

Handoff types define the schema for inter-agent messages using JSON Schema.

handoff_types:
  task-delegation:
    version: 1
    description: "Delegate a task to a sub-agent"
    schema:
      type: object
      required: [task, objective]
      properties:
        task: { type: string }
        objective: { type: string }
        constraints:
          type: array
          items: { type: string }

Using `components.schemas` with `allOf`

Common fields (e.g., from_agent, to_agent, run_id) can be shared across handoff types by placing them in components.schemas and composing via allOf:

components:
  schemas:
    handoff-common:
      type: object
      required: [from_agent, to_agent]
      properties:
        from_agent: { type: string }
        to_agent: { type: string }
        run_id: { type: string }

handoff_types:
  task-delegation:
    version: 1
    description: "Delegate a task"
    schema:
      allOf:
        - $ref: "#/components/schemas/handoff-common"
        - type: object
          required: [payload]
          properties:
            payload:
              type: object
              required: [objective]
              properties:
                objective: { type: string }

  implementation-result:
    version: 1
    description: "Return implementation results"
    schema:
      allOf:
        - $ref: "#/components/schemas/handoff-common"
        - type: object
          required: [payload]
          properties:
            payload:
              type: object
              required: [result]
              properties:
                result: { type: string }
                evidence:
                  type: array
                  items: { type: string }

The $ref: "#/..." references are resolved during loading, before validation. The resulting merged schema is then meta-validated as valid JSON Schema.

Inheritance and merge operators

agent-contracts supports shared base definitions with project-level overrides through extends.

extends: "./base/"

agents:
  implementer:
    constraints:
      $append:
        - "Use only approved external libraries"

  designer:
    role_name: "Designer"
    purpose: "UI design"

tasks:
  implement-feature:
    execution_steps:
      $insert_after:
        target: run-db-lint
        items:
          - id: run-contract-pipeline
            action: "Run contract pipeline"
            uses_tool: api-pipeline

`$clone` — resolve-time entity duplication

$clone creates a new entity by copying an existing entity within the same section and optionally applying a merge diff:

agents:
  implementer.api:
    $clone:
      from: implementer
      merge:
        purpose: "API-specialized implementer"
        can_write_artifacts:
          $replace: [openapi-spec]
        responsibilities:
          $append: ["Validate schema changes"]

$clone is processed during resolve (after extends, before tool inheritance). All merge operators ($append, $prepend, $replace, $remove, $insert_after) work within merge. The base entity is preserved; chained clones (A→B→C) are resolved via topological sort. Circular clones are rejected.

Supported merge operators:

Operator	Behavior
`$append`	Append entries to end of map/array
`$prepend`	Prepend entries to beginning of map/array
`$insert_after`	Insert after element with specified key/id
`$replace`	Replace entire value
`$remove`	Remove entries by key/id
direct value	Override scalar field

Multi-team collaboration

agent-contracts supports multi-team workflows where teams declare public interfaces and consume each other's capabilities.

Team Interface

A team_interface declares what a team exposes to the outside:

team_interface:
  version: 1
  description: "Backend team public interface"
  accepts:
    workflows:
      implement:
        internal_workflow: feature-implement
        input_handoff: feature-request
        output_handoff: implementation-result
        description: "Request a feature implementation"
  exposes:
    artifacts:
      - api-contract
      - build-report
  constraints:
    - "feature-request must include acceptance_criteria"

Key design decisions:

Workflow-level accepts — external callers invoke a workflow, not individual tasks
Explicit mapping — internal_workflow separates the stable public name from the internal workflow ID
Listing-based exposure — an entity is external only if listed in team_interface

Imports

A team consumes another team's generated interface via imports:

imports:
  backend:
    interface: ./teams/backend/team-interface.yaml
    version: ">=1"

Imported entities are referenced as {team_id}.{public_name} in cross-team workflow steps.

`team_task` workflow step

Cross-team delegation uses the team_task step type:

workflow:
  execute-tests:
    steps:
      - type: team_task
        to_team: backend
        workflow: implement
        handoff: feature-request
        expects: implementation-result
        description: "Delegate implementation to backend team"

Field	Description
`to_team`	Team ID from `imports`
`workflow`	Public workflow name from the imported interface
`handoff`	Handoff type for the request
`expects`	Handoff type for the response

Generating a team interface

The generate interface command produces a self-contained team-interface.yaml:

agent-contracts generate interface -c agent-contracts.config.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --team backend
agent-contracts generate interface -c agent-contracts.config.yaml -o custom-output.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --dry-run

The output includes:

Workflow entries with handoff key references
A handoff_types section containing only schemas referenced by external workflows
An exposes.artifacts section with type, description, and states
Metadata (team_id, team_name, version, generated_at)

Interface drift detection

The check command detects drift between the declared team_interface and the generated team-interface.yaml:

agent-contracts check -c agent-contracts.config.yaml

If a team-interface.yaml exists and differs from what would be regenerated, the check reports drift.

For managing multiple teams from a single configuration file (shared bindings, vars, and --team filtering), see Multi-team configuration.

Variable substitution

When using extends to share a base DSL across projects, base definitions often contain values that differ per project (project name, language, repository URL, etc.).

vars in agent-contracts.config.yaml lets you define project-specific values that are substituted into DSL string values using ${vars.xxx} syntax.

Defining vars

Add a vars section to your config file. Values must be flat string key-value pairs.

# agent-contracts.config.yaml
vars:
  project_name: "my-service"
  language: "TypeScript"
  repo_url: "https://github.com/org/my-service"

Using placeholders in DSL

Use ${vars.<key>} in any string value within the DSL YAML (base or project).

# base/agent-contracts.yaml
agents:
  implementer:
    purpose: "Implements features for ${vars.project_name}"
    constraints:
      - "Use ${vars.language} for all implementations"
      - "Repository: ${vars.repo_url}"

Processing order

Variable substitution happens after DSL resolution (extends merge) and before schema validation:

Load config (including vars)
Resolve DSL (load + merge extends)
Substitute ${vars.xxx} in all string values
Validate schema
Render / lint / check

This ensures that merged strings from both base and project are substituted, and the resulting values pass schema validation.

Error handling

If a placeholder references an undefined variable, the command exits with an error:

VarsSubstitutionError: Undefined variable "repo_url" in value "Repository: ${vars.repo_url}"
  Defined vars: project_name, language

Notes

Only string values are substituted; object keys are not affected.
vars is optional. If omitted, no substitution occurs.
Patterns that do not match ${vars.<key>} (e.g. ${env.HOME}, $vars.xxx, {{vars.xxx}}) are left unchanged.

CLI

For the full CLI reference with all commands, options, arguments, exit codes, and AI agent policies, see the CLI Reference.

The CLI contract specification is defined in cli-contract.yaml using CLI Contracts. Commands that have side effects declare structured effects metadata, and the --introspect global option outputs the derived policy as JSON without executing the command.

Installation

npm install -g agent-contracts
npm install -D agent-contracts
npx agent-contracts

Main commands

Command	Description
`agent-contracts resolve [path]`	Resolve `extends` inheritance and output resolved YAML
`agent-contracts validate [path]`	Validate schema and references
`agent-contracts lint [path]`	Run semantic lint
`agent-contracts generate`	Generate all artifacts (templates + guardrails + interface)
`agent-contracts generate templates`	Render template outputs from config
`agent-contracts generate guardrails`	Generate guardrail artifacts from bindings
`agent-contracts generate interface`	Generate team interface YAML from DSL
`agent-contracts score [path]`	Calculate DSL completeness score
`agent-contracts audit <type>`	Run LLM-based semantic audit (render/dsl/prompt/all)
`agent-contracts check`	Run resolve → validate → lint → render --check
`agent-contracts navigation-index`	Build artifact-centric navigation index
`agent-contracts artifact-coverage`	Measure file coverage by artifact definitions
`agent-contracts extract`	Extract embedded CLI contract specification
`agent-contracts render`	(deprecated) Alias for `generate templates`

The [path] argument defaults to agent-contracts.yaml in the current directory. If -c / --config is specified, the DSL path from the config file is used.

All commands also accept --team <id> to limit execution to a single team when using a multi-team configuration.

`--introspect` (global)

Any command can be invoked with --introspect to output the derived policy as JSON without executing the command. This is useful for AI agents to inspect what side effects a command would have before deciding whether to run it.

agent-contracts generate --introspect
agent-contracts audit --introspect
agent-contracts validate --introspect

The output follows the CLI Contracts IntrospectionResult shape:

{
  "command": "generate",
  "activeOptions": ["format"],
  "policy": {
    "riskLevel": "low",
    "requiresConfirmation": false,
    "idempotent": true,
    "sideEffects": ["file_write"],
    "reads": [],
    "writes": [
      {
        "kind": "semantic",
        "target": "configured render, guardrail, and interface output paths",
        "idempotent": true,
        "source": "command:generate"
      }
    ]
  }
}

`resolve` options

Option	Description
`--format <text\|json>`	Output format (default: `text`)
`--expand-defaults`	Expand all Zod default values in output. Fields like `required_validations: []`, `tags: []`, and `can_read_artifacts: []` are written explicitly instead of being silently applied by schema defaults.
`-c, --config <path>`	Path to `agent-contracts.config.yaml`
`--team <id>`	Limit to one team (multi-team config only)

`score` options

Option	Description
`--format <text\|json>`	Output format (default: `text`)
`--threshold <number>`	Minimum score; exit 1 if below (for CI gates)
`-c, --config <path>`	Path to `agent-contracts.config.yaml`
`--team <id>`	Limit to one team (multi-team config only)

`audit` options

Option	Description
`--format <text\|json\|markdown>`	Output format (default: `text`)
`--scope <filter>`	Limit audit scope (e.g. `agents:architect,implementer`)
`--dry-run`	Output the audit prompt without calling the LLM
`--adapter <name>`	SDK adapter: `claude`, `openai`, `gemini`, `cursor` (overrides config)
`--model <name>`	LLM model override (overrides config)
`-l, --log-file <path>`	Write structured agent progress log to a file
`-c, --config <path>`	Path to `agent-contracts.config.yaml`
`--team <id>`	Limit to one team (multi-team config only)

The audit command requires agent-contracts-runtime (optional peer dependency) to be installed. Configure the default adapter and model in agent-contracts.config.yaml:

audit:
  adapter: openai
  model: gpt-4.1

`artifact-coverage` options

Option	Description
`--format <text\|json>`	Output format (default: `text`)
`--min-coverage <number>`	Minimum coverage %; exit 1 if below (for CI gates)
`-c, --config <path>`	Path to `agent-contracts.config.yaml`
`--team <id>`	Limit to one team (multi-team config only)

Configure additional exclude patterns in agent-contracts.config.yaml:

artifact_coverage:
  exclude_patterns:
    - "*.lock"
    - "**/*.snap"
    - ".cursor/**"

The score command evaluates 7 dimensions:

Dimension	What it measures	Weight
Artifact validation coverage	% of artifacts with non-empty `required_validations`	High
Task validation coverage	% of tasks with at least one entry in `validations`	High
Guardrail policy coverage	% of guardrails referenced by at least one policy rule	Medium
Workflow validation integration	% of blocking validations referenced in workflow steps or tasks	High
Schema completeness	% of optional fields filled (description, rationale, trigger, etc.)	Low
Cross-reference bidirectionality	% of agent↔artifact, agent↔tool refs that are reciprocated	Medium
Guardrail scope resolution	% of guardrail scope entries that resolve to existing entities	Medium

Common usage

agent-contracts resolve
agent-contracts resolve --expand-defaults --format json
agent-contracts validate
agent-contracts lint --strict
agent-contracts score
agent-contracts score -c agent-contracts.config.yaml --threshold 70
agent-contracts score --format json
agent-contracts generate -c agent-contracts.config.yaml
agent-contracts generate templates -c agent-contracts.config.yaml
agent-contracts generate templates -c agent-contracts.config.yaml --check
agent-contracts check -c agent-contracts.config.yaml --strict
agent-contracts generate interface -c agent-contracts.config.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --dry-run
agent-contracts generate interface -c agent-contracts.config.yaml --format json
agent-contracts audit dsl -c agent-contracts.config.yaml
agent-contracts audit render -c agent-contracts.config.yaml --format json
agent-contracts audit all -c agent-contracts.config.yaml --adapter claude
agent-contracts audit dsl -c agent-contracts.config.yaml --dry-run
agent-contracts navigation-index
agent-contracts navigation-index --format yaml
agent-contracts navigation-index --artifact api-contracts
agent-contracts artifact-coverage
agent-contracts artifact-coverage --format json
agent-contracts artifact-coverage --min-coverage 80
agent-contracts artifact-coverage -c agent-contracts.config.yaml

Config-driven rendering

Rendering is configured via agent-contracts.config.yaml.

dsl: ./agent-contracts.yaml

vars:
  project_name: "my-service"
  language: "TypeScript"
  repo_url: "https://github.com/org/my-service"

renders:
  - template: ./templates/agent-prompt.md.hbs
    context: agent
    output: ./output/{agent.id}.md

  - template: ./templates/overview.md.hbs
    context: system
    output: ./output/overview.md

This lets you generate static outputs for:

agent prompts
task specs
overviews
artifact docs
validation docs
workflow docs

all from the same resolved DSL.

Multi-team configuration

When several teams (for example backend, QA, infra) are managed from one workspace, you can list every team in a single config file instead of maintaining separate configs.

This complements the DSL-level multi-team collaboration features (team_interface, imports, team_task).

teams:
  _defaults:
    bindings:
      - ./bindings/cursor.yaml
    vars:
      language: TypeScript
    paths:
      cursor_root: .cursor
    active_guardrail_policy: default-enforcement

  backend:
    dsl: ./teams/backend/agent-contracts.yaml
    interface_output: ./teams/backend/team-interface.yaml
    bindings:
      - ./teams/backend/bindings/observability.yaml
    vars:
      team_name: backend

  qa:
    dsl: ./teams/qa/agent-contracts.yaml
    vars:
      team_name: qa

_defaults: Reserved meta-entry in the teams map. It uses the same schema as team entries except dsl is not required. Values are inherited by all teams. The underscore prefix avoids colliding with real team IDs.

Merge with _defaults:

bindings — _defaults bindings are prepended before team-specific bindings
vars — shallow merge; team values win
paths — shallow merge; team values win
active_guardrail_policy — team wins when present

All commands accept --team <id> to run against a single team:

agent-contracts validate -c config.yaml              # all teams
agent-contracts validate -c config.yaml --team backend  # one team
agent-contracts check -c config.yaml --team qa          # one team

The check command also validates that imported interface files exist on disk (cross-team references).

Design constraints:

dsl and teams are mutually exclusive at the config root
Every team except _defaults must specify dsl
Existing single-team configs (top-level dsl only) remain valid unchanged

Artifact binding (config-level)

The artifact_binding config field connects DSL artifact definitions to an external artifact registry (e.g., artifact-contracts.yaml). Registry values override DSL defaults using deep-merge semantics.

Two forms are supported:

# Simple form (IDs match between DSL and registry)
artifact_binding: ./artifact-contracts.yaml

# Explicit mapping form (IDs differ)
artifact_binding:
  source: ./artifact-contracts.yaml
  mappings:
    openapi-spec: billing_api_contract

Merge semantics:

Registry fields override DSL fields (deep-merge at field level)
DSL-only fields are preserved
{var} templates in path_patterns are substituted using config.paths

Diagnostics:

Rule	Severity	Description
`unbound-artifact`	warning	DSL artifact has no registry counterpart
`orphan-binding`	warning	Registry artifact has no DSL counterpart
`type-mismatch`	warning	DSL and registry disagree on `type`/`authority`

Placement: Top-level for single-team configs, or per-team in teams (inheritable from _defaults).

Currently consumed by navigation-index and artifact-coverage commands. When not configured, behavior is unchanged (full backward compatibility).

Render target options

Each entry in renders supports these fields:

Field	Type	Required	Description
`template`	string	yes	Path to Handlebars template
`context`	string	yes	Context type (see below)
`output`	string	yes	Output file path (supports `{<context>.id}` placeholder)
`include`	string[]	no	Only render these entity IDs (not with `system`)
`exclude`	string[]	no	Skip these entity IDs (not with `system`)
`skip_empty`	boolean	no	When `true`, if the rendered output is empty or whitespace-only, the file is not written. If the file already exists, it is deleted.

`skip_empty` usage

skip_empty is useful when a single template applies to all entities of a context type, but only some entities produce meaningful output.

For example, when using context: tool to generate per-tool scripts, tools without an x-script property would produce empty files. With skip_empty: true, those files are simply not created:

renders:
  - template: ./templates/tool-script.sh.hbs
    context: tool
    output: ./output/scripts/{tool.id}.sh
    skip_empty: true

{{!-- tool-script.sh.hbs --}}
{{#if tool.x-script}}
{{{tool.x-script}}}
{{/if}}

Tools with x-script get a generated script file; tools without it produce no file at all.

skip_empty also works with generate templates --check (drift detection): when the expected output is empty, the check expects the file to not exist and reports drift if it does.

Available context types

Each context type provides a different rendering scope:

Context	Scope	Output	Key variables
`system`	Single file	`output` as-is	`system`, `dsl`, `guardrailEnforcement`, `bindings`
`navigation_index`	Single file	`output` as-is	`version`, `generated_at`, `system`, `artifacts` (full `ProjectNavigationIndex`)
`agent`	Per agent	`{agent.id}` in output path	`agent`, `receivableTasks`, `delegatableTasks`, `relatedArtifacts`, `relatedTools`, `relatedHandoffTypes`, `mergedBehavior`, `relatedGuardrails`, `relatedValidations`, `dsl`
`task`	Per task	`{task.id}` in output path	`task`, `targetAgent`, `relatedGuardrails`, `relatedValidations`, `dsl`
`artifact`	Per artifact	`{artifact.id}` in output path	`artifact`, `relatedTools`, `relatedValidations`, `relatedGuardrails`, `producerAgents`, `consumerAgents`, `editorAgents`, `createdInWorkflows`, `dsl`
`tool`	Per tool	`{tool.id}` in output path	`tool`, `invokableAgents`, `inputArtifactDetails`, `outputArtifactDetails`, `relatedGuardrails`, `relatedValidations`, `dsl`
`validation`	Per validation	`{validation.id}` in output path	`validation`, `dsl`
`handoff_type`	Per handoff type	`{handoff_type.id}` in output path	`handoff_type`, `relatedTasks`, `dsl`
`workflow`	Per workflow phase	`{workflow.id}` in output path	`workflow`, `relatedAgents`, `relatedTasks`, `relatedTools`, `relatedArtifacts`, `relatedValidations`, `dsl`
`policy`	Per policy	`{policy.id}` in output path	`policy`, `dsl`
`guardrail`	Per guardrail	`{guardrail.id}` in output path	`guardrail`, `dsl`
`guardrail_policy`	Per guardrail policy	`{guardrail_policy.id}` in output path	`guardrail_policy`, `dsl`

Enriched context details

workflow context collects all entities involved in a phase:

relatedTasks — tasks where task.workflow matches this phase
relatedAgents — agents from task target_agent, allowed_from_agents, step from_agent, and validation executors
relatedTools — tools from can_execute_tools of all related agents, plus uses_tool in execution steps
relatedArtifacts — artifacts from can_read_artifacts, can_write_artifacts, input_artifacts, plus produces_artifact and reads_artifact in execution steps
relatedValidations — validations referenced in workflow steps

artifact context provides ownership and cross-reference data:

relatedTools — tools with this artifact in input_artifacts or output_artifacts
relatedValidations — validations targeting this artifact
producerAgents / consumerAgents / editorAgents — resolved agent records
createdInWorkflows — workflow phases where this artifact is written

agent context provides merged behavioral specs and cross-references:

relatedGuardrails — guardrails bound via agent.guardrails[] or guardrail scope.agents[], merged and deduplicated
relatedValidations — validations from agent.can_perform_validations, resolved into full entries (kind, target_artifact, executor_type, blocking)

task context provides execution details:

relatedGuardrails — guardrails bound via task.guardrails[] or guardrail scope.tasks[]
relatedValidations — validations from task.validations[], resolved into full entries

tool context provides invocation and artifact details:

relatedGuardrails — guardrails bound via tool.guardrails[] or guardrail scope.tools[]
relatedValidations — validations where executor_type is "tool" and executor matches this tool ID
invokableAgents — agents listed in invokable_by
inputArtifactDetails / outputArtifactDetails — resolved artifact records

system context includes binding-aware guardrail enforcement data when bindings and active_guardrail_policy are configured:

guardrailEnforcement — array of enforcement entries, each with guardrail_id, description, severity, action, scoped entities (scoped_agents, scoped_tasks, scoped_workflows, scoped_tools, scoped_artifacts), allow_override, override_requires, trigger (from binding matcher type), and escalation
bindings — array of loaded SoftwareBinding objects

These fields are only populated when the config specifies bindings and active_guardrail_policy. Existing templates that do not reference these fields are unaffected.

Matrix helpers are available in context: system templates:

guardrailCoverageMatrix — generates a Guardrail Coverage Matrix table (guardrail × severity × action × scoped entities × trigger × override × escalation)
taskGuardrailMatrix — generates a Task × Guardrail cross-reference table showing which action applies to each task

Handlebars helpers

Templates can use these built-in helpers:

Helper	Usage	Description
`eq`	`{{#if (eq a b)}}`	Strict equality
`notEmpty`	`{{#if (notEmpty obj)}}`	True when object has at least one key
`inc`	`{{inc @index}}`	Increment number by 1 (for 1-based indexing)
`yamlBlock`	`{{{yamlBlock obj}}}`	Render value as YAML-formatted text
`jsonBlock`	`{{{jsonBlock obj}}}`	Render value as pretty-printed JSON
`yamlFrontmatter`	`{{{yamlFrontmatter obj}}}`	Render value as YAML frontmatter (`---` delimiters)
`handoffPayload`	`(handoffPayload handoffType)`	Resolve handoff payload (`example` or schema skeleton)
`handoffEnvelope`	`(handoffEnvelope handoffType id=@key)`	Build `{ type, version, payload }` envelope object
`lookupPayloadFields`	`{{#each (lookupPayloadFields schema)}}`	Extract schema field info (name, type, required, enum); resolves `allOf` internally
`join`	`{{join arr ", "}}`	Join array elements with separator
`contains`	`{{#if (contains arr "x")}}`	True when array includes value
`groupBy`	`{{#with (groupBy arr "key")}}`	Group array elements by field value
`filterByField`	`{{#each (filterByField arr "field" "val")}}`	Filter array by field match
`keys`	`{{#each (keys obj)}}`	Object keys as array
`values`	`{{#each (values obj)}}`	Object values as array
`size`	`{{size obj}}`	Array length or object key count
`not`	`{{#if (not x)}}`	Boolean negation
`or`	`{{#if (or a b)}}`	Boolean OR (variadic)
`and`	`{{#if (and a b)}}`	Boolean AND (variadic)
`gt` / `gte` / `lt`	`{{#if (gt a b)}}`	Numeric comparisons
`sequenceDiagram`	`{{{sequenceDiagram}}}` or `{{{sequenceDiagram @key ../dsl}}}`	Generate Mermaid sequence diagram. Supports `external_participants`, `group` (par blocks), `retry` (opt blocks), and read-only agent separation into Audit box
`overviewFlowchart`	`{{{overviewFlowchart dsl}}}`	Generate Mermaid graph showing phases → agents/tools/artifacts relationships (system context)

Guardrail DI system

agent-contracts includes a dependency injection system for guardrails that separates what to protect from how to enforce and where to output.

Architecture

agent-contracts.yaml (DSL)        agent-contracts.config.yaml
├─ guardrails:   (what + why)     ├─ bindings: [cursor.yaml, git.yaml, ...]
├─ guardrail_policies: (how)      ├─ active_guardrail_policy: default
└─ agents, tasks, ...             ├─ paths: {cursor_root: .cursor, ...}
                                  └─ vars, renders (existing)

Guardrail definition

Guardrails declare constraints in the DSL without any implementation details:

guardrails:
  no-force-push:
    description: "Force push to protected branches is forbidden"
    scope:
      tools: [git]
    rationale: "Force push destroys commit history"
    tags: [branch-protection, safety]

Guardrail policy

Policies define enforcement strategies:

guardrail_policies:
  default-enforcement:
    rules:
      - guardrail: no-force-push
        severity: critical
        action: block
      - guardrail: branch-lock
        severity: critical
        action:
          default: block
          when:
            maintenance: shadow
      - guardrail: english-only-code
        severity: warning
        action: warn
        allow_override: true

Software bindings

Bindings define software-specific check implementations, output generation, and rendering:

# bindings/cursor.yaml
software: cursor
version: 1

guardrail_impl:
  no-force-push:
    checks:
      - hook_event: beforeShellExecution
        matcher:
          type: command_regex
          pattern: "git\\s+push\\s+.*--force"
        message: "Force push is forbidden"

outputs:
  hook-script:
    target: "{cursor_root}/hooks/evaluate-hook.sh"
    mode: write
    executable: true
    template: ./templates/cursor-hook-wrapper.sh.hbs

renders:
  - context: agent
    output: "{cursor_root}/agent-team/{agent.id}.md"
    template: ./templates/agent-prompt.md.hbs
    exclude:
      - architect
  - context: system
    output: "{cursor_root}/rules/agent-team.mdc"
    inline_template: |
      {{#each agents}}
      - {{@key}}: {{this.role_name}}
      {{/each}}

Binding inheritance

Binding files support extends for inheriting and extending a base binding, using the same mechanism as DSL-level extends.

A base binding defines shared guardrail implementations and outputs:

# skeleton/bindings/cursor.yaml (base)
software: cursor
version: 1

guardrail_impl:
  no-force-push:
    checks:
      - hook_event: beforeShellExecution
        matcher:
          type: command_regex
          pattern: "git\\s+push\\s+.*--force"
        message: "Force push is forbidden"

outputs:
  policy-bundle:
    target: "{cursor_root}/guardrails/policy.json"
    mode: write
    inline_template: "{{json resolved_checks}}"

A project binding extends the base and adds project-specific guardrail implementations:

# project/bindings/cursor.yaml
extends: ../../skeleton/bindings/cursor.yaml
software: cursor
version: 1

guardrail_impl:
  lint-on-save:
    checks:
      - hook_event: afterFileEdit
        matcher:
          type: file_glob
          pattern: "**/*.{ts,tsx}"
        message: "TS file edited — lint results attached."

The result is a single merged binding with all guardrail implementations from both base and project.

Merge behavior:

Field	Behavior
`software`	Project wins
`guardrail_impl`	Map merge by guardrail ID (new IDs added; same ID deep-merged)
`outputs`	Map merge by output ID (project overrides base)
`renders`	Array concatenation (base renders + project renders)
`reporting`	Deep merge (project fields override base)
passthrough fields	Project wins

All merge operators ($append, $prepend, $insert_after, $replace, $remove) work within binding extends, the same as DSL extends.

Chained inheritance (grandparent → parent → child) and both local path (./, ../) and npm package references are supported. Circular extends are detected and rejected.

When using binding extends, the config only needs to list the child binding:

# agent-contracts.config.yaml
bindings:
  - ./bindings/cursor.yaml    # extends base internally
  - ./bindings/git.yaml

Config

# agent-contracts.config.yaml
bindings:
  - ./bindings/cursor.yaml
  - ./bindings/git.yaml

active_guardrail_policy: default-enforcement

paths:
  cursor_root: .cursor
  git_hooks_root: scripts/git-hooks

Binding template context

Both outputs and renders templates have access to the full binding generation context:

Variable	Type	Description
`system`	`{ id, name }`	System metadata
`guardrails`	`Record<string, Guardrail>`	All guardrail definitions
`policy`	`GuardrailPolicy`	Active guardrail policy
`binding`	`SoftwareBinding`	Current binding
`all_bindings`	`Record<string, SoftwareBinding>`	All loaded bindings
`vars`	`Record<string, string>`	Variables from `config.vars`
`paths`	`Record<string, string>`	Path variables from `config.paths`
`reporting`	`{ commands, fail_open, timeout_ms } \| null`	Reporting config
`resolved_checks`	`ResolvedCheck[]`	Resolved guardrail checks
`tasks`	`Record<string, Task>`	All DSL tasks
`artifacts`	`Record<string, Artifact>`	All DSL artifacts
`agents`	`Record<string, Agent>`	All DSL agents
`handoff_types`	`Record<string, HandoffType>`	All DSL handoff types
`workflow`	`Record<string, Workflow>`	All DSL workflows

DSL entities include passthrough fields (x-* extensions), so custom metadata defined in the DSL is accessible in templates (e.g., {{agents.implementer.x-team}}).

Binding renders

Binding renders provide entity-iteration rendering with full DSL context — the same capability as config-level renders, but defined within binding YAML files.

Each render target specifies a context type and an output path pattern:

Field	Required	Description
`context`	yes	Entity type: `agent`, `task`, `artifact`, `tool`, `workflow`, `system`, etc.
`output`	yes	Output path with `{entity.id}` and `{paths_var}` expansion
`template`	one of	Path to external `.hbs` template file
`inline_template`	one of	Inline Handlebars template string
`include`	no	Only render these entity IDs
`exclude`	no	Skip these entity IDs
`skip_empty`	no	Delete target if rendered output is empty
`executable`	no	Set file permissions to 0755

For non-system contexts, one file is generated per entity (filtered by include/exclude). The output path supports two types of variable expansion:

{agent.id}, {task.id}, etc. — replaced with the current entity ID
{cursor_root}, {observability_root}, etc. — replaced from config.paths

When to use binding renders vs config renders vs binding outputs:

Use case	Recommended
Generate per-entity files (agent prompts, workflow docs)	Binding `renders` or config `renders`
Generate guardrail/policy runtime artifacts	Binding `outputs`
Generate files using DSL data + guardrail data	Binding `renders` (has both)
Simple config without bindings	Config `renders`

Config renders remains supported and is not deprecated. Binding renders offers the advantage of co-locating templates with their binding definition and having access to the full binding context (vars, paths, resolved_checks, etc.) in addition to DSL entities.

Generate command

agent-contracts generate guardrails -c agent-contracts.config.yaml
agent-contracts generate guardrails -c agent-contracts.config.yaml --binding cursor
agent-contracts generate guardrails -c agent-contracts.config.yaml --dry-run

Validation model

agent-contracts validates your system in multiple layers.

Schema validation

Checks:

required fields
types
enums
handoff schema shape (meta-validated as valid JSON Schema via ajv)
allOf composition in handoff schemas
invalid custom properties without x- prefix (checked at all nesting levels)
extensions declaration validation — scope, schema, required, and undeclared checks
extensions_strict enforcement — reject undeclared x-* properties when enabled

Custom properties with x- prefix are allowed on any object in the DSL — top-level entities (agents, tasks, artifacts, …), nested objects (rules, execution steps, workflow steps, …), and the root DSL itself.

YAML safety

The DSL is expressed in YAML, which introduces risks from YAML 1.1's implicit type coercion. The yaml-reserved-key-safety lint rule warns when reserved words appear in positions that may be misinterpreted by non-1.2 parsers.

The most notable case is the on field in decision steps. In YAML 1.1, bare on as a mapping key is interpreted as boolean true. While agent-contracts uses a YAML 1.2 parser internally, DSL consumers (CI tools, editors, other parsers) may use YAML 1.1 parsers.

To address this:

Decision steps now support routing_key as the preferred field name (replacing on)
The legacy on field is still accepted for backward compatibility but triggers a lint warning
Branch keys like yes, no, true, false also trigger warnings

# Preferred — safe across all YAML versions
- type: decision
  routing_key: evidence-gate-verdict.verdict
  branches:
    PASS: [release]
    FAIL: [fix-violations]

# Deprecated — works but triggers yaml-reserved-key-safety warning
- type: decision
  on: evidence-gate-verdict.verdict
  branches:
    PASS: [release]
    FAIL: [fix-violations]

Reference integrity

Checks:

cross-entity references
owner / producer / editor / consumer validity
handoff schema consistency (required vs. properties alignment)
permission alignment between agents and artifacts
team_interface internal consistency (workflows, handoffs, and exposed artifacts exist in the DSL)
cross-team reference validity (team_task targets exist in imports)

Semantic lint

Checks:

bidirectional consistency
validation coverage — warns when artifacts lack validations or have empty required_validations (fails under --strict)
artifact-required-validation wiring — verifies every entry in artifact.required_validations exists, targets the correct artifact, and is referenced in a workflow step or task
task-output-validation completeness — checks that tasks producing artifacts (via execution_steps.produces_artifact or agent can_write_artifacts) cover those artifacts' required_validations
workflow graph completeness
merge integrity
read-only write violations
prerequisite readability
artifact ownership — produces_artifact/reads_artifact in execution steps vs. artifact producers/editors/consumers
tool commands — commands[].reads/commands[].writes reference valid artifacts and align with output_artifacts
semantic validation phase coverage — warns when semantic or fidelity validations only appear in late workflow phases (e.g., audit) but not earlier phases (e.g., specify, plan)
validation executor context wiring — warns when a validation's executor (agent or tool) exists in the DSL but the validation is not surfaced in the executor's prompt context
YAML safety — warns when YAML 1.1 reserved words (on, yes, no, true, false, etc.) are used in positions where they may be misinterpreted by non-1.2 parsers
naming/style issues through Spectral rules

`--strict` mode

When --strict is passed to lint or check, warnings are treated as failures (exit code 1). This is particularly relevant for artifact-centric validation rules — empty required_validations, orphaned validation wiring, and incomplete task coverage are all warnings that become blocking under --strict.

Completeness scoring

agent-contracts score provides a quantitative assessment of the DSL's completeness. While validate checks structural correctness (pass/fail) and lint checks semantic quality (warnings/errors), score produces a numeric metric (0–100) covering validation coverage, schema completeness, cross-reference consistency, and more.

Use --threshold in CI to enforce a minimum quality bar:

agent-contracts score -c config.yaml --threshold 70

LLM-based semantic audit

Static tools (validate, lint, score) catch structural and naming issues, but cannot evaluate design quality — whether agent responsibilities are well-scoped, whether workflow gates are placed correctly, or whether generated prompts faithfully represent DSL intent. The audit command bridges this gap by using LLMs as semantic reviewers.

Requires agent-contracts-runtime (optional peer dependency) and an API key for at least one supported adapter.

Audit types

Type	What it checks
`render`	19-dimension cross-check of DSL definitions vs generated prompts — detects template gaps, data gaps, and DSL gaps
`dsl`	Design coherence — role overlap, scope breadth, gate placement, guardrail enforcement paths, handoff schema completeness
`prompt`	Prompt fidelity — missing responsibilities, hallucinated permissions, ambiguous instructions, unsafe directives
`all`	Run all three

Results are structured: each finding has a severity (critical / warning / info), a gap type classification, and prioritized recommendations (P0/P1/P2) with concrete fix proposals.

Configuration

# agent-contracts.config.yaml
audit:
  adapter: openai      # claude | openai | gemini | cursor
  model: gpt-4.1       # model override (adapter-specific)

Adapter	Environment Variable
`claude`	`ANTHROPIC_API_KEY`
`openai`	`OPENAI_API_KEY`
`gemini`	`GEMINI_API_KEY`
`cursor`	`CURSOR_API_KEY`

Usage

agent-contracts audit dsl -c config.yaml
agent-contracts audit all -c config.yaml --adapter claude
agent-contracts audit render -c config.yaml --format json
agent-contracts audit dsl -c config.yaml --dry-run

Use --dry-run to inspect the prompt sent to the LLM without making an API call. Running multiple adapters provides cross-validation — findings reported by 3+ adapters are high-confidence issues.

All LLM commands support --log-file <path> (-l) to write structured progress logs to a file for debugging and monitoring.

Exit Code	Meaning
0	No critical findings
1	Critical findings detected
2	Invalid input or configuration
3	LLM adapter error (API failure, runtime not installed)

A self-hosted example of agent-contracts + runtime

The audit feature is itself built on the agent-contracts ecosystem. The auditor agent, audit tasks, handoff schemas, and workflow are all defined as DSL in dsl_base/, and executed via agent-contracts-runtime adapters at runtime. This makes the audit command a concrete, working example of how to combine the two packages: define agent behavior declaratively in YAML, auto-generate typed registries, and execute tasks against real LLM adapters with structured output validation.

Best used with runtime frameworks

agent-contracts works well alongside runtime frameworks and internal agent infrastructure.

A practical model is:

define the workflow in YAML
validate and lint it in CI
generate prompts and derived docs
execute the workflow in your runtime of choice

That separation keeps runtime concerns and architecture concerns from being mixed together.

Tech stack

Category	Choice
Language	TypeScript (ESM, strict mode)
Schema	Zod + ajv (JSON Schema meta-validation)
YAML parsing	yaml
Lint	TypeScript custom rules + Spectral
Templates	Handlebars
CLI	commander
Testing	Vitest
Build	tsup

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
.github		.github
docs		docs
dsl_base		dsl_base
sample		sample
schemas		schemas
scripts		scripts
src		src
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmrc		.npmrc
Dockerfile.test		Dockerfile.test
LICENSE		LICENSE
README.md		README.md
artifact-contracts.yaml		artifact-contracts.yaml
cli-contract.yaml		cli-contract.yaml
cli-contracts.config.yaml		cli-contracts.config.yaml
docker-compose.test.yml		docker-compose.test.yml
esbuild.bundle.mjs		esbuild.bundle.mjs
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.bundle.config.ts		vitest.bundle.config.ts
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

agent-contracts

Why agent-contracts?

Who this is for

Who this is not for

What makes it different?

Positioning

Quick Start

Core concepts

Agent

Task

Artifact

Tool

Workflow

Validation

Validation kinds

Guardrail

Guardrail Policy

Handoff Type

Components

Why teams adopt it

1. Explicit workflow design

2. Static guarantees before runtime

3. Prompt generation from source of truth

4. Reuse across teams and projects

5. Better CI discipline

Features

DSL structure

Single-file format

Multi-file format (section-level $ref)

Per-entry $ref

Directory $ref

$refs (import and merge)

JSON Pointer $ref

Example: Agent definition

Example: Task definition

Extension declarations

Example: Artifact definition

artifact-contracts integration

Design principle

How it works

Validation and linting

Example: Workflow definition

Example: Handoff type definition

Using components.schemas with allOf

Inheritance and merge operators

$clone — resolve-time entity duplication

Multi-team collaboration

Team Interface

Imports

team_task workflow step

Generating a team interface

Interface drift detection

Variable substitution

Defining vars

Using placeholders in DSL

Processing order

Error handling

Notes

CLI

Installation

Main commands

--introspect (global)

resolve options

score options

audit options

artifact-coverage options

Common usage

Config-driven rendering

Multi-team configuration

Artifact binding (config-level)

Render target options

skip_empty usage

Available context types

Enriched context details

Handlebars helpers

Guardrail DI system

Multi-file format (section-level `$ref`)

Per-entry `$ref`

Directory `$ref`

`$refs` (import and merge)

JSON Pointer `$ref`

Using `components.schemas` with `allOf`

`$clone` — resolve-time entity duplication

`team_task` workflow step

`--introspect` (global)

`resolve` options

`score` options

`audit` options

`artifact-coverage` options

`skip_empty` usage

`--strict` mode

Packages