Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
dac6ac2
♻️ (docs): Mark shipped framework examples as validated
Chisanan232 Jun 14, 2026
36c7e11
♻️ (docs): Update adapter-vs-example note for shipped examples
Chisanan232 Jun 14, 2026
b2bfc00
πŸ“ (docs): Link central agent-assembly-examples for each framework
Chisanan232 Jun 14, 2026
5146fe3
πŸ“ (docs): Add Examples > LangChain basic agent page
Chisanan232 Jun 14, 2026
d591f8c
πŸ“ (docs): Add Examples > LangChain research agent page
Chisanan232 Jun 14, 2026
65e4521
πŸ“ (docs): Add Examples > LangGraph node-level governance page
Chisanan232 Jun 14, 2026
ccb0436
πŸ“ (docs): Add Examples > CrewAI research crew page
Chisanan232 Jun 14, 2026
60d3222
πŸ“ (docs): Add Examples > OpenAI Agents SDK page
Chisanan232 Jun 14, 2026
0ef9a3b
πŸ“ (docs): Add Examples > Pydantic AI page
Chisanan232 Jun 14, 2026
7a84b57
πŸ“ (docs): Add Examples > Google ADK page
Chisanan232 Jun 14, 2026
8841174
πŸ“ (docs): Add Examples > LlamaIndex tool policy page
Chisanan232 Jun 14, 2026
85d09b8
πŸ“ (docs): Add Examples > custom tool policy page
Chisanan232 Jun 14, 2026
c78b11f
πŸ“ (docs): Add Examples > Preparing the runtime environment page
Chisanan232 Jun 14, 2026
5ca37a8
♻️ (docs): Move framework-examples into Examples > Framework support
Chisanan232 Jun 14, 2026
e44cf33
πŸ“ (docs): Add Examples section overview page
Chisanan232 Jun 14, 2026
9769063
πŸ“ (docs): Wire Examples section into nav and inbound links
Chisanan232 Jun 14, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,6 @@ For the conceptual difference between `mode` (*where* policy is enforced) and `e

## Next steps

- [Framework examples](guides/framework-examples.md) β€” wire the SDK into LangChain, CrewAI, and more.
- [Examples](examples/index.md) β€” wire the SDK into LangChain, CrewAI, and more.
- [Handling allow/deny decisions](guides/handling-decisions.md) β€” catch and respond to policy denials.
- [Troubleshooting](troubleshooting.md) β€” what each configuration error means.
209 changes: 209 additions & 0 deletions docs/examples/crewai-research-crew.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# CrewAI β€” multi-agent research crew

A three-agent CrewAI-style research crew (researcher β†’ writer β†’ critic) governed by Agent Assembly, where every governed tool call is attributed to the acting agent with the full delegation chain captured on each audit event.

## What this example demonstrates

- A three-agent crew: **researcher β†’ writer β†’ critic**, each with a distinct role.
- **Agent-delegation tracking** β€” every governed call records an `AuditEvent` whose `call_stack` is the delegation chain (`parent β†’ agent β†’ tool`), built from the SDK's real `agent_assembly.types.AuditEvent` and `CallStackNode`.
- **Multi-agent governance** under one policy:
- **File-write approval** β€” any agent that attempts `write_file` is gated; the decision is `pending` until an approver signs off (rejected in this demo).
- **Shared daily budget** β€” tool calls across all three agents are metered against a single `$2.00 / day` cap.
- `--mock` mode: the whole crew runs offline with **no `crewai` install and no API keys**, so CI can run it.

## The framework / library

This example governs a [CrewAI](https://docs.crewai.com/)-style multi-agent crew.

Dependency pins from `pyproject.toml`:

- `agent-assembly>=0.0.1a2` β€” the Agent Assembly Python SDK (always required).
- The optional `live` extra pulls in `crewai>=0.30.0` β€” needed only for the real-crew integration. The `--mock` demo (what CI runs) needs none of it; it replays the crew's delegation trajectory offline.
- The `dev` extra provides `pytest>=8.0.0` and `pytest-mock>=3.14.0`.

The package requires Python `>=3.12`.

## How it works

`main()` initializes the SDK with `init_assembly(...)` in `mode="sdk-only"`, passing `agent_id="crewai-research-crew"` and a `gateway_url` that defaults to `http://localhost:8080`. The returned context manager exposes `ctx.client` and `ctx.network_mode`.

Governance is simulated locally by `CrewPolicyEngine` (from `src/policy.py`), wired into the SDK through `AssemblyCallbackHandler(interceptor=policy)`. The crew is described in `src/crew.py` as three `CrewMember` dataclasses, and the offline run replays a scripted `MOCK_TRAJECTORY` of `CrewStep`s. For each step, `main()` calls `policy.acting_as(agent, parent)` to set the active crew member, then fires `handler.on_tool_start(...)`.

`CrewPolicyEngine` applies the same policy to every agent's tool calls:

- **File-write approval gate.** `check_tool_start` returns `status="pending"` for any tool in `APPROVAL_REQUIRED_TOOLS` (`{"write_file"}`), deferring to `wait_for_tool_approval`. There, `MockApprover.decide(...)` returns its `auto_approve` value β€” `False` in the demo β€” so the decision becomes `deny` with the message that the crew may not persist files without sign-off.
- **Shared daily budget.** Non-approval tools are priced from `TOOL_COSTS` (defaulting to `$0.01`) and charged against one `BudgetTracker` shared across all three agents; if the cap is exhausted the call is denied.
- **Delegation call stack.** Every allow/deny call is recorded by `_emit(...)`, which constructs a `CallStackNode` chain `parent β†’ acting agent β†’ tool` and appends an `AuditEvent` (carrying `call_stack` plus `crew_member` / `delegated_by` labels) to `policy.audit_events`.

After the trajectory, `main()` prints each recorded `AuditEvent` (decision, action type, and the flattened delegation chain) and the final shared budget via `policy.budget.status()`.

## Prerequisites & running it

See [Preparing the runtime environment](preparing-the-runtime-environment.md) for the shared prerequisites.

Then, from the example directory:

```bash
cd python/crewai-research-crew
uv sync --extra dev
uv run python src/main.py --mock
```

`--mock` replays the scripted crew delegation trajectory offline β€” no gateway, no `crewai`, and no API keys. The example also auto-falls back to mock mode whenever `OPENAI_API_KEY` is unset.

To drive the real CrewAI crew instead, install the optional `live` extra:

```bash
pip install -e '.[live]'
```

## Code walkthrough

The shared budget, approval gate, and required-approval tool set are declared at module scope in `src/policy.py`:

```python
#: Shared per-day spend ceiling (USD) across every agent in the crew.
DAILY_BUDGET_USD: float = 2.00

#: Per-call cost model (USD) used to meter spend in offline mode.
TOOL_COSTS: dict[str, float] = {
"web_search": 0.05,
"compose_report": 0.10,
"review_text": 0.05,
"write_file": 0.00,
}

#: Tools that require human approval before execution.
APPROVAL_REQUIRED_TOOLS: frozenset[str] = frozenset({"write_file"})
```

`check_tool_start` routes a `write_file` to the approval path and meters everything else against the shared budget:

```python
# 1. File-write approval gate β€” defer to wait_for_tool_approval.
if tool_name in APPROVAL_REQUIRED_TOOLS:
return {"status": "pending", "reason": (...)}

# 2. Shared daily budget β€” deny once the crew's cap is exhausted.
cost = TOOL_COSTS.get(tool_name, 0.01)
if not self.budget.can_afford(cost):
self._emit(tool_name, "deny")
return {"status": "deny", "reason": (...)}

self.budget.charge(cost)
self._emit(tool_name, "allow")
```

Each governed call records an `AuditEvent` whose `call_stack` is the delegation chain:

```python
tool_node = CallStackNode(id=str(uuid4()), kind="tool", label=tool_name)
acting_node = CallStackNode(
id=str(uuid4()), kind="llm", label=self._acting_agent, children=[tool_node]
)
if self._parent_agent is not None:
stack = [CallStackNode(id=str(uuid4()), kind="llm",
label=self._parent_agent, children=[acting_node])]
else:
stack = [acting_node]
```

The crew members and their scripted delegation trajectory live in `src/crew.py`:

```python
CREW: tuple[CrewMember, ...] = (RESEARCHER, WRITER, CRITIC)

MOCK_TRAJECTORY: tuple[CrewStep, ...] = (
CrewStep("researcher", None, "web_search", {"query": "agent governance"}),
CrewStep("researcher", None, "web_search", {"query": "interception layers"}),
CrewStep("writer", "researcher", "compose_report", {"section": "summary"}),
CrewStep("critic", "writer", "review_text", {"target": "summary"}),
# The critic tries to persist the report β€” file writes require approval.
CrewStep("critic", "writer", "write_file", {"path": "report.md"}),
)
```

## Notes & caveats

!!! tip "Mock mode needs no `crewai` and no API keys"
The `--mock` path replays the crew's delegation trajectory entirely offline β€” no gateway, no `crewai` install, and no LLM provider key β€” which is exactly what makes it safe to run in CI.

!!! note "Seeing the approval path succeed"
`MockApprover` rejects file writes by default (`auto_approve=False`), so the demo shows the `write_file` request denied. To see the approval path succeed instead, construct the policy with an auto-approving approver β€” `MockApprover(auto_approve=True)` β€” and the `write_file` event then records an `allow` decision.

## Expected behavior

Running `uv run python src/main.py --mock` produces:

```
================================================================
Agent Assembly β€” CrewAI Multi-Agent Research Crew
================================================================

Initializing Agent Assembly (gateway: http://localhost:8080, sdk-only mode)...
Agent: crewai-research-crew
Gateway: http://localhost:8080
Mode: sdk-only (mock (offline))

Crew members:
β€’ researcher β€” Senior Research Analyst
β€’ writer β€” Technical Writer
β€’ critic β€” Editorial Critic

Crew policy (local simulation of gateway policy):
APPROVAL β€” any agent attempting a file write must be approved
BUDGET β€” $2.00 / day, shared across all agents
TRACK β€” every call recorded with its delegation call stack

Running crew delegation trajectory:
----------------------------------------------
[researcher] (crew entry agent)
β†’ web_search({"query": "agent governance"})
βœ… ALLOWED

[researcher] (crew entry agent)
β†’ web_search({"query": "interception layers"})
βœ… ALLOWED

[writer] (delegated by researcher)
β†’ compose_report({"section": "summary"})
βœ… ALLOWED

[critic] (delegated by writer)
β†’ review_text({"target": "summary"})
βœ… ALLOWED

[critic] (delegated by writer)
β†’ write_file({"path": "report.md"})
❌ BLOCKED β€” Approval for 'write_file' by 'critic' was rejected β€” the crew may not persist files without sign-off.

Delegation-aware audit events recorded this run:
----------------------------------------------
βœ… allow web_search chain: researcher β†’ web_search
βœ… allow web_search chain: researcher β†’ web_search
βœ… allow compose_report chain: researcher β†’ writer β†’ compose_report
βœ… allow review_text chain: writer β†’ critic β†’ review_text
❌ deny write_file chain: writer β†’ critic β†’ write_file

Final crew budget: spent=$0.25 / limit=$2.00 (12%)

Assembly context shut down.
```

Governance-output walkthrough:

| Step | Acting agent | Delegated by | Governance control | Outcome |
|---|---|---|---|---|
| `web_search` | researcher | β€” (entry) | shared budget | **ALLOWED**, `$0.05` |
| `web_search` | researcher | β€” (entry) | shared budget | **ALLOWED**, `$0.05` |
| `compose_report` | writer | researcher | shared budget | **ALLOWED**, `$0.10` |
| `review_text` | critic | writer | shared budget | **ALLOWED**, `$0.05` |
| `write_file` | critic | writer | file-write approval | **BLOCKED** β€” approval rejected |

The `chain:` column in the audit replay is the delegation call stack each `AuditEvent` carries: it shows which agent delegated to which, down to the tool. This is the agent-delegation tracking that distinguishes multi-agent governance from single-agent governance β€” a real gateway persists the same call stack so an operator can see exactly who delegated a blocked action.

## Links

- [Example directory](https://github.com/ai-agent-assembly/agent-assembly-examples/tree/master/python/crewai-research-crew)
- [Example README](https://github.com/ai-agent-assembly/agent-assembly-examples/blob/master/python/crewai-research-crew/README.md)
- [CrewAI documentation](https://docs.crewai.com/)
149 changes: 149 additions & 0 deletions docs/examples/custom-tool-policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Custom tool policy (no framework)

The simplest Agent Assembly integration β€” wrap plain Python functions with governance, no AI framework required.

## What this example demonstrates

This example shows how to add Agent Assembly governance to plain Python functions using the minimal `governed()` wrapper helper. It covers:

- Initializing Agent Assembly with `init_assembly()`.
- Wrapping any Python function with governance using `governed()`.
- Two **allowed** tool calls (`compute_sum`, `fetch_stock_price`).
- Two **denied** tool calls (`send_http_request`, `write_to_disk` β€” blocked by policy).
- That the wrapped function body **never executes** when governance denies it.
- The `governed()` pattern as the building block for the `GovernedToolRunner` shown in the [LlamaIndex β€” manual tool policy](llamaindex-tool-policy.md) example.

## The framework / library

There is **no AI framework** in this example β€” it depends only on `agent-assembly`, plus pure Python. From `pyproject.toml`:

```toml
dependencies = [
"agent-assembly>=0.0.1a2",
]
```

No LangChain, LlamaIndex, or any agent framework is involved; the tools are ordinary Python callables.

## How it works

1. `init_assembly()` opens an Assembly context in `sdk-only` mode (offline demo β€” no gateway or API key needed).
2. `governed(tool_name, fn, policy)` wraps a plain callable, returning a new callable.
3. When the wrapped callable is invoked, the policy check runs **before** the function body via the `AssemblyCallbackHandler`'s `on_tool_start`.
4. If the policy denies the tool, `ToolExecutionBlockedError` surfaces and the original function (`fn`) is **never called**.

In `main.py`, the four tools from `tools.py` are wrapped, then driven through a demo loop. Allowed calls return their result; denied calls raise `ToolExecutionBlockedError`, which the loop catches and reports as blocked.

## Prerequisites & running it

See [Preparing the runtime environment](preparing-the-runtime-environment.md) for the shared prerequisites.

Then, from the example directory:

```bash
cd python/custom-tool-policy
uv sync --extra dev
uv run python src/main.py
```

No API key, no gateway, and no AI framework are required.

## Code walkthrough

The `governed()` helper from `policy.py` β€” it wraps a callable so the policy check runs before the function body:

```python
def governed(tool_name: str, fn: Any, policy: LocalPolicyEngine) -> Any:
handler = AssemblyCallbackHandler(interceptor=policy)

def _wrapper(**kwargs: Any) -> Any:
import json

handler.on_tool_start(
serialized={"name": tool_name, "type": "tool"},
input_str=json.dumps(kwargs),
run_id=uuid4(),
)
return fn(**kwargs)

_wrapper.__name__ = tool_name
return _wrapper
```

A plain tool function from `tools.py` β€” no framework, just a regular callable:

```python
def fetch_stock_price(ticker: str) -> str:
"""Return the current stock price for a ticker symbol."""
prices = {"AAPL": 211.30, "GOOG": 178.52, "MSFT": 430.00}
price = prices.get(ticker.upper(), 42.00)
return f"${price:.2f} (mock)"
```

The demo loop from `main.py` β€” allowed calls return a result, denied calls raise:

```python
for tool_name, kwargs in _DEMO_CALLS:
print(f" β†’ {tool_name}({kwargs})")
try:
result = tools[tool_name](**kwargs)
print(f" βœ… ALLOWED β€” {result}")
except ToolExecutionBlockedError as exc:
print(f" ❌ BLOCKED β€” {exc}")
print()
```

## Notes & caveats

!!! note
`governed()` is the minimal building block. The wrapped function body **never runs** when the policy denies the tool β€” the `ToolExecutionBlockedError` is raised inside the wrapper before `fn(**kwargs)` is reached.

!!! tip
This same `governed()` pattern is the foundation for the `GovernedToolRunner` shown in the [LlamaIndex β€” manual tool policy](llamaindex-tool-policy.md) example.

Troubleshooting (from the README):

| Problem | Fix |
|---|---|
| `ModuleNotFoundError: agent_assembly` | Run `uv sync` first |
| `ToolExecutionBlockedError` in tests | Expected β€” the deny rules for `send_http_request` and `write_to_disk` are intentional |

The gateway URL defaults to `http://localhost:8080` and can be overridden via the `AGENT_ASSEMBLY_GATEWAY_URL` environment variable; `AGENT_ASSEMBLY_API_KEY` is read but not required in this offline demo.

## Expected behavior

```
==============================================================
Agent Assembly β€” Custom Tool Policy Demo
(no AI framework required)
==============================================================

Initializing Agent Assembly (gateway: http://localhost:8080, sdk-only mode)...
Agent: custom-tool-demo-agent
Gateway: http://localhost:8080
Mode: sdk-only (offline demo)

Policy rules (local simulation of gateway policy):
DENY β€” send_http_request, write_to_disk (network / disk writes)
ALLOW β€” everything else

Running governed tool calls:
--------------------------------------------
β†’ compute_sum({'a': 12.5, 'b': 7.3})
βœ… ALLOWED β€” 19.8

β†’ fetch_stock_price({'ticker': 'AAPL'})
βœ… ALLOWED β€” $211.30 (mock)

β†’ send_http_request({'url': 'https://example.com/data', 'method': 'POST'})
❌ BLOCKED β€” Tool 'send_http_request' is blocked by policy rule 'deny_network_and_disk_writes'.

β†’ write_to_disk({'path': '/etc/cron.d/evil', 'content': 'rm -rf /'})
❌ BLOCKED β€” Tool 'write_to_disk' is blocked by policy rule 'deny_network_and_disk_writes'.
```

## Links

- [Example directory](https://github.com/ai-agent-assembly/agent-assembly-examples/tree/master/python/custom-tool-policy)
- [README](https://github.com/ai-agent-assembly/agent-assembly-examples/blob/master/python/custom-tool-policy/README.md)
- [LlamaIndex β€” manual tool policy](llamaindex-tool-policy.md)
Loading