feat: agent safety — MCP tool annotations, guide, and alegra agent guard#43
Conversation
alegra mcp now advertises MCP tool annotations (readOnlyHint / destructiveHint) on every tool, so hosts that honor them (e.g. Codex) gate destructive operations automatically. Read commands (list/get/ export, catalog, reports, doctor, version, items stock) are marked read-only; create/update/delete and custom actions (void/emit/stamp/…) are marked accordingly. Merged into cobra Annotations so the completion columns annotation is never clobbered. Adds docs/user-guide/agent-safety.md (EN + ES): a layered, honest guide to gating destructive operations — annotations (advisory) + per-host enforcement (Claude Code permissions & PreToolUse hooks, Codex sandbox/ approval, OpenCode permission rules) + CLI built-ins. Cross-referenced from the MCP, skill, and vs-official-MCP pages.
New command that generates the agent-safety config for Claude Code, Codex, or OpenCode, with the destructive operations derived from the live command tree (always complete, zero maintenance). - default: hard-block irreversible actions (delete, void, emit, stamp, close, and compound *-delete actions), require approval for ordinary writes (create/update/import), allow reads - Claude Code: emits a PreToolUse hook that definitively blocks the irreversible ops on BOTH the Bash and MCP surfaces (it inspects the real command/tool name, so glob tricks can't bypass it), plus permission deny/ask rules - Codex: read-only sandbox + untrusted approval (it has no per-command hook; honestly noted) - OpenCode: permission deny/ask patterns for bash and MCP tool names - flags: --all-writes (block writes too), --write (install files, never overwriting an existing config) Verified the generated hook blocks void/emit/delete/*-delete on both surfaces and lets reads/creates through. Documents it as the quick path in the Agent Safety guide.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Makes it easy — and honest — to stop an AI agent from running destructive accounting operations.
What's here
1. MCP tool safety annotations.
alegra mcpnow advertises standard MCP tool annotations on every tool: reads carryreadOnlyHint, writes/actions carrydestructiveHint. Hosts that honor them (e.g. Codex) gate destructive operations automatically, no config. Verified viatools/list: 121 read-only, 106 destructive. (Merged into cobraAnnotations, so the completion-columns annotation is never clobbered.)2.
alegra agent guard— generates the per-host safety config, with the destructive ops derived from the live command tree (always complete, zero maintenance):--host claude-code→settings.json(deny/ask rules) + a PreToolUse hook that definitively blocks the irreversible ops on both the Bash and MCP surfaces (it inspects the real command/tool name, so glob tricks can't bypass it).--host codex→config.toml(read-only sandbox + untrusted approval). Honestly notes Codex has no per-command deny hook.--host opencode→opencode.json(deny/ask patterns for bash and MCP tool names).delete,void,emit,stamp,close, and compound*-delete), require approval for ordinary writes (create/update/import), allow reads.--all-writesblocks writes too;--writeinstalls the files (never overwriting an existing config).3. Agent Safety guide (EN + ES) — a layered, honest explanation: layer 1 (annotations, advisory) + layer 2 (host config/hooks, the real gate) + layer 3 (CLI built-ins). Cross-referenced from the MCP, agent-skill, and vs-official-MCP pages.
Honesty (the framing Juan asked for)
The guide and command are explicit that annotations only advise — a host that ignores them runs everything. The hook is the definitive block on Claude Code; on OpenCode
denyblocks; on Codex the hard block is the read-only sandbox (approval otherwise). Verified the generated hook blocksvoid/emit/delete/*-deleteon both surfaces and lets reads/creates through.Verification
make check: gofmt, vet, golangci-lint (0 issues), tests green.TestClassifyAPICommands,TestGuard_*,TestMCPHints(annotations contract).mkdocs build --strictpasses (EN + ES).Docs-only changes ride along; this is a feature release (0.9.0).