βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β π€ AIYU MULTIAGENT β AI Agent Platform for Developers β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Build, test, and deploy AI agents with 84 specialized agents, MCP integration, WebSocket streaming, and multi-LLM support.
| 84 π― Agents |
46 π Skills |
78 β‘ Workflows |
10 π‘οΈ Rules |
6 π§ LLM Providers |
Aiyu MultiAgent is an open-source AI agent platform that helps developers automate software engineering tasks using large language models (LLMs). It features a ReAct execution engine, MCP server integration for Claude Code / Cursor / Windsurf, WebSocket real-time streaming, agent handoff orchestration, and a plugin system for extensible AI capabilities. Supports OpenAI GPT-4, Anthropic Claude, Ollama local models, and mock mode for testing.
Latest Release: v2.7.9 β Multi-CLI PATH Scanner, Question-Form Guardrail, Anti-Slop Quality Gate, Artifact Output Format. Detects AI CLIs in
$PATHas failover providers, injects 5-question discovery on first build/design turn, checks output for banned phrases/debug logs/secrets, and parses<artifact>tags into safe file writes. All changes backward compatible.
- What's new in V2.7
- What's new in V2.6
- What's new in V2.5
- Quick start
- Why Aiyu MultiAgent?
- CLI reference
- LLM providers
- Built-in tools
- Project structure
- How to use
- Security and guardrails
- Testing your agents
- Plugin system
- Customize and extend
- Contributing
- License
V2.7 brings a real-time monitoring dashboard, Groq + Frontmatter Task Runner (v2.7.6), Cursor IDE Full Support (v2.7.7), and Multi-CLI Scanner + Quality Gate + Artifacts (v2.7.9) β adding a Next.js 14 dashboard for live agent monitoring, formal WS event schema, 6th LLM provider, native Cursor .cursor/rules/*.mdc + slash commands generation, and structured output enforcement.
Four runtime features for safer, more structured agent output.
enginesβ List CLI engines detected in$PATHrun --engine cli:claudeβ Use a CLI engine as failover providerrun --no-formβ Skip the turn-1 question-form guardrailrun --strict-quality-gateβ Fail runs on quality violationsrun --output-format artifact --write-artifacts ./outβ Parse<artifact>tags and write files safely- API β
POST /jobsandPOST /agents/run-from-noteacceptoutput_format,no_form,strict_quality_gate;GET /artifacts/:jobIdretrieves artifacts
First-class Cursor IDE integration via a new generator (lib/commands/cursor-generator.js) that converts .windsurf/ artifacts β .cursor/ natively. Coexists with Windsurf, no breaking changes.
init --cursor-onlyβ Generate.cursor/only (from existing.windsurf/or package fallback)init --cursorβ Generate alongside Windsurf/.agent during regular initinit --cursor-only --forceβ Re-sync after.windsurf/updates- 84 agent rules β
.cursor/rules/agents/*.mdc(Agent-Requested, invoke via@<agent>) - 45 skill rules β
.cursor/rules/skills/*.mdc(Agent-Requested, AI auto-applies) - 9 domain rules β
.cursor/rules/domain/*.mdc(Auto-Attached via heuristic globs per domain β JS/TS/Py for code-quality,**/api/**for api-design,**/auth/**+**/*.env*for security, etc.) - 2 always-on rules β
.cursor/rules/00-project-overview.mdc+01-gemini-protocol.mdc - 78 slash commands β
.cursor/commands/*.md(/create,/debug,/deploy, etc.) - MCP config β
.cursor/mcp.json(preservescontext7+shadcn) - Smart description extraction β prefers frontmatter, falls back to blockquote tagline, skips code fences/tables/lists, synthesizes from
keywords:as last resort - 23 unit tests in
lib/test/unit/cursor-generator.test.jsβ 101 total tests passing
Full guide: docs/CURSOR-IDE.md
First-class Roo Code integration via lib/core/roo-generator.js β converts .agent/ agents β Roo custom modes.
init --roo-onlyβ Generate.roomodes,.roorules,.roo/from existing.agent/init --no-rooβ Skip Roo Code generation during regular init- 84 custom modes β
.roomodes(one per agent, selectable via Roo mode picker) - Project rules β
.roorules(mirrors.windsurfrules) - System prompts β
.roo/(per-agent instructions)
- 6th LLM provider β
callGroq()withGROQ_API_KEYenv var, free tier (14,400 req/day) run-from-file <path>β Execute agent from markdown with frontmatter (agent, provider, model, maxSteps)init --windsurf-onlyβ Create.windsurf/only (no.agent/directory)init --agent-onlyβ Create.agent/only (no.windsurf/symlink)
| Area | Change | Impact |
|---|---|---|
| V2.7.9 Quality + CLI | Multi-CLI scanner, question-form guardrail, anti-slop quality gate, artifact output | Safety β¬οΈ |
| V2.7.8 Cursor Output | Output Contract in 78 slash commands β agent ID enforced even without alwaysApply | Reliability β¬οΈ |
| V2.7.7 Cursor IDE | .cursor/rules/*.mdc + .cursor/commands/*.md generator β 84 agents, 45 skills, 78 commands |
IDE Support β¬οΈ |
| V2.7.6 Groq + Frontmatter | 6th LLM provider (callGroq) + run-from-file <path> task runner |
Flexibility β¬οΈ |
| Dashboard | Next.js 14 real-time monitoring (aiyu-multi-agent-dashboard/) |
Observability β¬οΈ |
| WS Schema | docs/WS-SCHEMA.md β formal contract (6 clientβserver, 10 serverβclient) |
Reliability β¬οΈ |
| V2.7.0βV2.7.4 Hardening | 150+ bug fixes across 6 audit rounds β see CHANGELOG.md for details | Stability β¬οΈ |
V2.6 brings module decomposition and reliability hardening β breaking the two largest god modules into focused, maintainable files while preserving full backward compatibility.
| Area | Change | Impact |
|---|---|---|
| ποΈ Decomposition | agent-runtime.js (843 lines) β 8 modules |
Maintainability β¬οΈ |
| ποΈ Decomposition | tool-registry.js (543 lines) β 3 modules |
Maintainability β¬οΈ |
| π§ Production | Tracing appendFileSync β async batched queue |
No event loop blocking |
| π§ Production | MCP run_agent 2min timeout + maxSteps cap 20 |
Prevents runaway agents |
| π§ Production | Usage flush beforeExit + sync fallback |
No data loss on exit |
| π³ Docker | Non-root user + expanded .dockerignore |
Security β¬οΈ |
| π οΈ CLI | aiyu-multi-agent dev REPL with verbose logging |
Dev experience β¬οΈ |
| π¦ Types | types.d.ts for 12 core modules |
TS migration foundation |
| π§ Karpathy | Behavioral principles in system prompt + runtime guardrails | LLM coding quality β¬οΈ |
| π€ Agent Audit | 84/84 clean-code, 84/84 Interaction Maps, frontend decomposed | Agent consistency β¬οΈ |
| π οΈ 7 New Tools | agent.delegate, memory.save/load, web.search, plan.create/update/list | Agent capability β¬οΈ |
| π Frontmatter Audit | 84/84 When to Activate, 84/84 Philosophy, 84/84 memory field | Agent completeness β¬οΈ |
agent-runtime.js (re-export) βββΊ react-loop.js β ReAct loop
βΊ chat-session.js β Interactive chat
βΊ failover.js β Per-provider CB
βΊ cache.js β LRU cache
βΊ agent-loader.js β Agent spec loading
βΊ prompt-builder.js β System prompt
βΊ input-sanitizer.js β Input validation
βΊ tool-parser.js β Tool call parsing
tool-registry.js (re-export) βββΊ tool-definitions.js β Tools + schemas
βΊ search-tools.js β Grep + Glob
βΊ command-parser.js β Shell arg parse
V2.5 brings Claude Design-inspired capabilities to the Aiyu MultiAgent platform, enabling real-time agent collaboration, external API access, and smarter project-aware AI automation. This release adds 9 major features and fixes 31 bugs for improved reliability.
V2.5.1 adds 25 system-audit bug fixes (6C+7H+12M):
- Per-provider circuit breaker keys (
llm:openai,llm:claude) withcallLLMWithFailover() - Rate limit hard cap (200 entries) + X-Forwarded-For spoofing fix (
AIYU_TRUST_PROXY) search.greplastIndex reset, chat session failover + 30-min TTL- Handoff bundle persistence + project-scoped path, cache freeze-on-fallback
- LLM retry off-by-one fix, Ollama https transport, usage flush on exit
- CORS origin config (
AIYU_CORS_ORIGIN), fs.glob brace alternation escape
| ποΈ Real-Time Streaming | π Agent Handoff | π¬ Inline Intervention |
|---|---|---|
WebSocket API at /ws streams agent step events live to your IDE |
POST /handoff chains multiple AI agents with enriched context bundles |
POST /agents/intervene or WebSocket lets you inject feedback mid-run |
π fetch.url Tool |
π€ Auto-Apply Context | π API Key Auth |
|---|---|---|
| Agents fetch HTTP(S) URLs with 15s timeout and 100KB body limit | Auto-detects language/framework from package.json + .windsurf/rules |
AIYU_API_KEY env var with Bearer token + crypto.timingSafeEqual |
| β‘ LLM Failover | β±οΈ Per-Tool Timeout | βοΈ Smart Truncation |
|---|---|---|
openai β claude β local β mock failover when circuit breaker opens |
30s Promise.race per tool call with tool_timeout tracing |
Section-aware 8KB skill limit preserving headings and code blocks |
Get started in seconds with npx β no installation required:
# Initialize in your project (one command, smart defaults, no prompts!)
npx aiyu-multi-agent init
# Or use interactive mode for full guided setup
npx aiyu-multi-agent init --interactive
# Cursor IDE users β generate native .cursor/ rules + slash commands
npx aiyu-multi-agent init --cursor-only
# Multi-IDE projects β generate both .windsurf/ and .cursor/
npx aiyu-multi-agent init --cursor
# Roo Code (VS Code) users β generate .roomodes, .roorules, .roo/
npx aiyu-multi-agent init --roo-only
# Windsurf-only (no .agent/ directory)
npx aiyu-multi-agent init --windsurf-onlyOnce initialized, type any slash command in the Windsurf chat panel, Cursor chat, or terminal to activate specialized AI agents:
/create Build a task management app with Next.js
/backend Design a REST API with PostgreSQL and authentication
/security Audit my codebase for OWASP vulnerabilities
/debug Find the memory leak in my Express middleware
The platform automatically detects your project type, selects the right agent, and starts working.
π¦ Or Clone From Source
git clone https://github.com/teeprakorn1/aiyu-multi-agent.git
cd aiyu-multi-agent
npm install
aiyu-multi-agent .Developers waste hours on repetitive tasks β setting up projects, writing boilerplate, auditing code, debugging, and orchestrating complex multi-step workflows across different tools and LLM providers.
Aiyu MultiAgent is a unified AI agent platform that brings 84 specialized agents to your fingertips. Instead of context-switching between ChatGPT, Claude, and custom scripts, you get:
- β‘ Instant Agent Activation β Type
/backend,/security, or/deployand a domain expert agent takes over - π§ Multi-LLM Support β Works with OpenAI GPT-4, Anthropic Claude, local Ollama models, and mock mode for testing
- π Safety & Security β Path traversal protection, sandboxed execution, secret scanning, and permission-based skill installation
- π MCP Integration β Native support for Claude Code, Cursor, and Windsurf via the Model Context Protocol
- π‘ Real-Time Streaming β WebSocket API streams agent thoughts and actions live to your IDE
- π€ Agent Handoff β Chain multiple agents together for complex workflows (e.g., architect β backend β security auditor)
- π§ͺ Built-In Testing β Write declarative agent tests in Markdown, run compliance checks, and validate with CI/CD
- π¦ Publish & Share β Package your custom agents as npm modules for your team or the community
Aiyu MultiAgent provides a comprehensive command-line interface for managing AI agents, running tasks, testing, and publishing. All commands support --help for detailed usage.
aiyu-multi-agent init # Quick setup (smart defaults)
aiyu-multi-agent init --interactive # Full interactive setup
aiyu-multi-agent init --dry-run # Preview without writing
aiyu-multi-agent version # Show version + check updates
aiyu-multi-agent status # Project statistics
aiyu-multi-agent list # List all slash commands
aiyu-multi-agent inspect # Observability dashboard
aiyu-multi-agent checklist # Run master checklistRun agents with natural language input or start an interactive chat session:
aiyu-multi-agent run "Create REST API" # Run agent with input
aiyu-multi-agent run "..." --agent backend # Specify agent
aiyu-multi-agent run "..." --provider openai # OpenAI (needs OPENAI_API_KEY)
aiyu-multi-agent run "..." --provider claude # Claude (needs ANTHROPIC_API_KEY)
aiyu-multi-agent run "..." --provider groq # Groq (needs GROQ_API_KEY, free tier)
aiyu-multi-agent run "..." --provider local # Ollama (local LLM)
aiyu-multi-agent run "..." --provider mock # Mock (testing)
aiyu-multi-agent run "..." --provider cli:claude # CLI engine as provider
aiyu-multi-agent run "..." --json # JSON output (CI/CD)
aiyu-multi-agent run "..." --max-steps 20 # Override max ReAct steps
aiyu-multi-agent run "..." --verbose # Streaming step-by-step
aiyu-multi-agent run "..." --no-cache # Skip cache
aiyu-multi-agent run "..." --output-format artifact # Parse <artifact> tags
aiyu-multi-agent run "..." --write-artifacts ./out # Write artifacts to dir
aiyu-multi-agent run "..." --no-form # Skip question-form guardrail
aiyu-multi-agent run "..." --strict-quality-gate # Fail on quality violations
aiyu-multi-agent run-from-file tasks/login.md # Run agent from markdown with frontmatter
# Frontmatter: agent, provider, model, maxSteps, outputFormat (all optional, --flags override)
# Also supports: --output-format, --no-form, --strict-quality-gate, --write-artifacts
aiyu-multi-agent dev # Dev REPL (mock provider)
aiyu-multi-agent dev --provider openai # Dev REPL with real LLM
aiyu-multi-agent dev --verbose # Dev REPL with step logging
aiyu-multi-agent chat # Interactive session
aiyu-multi-agent chat --agent backend # Chat with specific agent
aiyu-multi-agent engines # List CLI engines in PATH
aiyu-multi-agent engines --json # JSON output
aiyu-multi-agent health # System health check
aiyu-multi-agent health --json # JSON output
aiyu-multi-agent traces # View recent traces
aiyu-multi-agent traces --id <traceId> # Specific trace details
aiyu-multi-agent traces --metrics # Trace metrics summary
aiyu-multi-agent inspect # Observability dashboard
aiyu-multi-agent usage # Usage statistics
aiyu-multi-agent info <agent> # Agent details
aiyu-multi-agent update # Update config to latest
aiyu-multi-agent uninstall # Remove config directoriesStart a production-ready HTTP server with REST API and WebSocket streaming:
aiyu-multi-agent serve # Start HTTP API server| Endpoint | Method | Description |
|---|---|---|
/health |
GET | System status (k8s probe exempt from auth) |
/jobs |
POST | Enqueue agent run |
/jobs/:id |
GET | Poll job status |
/metrics |
GET | Prometheus gauge format |
/traces |
GET | Distributed trace data |
/handoff |
POST | Chain agents with context bundles |
/agents/intervene |
POST | Inject mid-run feedback |
/agents/statuses |
GET | Live agent status grid |
/artifacts/:jobId |
GET | Retrieve parsed artifacts for a job |
/ws |
WebSocket | Real-time agent step streaming |
First-class Cursor IDE integration via auto-generated .cursor/rules/*.mdc and .cursor/commands/*.md:
npx aiyu-multi-agent init --cursor-only # .cursor/ only
npx aiyu-multi-agent init --cursor # .windsurf/ + .cursor/ coexist
npx aiyu-multi-agent init --cursor-only --force # Re-sync after .windsurf/ changesThis generates:
- 84 agent rules (Agent-Requested) β invoke via
@orchestrator,@backend-specialist, etc. - 45 skill rules (Agent-Requested) β AI auto-applies based on context
- 9 domain rules (Auto-Attached) β globs target relevant file types
- 78 slash commands β
/create,/debug,/deploy, etc. - MCP config β
.cursor/mcp.jsonwithcontext7+shadcn
Full guide: docs/CURSOR-IDE.md
First-class Roo Code (VS Code extension) integration via auto-generated .roomodes, .roorules, .roo/:
npx aiyu-multi-agent init --roo-only # .roomodes + .roorules + .roo/
npx aiyu-multi-agent init --roo-only --force # Re-generate after .agent/ changesThis generates:
- 84 custom modes β one per agent, selectable via Roo mode picker
- Project rules β
.roorulesmirrors.windsurfrules - System prompts β
.roo/per-agent instructions
npx aiyu-multi-agent init --windsurf-only # .windsurf/ only (no .agent/)
npx aiyu-multi-agent init --agent-only # .agent/ only (no .windsurf/ symlink)
npx aiyu-multi-agent init --no-roo # Skip Roo Code generationIntegrate with Claude Code, Cursor, Windsurf, and any MCP-compatible host:
aiyu-multi-agent mcp # Start MCP server (stdio)Host Configuration
Claude Code β claude_desktop_config.json:
{
"mcpServers": {
"aiyu": {
"command": "npx",
"args": ["-y", "aiyu-multi-agent", "mcp"],
"cwd": "/path/to/your/project"
}
}
}Cursor β .cursor/mcp.json:
{
"mcpServers": {
"aiyu": {
"command": "npx",
"args": ["-y", "aiyu-multi-agent", "mcp"]
}
}
}| MCP Tool | Description |
|---|---|
list_agents |
Discover available agents |
run_agent |
Execute agent (pass agent_name + input) |
inspect_agent |
Get agent details β skills, tools, instructions |
aiyu-multi-agent test # Run agent test suite
aiyu-multi-agent test --compliance # Spec compliance (15 checks)
aiyu-multi-agent test --unit # Unit tests (41 tests)
aiyu-multi-agent test --production # Production tests (25 tests)
aiyu-multi-agent test --integration # Integration tests (12 tests)
aiyu-multi-agent test --watch # Watch mode
aiyu-multi-agent publish # Publish agent to npm
aiyu-multi-agent publish --dry-run # Validate without publishingaiyu-multi-agent add skill <name> # Install skill from npm
aiyu-multi-agent remove skill <name> # Uninstall skillAiyu MultiAgent supports multiple large language model providers with automatic failover. If one provider's circuit breaker opens, the system automatically tries the next provider in the chain.
| π§ Provider | π Environment Variable | π Supported Models | π‘ Best For |
|---|---|---|---|
| OpenAI | OPENAI_API_KEY |
gpt-4, gpt-4o, gpt-3.5-turbo |
General-purpose coding, reasoning, creative tasks |
| Claude | ANTHROPIC_API_KEY |
claude-3-5-sonnet, claude-3-5-haiku |
Long context, detailed analysis, safety-critical code |
| Groq | GROQ_API_KEY (+ optional GROQ_MODEL) |
llama-3.3-70b-versatile, mixtral-8x7b-32768, gemma2-9b-it |
Fast inference, free tier (14,400 req/day at console.groq.com) |
| Ollama | OLLAMA_HOST |
llama3, mistral, codellama |
Local/offline execution, privacy-sensitive projects |
| Mock | AIYU_ENABLE_MOCK=1 |
Canned responses | Testing, CI/CD pipelines, development without API keys |
| CLI Engines | Auto-detected in $PATH |
claude, codex, gemini, etc. |
Use installed AI CLIs as failover providers |
Failover chain: openai β claude β groq β ollama β mock
When the circuit breaker detects failures (timeouts, 5xx errors, rate limits), it automatically promotes the next provider. No manual intervention required.
Every agent gets access to a set of sandboxed, namespaced tools for safe file system and shell operations. All tools run with path traversal protection and argument validation.
| π§ Tool | π₯ Required Args | π What It Does |
|---|---|---|
fs.read |
path |
Read file contents with project-root restriction |
fs.write |
path, content |
Atomic file write (temp β rename) with EXDEV fallback |
fs.edit |
path, old_string, new_string |
Find & replace with unique match enforcement |
fs.glob |
pattern |
Find files by glob pattern (brace {a,b} expansion supported) |
search.grep |
pattern |
Search file contents (async walk, Node.js native β works on Windows) |
shell.exec |
command |
Execute whitelisted shell commands via execFileSync (no shell) |
fetch.url |
url |
Fetch HTTP(S) URLs with 15s timeout, 3-redirect follow, 100KB limit |
Legacy aliases:
Read,Write,Edit,Grep,Glob,Bashauto-map to namespaced versions for backward compatibility.
βοΈ Runtime Correctness Guarantees
- Parser Fallback Chain β 4 strategies: structured JSON β
TOOL_CALLregex β JSON code blocks β final answer - Arg Validation β Required args checked before execution; missing args return descriptive errors
- Step Logging β Every step recorded as
{ step, thought, action, result, error, duration_ms } - Output Contract β
outputFormat: jsonenforces valid JSON output (useful for CI/CD) - Deterministic Mode β
temperature: 0for reproducible test results across all providers - Tool Timeout β 30s per tool call via
Promise.race; tracing tagstool_timeoutvstool_failure - LLM Retry/Backoff β Exponential backoff (max 3 retries) for HTTP 429, 503, and network timeouts
- Cross-Platform β
fs.globandsearch.grepuse pure Node.js (no externalgrep/finddependency) - Safe Write EXDEV β Atomic write handles cross-partition rename with copy+unlink fallback
- Agent Name Validation β Rejects path traversal characters:
/ \ : * ? " < > |
.agent/ # Universal config (primary)
βββ agents/ # AI Agents
βββ skills/
β βββ core/ # Built-in skills
β βββ installed/ # npm-installed skills
βββ workflows/ # Slash command workflows
βββ rules/ # Auto-triggered rules
βββ tests/ # Agent test files (*.test.md)
βββ scripts/ # Verification scripts
βββ config.yaml # Agent configuration
.windsurf/ # Symlink β .agent/ (Windsurf IDE)
.cursor/ # Auto-generated for Cursor IDE (84 agents + 45 skills + 78 commands)
.roomodes # Roo Code custom modes (84 agents)
.roorules # Roo Code project rules
.roo/ # Roo Code system prompts
π¦ Package Structure
aiyu-multi-agent/
βββ bin/
β βββ cli.js # CLI entry (Commander.js)
β βββ server.js # HTTP API server entry
β βββ postinstall.js # Post-install script
βββ lib/
β βββ api/
β β βββ server.js # Express HTTP server
β β βββ ws.js # WebSocket real-time streaming
β β βββ handoff.js # Agent handoff + intervention API
β β βββ jobs.js # Async job queue
β β βββ middleware.js # Auth, rate-limit, logging, shutdown guard
β β βββ config.js # API configuration
β βββ core/
β β βββ agent-runtime.js # Re-export (V2.6 decomposed)
β β βββ react-loop.js # ReAct loop + tool calling + timeout
β β βββ chat-session.js # Interactive chat + timeout
β β βββ failover.js # Per-provider circuit breaker + failover
β β βββ cache.js # LRU cache
β β βββ agent-loader.js # Agent spec + skill loading
β β βββ prompt-builder.js # System prompt construction
β β βββ input-sanitizer.js # Input validation + injection detection
β β βββ tool-parser.js # Tool call parsing
β β βββ tool-registry.js # Re-export (V2.6 decomposed)
β β βββ tool-definitions.js # Tools + schemas + registry
β β βββ search-tools.js # Grep + Glob
β β βββ command-parser.js # Shell arg parse + ReDoS-safe
β β βββ llm-providers.js # OpenAI, Claude, Groq, Ollama, Mock + retry
β β βββ circuit-breaker.js # Prevents cascade LLM failures
β β βββ request-queue.js # Concurrency control + backpressure
β β βββ tracing.js # Distributed tracing (OTel export)
β β βββ health-check.js # System + Ollama health status
β β βββ cli-scanner.js # Multi-CLI PATH scanner
β β βββ cli-adapters/ # Per-CLI adapters (claude, codex, generic)
β β βββ question-form.js # Turn-1 discovery guardrail
β β βββ quality-gate.js # Anti-slop output quality checker
β β βββ artifact-parser.js # <artifact> tag parser
β β βββ roo-generator.js # Roo Code (.roomodes, .roorules, .roo/)
β β βββ config.js # Config loader (.agent/ + .windsurf/)
β β βββ plugin.js # Plugin lifecycle + permission system
β β βββ guardrails.js # Security & safety layer
β β βββ usage.js # Usage stats + Prometheus metrics
β β βββ logger.js # Structured JSON logging
β β βββ types.d.ts # TypeScript declarations
β βββ commands/ # CLI command handlers
β βββ test/ # Test runner + compliance + unit tests
β βββ mcp/ # MCP server + tools
β βββ publish/ # Packager + validator + registry
βββ templates/ # Agent + skill scaffolds
βββ docs/ # Architecture, runtime spec, roadmap, usage
βββ .windsurf/ # 84 Agents, 46 Skills, 78 Workflows, 10 Rules (Windsurf IDE)
βββ .cursor/ # 84 agents + 45 skills + 9 domain rules + 78 commands (Cursor IDE, auto-generated)
βββ .roomodes # 84 custom modes (Roo Code, auto-generated)
βββ .roorules # Roo Code project rules (auto-generated)
βββ .roo/ # Roo Code system prompts (auto-generated)
βββ aiyu-multi-agent-dashboard/ # Real-time monitoring dashboard (Next.js 14)
Type / followed by a command name to instantly activate a specialized AI agent. Each agent has domain-specific skills, tools, and guardrails tailored to its purpose.
| π Core | π» Development | ποΈ Frameworks |
|---|---|---|
/create /plan /enhance /brainstorm |
/backend /frontend /fullstack |
/nextjs /react /angular /sveltekit |
/status /debug /deploy /test |
/database /data-layer /business-logic |
/nestjs /express /python-api /go |
| π Security | βοΈ Infrastructure | π Industrial |
|---|---|---|
/security /secure-coding /threat-modeling |
/cloud /docker /linux /windows |
/mechatronic /pneumatic /electric |
/pentest-plan /kali /hack /bypass |
/network /load-balancer /migrate |
/chief-machine /plc /iot |
| π€ Orchestration | π Specialist |
|---|---|
/orchestrate /junior-orchestrate (2-3 agents) |
/math /elite-tech-leader /package-finder |
/senior-orchestrate (4-6 agents) |
/staff /platform /ux-research /accessibility |
/elite-orchestrate (7+ agents) |
Just describe your task in plain English β the built-in intelligent routing system automatically selects the best AI agent for your request:
"Build me a REST API with JWT authentication and PostgreSQL"
β π€ Active Agent: backend-specialist
"Check my React app for XSS and CSRF vulnerabilities"
β π€ Active Agent: security-auditor
"Design a cloud architecture on AWS for 10k concurrent users"
β π€ Active Agent: cloud-architect
For complex, multi-domain projects, orchestrate multiple agents to work together:
| Orchestration Level | Agents | Best For |
|---|---|---|
π’ Junior /junior-orchestrate |
2β3 | Simple feature, quick bug fix, single-file refactor |
π‘ Senior /senior-orchestrate |
4β6 | Multi-service feature, cross-team integration, architecture review |
π΄ Elite /elite-orchestrate |
7+ | Mission-critical migration, enterprise platform, zero-downtime deployment |
Aiyu MultiAgent is built with security-first design for safe AI agent execution in production environments. Every tool call passes through multiple safety layers:
| π Guardrail | π‘οΈ Protection Layer | π Details |
|---|---|---|
| Path Traversal | File system isolation | Blocks ../, absolute paths, symlink escapes. Uses projectRoot + path.normalize() + fs.realpathSync() |
| Safe Write | Data integrity | Atomic writes (temp β rename) with EXDEV cross-partition fallback |
| Rate Limit | DoS prevention | In-memory limiting with X-Forwarded-For support, auto-cleanup |
| Sandbox Exec | Command isolation | execFileSync only (no shell). Whitelist-only commands. path.basename() pre-check |
| Command Injection | Input sanitization | Blocks $(), `, rm -rf, mkfs, dd, destructive patterns |
| API Key Auth | Access control | AIYU_API_KEY env var. Bearer token with crypto.timingSafeEqual (timing-attack safe) |
| Env Leak Prevention | Secret protection | Strips API_KEY / TOKEN / SECRET / PASSWORD from child process env regardless of env source |
| Secret Scanning | Pre-publish safety | Detects leaked keys on publish. Blocks with --strict. Recursive scan of all .md, .yaml, .json files for ghp_, sk-, AKIA |
| Permission System | Explicit consent | Skills declare permissions: { fs, network, exec }. User must approve on install |
Write declarative tests in Markdown β no code required. Create .agent/tests/your-agent.test.md:
---
name: your-agent-test
description: "Test suite for your-agent"
---
## Test 1: Agent loads correctly
- assert: config exists
- assert: agent name is "your-agent"
- assert: provider is "openai"
## Test 2: Guardrails active
- assert: path traversal protection enabled
- assert: safe write enabled
- assert: rate limit enabled
## Test 3: Skills loaded
- assert: skill clean-code loadedRun tests with a single command:
aiyu-multi-agent test # Run all test suites
aiyu-multi-agent test --compliance # 15 spec compliance checks
aiyu-multi-agent test --unit # 41 core module unit tests
aiyu-multi-agent test --production # 25 production module tests
aiyu-multi-agent test --integration # 12 integration tests
aiyu-multi-agent test --watch # Auto-re-run on file changes| Assertion | What It Checks |
|---|---|
config exists |
.agent/ directory exists and is valid |
agent name is "X" |
Agent manifest name matches expected |
provider is "X" |
LLM provider configured correctly |
guardrails active/enabled |
All security guardrails initialized |
tool X available |
Required tool is in agent's tool list |
skill X loaded |
Skill directory exists and parses correctly |
Install community skills from npm to extend your agents:
aiyu-multi-agent add skill postgres # Install aiyu-multi-agent-skill-postgres
aiyu-multi-agent add skill @org/custom # Scoped packages supportedSkills add new capabilities β database helpers, cloud APIs, testing frameworks, and more. Each skill declares its required permissions, and you approve before installation. npm install uses --ignore-scripts for safety.
π Publish Your Own Skill
- Create npm package
aiyu-multi-agent-skill-<name>:
aiyu-multi-agent-skill-my-skill/
βββ SKILL.md # Required: metadata + guidelines
βββ config.json # Optional: plugin manifest with permissions
βββ scripts/ # Optional: tool functions
βββ references/ # Optional: templates, docs
- Publish:
npm publish - Users install:
aiyu-multi-agent add skill my-skill
β Add a New Agent
Create .agent/agents/your-agent.md:
---
name: your-agent
description: What this AI agent specializes in
tools: fs.read, search.grep, fs.glob, shell.exec, fs.edit, fs.write
skills: clean-code, architecture
provider: openai
guardrails: true
---
# Your Agent Instructions
Write detailed instructions here for the LLM...π Add a New Skill
aiyu-multi-agent add skill your-skillOr create manually in .agent/skills/your-skill/SKILL.md.
π Add a New Rule
Create .agent/rules/your-rules.md:
---
trigger: on_request
keywords: [keyword1, keyword2]
---
# Your Rule Title
Guidelines that auto-trigger when keywords match...We welcome contributions from the community! Whether it's bug fixes, new agents, skills, or documentation improvements.
| π Document | π Description |
|---|---|
| CONTRIBUTING.md | Development setup, code style, testing guide |
| SECURITY.md | Vulnerability reporting and security policy |
| CODE_OF_CONDUCT.md | Community standards (Contributor Covenant 2.1) |
| CHANGELOG.md | Full version history and release notes |
| CODEBASE.md | Architecture overview and module documentation |
Quick contribution workflow:
# 1. Fork and clone
git clone https://github.com/YOUR_NAME/aiyu-multi-agent.git
# 2. Create a feature branch
git checkout -b feature/my-awesome-feature
# 3. Make changes and test
npm test
# 4. Commit with conventional commits
git commit -m "feat(agents): add kubernetes-orchestrator agent"
# 5. Push and open a Pull Request
git push origin feature/my-awesome-featureApache License 2.0 Β© 2026 Aiyu MultiAgent Contributors
@teeprakorn1 Β· @FrameHandsomez
β Star us on GitHub Β· π Report Issues Β· π¦ npm Package