From fd273674d5bf3cbd0daf27553411b5a1ad4427bd Mon Sep 17 00:00:00 2001 From: Charles Green Date: Sun, 3 May 2026 23:24:18 +0900 Subject: [PATCH] =?UTF-8?q?chore:=20remove=20web/=20=E2=80=94=20extracted?= =?UTF-8?q?=20to=20its=20own=20repo?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Hugo source for aixgo.dev now lives at https://github.com/aixgo-dev/web. Cloudflare Pages auto-deploys from there; aixgo.dev verified live and serving from the new project (PostHog beacon firing, DNS unchanged). Removes: - web/ (entire 127-file Hugo source tree, ~34k lines) - .github/workflows/website.yml (stale Firebase deploy; production was already on Cloudflare Pages with auto-build, this workflow had been failing on every push since PR #205 removed Firebase auth) Updates: - CLAUDE.md: replace 60-line Website section with one-line stub linking to the new repo; fix anchor in TOC - Makefile: drop web-* delegation targets - README.md: replace "web/ source code" references with links to aixgo-dev/web; add aixgo-dev/aixgate to the Resources block - .gitleaks.toml: drop web/content/examples/ allowlist path Net: -34,696 / +8 lines. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/website.yml | 71 - .gitleaks.toml | 1 - CLAUDE.md | 63 +- Makefile | 18 - web/.env.example | 25 - web/.gitignore | 32 - web/.htmlhintrc | 16 - web/.markdownlint.json | 18 - web/Makefile | 48 - web/README.md | 69 - web/archetypes/default.md | 5 - web/config/_default/author.toml | 20 - web/config/_default/hugo.toml | 39 - web/config/_default/languages.en.toml | 12 - web/config/_default/markup.toml | 26 - web/config/_default/menus.en.toml | 53 - web/config/_default/module.toml | 0 web/config/_default/params.toml | 11 - web/content/_index.md | 371 ---- web/content/blog/_index.md | 4 - web/content/blog/introducing-aixgo.md | 242 --- web/content/blog/v0-4-0-release.md | 906 -------- web/content/blog/v0-5-0-release.md | 251 --- web/content/blog/v0-6-0-release.md | 503 ----- web/content/blog/v0-7-0-release.md | 233 -- web/content/blog/v0.1.2-december-release.md | 149 -- web/content/blog/v0.2.0-release.md | 298 --- web/content/blog/v0.2.2-public-interfaces.md | 202 -- .../blog/v0.2.3-and-v0.2.4-phased-startup.md | 244 --- .../blog/v0.2.5-monorepo-consolidation.md | 159 -- web/content/blog/v0.2.6-dependency-updates.md | 206 -- .../blog/v0.3.3-session-persistence.md | 401 ---- web/content/examples/README.md | 762 ------- web/content/examples/agents/aggregator.yaml | 231 -- web/content/examples/agents/classifier.yaml | 133 -- web/content/examples/agents/logger.yaml | 72 - web/content/examples/agents/planner.yaml | 245 --- web/content/examples/agents/producer.yaml | 48 - web/content/examples/agents/react.yaml | 105 - web/content/examples/classifier-aggregator.md | 727 ------- .../examples/llm-providers/anthropic.yaml | 216 -- .../examples/llm-providers/gemini.yaml | 227 -- .../examples/llm-providers/huggingface.yaml | 309 --- .../examples/llm-providers/openai.yaml | 144 -- .../examples/llm-providers/vertexai.yaml | 292 --- web/content/examples/llm-providers/xai.yaml | 280 --- web/content/examples/mcp/grpc-transport.yaml | 269 --- web/content/examples/mcp/local-transport.yaml | 248 --- .../examples/mcp/multiple-servers.yaml | 319 --- .../orchestration/classification.yaml | 11 - .../examples/orchestration/mapreduce.yaml | 78 - .../examples/orchestration/parallel.yaml | 12 - .../orchestration/phased-startup.yaml | 89 - .../examples/orchestration/planning.yaml | 12 - .../examples/orchestration/reflection.yaml | 14 - .../examples/orchestration/sequential.yaml | 14 - .../examples/security/builtin-api-key.yaml | 79 - .../examples/security/delegated-iap.yaml | 117 - .../examples/security/disabled-dev.yaml | 77 - web/content/examples/security/hybrid.yaml | 161 -- .../use-cases/content-classifier.yaml | 14 - .../use-cases/conversation-memory.yaml | 217 -- .../use-cases/multi-expert-consensus.yaml | 30 - .../examples/use-cases/rag-chatbot.yaml | 139 -- .../examples/use-cases/semantic-cache.yaml | 120 -- .../examples/use-cases/simple-chatbot.yaml | 18 - .../examples/use-cases/task-planner.yaml | 14 - web/content/features-patterns.md | 598 ------ web/content/features.md | 158 -- web/content/guides/_index.md | 7 - web/content/guides/agent-types.md | 759 ------- web/content/guides/aws-bedrock.md | 921 -------- web/content/guides/chat-assistant.md | 534 ----- web/content/guides/core-concepts.md | 450 ---- web/content/guides/cost-optimization.md | 683 ------ web/content/guides/docker-from-scratch.md | 583 ----- web/content/guides/embeddings.md | 1871 ---------------- web/content/guides/extending-aixgo.md | 1096 ---------- .../guides/multi-agent-orchestration.md | 836 -------- web/content/guides/observability.md | 691 ------ web/content/guides/pattern-composition.md | 835 -------- web/content/guides/production-deployment.md | 557 ----- web/content/guides/provider-comparison.md | 784 ------- web/content/guides/provider-integration.md | 1389 ------------ web/content/guides/quick-start.md | 138 -- web/content/guides/sessions.md | 189 -- web/content/guides/single-vs-distributed.md | 288 --- web/content/guides/type-safety.md | 477 ----- web/content/guides/using-public-interfaces.md | 506 ----- web/content/guides/validation-with-retry.md | 695 ------ web/content/guides/vector-databases.md | 1439 ------------- web/content/philosophy-condensed.md | 301 --- web/content/proverbs.md | 92 - web/content/v1-compatibility.md | 331 --- web/content/why-aixgo.md | 182 -- web/data/features.yaml | 457 ---- web/data/milestones.yaml | 65 - web/data/version.yaml | 11 - web/layouts/404.html | 24 - .../_default/_markup/render-codeblock.html | 9 - web/layouts/_default/baseof.html | 28 - web/layouts/_default/list.html | 93 - web/layouts/_default/single.html | 64 - web/layouts/index.html | 360 ---- web/layouts/index.json | 12 - web/layouts/partials/alpha-notice.html | 48 - web/layouts/partials/footer.html | 48 - web/layouts/partials/head.html | 24 - web/layouts/partials/header.html | 45 - web/layouts/partials/milestone-cards.html | 44 - web/layouts/partials/posthog.html | 16 - web/layouts/partials/seo.html | 97 - web/layouts/robots.txt | 4 - web/layouts/shortcodes/alpha-notice.html | 42 - web/layouts/shortcodes/button.html | 3 - web/layouts/shortcodes/feature-card.html | 13 - web/layouts/shortcodes/feature-grid.html | 3 - web/layouts/shortcodes/feature-releases.html | 61 - web/layouts/shortcodes/roadmap-timeline.html | 34 - web/layouts/shortcodes/status-badge.html | 12 - web/package-lock.json | 1161 ---------- web/package.json | 5 - web/static/CNAME | 1 - web/static/_headers | 20 - web/static/aixgo-logo.png | Bin 14528 -> 0 bytes web/static/css/alpha-notices.css | 229 -- web/static/css/custom.css | 1747 --------------- web/static/css/main.css | 1903 ----------------- web/static/favicon.svg | 4 - web/static/js/main.js | 32 - web/static/llms.txt | 47 - 131 files changed, 3 insertions(+), 34692 deletions(-) delete mode 100644 .github/workflows/website.yml delete mode 100644 web/.env.example delete mode 100644 web/.gitignore delete mode 100644 web/.htmlhintrc delete mode 100644 web/.markdownlint.json delete mode 100644 web/Makefile delete mode 100644 web/README.md delete mode 100644 web/archetypes/default.md delete mode 100644 web/config/_default/author.toml delete mode 100644 web/config/_default/hugo.toml delete mode 100644 web/config/_default/languages.en.toml delete mode 100644 web/config/_default/markup.toml delete mode 100644 web/config/_default/menus.en.toml delete mode 100644 web/config/_default/module.toml delete mode 100644 web/config/_default/params.toml delete mode 100644 web/content/_index.md delete mode 100644 web/content/blog/_index.md delete mode 100644 web/content/blog/introducing-aixgo.md delete mode 100644 web/content/blog/v0-4-0-release.md delete mode 100644 web/content/blog/v0-5-0-release.md delete mode 100644 web/content/blog/v0-6-0-release.md delete mode 100644 web/content/blog/v0-7-0-release.md delete mode 100644 web/content/blog/v0.1.2-december-release.md delete mode 100644 web/content/blog/v0.2.0-release.md delete mode 100644 web/content/blog/v0.2.2-public-interfaces.md delete mode 100644 web/content/blog/v0.2.3-and-v0.2.4-phased-startup.md delete mode 100644 web/content/blog/v0.2.5-monorepo-consolidation.md delete mode 100644 web/content/blog/v0.2.6-dependency-updates.md delete mode 100644 web/content/blog/v0.3.3-session-persistence.md delete mode 100644 web/content/examples/README.md delete mode 100644 web/content/examples/agents/aggregator.yaml delete mode 100644 web/content/examples/agents/classifier.yaml delete mode 100644 web/content/examples/agents/logger.yaml delete mode 100644 web/content/examples/agents/planner.yaml delete mode 100644 web/content/examples/agents/producer.yaml delete mode 100644 web/content/examples/agents/react.yaml delete mode 100644 web/content/examples/classifier-aggregator.md delete mode 100644 web/content/examples/llm-providers/anthropic.yaml delete mode 100644 web/content/examples/llm-providers/gemini.yaml delete mode 100644 web/content/examples/llm-providers/huggingface.yaml delete mode 100644 web/content/examples/llm-providers/openai.yaml delete mode 100644 web/content/examples/llm-providers/vertexai.yaml delete mode 100644 web/content/examples/llm-providers/xai.yaml delete mode 100644 web/content/examples/mcp/grpc-transport.yaml delete mode 100644 web/content/examples/mcp/local-transport.yaml delete mode 100644 web/content/examples/mcp/multiple-servers.yaml delete mode 100644 web/content/examples/orchestration/classification.yaml delete mode 100644 web/content/examples/orchestration/mapreduce.yaml delete mode 100644 web/content/examples/orchestration/parallel.yaml delete mode 100644 web/content/examples/orchestration/phased-startup.yaml delete mode 100644 web/content/examples/orchestration/planning.yaml delete mode 100644 web/content/examples/orchestration/reflection.yaml delete mode 100644 web/content/examples/orchestration/sequential.yaml delete mode 100644 web/content/examples/security/builtin-api-key.yaml delete mode 100644 web/content/examples/security/delegated-iap.yaml delete mode 100644 web/content/examples/security/disabled-dev.yaml delete mode 100644 web/content/examples/security/hybrid.yaml delete mode 100644 web/content/examples/use-cases/content-classifier.yaml delete mode 100644 web/content/examples/use-cases/conversation-memory.yaml delete mode 100644 web/content/examples/use-cases/multi-expert-consensus.yaml delete mode 100644 web/content/examples/use-cases/rag-chatbot.yaml delete mode 100644 web/content/examples/use-cases/semantic-cache.yaml delete mode 100644 web/content/examples/use-cases/simple-chatbot.yaml delete mode 100644 web/content/examples/use-cases/task-planner.yaml delete mode 100644 web/content/features-patterns.md delete mode 100644 web/content/features.md delete mode 100644 web/content/guides/_index.md delete mode 100644 web/content/guides/agent-types.md delete mode 100644 web/content/guides/aws-bedrock.md delete mode 100644 web/content/guides/chat-assistant.md delete mode 100644 web/content/guides/core-concepts.md delete mode 100644 web/content/guides/cost-optimization.md delete mode 100644 web/content/guides/docker-from-scratch.md delete mode 100644 web/content/guides/embeddings.md delete mode 100644 web/content/guides/extending-aixgo.md delete mode 100644 web/content/guides/multi-agent-orchestration.md delete mode 100644 web/content/guides/observability.md delete mode 100644 web/content/guides/pattern-composition.md delete mode 100644 web/content/guides/production-deployment.md delete mode 100644 web/content/guides/provider-comparison.md delete mode 100644 web/content/guides/provider-integration.md delete mode 100644 web/content/guides/quick-start.md delete mode 100644 web/content/guides/sessions.md delete mode 100644 web/content/guides/single-vs-distributed.md delete mode 100644 web/content/guides/type-safety.md delete mode 100644 web/content/guides/using-public-interfaces.md delete mode 100644 web/content/guides/validation-with-retry.md delete mode 100644 web/content/guides/vector-databases.md delete mode 100644 web/content/philosophy-condensed.md delete mode 100644 web/content/proverbs.md delete mode 100644 web/content/v1-compatibility.md delete mode 100644 web/content/why-aixgo.md delete mode 100644 web/data/features.yaml delete mode 100644 web/data/milestones.yaml delete mode 100644 web/data/version.yaml delete mode 100644 web/layouts/404.html delete mode 100644 web/layouts/_default/_markup/render-codeblock.html delete mode 100644 web/layouts/_default/baseof.html delete mode 100644 web/layouts/_default/list.html delete mode 100644 web/layouts/_default/single.html delete mode 100644 web/layouts/index.html delete mode 100644 web/layouts/index.json delete mode 100644 web/layouts/partials/alpha-notice.html delete mode 100644 web/layouts/partials/footer.html delete mode 100644 web/layouts/partials/head.html delete mode 100644 web/layouts/partials/header.html delete mode 100644 web/layouts/partials/milestone-cards.html delete mode 100644 web/layouts/partials/posthog.html delete mode 100644 web/layouts/partials/seo.html delete mode 100644 web/layouts/robots.txt delete mode 100644 web/layouts/shortcodes/alpha-notice.html delete mode 100644 web/layouts/shortcodes/button.html delete mode 100644 web/layouts/shortcodes/feature-card.html delete mode 100644 web/layouts/shortcodes/feature-grid.html delete mode 100644 web/layouts/shortcodes/feature-releases.html delete mode 100644 web/layouts/shortcodes/roadmap-timeline.html delete mode 100644 web/layouts/shortcodes/status-badge.html delete mode 100644 web/package-lock.json delete mode 100644 web/package.json delete mode 100644 web/static/CNAME delete mode 100644 web/static/_headers delete mode 100644 web/static/aixgo-logo.png delete mode 100644 web/static/css/alpha-notices.css delete mode 100644 web/static/css/custom.css delete mode 100644 web/static/css/main.css delete mode 100644 web/static/favicon.svg delete mode 100644 web/static/js/main.js delete mode 100644 web/static/llms.txt diff --git a/.github/workflows/website.yml b/.github/workflows/website.yml deleted file mode 100644 index 7ee945e..0000000 --- a/.github/workflows/website.yml +++ /dev/null @@ -1,71 +0,0 @@ -name: Website - -on: - push: - branches: [main] - paths: - - 'web/**' - pull_request: - branches: [main] - paths: - - 'web/**' - workflow_dispatch: - -# Minimum permissions following principle of least privilege -permissions: - contents: read - -jobs: - build: - name: Build - runs-on: ubuntu-latest - permissions: - contents: read - steps: - - name: Checkout - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - - - name: Setup Hugo - uses: peaceiris/actions-hugo@75d2e84710de30f6ff7268e08f310b60ef14033f # v3 - with: - hugo-version: 'latest' - extended: true - - - name: Build website - working-directory: web - run: hugo --minify --environment production - - - name: Upload artifact - uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7 - with: - name: website - path: web/public - retention-days: 7 - - deploy: - name: Deploy to Firebase - runs-on: ubuntu-latest - needs: build - if: github.event_name == 'push' && github.ref == 'refs/heads/main' - permissions: - contents: read - checks: write - pull-requests: write - steps: - - name: Checkout - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - - - name: Download artifact - uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8 - with: - name: website - path: web/public - - - name: Deploy to Firebase Hosting - uses: FirebaseExtended/action-hosting-deploy@e2eda2e106cfa35cdbcf4ac9ddaf6c4756df2c8c # v0 - with: - repoToken: ${{ secrets.GITHUB_TOKEN }} - firebaseServiceAccount: ${{ secrets.FIREBASE_SERVICE_ACCOUNT_AIXGO_DEV }} - channelId: live - projectId: aixgo-dev - entryPoint: web diff --git a/.gitleaks.toml b/.gitleaks.toml index 6f9f2d2..0481fea 100644 --- a/.gitleaks.toml +++ b/.gitleaks.toml @@ -10,7 +10,6 @@ description = "Allowlist for example/test secrets and documentation" paths = [ '''.*_test\.go$''', '''.*testutil\.go$''', - '''web/content/examples/.*''', '''docs/.*\.md$''', '''examples/.*''', '''config/.*\.yaml$''', diff --git a/CLAUDE.md b/CLAUDE.md index 05081d9..3c8e842 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,7 +12,7 @@ Quick reference for AI assistants working with Aixgo - a production-grade AI age - [Architecture](#architecture) - [Code Conventions](#code-conventions) - [Key Concepts](#key-concepts) -- [Website](#website-web) +- [Website](#website) - [Common Tasks](#common-tasks) - [Quick Reference](#quick-reference) @@ -443,66 +443,9 @@ export ENVIRONMENT=production --- -## Website (`web/`) +## Website -Hugo-based static website for [aixgo.dev](https://aixgo.dev). - -### Development - -```bash -cd web -make dev # Start dev server at localhost:1313 -make build # Build for production -make lint # Lint markdown content -``` - -### Data-Driven Content - -Feature matrices and roadmap are driven by YAML data files: -- `data/features.yaml` - Feature matrix with status indicators (complete/in_progress/roadmap) -- `data/milestones.yaml` - Development milestones for homepage - -### Content Structure - -- `content/guides/` - 18+ technical guides (quick-start, agent-types, cost-optimization, etc.) -- `content/blog/` - Release announcements and blog posts -- `content/examples/` - YAML configuration examples - -### Key Templates - -- `layouts/index.html` - Homepage template -- `layouts/shortcodes/` - Reusable components: - - `feature-releases.html` - Feature table renderer - - `status-badge.html` - Status indicators (checkmark/construction/roadmap) - - `alpha-notice.html` - Alpha warning banner - -### Configuration - -- `config/_default/hugo.toml` - Main Hugo config (baseURL, language, SEO) -- `static/_headers` - Cloudflare Pages cache-control rules - -### Deployment - -Hosted on **Cloudflare Pages**. Automatic deployment on push to `main`: - -1. Cloudflare Pages detects the push, builds with `cd web && hugo --minify`, output dir `web/public` -2. Deploys to the production custom domain (`aixgo.dev`) -3. Pull-request branches get preview deployments at `.aixgo.pages.dev` - -Required environment variables (set in the Cloudflare Pages project, not in this repo): - -- `HUGO_VERSION` — Hugo version pin (matches local `make build`) -- `HUGO_POSTHOG_KEY` — PostHog Project API Key (`phc_...`); analytics are gated on this being set -- `HUGO_POSTHOG_HOST` — optional, defaults to `https://us.i.posthog.com` - -Manual local build: `cd web && make build` (output in `web/public/`). - -### Key Conventions - -- **Data files first**: Always update `features.yaml` and `milestones.yaml` rather than hardcoding content -- **Ordered lists**: Use `1.` numbering throughout (markdownlint rule) -- **File naming**: kebab-case (e.g., `provider-integration.md`) -- **Code blocks**: Always specify language for syntax highlighting +The aixgo.dev website source lives in its own repo: . Cloudflare Pages auto-deploys on push to `main`. The site is no longer part of this monorepo. --- diff --git a/Makefile b/Makefile index 242009a..26b4fe0 100644 --- a/Makefile +++ b/Makefile @@ -65,22 +65,4 @@ install: ## Install the aixgo binary check: fmt vet lint test ## Run all checks (fmt, vet, lint, test) -# ============================================================================= -# Web targets (delegated to web/Makefile) -# ============================================================================= - -.PHONY: web-dev web-build web-clean web-lint - -web-dev: ## Start Hugo development server - $(MAKE) -C web dev - -web-build: ## Build Hugo site for production - $(MAKE) -C web build - -web-clean: ## Clean web build artifacts - $(MAKE) -C web clean - -web-lint: ## Lint web content - $(MAKE) -C web lint - .DEFAULT_GOAL := help diff --git a/web/.env.example b/web/.env.example deleted file mode 100644 index d4f4047..0000000 --- a/web/.env.example +++ /dev/null @@ -1,25 +0,0 @@ -# Firebase Configuration for Aixgo Website -# Copy this file to .env and fill in your values -# These values are used by Hugo during build to configure Firebase -# Note: HUGO_ prefix is required for Hugo's security policy - -# Firebase API Key -HUGO_FIREBASE_API_KEY= - -# Firebase Auth Domain -HUGO_FIREBASE_AUTH_DOMAIN= - -# Firebase Project ID -HUGO_FIREBASE_PROJECT_ID= - -# Firebase Storage Bucket -HUGO_FIREBASE_STORAGE_BUCKET= - -# Firebase Messaging Sender ID -HUGO_FIREBASE_MESSAGING_SENDER_ID= - -# Firebase App ID -HUGO_FIREBASE_APP_ID= - -# Firebase Measurement ID (Google Analytics) -HUGO_FIREBASE_MEASUREMENT_ID= diff --git a/web/.gitignore b/web/.gitignore deleted file mode 100644 index 49dd02f..0000000 --- a/web/.gitignore +++ /dev/null @@ -1,32 +0,0 @@ -# Hugo build directories -/public/ -/resources/_gen/ -/assets/jsconfig.json - -# Hugo binary (downloaded during Cloud Build) -hugo -hugo.tar.gz - -# Hugo cache -.hugo_build.lock - -# Environment variables (use .env.example as template) -.env - -# OS files -.DS_Store -Thumbs.db - -# Editor directories -.vscode/ -.idea/ -*.swp -*.swo -*~ - -# Node modules (if using npm/yarn) -node_modules/ - -# Temporary files -*.tmp -*.log diff --git a/web/.htmlhintrc b/web/.htmlhintrc deleted file mode 100644 index ea63124..0000000 --- a/web/.htmlhintrc +++ /dev/null @@ -1,16 +0,0 @@ -{ - "tagname-lowercase": true, - "attr-lowercase": true, - "attr-value-double-quotes": false, - "doctype-first": true, - "tag-pair": true, - "spec-char-escape": false, - "id-unique": true, - "src-not-empty": true, - "attr-no-duplication": true, - "alt-require": true, - "space-tab-mixed-disabled": false, - "inline-style-disabled": false, - "inline-script-disabled": false, - "id-class-ad-disabled": true -} diff --git a/web/.markdownlint.json b/web/.markdownlint.json deleted file mode 100644 index f2d8271..0000000 --- a/web/.markdownlint.json +++ /dev/null @@ -1,18 +0,0 @@ -{ - "default": true, - "MD013": false, - "MD022": false, - "MD024": false, - "MD025": false, - "MD026": false, - "MD031": false, - "MD032": false, - "MD033": false, - "MD034": false, - "MD036": false, - "MD041": false, - "MD052": false, - "MD055": false, - "MD056": false, - "MD060": false -} diff --git a/web/Makefile b/web/Makefile deleted file mode 100644 index a6ef25b..0000000 --- a/web/Makefile +++ /dev/null @@ -1,48 +0,0 @@ -.PHONY: dev serve build clean lint lint-md lint-html lint-install - -# Load environment variables from .env if it exists -ifneq (,$(wildcard ./.env)) - include .env - export -endif - -# Development server with fast render disabled for accurate previews -dev: - hugo server --disableFastRender - -# Alias for dev -serve: dev - -# Build the site for production -build: - hugo --minify --environment production - -# Clean generated files -clean: - rm -rf public/ resources/ - -# ============================================================================= -# Linting -# ============================================================================= - -# Install linting dependencies (run once) -lint-install: - npm install -g markdownlint-cli2 htmlhint - -# Lint all (markdown + HTML if built) -lint: lint-md - @if [ -d "public" ]; then \ - $(MAKE) lint-html; \ - else \ - echo "Tip: Run 'make build' first, then 'make lint-html' to lint generated HTML"; \ - fi - -# Lint Markdown files in content directory -lint-md: - @echo "Linting Markdown files..." - markdownlint-cli2 "content/**/*.md" - -# Lint generated HTML (requires build first) -lint-html: - @echo "Linting HTML files..." - htmlhint "public/**/*.html" --config .htmlhintrc diff --git a/web/README.md b/web/README.md deleted file mode 100644 index 86787dd..0000000 --- a/web/README.md +++ /dev/null @@ -1,69 +0,0 @@ -# Aixgo Website - -Hugo-based static website for [aixgo.dev](https://aixgo.dev). - -## Development - -```bash -# Install Hugo (macOS) -brew install hugo - -# Install linting tools (one-time) -make lint-install - -# Start development server -make dev -# Access at http://localhost:1313/ - -# Build for production -make build - -# Lint content -make lint -``` - -## Structure - -- `content/` - Markdown content (guides, blog, examples) -- `layouts/` - Hugo templates and shortcodes -- `static/` - Static assets (CSS, JS, images) -- `data/` - YAML data files (features.yaml, milestones.yaml) -- `config/` - Hugo configuration - -## Deployment - -Automatically deployed via Google Cloud Build on push to main: - -1. Cloud Build detects changes in `web/**` -2. Builds Hugo site with `--minify` -3. Deploys to Firebase Hosting - -Manual deployment: - -```bash -make build -firebase deploy --only hosting --project aixgo-dev -``` - -## Content Management - -### Adding/Updating Features - -Edit `data/features.yaml` to update the feature matrix on the Features page. - -### Creating Guides - -1. Create `content/guides/my-guide.md` -2. Add front matter (title, description, weight) -3. Write content in Markdown - -### Blog Posts - -1. Create `content/blog/my-post.md` -2. Add front matter with date and tags -3. Write content - -## Related - -- [Main Aixgo Documentation](../docs/) -- [Production Examples](../examples/) diff --git a/web/archetypes/default.md b/web/archetypes/default.md deleted file mode 100644 index 25b6752..0000000 --- a/web/archetypes/default.md +++ /dev/null @@ -1,5 +0,0 @@ -+++ -date = '{{ .Date }}' -draft = true -title = '{{ replace .File.ContentBaseName "-" " " | title }}' -+++ diff --git a/web/config/_default/author.toml b/web/config/_default/author.toml deleted file mode 100644 index a8d5031..0000000 --- a/web/config/_default/author.toml +++ /dev/null @@ -1,20 +0,0 @@ -name = "Charles Green" -headline = "Systems, not guesswork. | Lean. Fast. Secure AI." - -[[links]] -email = "team@aixgo.dev" - -[[links]] -github = "https://github.com/charlesgreen" - -[[links]] -link = "https://charles.green" - -[[links]] -linkedin = "https://linkedin.com/in/charlesgreen" - -[[links]] -x-twitter = "https://x.com/charles_green" - -[[links]] -youtube = "https://www.youtube.com/@charles-green" diff --git a/web/config/_default/hugo.toml b/web/config/_default/hugo.toml deleted file mode 100644 index 475735f..0000000 --- a/web/config/_default/hugo.toml +++ /dev/null @@ -1,39 +0,0 @@ -# -- Site Configuration -- -baseURL = "https://aixgo.dev/" -languageCode = "en-us" -title = "Aixgo - Production AI Agents in Go" -defaultContentLanguage = "en" - -enableRobotsTXT = true -summaryLength = 30 - -buildDrafts = false -buildFuture = false - -enableEmoji = true - -# googleAnalytics = "G-XXXXXXXXX" - -[pagination] - pagerSize = 10 - -[imaging] - anchor = 'Center' - -[taxonomies] - tag = "tags" - category = "categories" - author = "authors" - -[sitemap] - changefreq = 'daily' - filename = 'sitemap.xml' - priority = 0.5 - -[outputs] - home = ["HTML", "RSS", "JSON"] - -[params] - description = "Production-grade AI agent framework for Go. Session persistence, 8+ LLM providers, 13 orchestration patterns, enterprise security. Single <20MB binary, <100ms startup." - author = "Charles Green" - github = "https://github.com/aixgo-dev/aixgo" diff --git a/web/config/_default/languages.en.toml b/web/config/_default/languages.en.toml deleted file mode 100644 index 204aed8..0000000 --- a/web/config/_default/languages.en.toml +++ /dev/null @@ -1,12 +0,0 @@ -disabled = false -languageCode = "en" -languageName = "English" -weight = 1 -title = "Aixgo" - -[params] -displayName = "EN" -isoCode = "en" -rtl = false -dateFormat = "2 January 2006" -description = "AI-native agent framework for Go." diff --git a/web/config/_default/markup.toml b/web/config/_default/markup.toml deleted file mode 100644 index 3eb8d83..0000000 --- a/web/config/_default/markup.toml +++ /dev/null @@ -1,26 +0,0 @@ -# -- Markup -- -# These settings are required for the theme to function. - -[goldmark] - [goldmark.parser] - wrapStandAloneImageWithinParagraph = false - - [goldmark.parser.attribute] - block = true - - [goldmark.renderer] - unsafe = true - - [goldmark.extensions] - [goldmark.extensions.passthrough] - enable = true - [goldmark.extensions.passthrough.delimiters] - block = [['\[', '\]'], ['$$', '$$']] - inline = [['\(', '\)']] - -[highlight] - noClasses = false - -[tableOfContents] - startLevel = 2 - endLevel = 4 diff --git a/web/config/_default/menus.en.toml b/web/config/_default/menus.en.toml deleted file mode 100644 index d9f7549..0000000 --- a/web/config/_default/menus.en.toml +++ /dev/null @@ -1,53 +0,0 @@ -# -- Main Menu -- - -[[main]] - name = "Docs" - url = "/docs" - weight = 7 - -[[main]] - name = "Blog" - pageRef = "blog" - weight = 20 - -[[main]] - name = "GitHub" - url = "https://github.com/aixgo-dev/aixgo" - weight = 25 - -# -- Footer Menu -- - -[[footer]] - name = "Proverbs" - pageRef = "proverbs" - weight = 3 - -[[footer]] - name = "Features" - pageRef = "features" - weight = 5 - -[[footer]] - name = "Why Aixgo" - pageRef = "why-aixgo" - weight = 7 - -[[footer]] - name = "GitHub" - url = "https://github.com/aixgo-dev/aixgo" - weight = 10 - -[[footer]] - name = "Discussions" - url = "https://github.com/aixgo-dev/aixgo/discussions" - weight = 20 - -[[footer]] - name = "Roadmap" - url = "https://github.com/orgs/aixgo-dev/projects/1" - weight = 30 - -[[footer]] - name = "Report an Issue" - url = "https://github.com/aixgo-dev/aixgo/issues" - weight = 40 diff --git a/web/config/_default/module.toml b/web/config/_default/module.toml deleted file mode 100644 index e69de29..0000000 diff --git a/web/config/_default/params.toml b/web/config/_default/params.toml deleted file mode 100644 index 9953129..0000000 --- a/web/config/_default/params.toml +++ /dev/null @@ -1,11 +0,0 @@ -# -- Aixgo Site Parameters -- - -# Site description -description = "Production-grade AI agent framework built for Go developers" -author = "Charles Green" - -# Social links -github = "https://github.com/aixgo-dev/aixgo" - -# Footer text -footerText = "MIT Licensed · Open Source · Community Driven" diff --git a/web/content/_index.md b/web/content/_index.md deleted file mode 100644 index ea3ab5e..0000000 --- a/web/content/_index.md +++ /dev/null @@ -1,371 +0,0 @@ ---- -title: 'Aixgo - AI Agents in Pure Go' -description: 'Production-ready AI agent framework for Go developers. Build multi-agent systems with 6 LLM providers, enterprise security, and full observability.' -layout: 'page' -keywords: ['go ai agent framework', 'golang ai framework', 'ai agents golang', 'multi-agent systems', 'langchain alternative go'] ---- - - - -
- -# The Future of AI Agents is Go - -

Build production AI systems as single binaries. 6 LLM providers, enterprise security, full observability.

- -
- -{{< button href="https://github.com/aixgo-dev/aixgo#quick-start" target="_blank" class="cta-primary cta-massive" >}} Get Started → {{< /button >}} - -
- -
- ---- - -

From Prototype to Planet-Scale in 60 Seconds

- -
-
Install
- -```bash -go get github.com/aixgo-dev/aixgo -``` - -
- -
-
Create config/agents.yaml
- -```yaml -supervisor: - name: coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - - name: data-producer - role: producer - interval: 1s - outputs: - - target: analyzer - - - name: analyzer - role: react - model: gpt-4-turbo - prompt: | - You are a data analyst. Analyze incoming data and provide insights. - inputs: - - source: data-producer - outputs: - - target: logger - - - name: logger - role: logger - inputs: - - source: analyzer -``` - -
- -
-
Create main.go
- -```go -package main - -import ( - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - if err := aixgo.Run("config/agents.yaml"); err != nil { - panic(err) - } -} -``` - -
- -
-
Deploy anywhere
- -```bash -# Local development -go run main.go - -# Production - single <20MB binary -go build -o agent -./agent - -# Edge, Lambda, Cloud Run, Kubernetes - one binary, zero configuration -``` - -
- -
- -{{< button href="https://github.com/aixgo-dev/aixgo" target="_blank" >}} View Full Documentation → {{< /button >}} - -
- ---- - -
- -

Why the Industry is Moving to Go

- -

Python dominated AI because it was easy to prototype. Go will dominate production because it's built to ship. Read our [Philosophy](/why-aixgo) to understand why production AI deserves production tooling.

- -
- -
-

Container Size

-
-Python frameworks -1.2GB -with dependencies -
-
-Aixgo -<20MB -single binary -
-
-Impact -Deploy to edge devices, serverless, anywhere -
-
- -
-

Startup Performance

-
-Python frameworks -30-45s -cold start -
-
-Aixgo -<100ms -instant startup -
-
-Impact -True serverless viability, real-time response -
-
- -
-

Runtime Safety

-
-Python frameworks -Runtime -Discover errors in production -
-
-Aixgo -Compile-time -Compiler catches errors before deploy -
-
-Impact -Ship with confidence, sleep at night -
-
- -
-

LLM Data Validation

-
-Python + Pydantic -Runtime only -Type changes found in production -
-
-Aixgo -Compile-time -Type changes caught before deploy, auto-retry on LLM errors -
-
-Impact -Refactor with confidence, LLM errors auto-recover -
-
- -
- -
- ---- - -
- -## Production-Ready Features - -Aixgo provides a comprehensive feature set for building production AI agent systems. - -### LLM Providers (6 Cloud + Local Available) - -- **OpenAI** - Chat, streaming SSE, function calling, JSON mode -- **Anthropic (Claude)** - Messages API, streaming SSE, tool use -- **Google Gemini** - GenerateContent API, streaming SSE, function calling -- **X.AI (Grok)** - Chat, streaming SSE, function calling (OpenAI-compatible) -- **Vertex AI** - Google Cloud AI Platform, streaming SSE, function calling -- **HuggingFace** - Free Inference API, cloud backends -- **Ollama** - Local models (phi, llama, mistral, gemma), zero API costs, enterprise security - -### Security (Enterprise-Grade) - -- **Authentication Framework** - 4 modes: disabled, delegated, builtin, hybrid -- **RBAC Authorization** - Role-based access control -- **Rate Limiting** - Token bucket, per-tool/per-user limits -- **Prompt Injection Protection** - 5 categories, 25+ detection patterns -- **TLS/mTLS Support** - Secure communications -- **Audit Logging** - Elasticsearch, Splunk HEC, Webhook backends -- **Safe YAML Parser** - Secure configuration handling - -### Agent System & Orchestration - -- **Six Agent Types** - Producer, ReAct, Logger, Classifier, Aggregator, Planner -- **Classifier Strategies** - ZeroShot, FewShot, MultiLabel, SingleLabel -- **Aggregator Strategies** - Consensus, Weighted, Semantic, Hierarchical, RAG -- **Planner Strategies** - Chain-of-Thought, Tree-of-Thought, ReAct, MonteCarlo, Backward Chaining, Hierarchical -- **Supervisor Orchestration** - Multi-agent coordination -- **Patterns** - Parallel, Sequential, Reflection, MapReduce -- **Workflow Engine** - Complex workflow orchestration -- **Persistence** - FileStore, MemoryStore, checkpoints - -### MCP (Model Context Protocol) - -- **Transports** - Local and gRPC -- **Service Discovery** - Static, DNS, Kubernetes, Consul -- **Cluster Coordination** - Load balancing, health checks, failover -- **Dynamic Tool Registration** - Runtime tool management - -### Observability - -- **OpenTelemetry** - OTLP export, distributed tracing -- **Langfuse Integration** - LLM-specific analytics via OTLP -- **Prometheus Metrics** - Production monitoring -- **Health Checks** - Liveness and readiness probes - -### Deployment - -- **Containers** - Dockerfile, Docker Compose -- **Cloud** - Cloud Run, Kubernetes (Kustomize) -- **CI/CD** - GitHub Actions workflows -- **Testing** - Unit tests, E2E tests, benchmarking, linting - -### Planned Features - -- **Vision Support** - Anthropic multimodal (planned) -- **Kubernetes Operator** - Native K8s orchestration (planned) -- **Terraform IaC** - Infrastructure as code (planned) - -{{< button href="https://github.com/aixgo-dev/aixgo/blob/main/ROADMAP.md" target="_blank" >}} View Full Roadmap → {{< /button >}} - -
- ---- - -## The Go Advantage - -| What Matters in Production | Python Frameworks | Aixgo | -| -------------------------- | ----------------------------- | ------------------------------ | -| **Deploy Anywhere** | 1GB+ containers, complex deps | <20MB binary, zero deps | -| **Cold Start Speed** | 10-45 seconds | <100ms | -| **Type Safety** | Runtime discovery | Compile-time guarantees | -| **Concurrency** | GIL bottleneck | Native parallelism | -| **Scaling Pattern** | Rewrite for distribution | Same code, local → distributed | -| **Security Surface** | 200+ dependencies | ~10 vetted packages | -| **Memory Efficiency** | GC pressure, leaks common | Predictable, efficient GC | -| **Operational Cost** | High compute overhead | 60-70% infrastructure savings | - ---- - -
- -

Built for What's Next

- -

Aixgo isn't just another framework. It's the foundation for the next generation of AI systems:

- -
-

Multi-Agent Orchestration {{< status-badge status="available" >}}

-

Coordinate specialized agents with built-in supervisor patterns. Six agent types (Producer, ReAct, Logger, Classifier, Aggregator, Planner) with multiple strategies. Support for parallel/sequential/reflection/MapReduce workflows.

-
- -
-

Edge to Cloud, Seamlessly {{< status-badge status="available" >}}

-

Deploy the same <20MB binary everywhere - edge devices, serverless, Kubernetes. Local transport and gRPC for distributed systems.

-
- -
-

Observable by Default {{< status-badge status="available" >}}

-

OpenTelemetry with OTLP export, Langfuse integration, Prometheus metrics, and health checks built-in.

-
- -
-

Vector Databases & RAG {{< status-badge status="available" >}}

-

Build production-ready RAG systems with semantic search. Collection-based architecture with Firestore, memory stores, and extensible provider support. Learn more →

-
- -
-

LLM Provider Freedom {{< status-badge status="available" >}}

-

6 providers with streaming: OpenAI, Anthropic, Google Gemini, X.AI Grok, Vertex AI, HuggingFace. Switch providers with config changes.

-
- -
-

Enterprise Security {{< status-badge status="available" >}}

-

Authentication (4 modes), RBAC, rate limiting, prompt injection protection, TLS/mTLS, and audit logging.

-
- -
- ---- - -
- -

Roadmap

- -
- -
-
Planned
-
-
    -
  • Vision support for Anthropic multimodal
  • -
  • Kubernetes operator for agent orchestration
  • -
  • Terraform IaC modules
  • -
  • Vector database integrations (pgvector, Qdrant)
  • -
-
-
- -
-
Future
-
-
    -
  • Agent federation across organizations
  • -
  • Formal verification tooling
  • -
  • WASM target for browser agents
  • -
  • GPU acceleration support
  • -
-
-
- -
- -
- ---- - -## Get Started - -The AI infrastructure built on Python is reaching its limits. Aixgo is what comes next. - -{{< button href="https://github.com/aixgo-dev/aixgo#quick-start" target="_blank" class="cta-primary cta-massive" >}} Get Started → {{< /button >}} - -**MIT Licensed** · Open Source · Community Driven - -[GitHub](https://github.com/aixgo-dev/aixgo) · [Documentation](https://github.com/aixgo-dev/aixgo#readme) · [Discussions](https://github.com/aixgo-dev/aixgo/discussions) · -[Roadmap](https://github.com/aixgo-dev/aixgo/issues) diff --git a/web/content/blog/_index.md b/web/content/blog/_index.md deleted file mode 100644 index 817aa1d..0000000 --- a/web/content/blog/_index.md +++ /dev/null @@ -1,4 +0,0 @@ ---- -title: 'Blog' -description: 'Updates, insights, and deep dives into production AI systems.' ---- diff --git a/web/content/blog/introducing-aixgo.md b/web/content/blog/introducing-aixgo.md deleted file mode 100644 index 5e4b7b9..0000000 --- a/web/content/blog/introducing-aixgo.md +++ /dev/null @@ -1,242 +0,0 @@ ---- -title: 'Introducing Aixgo: AI Agents in Pure Go (Alpha Release)' -date: 2025-11-16 -draft: false -description: 'Aixgo alpha release - AI agent framework for Go developers. Build and test multi-agent systems today. Production release late 2025.' -tags: ['go', 'ai agents', 'alpha', 'langchain alternative', 'multi-agent systems'] -categories: ['Announcement', 'Technical'] -author: 'Charles Green' -showAuthor: true ---- - -Your AI agent doesn't need 1.5GB to say hello. - -Python frameworks produce massive containers—1GB+, 30-second cold starts, and dependency hell. They're built for research, not production systems that need to ship and scale. - -Today, we're launching **Aixgo alpha**—an AI agent framework built for Go developers who refuse to compromise on performance, security, or simplicity. - -## Why Aixgo? - -Python excels at AI research and prototyping, but production deployments reveal critical limitations: - -- **Bloated deployments** - 1GB+ containers with 200+ dependencies -- **Runtime surprises** - Type errors caught in production, not compile time -- **GIL limitations** - No true parallelism -- **Scaling complexity** - Manual orchestration overhead -- **Security vulnerabilities** - Large attack surface - -Aixgo exists because **production AI deserves production tooling.** Go developers shouldn't abandon their stack's strengths just to build AI agents. - -## Core Principles - -### 1. Single Binary Simplicity - -Deploy AI agents in <20MB binaries with zero runtime dependencies. - -```bash -# Python AI service -FROM python:3.11 -COPY requirements.txt . -RUN pip install -r requirements.txt # 1.2GB later... -COPY . . -CMD ["python", "main.py"] - -# Aixgo service -FROM scratch -COPY aixgo-agent / -CMD ["/aixgo-agent"] # <20MB total -``` - -### 2. Type-Safe Architecture - -Catch errors at compile time. Go's type system enforces contracts between agents, tools, and workflows. - -```go -// This won't compile - caught before deployment -agent := aixgo.NewAgent( - aixgo.WithName("analyzer"), - aixgo.WithModel(123), // Type error: expected string, got int -) -``` - -### 3. Seamless Scaling - -Start with Go channels locally. Scale to distributed agents with gRPC. **Same code, zero changes.** - -```go -// This code works locally AND distributed -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) -supervisor.AddAgent(analyzer) -supervisor.Run() // Local: channels, Distributed: gRPC -``` - -## Quick Example - -```go -package main - -import ( - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - if err := aixgo.Run("config/agents.yaml"); err != nil { - panic(err) - } -} -``` - -Configure agents declaratively in YAML. See the [Quick Start Guide](/guides/quick-start) for details. - -## Production Performance - -
- -| Metric | Python (LangChain) | Aixgo | Improvement | -| ---------------- | ------------------ | ------------- | ---------------------- | -| Container Size | 1.2GB | <20MB | **60x smaller** | -| Cold Start | 45 seconds | <100ms | **450x faster** | -| Throughput | 500-1,000 req/s | 10,000 req/s | **10-20x higher** | -| Memory Footprint | 512MB baseline | 50MB baseline | **10x more efficient** | -| Dependencies | 200+ packages | ~10 packages | **95% fewer** | - -{.advantage-table} - -
- -## Production Features - -**Observability:** OpenTelemetry integration, Langfuse, Prometheus, distributed tracing, health checks, structured logging. - -**Security:** Auth framework, RBAC, rate limiting, prompt injection protection, TLS/mTLS, audit logging, JWT verification. - -**Infrastructure:** Model Context Protocol (MCP) support with local and gRPC transports, service discovery, dynamic tool registration. - -**Reliability:** Circuit breakers, retry with exponential backoff, graceful degradation, workflow persistence. - -No instrumentation code required—configure and deploy: - -```yaml -observability: - tracing: true - service_name: 'my-agent-system' - exporter: 'otlp' - -security: - auth: - enabled: true - provider: 'jwt' - rate_limiting: - enabled: true - requests_per_minute: 1000 -``` - -## Supported Integrations - -**LLM Providers:** OpenAI, Anthropic (Claude), Google (Vertex AI, Gemini), xAI (Grok), HuggingFace - -**Vector Databases:** Firestore Vector Search, In-Memory Storage, Qdrant (in progress), pgvector (in progress) - -**Observability:** OpenTelemetry, Langfuse, Prometheus, Grafana, Datadog, New Relic - -## Use Cases - -- **Data Pipelines** - High-throughput ETL with inline AI classification and enrichment -- **Production APIs** - Sub-millisecond P99 latency AI endpoints -- **Edge Deployment** - Run on IoT gateways, edge servers, embedded systems -- **Multi-Agent Research** - Coordinate complex workflows with supervisor orchestration -- **Distributed Networks** - Scale from single instance to multi-region deployment - -## Current Status: Alpha - -**What's ready today:** - -- Multi-agent orchestration with 13 patterns (supervisor, sequential, parallel, router, swarm, hierarchical, RAG, reflection, ensemble, classifier, aggregation, planning, - MapReduce) -- Seven agent types: Producer, ReAct, Logger, Classifier, Aggregator, Planner, Custom -- YAML-based declarative configuration with validation -- Local and distributed execution (Go channels, gRPC/MCP) -- Complete observability suite: OpenTelemetry, Langfuse, Prometheus -- Enterprise security: Auth, RBAC, rate limiting, TLS/mTLS -- Production deployment: Docker, Cloud Run, Kubernetes -- 6 LLM providers with streaming support -- Circuit breakers, retry logic, workflow persistence - -**In active development:** - -- Vector database integrations (Qdrant, pgvector) -- Long-term memory and personalization -- Multi-modal capabilities (vision, audio, document parsing) - -Expect breaking changes as we evolve the API based on feedback. - -## When to Choose Aixgo - -Choose Aixgo when: - -- Deploying AI agents to production, not experimenting -- Your team uses Go for backend services -- Container size and cold start time matter -- You need type safety and compile-time error detection -- You want to avoid Python dependency overhead -- You're building distributed multi-agent systems - -Choose Python frameworks when: - -- Doing exploratory research or rapid prototyping -- Need access to Python's ML ecosystem -- Your team doesn't have Go experience -- You need features Aixgo doesn't support yet - -## Getting Started - -Follow our [Quick Start Guide](/guides/quick-start) to get running in 5 minutes. Explore the [Features](/features) and join -[GitHub Discussions](https://github.com/aixgo-dev/aixgo/discussions). - -## Roadmap - -### Beta Release (Q4 2025) - -- Complete vector database integrations (Qdrant, pgvector) -- Long-term memory and personalization -- Enhanced error handling and validation -- Production battle-testing - -### v1.0 Production Release (Q1 2026) - -- API stability guarantees with semantic versioning -- Kubernetes operator -- Multi-region deployment with state replication -- Terraform modules -- Multi-modal capabilities -- Performance benchmarking suite -- Production SLA commitments - -See our [v1.0 Compatibility Guarantee](/v1-compatibility) for API stability details. - -## Our Philosophy - -Aixgo is built on a simple belief: **production AI deserves production tooling.** - -We're not trying to out-prototype Python. We're trying to out-ship it. Production-first design, single binary simplicity, type safety, Go-native patterns, observable by default, -open source (MIT licensed). - -Read our complete [Philosophy](/why-aixgo) for design principles and decision criteria. - -## Join Us - -**Links:** - -- GitHub: [github.com/aixgo-dev/aixgo](https://github.com/aixgo-dev/aixgo) -- Discussions: [github.com/aixgo-dev/aixgo/discussions](https://github.com/aixgo-dev/aixgo/discussions) -- Documentation: [aixgo.dev](https://aixgo.dev) - -Drop into [GitHub Discussions](https://github.com/aixgo-dev/aixgo/discussions) with questions or feedback. - ---- - -**Where Python prototypes go to die in production, Go agents ship and scale.** - -Welcome to Aixgo. diff --git a/web/content/blog/v0-4-0-release.md b/web/content/blog/v0-4-0-release.md deleted file mode 100644 index baeb4aa..0000000 --- a/web/content/blog/v0-4-0-release.md +++ /dev/null @@ -1,906 +0,0 @@ ---- -title: "Aixgo v0.4.0: Go 1.26, Advanced Planning Strategies, Enhanced Security, and RAG Variants" -date: 2026-02-13 -description: "Major release with Go 1.26 upgrade, 6 advanced planner strategies (MCTS, Tree-of-Thought, ReAct Planning), 4 RAG pattern variants, JWT verification, file-based API keys, and typed MCP tool registration" -tags: ["release", "go-1.26", "planning", "rag", "security", "mcp"] -author: "Aixgo Team" ---- - -We're thrilled to announce **Aixgo v0.4.0**, our most significant release yet. This version brings enterprise-grade planning capabilities with 6 advanced strategies, 4 RAG pattern -variants for sophisticated retrieval workflows, critical security enhancements with full JWT verification, and a major platform upgrade to Go 1.26. Whether you're building complex -task planners, knowledge-intensive applications, or secure multi-tenant systems, v0.4.0 delivers the tools you need. - -## What's New in v0.4.0 - -**Quick Links:** - -1. [Go 1.26 Upgrade](#1-go-126-upgrade-major-platform-update) - Modern Go with performance improvements -1. [Security Enhancements](#2-security-enhancements-critical) - JWT verification and file-based API keys -1. [Planner Agent](#3-planner-agent-6-advanced-strategies) - MCTS, Tree-of-Thought, ReAct Planning, and more -1. [RAG Pattern Variants](#4-rag-pattern-4-variants) - Conversational, Multi-Query, and Hybrid RAG -1. [Reflection Pattern](#5-reflection-pattern-improvements) - Multi-critic aggregation with quality scoring -1. [MCP Tools Enhancement](#6-mcp-tools-typed-registration) - Type-safe tool registration with generics - -### 1. Go 1.26 Upgrade (Major Platform Update) - -Aixgo now requires **Go 1.26** or later, bringing significant performance improvements and modern language features. - -**What Changed:** - -- **Updated Dependencies** - All 17 core dependencies upgraded to latest versions -- **Docker Images** - All Dockerfiles now use `golang:1.26-alpine` base image -- **CI/CD Pipelines** - GitHub Actions workflows updated for Go 1.26 -- **Code Modernization** - Leveraging Go 1.26 features like `maps.Copy()` and consistent `any` type usage - -**Why Go 1.26:** - -- **Better Performance** - Improved compiler optimizations and runtime efficiency -- **Enhanced Security** - Latest security patches and vulnerability fixes -- **Modern Language Features** - Cleaner code with improved type inference -- **Ecosystem Support** - Latest versions of gRPC, OpenTelemetry, and other critical dependencies - -**Upgrade Path:** - -```bash -# Install Go 1.26 -# Visit https://go.dev/dl/ for installation instructions - -# Verify version -go version # Should show go1.26 or later - -# Update your project -go get github.com/aixgo-dev/aixgo@v0.4.0 -go mod tidy -``` - -### 2. Security Enhancements (CRITICAL) - -This release includes two major security features for production deployments. - -#### JWT Verification (Full Implementation) - -Aixgo now includes complete JWT verification with automatic JWKS (JSON Web Key Set) fetching, supporting Google Cloud, Auth0, Okta, and any OIDC-compliant provider. - -**Key Features:** - -- **RS256 Signature Verification** - Full RSA signature validation using `crypto/rsa` -- **Automatic JWKS Fetching** - Fetches public keys from Google, Auth0, Okta, custom OIDC providers -- **1-Hour Caching** - JWKS keys cached for 1 hour to reduce latency -- **Complete Validation** - Expiration, issuer, audience, and signature checks -- **RSA Key Validation** - Minimum 2048-bit RSA keys enforced - -**Configuration:** - -```yaml -security: - jwt: - enabled: true - jwks_url: 'https://www.googleapis.com/oauth2/v3/certs' # Google - # jwks_url: "https://YOUR_DOMAIN.auth0.com/.well-known/jwks.json" # Auth0 - # jwks_url: "https://YOUR_DOMAIN.okta.com/oauth2/default/v1/keys" # Okta - issuer: 'https://accounts.google.com' - audience: 'your-client-id.apps.googleusercontent.com' - cache_ttl: '1h' -``` - -**Usage Example:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/security" - -// JWT verification with Google -verifier, err := security.NewJWTVerifier(security.JWTConfig{ - JWKSUrl: "https://www.googleapis.com/oauth2/v3/certs", - Issuer: "https://accounts.google.com", - Audience: "your-app.apps.googleusercontent.com", - CacheTTL: time.Hour, -}) - -// Verify incoming JWT -token := "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..." -claims, err := verifier.Verify(context.Background(), token) -if err != nil { - return fmt.Errorf("invalid token: %w", err) -} - -// Extract claims -userID := claims["sub"].(string) -email := claims["email"].(string) -``` - -**Why This Matters:** - -- **Multi-Tenant Security** - Isolate users with verified identities -- **Zero-Trust Architecture** - Every request validated with cryptographic signatures -- **Standards Compliance** - OIDC-compliant, works with any identity provider -- **Production Ready** - Battle-tested crypto/rsa implementation - -#### File-Based API Keys - -Load API keys securely from files with automatic permission validation. - -**Supported Formats:** - -1. **Line-Based Format** (default): - -```text -user1= -user2= -admin= -``` - -1. **JSON Format**: - -```json -{ - "user1": "", - "user2": "", - "admin": "" -} -``` - -**Configuration:** - -```yaml -security: - api_keys: - enabled: true - file: '/etc/secrets/api-keys.txt' - format: 'lines' # or "json" - mode: 'plain' # Future: "encrypted" -``` - -**Security Features:** - -- **Permission Validation** - Rejects world-readable files (must be 0600 or 0400) -- **Comment Support** - Lines starting with `#` are ignored -- **Empty Line Handling** - Blank lines automatically skipped -- **Hot Reload** - Watch file for changes (future enhancement) - -**Usage Example:** - -```go -// Load API keys from file -loader := security.NewFileAPIKeyLoader("/etc/secrets/api-keys.txt") -keys, err := loader.Load() -if err != nil { - log.Fatal(err) -} - -// Validate request -apiKey := r.Header.Get("X-API-Key") -if userID, ok := keys[apiKey]; ok { - // Valid API key for userID -} else { - http.Error(w, "Unauthorized", http.StatusUnauthorized) -} -``` - -**Best Practices:** - -```bash -# Create secure API key file -echo "admin=sk-$(openssl rand -hex 32)" > /etc/secrets/api-keys.txt - -# Set restrictive permissions -chmod 600 /etc/secrets/api-keys.txt - -# Verify permissions -ls -l /etc/secrets/api-keys.txt -# Should show: -rw------- (600) -``` - -### 3. Planner Agent: 6 Advanced Strategies - -The Planner agent now supports 6 sophisticated planning strategies for complex task decomposition. - -#### Strategy 1: Chain-of-Thought (CoT) - -Systematic step-by-step decomposition with reasoning chains. - -```yaml -agents: - - name: cot-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: chain_of_thought - max_steps: 10 - prompt: 'Break down the task into logical steps with reasoning.' -``` - -**When to Use:** Sequential tasks requiring clear logical flow (data processing pipelines, multi-step calculations). - -**Example Output:** - -```text -Step 1: Load data from database (Reasoning: Need raw data first) -Step 2: Clean and normalize data (Reasoning: Remove outliers before analysis) -Step 3: Compute statistics (Reasoning: Need metrics for comparison) -Step 4: Generate report (Reasoning: Present findings) -``` - -#### Strategy 2: Tree-of-Thought (ToT) - -Multi-branch exploration with LLM-based scoring to find optimal paths. - -```yaml -agents: - - name: tot-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: tree_of_thought - branching_factor: 3 - max_depth: 4 - beam_width: 2 # Keep top 2 branches -``` - -**When to Use:** Problems with multiple solution paths where you need to explore alternatives (design decisions, strategic planning). - -**How It Works:** - -1. Generate 3 alternative approaches at each step -1. LLM scores each branch (0.0-1.0) -1. Keep top 2 scoring branches (beam search) -1. Expand best branches until max depth -1. Return highest-scoring complete path - -**Performance:** 20-40% better solution quality for complex problems vs. single-path approaches. - -#### Strategy 3: ReAct Planning - -Reasoning-action cycles with iterative refinement. - -```yaml -agents: - - name: react-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: react_planning - max_iterations: 5 - tools: - - name: search_documentation - description: 'Search internal docs for information' - - name: validate_approach - description: 'Check if approach is feasible' -``` - -**When to Use:** Tasks requiring external information or validation during planning (research projects, feasibility studies). - -**Cycle Example:** - -```text -Iteration 1: - Thought: Need to understand current system architecture - Action: search_documentation("system architecture") - Observation: System uses microservices with REST APIs - -Iteration 2: - Thought: Should validate if REST APIs support new feature - Action: validate_approach("add GraphQL endpoint") - Observation: GraphQL requires new gateway component - -Iteration 3: - Thought: Plan should include gateway setup first - Action: finalize_plan() -``` - -#### Strategy 4: Monte Carlo Tree Search (MCTS) - -UCB1-based exploration with backpropagation for optimal path finding. - -```yaml -agents: - - name: mcts-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: monte_carlo_tree_search - simulations: 100 - exploration_constant: 1.414 # sqrt(2), UCB1 standard - max_depth: 5 -``` - -**When to Use:** High-stakes decisions requiring exhaustive exploration (resource allocation, risk assessment). - -**Algorithm:** - -1. **Selection** - Use UCB1 to pick promising nodes: `value + c * sqrt(ln(parent_visits) / node_visits)` -1. **Expansion** - Generate child nodes for unexplored actions -1. **Simulation** - LLM evaluates rollout quality -1. **Backpropagation** - Update parent node scores - -**Performance:** Converges to optimal solution with sufficient simulations (typically 50-200). - -#### Strategy 5: Backward Chaining - -Goal-to-steps decomposition with dependency graph construction. - -```yaml -agents: - - name: backward-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: backward_chaining - goal: 'Deploy application to production' - validate_dependencies: true -``` - -**When to Use:** Goal-oriented planning where you know the end state (deployment planning, project management). - -**Example:** - -```text -Goal: Deploy application to production - -Subgoal 1: Application passes all tests - Required: Write tests (Subgoal 1.1) - Required: Code is complete (Subgoal 1.2) - -Subgoal 2: Infrastructure is ready - Required: Provision servers (Subgoal 2.1) - Required: Configure load balancer (Subgoal 2.2) - -Dependency Graph: 1.1, 1.2 → 1 → 2.1, 2.2 → 2 → Goal -``` - -#### Strategy 6: Hierarchical Planning - -Multi-level task breakdown with high-level strategy decomposition. - -```yaml -agents: - - name: hierarchical-planner - role: planner - model: gpt-4-turbo - planner_config: - strategy: hierarchical - levels: 3 # High-level → Mid-level → Low-level - max_tasks_per_level: 5 -``` - -**When to Use:** Large complex projects requiring organization (software development, event planning). - -**Example:** - -```text -Level 1 (High-Level Strategy): - - Design system architecture - - Implement core features - - Test and deploy - -Level 2 (Mid-Level Tasks): - - Design system architecture: - - Define API contracts - - Choose technology stack - - Create system diagram - -Level 3 (Low-Level Actions): - - Define API contracts: - - Write OpenAPI spec - - Review with stakeholders - - Generate client SDKs -``` - -**Comparison Table:** - -| Strategy | Best For | Complexity | Optimality | Speed | -| ----------------- | --------------------- | ---------- | ---------- | ------ | -| Chain-of-Thought | Sequential tasks | Low | Good | Fast | -| Tree-of-Thought | Multiple solutions | Medium | Better | Medium | -| ReAct Planning | Research tasks | Medium | Good | Medium | -| MCTS | High-stakes decisions | High | Best | Slow | -| Backward Chaining | Goal-oriented | Medium | Good | Fast | -| Hierarchical | Large projects | High | Good | Medium | - -### 4. RAG Pattern: 4 Variants - -Retrieval-Augmented Generation now supports 4 sophisticated variants for different use cases. - -#### Variant 1: Standard RAG - -Basic retrieve → generate flow for simple knowledge retrieval. - -```yaml -supervisor: - name: standard-rag - pattern: rag - rag_config: - retrieval_strategy: standard - top_k: 5 - similarity_threshold: 0.7 -``` - -**When to Use:** FAQ systems, documentation lookup, simple QA. - -**Flow:** - -```text -Query: "How do I configure JWT authentication?" - ↓ -Retrieve: Top 5 docs with similarity > 0.7 - ↓ -Generate: Answer using retrieved context -``` - -#### Variant 2: Conversational RAG - -History tracking with last 100 turns stored, 5 included in context. - -```yaml -supervisor: - name: conversational-rag - pattern: rag - rag_config: - retrieval_strategy: conversational - history_size: 100 - context_turns: 5 - top_k: 3 -``` - -**When to Use:** Chatbots, customer support, interactive documentation. - -**Features:** - -- **Session Persistence** - Stores last 100 conversation turns -- **Context Window** - Includes 5 most recent turns in retrieval -- **Coreference Resolution** - Understands "it", "that", "the previous answer" - -**Example:** - -```text -User: "How do I enable JWT?" -Bot: "Set jwt.enabled: true in config.yaml" - -User: "What about the JWKS URL?" ← References previous context -Bot: "Set jwt.jwks_url to your identity provider's endpoint" - -User: "Can you show an example?" ← Builds on conversation -Bot: "Here's a complete example using Google..." -``` - -#### Variant 3: Multi-Query RAG - -Query expansion with reciprocal rank fusion (RRF) for comprehensive retrieval. - -```yaml -supervisor: - name: multi-query-rag - pattern: rag - rag_config: - retrieval_strategy: multi_query - num_queries: 3 - fusion_method: reciprocal_rank_fusion - rrf_k: 60 # RRF parameter -``` - -**When to Use:** Complex questions requiring multiple perspectives, research tasks. - -**Algorithm:** - -1. LLM generates 3 query variants: - -```text -Original: "How to optimize LLM performance?" - -Variant 1: "What are best practices for LLM latency reduction?" -Variant 2: "How to reduce LLM API costs while maintaining quality?" -Variant 3: "What caching strategies improve LLM performance?" -``` - -1. Retrieve top-k for each query -1. Merge results using RRF scoring: - -```text -RRF_score(doc) = Σ(1 / (k + rank_i)) -where k=60, rank_i is doc position in query_i results -``` - -**Performance:** 30-50% better recall for complex queries vs. single-query retrieval. - -#### Variant 4: Hybrid RAG - -Semantic + keyword retrieval with RRF score merging. - -```yaml -supervisor: - name: hybrid-rag - pattern: rag - rag_config: - retrieval_strategy: hybrid - semantic_weight: 0.7 - keyword_weight: 0.3 - semantic_top_k: 10 - keyword_top_k: 10 - final_top_k: 5 -``` - -**When to Use:** Technical documentation, legal documents, code search (combines semantic understanding with exact term matching). - -**How It Works:** - -1. **Semantic Retrieval** - Vector similarity search (embeddings) -1. **Keyword Retrieval** - BM25 or full-text search -1. **Score Fusion** - RRF combines both result sets -1. **Reranking** - Final top-k selection - -**Example:** - -```text -Query: "JWT RS256 signature verification" - -Semantic Results (cosine similarity): - 1. "Implementing JWT authentication" (0.89) - 2. "Token verification best practices" (0.85) - 3. "OAuth2 and OIDC guide" (0.78) - -Keyword Results (BM25): - 1. "JWT signature algorithms" (contains "RS256") - 2. "Crypto/rsa signature verification" (exact match) - 3. "Implementing JWT authentication" (contains "JWT") - -Merged (RRF): - 1. "Implementing JWT authentication" (high in both) - 2. "Crypto/rsa signature verification" (high keyword score) - 3. "JWT signature algorithms" (good keyword match) -``` - -**Performance:** 40-60% better precision for technical queries requiring exact terminology. - -**Comparison Table:** - -| Variant | Use Case | Complexity | Recall | Precision | -| -------------- | -------------- | ---------- | ------ | --------- | -| Standard | Simple QA | Low | Good | Good | -| Conversational | Chatbots | Medium | Good | Better | -| Multi-Query | Research | Medium | Best | Good | -| Hybrid | Technical docs | High | Better | Best | - -### 5. Reflection Pattern Improvements - -Enhanced quality assessment and multi-critic aggregation for iterative refinement. - -#### Quality Score Extraction - -Robust parsing of quality scores from LLM outputs. - -**Supported Formats:** - -1. **JSON Format**: - -```json -{ "quality_score": 0.85, "feedback": "Good structure, needs examples" } -``` - -1. **Regex Patterns**: - -```text -Quality Score: 8.5/10 -Score: 0.85 -Rating: 8.5 out of 10 -``` - -1. **Sentiment Analysis** - Fallback when explicit scores missing - -**Configuration:** - -```yaml -supervisor: - name: reflection-workflow - pattern: reflection - reflection_config: - max_iterations: 3 - quality_threshold: 0.8 - score_extraction_method: 'json_first' # Try JSON, then regex, then sentiment -``` - -#### Critique Combination - -Structured refinement prompts for iterative improvement. - -**Example:** - -```text -Iteration 1: - Output: "The system handles authentication." - Critique: "Too vague, lacks detail about methods" - Score: 0.4 - -Iteration 2 (with refinement): - Output: "The system supports JWT and OAuth2 authentication with RS256 signing." - Critique: "Better, but missing configuration details" - Score: 0.7 - -Iteration 3 (with refinement): - Output: "The system supports JWT (RS256) and OAuth2 authentication. Configure via jwt.enabled and jwt.jwks_url in config.yaml. See docs/SECURITY.md for examples." - Critique: "Complete and actionable" - Score: 0.9 ✓ -``` - -#### Multi-Critic Aggregation - -Parallel critics with score averaging for comprehensive evaluation. - -```yaml -supervisor: - name: multi-critic-reflection - pattern: reflection - reflection_config: - num_critics: 3 - aggregation_method: average # or weighted_average, max, consensus - critic_roles: - - 'Technical accuracy reviewer' - - 'Clarity and readability reviewer' - - 'Completeness and examples reviewer' -``` - -**Performance:** 20-50% quality improvement over single-critic reflection for complex outputs. - -**How It Works:** - -```text -Output: [Generated documentation] - -Parallel Critique: - Critic 1 (Technical): Score 0.9, "Accurate implementation details" - Critic 2 (Clarity): Score 0.7, "Some sections unclear" - Critic 3 (Examples): Score 0.6, "Needs more code examples" - -Aggregated Score: (0.9 + 0.7 + 0.6) / 3 = 0.73 -Combined Feedback: "Good technical accuracy. Improve clarity in sections 2-3 and add code examples." - -Refinement: [Updated documentation with clearer explanations and examples] -``` - -### 6. MCP Tools: Typed Registration - -Type-safe tool registration using Go generics with automatic schema generation. - -**Before (v0.3.x):** - -```go -// Manual schema definition -server.RegisterTool(mcp.Tool{ - Name: "get_weather", - Description: "Get weather for a location", - InputSchema: map[string]any{ - "type": "object", - "properties": map[string]any{ - "location": map[string]any{"type": "string"}, - "units": map[string]any{"type": "string", "enum": []string{"celsius", "fahrenheit"}}, - }, - "required": []string{"location"}, - }, - Handler: func(ctx context.Context, args mcp.Args) (any, error) { - location := args["location"].(string) // Type assertion - units := args["units"].(string) // Type assertion - return getWeather(location, units) - }, -}) -``` - -**After (v0.4.0):** - -```go -// Define typed input struct -type WeatherInput struct { - Location string `json:"location" jsonschema:"required,description=City name"` - Units string `json:"units,omitempty" jsonschema:"enum=celsius|fahrenheit,default=celsius"` -} - -// Type-safe registration -server.RegisterTypedTool("get_weather", "Get weather for a location", - func(ctx context.Context, input WeatherInput) (any, error) { - return getWeather(input.Location, input.Units) - }, -) -``` - -**Benefits:** - -- **Compile-Time Safety** - Catch type errors at build time -- **Auto Schema Generation** - JSON schema generated from struct tags via reflection -- **No Type Assertions** - Direct access to typed fields -- **IDE Support** - Full autocomplete and type hints -- **Runtime Validation** - Automatic input validation against schema - -**Supported Tags:** - -```go -type ComplexInput struct { - Name string `json:"name" jsonschema:"required,minLength=1,maxLength=100"` - Age int `json:"age" jsonschema:"minimum=0,maximum=120"` - Email string `json:"email" jsonschema:"format=email"` - Tags []string `json:"tags,omitempty" jsonschema:"uniqueItems=true"` - Priority string `json:"priority" jsonschema:"enum=low|medium|high"` -} -``` - -**Error Handling:** - -```go -// Automatic validation errors -Input: {"name": "", "age": 150} - -Error: "validation failed: - - name: must have minimum length of 1 - - age: must be at most 120" -``` - -## Test Coverage - -All new features are thoroughly tested: - -- **New Test Files**: - - `pkg/security/jwt_test.go` - JWT verification (15 test cases) - - `internal/supervisor/patterns/rag_test.go` - RAG variants (20 test cases) - - `agents/planner_test.go` - All 6 planning strategies (30 test cases) - - `internal/supervisor/patterns/hierarchical_test.go` - Hierarchical planning (12 test cases) - -- **Test Results**: - - All 36 test packages passing - - 0 linter issues (golangci-lint) - - Race detector enabled (`go test -race`) - -```bash -# Run full test suite -make test - -# Run specific feature tests -go test ./pkg/security -v -run TestJWT -go test ./internal/supervisor/patterns -v -run TestRAG -go test ./agents -v -run TestPlanner -``` - -## Breaking Changes - -### Go Version Requirement - -**Before:** Go 1.24+ **After:** Go 1.26+ - -Action required: Upgrade to Go 1.26 before updating Aixgo. - -### Dependency Updates - -17 dependencies upgraded to latest versions. Most are backward compatible, but review `go.mod` changes if you import these directly: - -- `google.golang.org/grpc` - v1.63.0 → v1.68.0 -- `google.golang.org/protobuf` - v1.34.0 → v1.35.2 -- `go.opentelemetry.io/*` - v1.26.0 → v1.32.0 -- And 14 more (see full changelog) - -## Upgrade Guide - -### 1. Upgrade Go - -```bash -# Check current version -go version - -# Install Go 1.26 -# Visit https://go.dev/dl/ for your platform - -# Verify installation -go version -# Output: go version go1.26.0 darwin/arm64 -``` - -### 2. Update Aixgo - -```bash -# Update dependency -go get github.com/aixgo-dev/aixgo@v0.4.0 - -# Update all dependencies -go mod tidy - -# Verify build -go build ./... -``` - -### 3. Update Dockerfiles - -```dockerfile -# Before -FROM golang:1.24-alpine - -# After -FROM golang:1.26-alpine -``` - -### 4. Test Your Application - -```bash -# Run tests with race detector -go test -race ./... - -# Check for deprecated APIs -go vet ./... - -# Run linter -golangci-lint run -``` - -### 5. Update CI/CD - -```yaml -# GitHub Actions example -jobs: - test: - runs-on: ubuntu-latest - steps: - - uses: actions/setup-go@v4 - with: - go-version: '1.26' # Update from 1.24 -``` - -## Performance Improvements - -- **Go 1.26 Runtime** - 5-10% general performance improvement -- **Optimized Dependencies** - gRPC 1.68.0 brings 15% latency reduction -- **JWKS Caching** - 1-hour cache reduces JWT verification latency by 90% -- **RRF Scoring** - Optimized reciprocal rank fusion with O(n log n) complexity - -## What's Next - -Looking ahead to v0.5.0 (Q2 2026): - -**Agent Memory & Context Management:** - -- Semantic search over agent conversation history -- Long-term memory with automatic summarization -- Context window optimization for large histories - -**Enhanced Orchestration:** - -- Dynamic workflow modification at runtime -- Conditional branching based on agent outputs -- Workflow templates and composition - -**Observability:** - -- Real-time planning visualization -- RAG retrieval analytics dashboard -- Cost tracking per planning strategy - -**Security:** - -- Encrypted session storage at rest -- API key rotation and expiration policies -- Audit logging for all security events - -## Resources - -- **Documentation**: [docs/FEATURES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md) -- **Planning Guide**: [docs/PATTERNS.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md) -- **Security Best Practices**: [docs/SECURITY_BEST_PRACTICES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/SECURITY_BEST_PRACTICES.md) -- **API Reference**: [pkg.go.dev](https://pkg.go.dev/github.com/aixgo-dev/aixgo) -- **Release Notes**: [v0.4.0 on GitHub](https://github.com/aixgo-dev/aixgo/releases/tag/v0.4.0) - -## Get Involved - -We'd love to hear from you: - -- **GitHub Issues**: [github.com/aixgo-dev/aixgo/issues](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [github.com/orgs/aixgo-dev/discussions](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [docs/CONTRIBUTING.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - -## Contributors - -Thank you to everyone who contributed to this release through code, documentation, testing, and feedback. Special thanks to the community for feature requests and bug reports that -helped shape this release. - ---- - -**Aixgo** - Production-grade AI agents in Go. - -[Website](https://aixgo.dev) | [GitHub](https://github.com/aixgo-dev/aixgo) | [Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo) - -**Download Aixgo v0.4.0 today:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.4.0 -``` diff --git a/web/content/blog/v0-5-0-release.md b/web/content/blog/v0-5-0-release.md deleted file mode 100644 index c105493..0000000 --- a/web/content/blog/v0-5-0-release.md +++ /dev/null @@ -1,251 +0,0 @@ ---- -title: "Aixgo v0.5.0: Public Provider API and Guided ReAct Workflows" -date: 2026-02-14 -description: "LLM providers now available as public API for external projects, plus guided execution workflows with step-by-step verification for improved ReAct agent reliability" -tags: ["release", "providers", "react", "api", "workflows"] -author: "Aixgo Team" ---- - -We're excited to announce **Aixgo v0.5.0**, bringing two major improvements that expand Aixgo's capabilities: a public LLM provider API that enables external projects to leverage Aixgo's multi-provider infrastructure, and guided ReAct workflows that execute all tools per iteration with optional LLM verification for more reliable agent execution. - -## What's New in v0.5.0 - -**Quick Links:** - -1. [Public Provider API](#1-public-provider-api) - Import LLM providers in your projects -1. [Guided ReAct Workflows](#2-guided-react-workflows) - Step-by-step execution with verification - -### 1. Public Provider API - -The LLM provider and cost calculation packages have moved from `internal/` to `pkg/llm/`, making them available for external use. External projects can now import Aixgo's provider infrastructure without running full agent orchestration. - -**What Changed:** - -- **Provider Package** - All 7+ LLM providers now accessible via `github.com/aixgo-dev/aixgo/pkg/llm/provider` -- **Cost Calculator** - Pricing data for 25+ models available via `github.com/aixgo-dev/aixgo/pkg/llm/cost` -- **Supported Providers** - OpenAI, Anthropic (Claude), Google Gemini, xAI (Grok), Vertex AI, HuggingFace, and inference services (Ollama, vLLM) - -**Quick Example:** - -```go -import ( - "github.com/aixgo-dev/aixgo/pkg/llm/provider" - "github.com/aixgo-dev/aixgo/pkg/llm/cost" -) - -// Create OpenAI provider directly -openai, err := provider.NewOpenAI(apiKey, "gpt-4-turbo") - -// Make completion request -resp, err := openai.Complete(ctx, &provider.Request{ - Messages: []provider.Message{ - {Role: "user", Content: "Explain quantum computing"}, - }, - Temperature: 0.7, -}) - -// Calculate costs -calculator := cost.NewCalculator() -totalCost := calculator.Calculate("gpt-4-turbo", resp.Usage.PromptTokens, resp.Usage.CompletionTokens) -fmt.Printf("Request cost: $%.4f\n", totalCost) -``` - -**Why This Matters:** - -- **Reusable Infrastructure** - Use Aixgo's battle-tested provider implementations in any Go project -- **Multi-Provider Support** - Switch between providers without changing application code -- **Cost Transparency** - Built-in cost calculation for budget management -- **Production Ready** - All providers include retry logic, rate limiting, and error handling - -**Use Cases:** - -- **Custom Agent Frameworks** - Build your own agent system using Aixgo's providers -- **LLM Comparison Tools** - Test the same prompt across multiple providers -- **Cost Analysis** - Track LLM expenses across different models and providers -- **Prototyping** - Quickly integrate LLMs without building provider clients from scratch - -### 2. Guided ReAct Workflows - -ReAct agents now support guided execution mode, which executes all tool calls in each iteration and optionally verifies results with the LLM before proceeding. This improves reliability for multi-step workflows requiring quality control. - -**How It Works:** - -Previous behavior executed only the first tool call per iteration. Guided mode now: - -1. Executes **ALL tool calls** returned by the LLM in a single iteration -1. Presents results back to the LLM for verification -1. LLM decides whether to "continue" with more work or mark as "done" -1. Configurable maximum iterations prevent infinite loops - -**Configuration:** - -```yaml -agents: - - name: mail-processor - role: react - model: gpt-4-turbo - prompt: "Process email and extract action items..." - tools: - - name: extract_sender - description: "Extract sender email address" - - name: extract_subject - description: "Extract email subject" - - name: extract_action_items - description: "Extract action items from body" - guided_config: - enabled: true - max_iterations: 5 - verification_prompt: | - Review the extracted data. Check if all fields are complete: - - Sender email address - - Email subject - - Action items list - - Respond 'continue' if more extraction needed, 'done' if complete. -``` - -**Example Execution Flow:** - -```text -Iteration 1: - LLM: "I need to extract sender, subject, and action items" - Tools Called: [extract_sender, extract_subject, extract_action_items] - Results: - - Sender: "john@example.com" - - Subject: "Project Update" - - Action Items: ["Review PR #123", "Deploy to staging"] - - Verification Prompt: "Review the extracted data..." - LLM Response: "done" (all fields extracted successfully) - -Final Output: {sender: "john@example.com", subject: "Project Update", ...} -``` - -**Key Features:** - -- **Parallel Tool Execution** - All tools run simultaneously in each iteration -- **Quality Control** - Optional LLM verification catches incomplete results -- **Iteration Limits** - Configurable max iterations (default: 5) prevent runaway execution -- **Custom Verification** - Define verification prompts for domain-specific quality checks - -**When to Use Guided Mode:** - -- **Multi-Step Extraction** - Tasks requiring multiple tool calls to complete -- **Quality-Critical Workflows** - When partial results are unacceptable -- **Complex Dependencies** - Tools that build on each other's outputs -- **Production Systems** - Verification adds reliability for critical operations - -**Benefits:** - -- **40-70% Improved Reliability** - Verification catches incomplete or incorrect results -- **Better Resource Usage** - Executes all needed tools per iteration instead of making multiple LLM calls -- **Explicit Quality Gates** - Custom verification prompts enforce domain requirements -- **Debugging** - Clear iteration history shows exactly what the agent did - -## Performance Impact - -**Provider API:** - -- **Zero Overhead** - Direct provider imports have no additional latency vs. internal usage -- **Memory Efficient** - Providers only load when imported - -**Guided Workflows:** - -- **Reduced LLM Calls** - Executing all tools per iteration typically reduces total LLM calls by 30-50% -- **Latency** - Verification adds one LLM call per iteration, but fewer iterations needed overall -- **Cost** - Slight increase per iteration, but fewer total iterations often reduces overall cost - -| Execution Mode | Avg Iterations | Avg LLM Calls | Reliability | -|----------------|----------------|---------------|-------------| -| Standard | 3-5 | 6-10 | 60-70% | -| Guided | 2-3 | 4-6 | 90-95% | - -## Migration Guide - -### Using Public Provider API - -No breaking changes for existing Aixgo users. To use providers externally: - -```bash -# Install Aixgo -go get github.com/aixgo-dev/aixgo@v0.5.0 - -# Import provider package -import "github.com/aixgo-dev/aixgo/pkg/llm/provider" -``` - -### Enabling Guided Workflows - -Add `guided_config` to existing ReAct agent definitions: - -```yaml -agents: - - name: my-agent - role: react - model: gpt-4-turbo - prompt: "Your agent prompt..." - tools: [...] - guided_config: # Add this section - enabled: true - max_iterations: 5 - verification_prompt: "Review results. Respond 'continue' or 'done'." -``` - -**Backward Compatibility:** - -- Guided mode is opt-in via `guided_config.enabled: true` -- Existing agents continue using standard execution by default -- No code changes required for current deployments - -## What's Next - -Looking ahead to v0.6.0 (Q2 2026): - -**Enhanced Provider Features:** - -- Streaming support for all providers -- Function calling standardization across providers -- Provider-level cost limits and alerts - -**Workflow Improvements:** - -- Visual workflow debugging dashboard -- Automatic verification prompt generation -- Workflow templates for common patterns - -**Agent Capabilities:** - -- Long-term memory integration with guided workflows -- Multi-agent collaboration in guided mode -- Workflow branching based on verification results - -## Resources - -- **Provider API Documentation**: [pkg.go.dev/github.com/aixgo-dev/aixgo/pkg/llm](https://pkg.go.dev/github.com/aixgo-dev/aixgo/pkg/llm) -- **Guided Workflows Guide**: [docs/PATTERNS.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md) -- **Feature Catalog**: [docs/FEATURES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md) -- **Examples**: [examples/](https://github.com/aixgo-dev/aixgo/tree/main/examples) - -## Get Involved - -We'd love to hear how you use the new provider API and guided workflows: - -- **GitHub Issues**: [github.com/aixgo-dev/aixgo/issues](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [github.com/orgs/aixgo-dev/discussions](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [docs/CONTRIBUTING.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - -## Contributors - -Thank you to everyone who contributed to this release through code, documentation, testing, and feedback. - ---- - -**Aixgo** - Production-grade AI agents in Go. - -[Website](https://aixgo.dev) | [GitHub](https://github.com/aixgo-dev/aixgo) | [Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo) - -**Upgrade to v0.5.0 today:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.5.0 -``` diff --git a/web/content/blog/v0-6-0-release.md b/web/content/blog/v0-6-0-release.md deleted file mode 100644 index 250a408..0000000 --- a/web/content/blog/v0-6-0-release.md +++ /dev/null @@ -1,503 +0,0 @@ ---- -title: "Aixgo v0.6.0: Interactive Coding Assistant with Multi-Model Support" -date: 2026-03-08 -description: "Interactive coding assistant with file operations, git integration, and cost tracking" -tags: ["release", "cli", "chat", "assistant", "interactive"] -author: "Aixgo Team" ---- - -We're excited to announce **Aixgo v0.6.0**, introducing an interactive coding assistant that brings the power of -multi-agent AI to your command line. This release transforms Aixgo from purely an orchestration framework into a -complete AI development toolkit with the new `aixgo chat` command. - -## What's New in v0.6.0 - -**Quick Links:** - -1. [Interactive Coding Assistant](#1-interactive-coding-assistant) - Multi-model chat interface -1. [CLI Modernization](#2-cli-modernization) - Cobra-based subcommands -1. [Session Management](#3-session-management) - Persistent conversations -1. [Assistant Tools](#4-assistant-tools) - File, git, and terminal operations - -### 1. Interactive Coding Assistant - -The centerpiece of v0.6.0 is `aixgo chat` - an interactive coding assistant that combines conversational AI -with practical development tools, all in a lightweight Go binary. - -**Start Chatting in Seconds:** - -```bash -# Launch interactive session -aixgo chat - -# Use a specific model -aixgo chat --model gpt-4o - -# Resume a previous session -aixgo chat --session abc123 -``` - -**What Makes It Special:** - -- **Multi-Model Support** - Switch between 7+ LLM providers (Claude, GPT, Gemini, Grok) mid-conversation -- **Real-Time Streaming** - Markdown-rendered responses stream as they're generated -- **Cost Transparency** - See per-message and session-total costs automatically -- **Lightweight** - <20MB binary, <100ms startup time -- **Session Persistence** - Conversations automatically saved to `~/.aixgo/sessions/` - -**Example Session:** - -```text -╭──────────────────────────────────────────────────╮ -│ Aixgo Interactive Assistant │ -╰──────────────────────────────────────────────────╯ - Model: claude-3-5-sonnet - Type /help for commands, /quit to exit - -> Read the main.go file - -[Assistant reads and analyzes the file content...] - -> Now refactor the error handling to use wrapped errors - -[Assistant modifies the file with improved error handling...] - -✓ Updated main.go - - Added fmt.Errorf with %w wrapping - - Improved error context messages - -[Cost: $0.0023 | Session total: $0.0045] - -> /model gpt-4o -Switched to model: gpt-4o - -> Run the tests -[Confirms command execution...] -✓ Executed: go test ./... -ok github.com/example/pkg 0.123s - -[Cost: $0.0012 | Session total: $0.0057] -``` - -**In-Session Commands:** - -- `/model ` - Switch models without losing context -- `/cost` - Detailed cost breakdown by message -- `/save` - Manual session save -- `/clear` - Reset conversation (with confirmation) -- `/help` - Command reference -- `/quit` - Save and exit - -### 2. CLI Modernization - -The entire CLI has been refactored from flag-based commands to a modern Cobra framework with clear subcommands. - -**Before (v0.5.0):** - -```bash -aixgo -config agents.yaml -``` - -**After (v0.6.0):** - -```bash -# Run orchestrator -aixgo run -config agents.yaml - -# Interactive assistant -aixgo chat --model claude-3-5-sonnet - -# List models with pricing -aixgo models - -# Manage sessions -aixgo session list -aixgo session resume -aixgo session delete -``` - -**Benefits:** - -- **Better Discovery** - `aixgo --help` shows all subcommands -- **Clearer Purpose** - Each command has a specific function -- **Future Extensibility** - Easy to add new commands -- **Consistent UX** - Follows standard CLI conventions - -### 3. Session Management - -Sessions provide persistent conversation history with automatic cost tracking and easy resumption. - -**Session Commands:** - -```bash -# List all sessions -aixgo session list - -# Output: -# ID: a1b2c3d4 | Model: claude-3-5-sonnet | Cost: $0.0234 | Updated: 2026-03-08 -# ID: e5f6g7h8 | Model: gpt-4o | Cost: $0.0156 | Updated: 2026-03-07 - -# Resume a session -aixgo session resume a1b2c3d4 - -# Delete old sessions -aixgo session delete e5f6g7h8 -``` - -**Session Features:** - -- **Automatic Saving** - Every interaction saved to disk -- **Cost Tracking** - Per-session and per-message cost accumulation -- **Resume Anywhere** - Pick up conversations on any machine -- **JSON Storage** - Human-readable format at `~/.aixgo/sessions/` -- **Model History** - Track which models were used when - -**Storage Format:** - -```json -{ - "id": "a1b2c3d4", - "model": "claude-3-5-sonnet", - "created": "2026-03-08T10:00:00Z", - "updated": "2026-03-08T11:30:00Z", - "total_cost": 0.0234, - "messages": [ - { - "role": "user", - "content": "Read main.go", - "timestamp": "2026-03-08T10:00:00Z" - }, - { - "role": "assistant", - "content": "Here's the content of main.go...", - "timestamp": "2026-03-08T10:00:05Z", - "model": "claude-3-5-sonnet", - "cost": 0.0012 - } - ] -} -``` - -### 4. Assistant Tools - -The assistant comes equipped with practical development tools accessible through natural language. - -**File Operations:** - -- **read_file** - Read file contents with syntax highlighting context -- **write_file** - Create or update files with safety checks -- **glob** - Find files by pattern (e.g., "*.go", "src/**/*.ts") -- **grep** - Search file contents with regex support - -**Examples:** - -```text -> Read all Go files in the pkg directory -[Uses glob to find files, then reads each one] - -> Find all TODO comments in the codebase -[Uses grep to search across files] - -> Create a new utils.go file with helper functions -[Generates code and writes to file] -``` - -**Git Operations:** - -- **git_status** - Show working tree status -- **git_diff** - View changes in working directory -- **git_commit** - Create commits with generated messages -- **git_log** - View commit history - -**Examples:** - -```text -> What files have I changed? -[Runs git status and summarizes changes] - -> Show me the diff -[Displays git diff with context] - -> Commit these changes with an appropriate message -[Analyzes changes, generates commit message, confirms with user] -``` - -**Terminal Execution:** - -- **Safe Command Execution** - Allowlist-based security -- **User Confirmation** - Prompts before running commands -- **Output Capture** - Returns command results to assistant - -**Examples:** - -```text -> Run the test suite -Execute command: go test ./... -Continue? (y/n): y -[Runs tests and shows results] - -> What's the current Go version? -Execute command: go version -Continue? (y/n): y -go version go1.26.0 darwin/arm64 -``` - -**Security Features:** - -- **Command Allowlist** - Only approved commands execute -- **Path Validation** - Prevents directory traversal -- **Confirmation Prompts** - User approval required -- **Output Sanitization** - Removes sensitive information - -## Model Information - -View all available models and their pricing with a single command. - -```bash -aixgo models -``` - -**Output:** - -```text -Available LLM Models: - -Provider: Anthropic - claude-3-5-sonnet | Input: $0.003/1K | Output: $0.015/1K - claude-opus-4 | Input: $0.015/1K | Output: $0.075/1K - claude-3-haiku | Input: $0.00025/1K | Output: $0.00125/1K - -Provider: OpenAI - gpt-4o | Input: $0.005/1K | Output: $0.015/1K - gpt-4-turbo | Input: $0.01/1K | Output: $0.03/1K - gpt-3.5-turbo | Input: $0.0005/1K | Output: $0.0015/1K - -Provider: Google - gemini-1.5-pro | Input: $0.00035/1K | Output: $0.0014/1K - gemini-2.0-flash | Input: $0.000075/1K | Output: $0.0003/1K - -Provider: xAI - grok-2 | Input: $0.002/1K | Output: $0.010/1K -``` - -This helps you choose the right model for your budget and use case. - -## Performance & Efficiency - -**Binary Size & Startup:** - -- **Binary Size** - <20MB (includes all providers and tools) -- **Cold Start** - <100ms (serverless-ready) -- **Memory** - ~50MB base footprint - -**Cost Optimization:** - -Switching models mid-conversation enables cost-aware workflows: - -```text -> /model claude-3-haiku -Switched to model: claude-3-haiku - -> Run these simple code generation tasks -[Uses cheaper model for straightforward work] - -> /model claude-3-5-sonnet -Switched to model: claude-3-5-sonnet - -> Now review the code for security issues -[Uses more capable model for complex analysis] -``` - -**Comparison to Alternatives:** - -| Dimension | Aixgo Chat | Cursor/Copilot | Python CLIs | -|-----------|------------|----------------|-------------| -| Binary Size | <20MB | N/A (Editor) | 100MB+ venv | -| Startup Time | <100ms | N/A | 2-5s | -| Multi-Model | Yes (7+) | Limited | Varies | -| Cost Tracking | Built-in | No | Manual | -| Offline-First | Yes | No | Varies | -| Self-Hosted | Yes | No | Yes | - -## Migration Guide - -### Existing Users - -**No breaking changes** - The orchestrator functionality remains unchanged: - -```bash -# Old way still works -aixgo run -config agents.yaml - -# New subcommand (recommended) -aixgo run -config agents.yaml -``` - -**New capabilities** are purely additive via new subcommands. - -### Getting Started - -**Fresh Install:** - -```bash -# Install via go install -go install github.com/aixgo-dev/aixgo/cmd/aixgo@v0.6.0 - -# Or download binary -curl -L https://github.com/aixgo-dev/aixgo/releases/latest/download/aixgo_Linux_x86_64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ -``` - -**Configure API Keys:** - -Set at least one provider's API key: - -```bash -export OPENAI_API_KEY=sk-... # For GPT models -export ANTHROPIC_API_KEY=sk-ant-... # For Claude models -export GOOGLE_API_KEY=... # For Gemini models -export XAI_API_KEY=xai-... # For Grok models -``` - -**Start Chatting:** - -```bash -aixgo chat -``` - -## Use Cases - -### 1. Code Review Assistant - -```text -> Review this pull request for security issues -[Reads changed files, analyzes for vulnerabilities] - -> Generate a summary of changes for the PR description -[Creates structured PR description with changes categorized] -``` - -### 2. Refactoring Helper - -```text -> Find all instances of the old API pattern -[Uses grep to locate patterns] - -> Refactor them to use the new pattern -[Generates and applies changes across files] - -> Show me the diff before committing -[Displays git diff for review] -``` - -### 3. Documentation Generator - -```text -> Read all exported functions in pkg/ -[Scans codebase] - -> Generate API documentation in docs/api.md -[Creates comprehensive documentation] -``` - -### 4. Multi-Model Experimentation - -```text -> /model gpt-4o -Explain this algorithm - -> /model claude-3-5-sonnet -[Same question to different model] - -> /model gemini-1.5-pro -[Compare responses across models] -``` - -## What's Next - -Looking ahead to v0.7.0 (Q2 2026): - -**Enhanced Assistant Features:** - -- Visual workflow debugging for multi-step operations -- Custom tool integration via MCP protocol -- Session encryption at rest -- Collaborative sessions (shared context) - -**Provider Improvements:** - -- Function calling standardization -- Streaming support for all providers -- Provider health monitoring - -**Infrastructure:** - -- PostgreSQL session backend -- Web UI for session management -- Docker-based assistant deployment - -## Technical Implementation - -For those interested in the architecture: - -**Package Structure:** - -```text -pkg/assistant/ -├── coordinator/ # LLM orchestration with streaming -├── session/ # JSON file session persistence -├── tools/ -│ ├── file/ # read_file, write_file, glob, grep -│ ├── git/ # git operations -│ └── terminal/ # safe command execution -├── output/ # markdown streaming renderer -└── prompt/ # interactive UI components -``` - -**Key Design Decisions:** - -1. **File-Based Sessions** - Simple, portable, human-readable -2. **MCP-Compatible Tools** - Standard tool protocol for extensibility -3. **Streaming First** - Real-time feedback for better UX -4. **Cost Tracking** - Built into every LLM call automatically -5. **Security by Default** - Allowlists, confirmations, sanitization - -## Resources - -- **Interactive Guide**: [aixgo.dev/guides/chat-assistant](https://aixgo.dev/guides/chat-assistant) -- **CLI Reference**: [docs/CLI.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/CLI.md) -- **Feature Catalog**: [docs/FEATURES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md) -- **API Documentation**: [pkg.go.dev/github.com/aixgo-dev/aixgo](https://pkg.go.dev/github.com/aixgo-dev/aixgo) - -## Get Involved - -We'd love to hear how you use the new interactive assistant: - -- **GitHub Issues**: [github.com/aixgo-dev/aixgo/issues](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [github.com/orgs/aixgo-dev/discussions](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [docs/CONTRIBUTING.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - -**Feature Requests:** - -What tools would you like to see? Let us know: - -- Additional file operations? -- IDE integrations? -- Custom tool support? -- Team collaboration features? - -## Contributors - -Thank you to everyone who contributed to this release through code, documentation, testing, and feedback. - ---- - -**Aixgo** - Production-grade AI agents in Go. - -[Website](https://aixgo.dev) | [GitHub](https://github.com/aixgo-dev/aixgo) | [Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo) - -**Upgrade to v0.6.0 today:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.6.0 -``` diff --git a/web/content/blog/v0-7-0-release.md b/web/content/blog/v0-7-0-release.md deleted file mode 100644 index 46ab3e3..0000000 --- a/web/content/blog/v0-7-0-release.md +++ /dev/null @@ -1,233 +0,0 @@ ---- -title: "Aixgo v0.7.0: Amazon Bedrock Integration" -date: 2026-03-28 -description: "Access Claude, Llama, Nova, and more via AWS with enterprise-grade security" -tags: ["release", "bedrock", "aws", "enterprise", "multi-model"] -author: "Aixgo Team" ---- - -We're excited to announce **Aixgo v0.7.0**, bringing Amazon Bedrock support to the framework. -This release enables Go developers to access foundation models from Anthropic, Meta, Amazon, -and Mistral through AWS's managed AI service with enterprise-grade security. - -## What's New in v0.7.0 - -**Quick Links:** - -1. [Amazon Bedrock Provider](#1-amazon-bedrock-provider) - Access 20+ foundation models -1. [Multi-Model Access](#2-multi-model-access) - Claude, Llama, Nova, Titan, and more -1. [Enterprise Security](#3-enterprise-security) - AWS IAM, VPC endpoints, CloudTrail -1. [Getting Started](#4-getting-started) - Quick setup guide - -### 1. Amazon Bedrock Provider - -The centerpiece of v0.7.0 is the new Bedrock provider, offering access to foundation models -from multiple providers through a single, unified AWS API. - -**Key Benefits:** - -- **Single API** - Access Claude, Llama, Nova, Mistral, and Titan without managing multiple provider accounts -- **Converse API** - Unified messaging format that works across all models -- **AWS Security** - IAM authentication, VPC endpoints, CloudTrail logging -- **Cost Control** - Consolidated billing through AWS - -**Simple Configuration:** - -```yaml -agents: - - name: analyst - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - prompt: "You are a data analyst..." -``` - -**Go SDK Usage:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/llm/provider" - -p, err := provider.CreateProvider("bedrock", map[string]any{ - "region": "us-east-1", -}) - -resp, err := p.CreateCompletion(ctx, provider.CompletionRequest{ - Model: "anthropic.claude-3-5-sonnet-20240620-v1:0", - Messages: []provider.Message{ - {Role: "user", Content: "Analyze this data..."}, - }, -}) -``` - -### 2. Multi-Model Access - -Amazon Bedrock provides access to foundation models from multiple AI providers: - -| Provider | Models | -| --------- | ------------------------------------------------ | -| Anthropic | Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3 Opus, Claude Opus 4 | -| Amazon | Nova Pro, Nova Lite, Nova Micro | -| Meta | Llama 3 70B, Llama 3 8B, Llama 4 series | -| Mistral | Mistral Large | -| Amazon | Titan Text Express, Titan Text Lite | -| Cohere | Command R, Command R+ | -| AI21 | Jamba 1.5 Large, Jamba 1.5 Mini | - -**Automatic Model Detection:** - -Aixgo automatically routes to Bedrock based on model ID patterns: - -```go -// All of these use the Bedrock provider automatically -provider.DetectProvider("anthropic.claude-3-5-sonnet-20240620-v1:0") // -> "bedrock" -provider.DetectProvider("amazon.nova-pro-v1:0") // -> "bedrock" -provider.DetectProvider("meta.llama3-70b-instruct-v1:0") // -> "bedrock" -provider.DetectProvider("bedrock/anthropic.claude-3-haiku") // -> "bedrock" -``` - -### 3. Enterprise Security - -Amazon Bedrock integrates with AWS's enterprise security features: - -**AWS IAM Authentication:** - -```bash -# Option 1: Environment variables -export AWS_REGION=us-east-1 -export AWS_ACCESS_KEY_ID=AKIA... -export AWS_SECRET_ACCESS_KEY=... - -# Option 2: IAM roles (EC2/ECS/EKS) -# No configuration needed - automatically uses instance role - -# Option 3: Named profiles -export AWS_PROFILE=production -``` - -**VPC Endpoints:** - -Keep traffic within your VPC for enhanced security: - -```text -Your VPC -├── Subnet (private) -│ ├── Aixgo Application -│ └── VPC Endpoint (bedrock-runtime) -└── No internet gateway needed for Bedrock calls -``` - -**CloudTrail Logging:** - -Every API call is logged for audit and compliance: - -- Model invocations -- Access patterns -- Cost attribution - -### 4. Getting Started - -**Prerequisites:** - -1. AWS account with Bedrock access -1. Model access granted in the Bedrock console -1. AWS credentials configured - -**Installation:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.7.0 -``` - -**Quick Start:** - -```yaml -# agents.yaml -supervisor: - name: bedrock-demo - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - -agents: - - name: analyst - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - prompt: "You are a helpful assistant." -``` - -```bash -# Set AWS credentials -export AWS_REGION=us-east-1 - -# Run -aixgo run -config agents.yaml -``` - -## Cost Comparison - -Amazon Bedrock pricing is competitive with direct provider access: - -| Model | Bedrock (per 1M) | Direct API (per 1M) | -| --------------- | ---------------- | ------------------- | -| Claude 3.5 Sonnet | $3 / $15 | $3 / $15 | -| Claude 3 Haiku | $0.25 / $1.25 | $0.25 / $1.25 | -| Llama 3 70B | $2.65 / $3.50 | N/A (open source) | -| Nova Pro | $0.80 / $3.20 | N/A (Bedrock only) | - -Plus benefits of consolidated AWS billing, volume discounts, and enterprise contracts. - -## Migration Guide - -### From Anthropic Direct API - -```yaml -# Before (v0.6.0) -agents: - - name: analyst - model: claude-3-5-sonnet-20240620 - provider: anthropic - api_key: ${ANTHROPIC_API_KEY} - -# After (v0.7.0 with Bedrock) -agents: - - name: analyst - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock -``` - -### Environment Variables - -```bash -# Remove (optional - can keep for fallback) -# export ANTHROPIC_API_KEY=... - -# Add -export AWS_REGION=us-east-1 -# Credentials via IAM role, profile, or access keys -``` - -## What's Next - -Our roadmap for v0.8.0 includes: - -- **Bedrock Guardrails** - Content filtering and safety controls -- **Bedrock Agents** - Integration with AWS Bedrock Agents -- **Knowledge Bases** - RAG with Bedrock Knowledge Bases -- **Custom Models** - Support for fine-tuned Bedrock models - -## Resources - -- **AWS Bedrock Guide**: [aixgo.dev/guides/aws-bedrock](/guides/aws-bedrock/) -- **Provider Integration**: [aixgo.dev/guides/provider-integration](/guides/provider-integration/) -- **Feature Catalog**: [docs/FEATURES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md) - ---- - -**Upgrade to v0.7.0:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.7.0 -``` - -We're excited to see what you build with Amazon Bedrock and Aixgo. Share your projects and feedback in -[GitHub Discussions](https://github.com/orgs/aixgo-dev/discussions)! diff --git a/web/content/blog/v0.1.2-december-release.md b/web/content/blog/v0.1.2-december-release.md deleted file mode 100644 index d534512..0000000 --- a/web/content/blog/v0.1.2-december-release.md +++ /dev/null @@ -1,149 +0,0 @@ ---- -title: 'Aixgo v0.1.2: Faster Aggregation, Reliable Validation, Production Security' -date: 2025-12-07 -draft: false -description: - 'Two releases in one: deterministic voting strategies (8000× faster, zero LLM cost), automatic validation retry (40-70% fewer failures), and security hardening for production.' -tags: ['release', 'aggregation', 'validation', 'security', 'production'] -categories: ['Release'] -author: 'Charles Green' -showAuthor: true ---- - -**Two releases ship today.** Both solve real problems Go developers face when building production AI agents. - -- **v0.1.1**: Automatic validation retry cuts structured output failures 40-70% -- **v0.1.2**: Deterministic voting eliminates LLM costs for consensus decisions - -Here's what changed and why it matters. - ---- - -## Deterministic Aggregation: 8000× Faster, Zero Cost - -Multi-agent systems need to combine outputs. Before v0.1.2, every aggregation required an LLM call—even for simple majority votes. - -**The problem**: You're paying $0.01-0.03 and waiting 500-2000ms just to count which answer got the most votes. - -**The fix**: Four new voting strategies that run in <1ms with zero API cost: - -| Strategy | Use Case | Speed | -| ------------------- | ----------------------------------------------------- | ------ | -| `voting_majority` | Democratic consensus | <0.1ms | -| `voting_unanimous` | Safety-critical decisions (fails if any disagreement) | <0.1ms | -| `voting_weighted` | Expert panels with confidence scores | <0.2ms | -| `voting_confidence` | Trust the most confident agent | <0.1ms | - -```yaml -# Before: LLM call for every aggregation -agents: - - name: aggregator - role: aggregator - model: gpt-4-turbo - aggregator_config: - aggregation_strategy: consensus - -# After: Instant, free, deterministic -agents: - - name: aggregator - role: aggregator - aggregator_config: - aggregation_strategy: voting_majority -``` - -**When to use what:** - -- **Deterministic** (4 strategies): Simple consensus, cost-sensitive pipelines, regulated industries needing audit trails -- **LLM-based** (5 strategies): Resolving conflicts, synthesizing nuanced viewpoints, creating narratives from conflicting inputs - -You now have **9 total strategies**. Pick the right tool for the job. - ---- - -## Validation Retry: 40-70% Fewer Failures - -LLMs fail structured extraction constantly. GPT-4 omits required fields 30-40% of the time. Claude forgets email validation. Gemini returns strings where you need integers. - -**The problem**: You're writing manual retry logic for every structured output call. - -**The fix**: Aixgo automatically retries when validation fails—sending error details back to the LLM. - -```go -type User struct { - Name string `json:"name" validate:"required"` - Email string `json:"email" validate:"required,email"` - Age int `json:"age" validate:"gte=0,lte=150"` -} - -// That's it. Automatic retry on validation failure. -user, err := llm.CreateStructured[User](ctx, client, prompt, nil) -``` - -**What happens behind the scenes:** - -1. LLM returns `{"name": "John", "age": 30}` (missing email) -2. Aixgo detects validation failure, sends error feedback to LLM -3. LLM corrects: `{"name": "John", "email": "john@example.com", "age": 30}` -4. Your code receives valid data - -**Results across production workloads:** - -- Simple schemas (3-5 fields): 30% → 8% failure rate (**73% improvement**) -- Complex schemas (10+ fields): 45% → 18% failure rate (**60% improvement**) - -Zero configuration required. Enabled by default. - ---- - -## Security Hardening - -All critical Aikido security issues fixed: - -- **SSRF protection**: URL validation in SIEM integrations -- **Path traversal prevention**: Blocked directory traversal in file operations -- **Command injection fixes**: Sanitized kubectl commands -- **Kubernetes hardening**: Non-root users, dropped capabilities, read-only filesystems - -```yaml -# Production-ready security context -securityContext: - runAsNonRoot: true - runAsUser: 65532 - capabilities: - drop: [ALL] - readOnlyRootFilesystem: true - allowPrivilegeEscalation: false -``` - ---- - -## Documentation: 49% Smaller - -We cut 1,782 lines of redundant documentation. Same information, half the reading. - ---- - -## Upgrade - -```bash -go get github.com/aixgo-dev/aixgo@v0.1.2 -``` - -**Breaking changes**: None. All existing configurations work unchanged. - -**New examples:** - -- [Deterministic Aggregation](https://github.com/aixgo-dev/aixgo/tree/main/examples/deterministic-aggregation) -- [Validation Retry](https://github.com/aixgo-dev/aixgo/tree/main/examples/pydantic-style-validation) - ---- - -## What's Next - -**v0.2.0 (Q1 2025)**: Tool-use patterns and MCP integration **v0.3.0 (Q2 2025)**: OpenTelemetry tracing and workflow visualization - -Questions? [GitHub Discussions](https://github.com/aixgo-dev/aixgo/discussions) - ---- - -**Two releases. Real problems solved. Ship it.** diff --git a/web/content/blog/v0.2.0-release.md b/web/content/blog/v0.2.0-release.md deleted file mode 100644 index 4ba3b19..0000000 --- a/web/content/blog/v0.2.0-release.md +++ /dev/null @@ -1,298 +0,0 @@ ---- -title: 'Aixgo v0.2.0: Production-Grade VertexAI, Enhanced Security, and Stability' -date: 2025-12-26 -description: - 'Aixgo v0.2.0 brings major improvements to the VertexAI provider with Google Gen AI SDK migration, production hardening fixes, enhanced security with SSRF protection, and - streamlined CI/CD workflows.' -tags: ['release', 'vertexai', 'security', 'production', 'google-cloud'] -author: 'Aixgo Team' ---- - -**Aixgo v0.2.0** focuses on production readiness, security hardening, and developer experience. This release brings a major upgrade to our VertexAI provider, critical stability -fixes, and enhanced security features. - -## VertexAI Provider: Migration to Google Gen AI SDK - -The biggest change is complete migration from manual HTTP API calls to the official [Google Generative AI SDK](https://pkg.go.dev/google.golang.org/genai) (`v0.5.0`). - -### Key Benefits - -**Simplified Authentication** - -Previously, authentication required manual OAuth2 token management. Now, the provider uses Application Default Credentials (ADC), automatically discovering credentials from: - -- Service account key files via `GOOGLE_APPLICATION_CREDENTIALS` -- Google Cloud SDK credentials -- Compute Engine/Cloud Run metadata service - -```go -// Automatic ADC-based authentication -provider, err := provider.NewVertexAIProvider("gemini-1.5-pro", "your-project-id") -if err != nil { - log.Fatal(err) -} -defer provider.Close() // New: Graceful cleanup -``` - -**Better SDK Support** - -- Automatic handling of API versioning -- Built-in retry logic and error handling -- Support for new features as Google releases them -- Reduced maintenance burden - -**Configuration Example** - -```yaml -agents: - - name: analyzer - role: react - model: gemini-1.5-pro - prompt: | - Analyze the provided data and identify trends. - tools: - - name: query_database - description: Query the analytics database -``` - -```bash -# Set your GCP project ID -export GOOGLE_CLOUD_PROJECT=your-project-id - -# Configure ADC -export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json - -# Run your agent -go run main.go -``` - -## Production Hardening - -### Fixed Goroutine Leak in Streaming - -**Solution**: Added context-aware cleanup ensuring goroutines terminate properly when: - -- Client cancels the request -- Stream encounters an error -- Normal completion occurs - -```go -ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) -defer cancel() - -stream, err := provider.CompleteStream(ctx, request) -if err != nil { - log.Fatal(err) -} - -for chunk := range stream { - fmt.Print(chunk.Content) -} -// Goroutine automatically cleaned up -``` - -### Fixed Streaming Error Race Condition - -Implemented proper channel lifecycle management with sync primitives to prevent race conditions. - -### Improved Retry Logic - -The provider now includes robust retry logic: - -- **5 retries** for transient failures (rate limits, temporary errors) -- **Exponential backoff** with jitter (±30%) to prevent thundering herd -- **Max backoff** capped at 32 seconds -- **30-second timeout** for client creation - -### Deterministic Outputs Fixed - -Temperature is now correctly passed to the SDK, enabling truly deterministic outputs. - -```yaml -agents: - - name: classifier - role: classifier - model: gemini-1.5-flash - temperature: 0 # Now works correctly for deterministic outputs -``` - -## Security Enhancements - -### SSRF Protection for Ollama - -**Protected Against**: - -- Private IP address access (127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) -- Metadata service endpoints (169.254.169.254) -- IPv6 private addresses -- DNS rebinding attacks - -```yaml -model_services: - - name: local-llama - provider: huggingface - model: meta-llama/Llama-2-7b - runtime: ollama - config: - address: http://localhost:11434 # Validated for SSRF -``` - -### Debug Logging Control - -Debug logging is now controlled via `AIXGO_DEBUG` environment variable instead of always being enabled. - -```bash -# Enable debug logging -export AIXGO_DEBUG=true -go run main.go - -# Production: Debug logging disabled by default -go run main.go -``` - -## New Features - -### Tool/Function Response Handling - -The VertexAI provider now properly handles tool/function responses in the ReAct agent loop. - -```go -agent := NewReActAgent(AgentDef{ - Name: "assistant", - Model: "gemini-1.5-pro", - Tools: []Tool{ - { - Name: "get_weather", - Description: "Get current weather for a location", - Handler: weatherHandler, - }, - }, -}) - -result, err := agent.Execute(ctx, &Message{ - Content: "What's the weather in San Francisco?", -}) -``` - -### Graceful Provider Shutdown - -All providers now implement a `Close()` method for graceful cleanup. - -```go -provider, err := provider.NewVertexAIProvider("gemini-1.5-pro", "project-id") -if err != nil { - log.Fatal(err) -} -defer provider.Close() // Clean up gRPC connections, clients, etc. - -// Use provider... -``` - -## Upgrade Guide - -### From v0.1.2 to v0.2.0 - -#### 1. Update Dependency - -```bash -go get -u github.com/aixgo-dev/aixgo@v0.2.0 -go mod tidy -``` - -#### 2. VertexAI Authentication Changes - -**Before (v0.1.2)**: - -```go -// Manual token management (no longer needed) -provider, err := provider.NewVertexAIProvider(model, projectID, serviceAccountJSON) -``` - -**After (v0.2.0)**: - -```go -// Use Application Default Credentials -provider, err := provider.NewVertexAIProvider(model, projectID) -if err != nil { - log.Fatal(err) -} -defer provider.Close() // Add cleanup -``` - -Set up ADC: - -```bash -# Option 1: Service account key -export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json - -# Option 2: gcloud CLI -gcloud auth application-default login -``` - -#### 3. Add Provider Cleanup - -```go -provider, err := provider.NewOpenAIProvider("gpt-4-turbo") -if err != nil { - log.Fatal(err) -} -defer provider.Close() // Add this - -// Use provider... -``` - -#### 4. Review Debug Logging - -```bash -export AIXGO_DEBUG=true -``` - -#### 5. Test Ollama Configurations - -For production, use public endpoints or configure SSRF allowlists if needed. - -## Breaking Changes - -### VertexAI Provider - -- **Authentication**: The `serviceAccountJSON` parameter was removed. Use Application Default Credentials (ADC) instead. - -### All Providers - -- **Cleanup**: All providers now have a `Close()` method. While not strictly required to call, it's recommended for graceful cleanup. - -## Performance Improvements - -- **Reduced Memory Usage**: Goroutine leak fix prevents unbounded memory growth -- **Faster Error Recovery**: Improved retry logic with exponential backoff -- **Client Reuse**: VertexAI provider now reuses HTTP clients - -## What's Next - -Looking ahead to v0.3.0: - -- **More Provider Improvements**: Expanding SDK migrations to other providers -- **Enhanced Observability**: Better tracing for tool calls and multi-step reasoning -- **Performance Benchmarks**: Comprehensive benchmarking suite -- **Advanced Caching**: Response caching for improved latency and cost reduction - -## Get Involved - -- **Report Issues**: [GitHub Issues](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [GitHub Discussions](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [Contributing Guide](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - -## Resources - -- **Release Notes**: [v0.2.0 on GitHub](https://github.com/aixgo-dev/aixgo/releases/tag/v0.2.0) -- **Documentation**: [docs/](https://github.com/aixgo-dev/aixgo/tree/main/docs) -- **Examples**: [examples/](https://github.com/aixgo-dev/aixgo/tree/main/examples) -- **API Reference**: [pkg.go.dev](https://pkg.go.dev/github.com/aixgo-dev/aixgo) - ---- - -**Download Aixgo v0.2.0 today and build production-grade AI agents with confidence.** - -```bash -go get github.com/aixgo-dev/aixgo@v0.2.0 -``` diff --git a/web/content/blog/v0.2.2-public-interfaces.md b/web/content/blog/v0.2.2-public-interfaces.md deleted file mode 100644 index fc86a3c..0000000 --- a/web/content/blog/v0.2.2-public-interfaces.md +++ /dev/null @@ -1,202 +0,0 @@ ---- -title: 'Aixgo v0.2.2: Public Interfaces for Library-Style Integration' -date: 2025-12-27 -description: 'Introducing the public agent package: build custom agents and integrate Aixgo into existing Go applications with minimal dependencies.' -tags: ['release', 'public-api', 'integration', 'library'] -author: 'Aixgo Team' ---- - -**Aixgo v0.2.2** features the new public `agent` package that enables library-style integration of Aixgo into existing Go applications. This release also includes streaming -reliability improvements for the VertexAI provider. - -## Introducing the Public Agent Package - -The `agent` package is a standalone, minimal-dependency package that exports core interfaces for building custom agents without requiring the full Aixgo framework. - -### Why This Matters - -Until now, using Aixgo meant adopting the full framework. Many teams have existing Go services where they want to add agent-based functionality without a major architectural -change. - -The public agent package provides: - -- **Minimal Dependencies**: Only requires `github.com/google/uuid` -- **Clean Interfaces**: `Agent`, `Message`, and `Runtime` are all you need -- **Zero Framework Lock-in**: Use what you need, add more later -- **Production Ready**: `LocalRuntime` implementation for single-process deployments - -### Quick Example - -```go -package main - -import ( - "context" - "fmt" - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/agent" -) - -type AnalyzerAgent struct { - name string - ready bool -} - -func NewAnalyzerAgent(name string) *AnalyzerAgent { - return &AnalyzerAgent{name: name} -} - -func (a *AnalyzerAgent) Name() string { return a.name } -func (a *AnalyzerAgent) Role() string { return "analyzer" } -func (a *AnalyzerAgent) Ready() bool { return a.ready } - -func (a *AnalyzerAgent) Start(ctx context.Context) error { - a.ready = true - <-ctx.Done() - return nil -} - -func (a *AnalyzerAgent) Execute(ctx context.Context, input *agent.Message) (*agent.Message, error) { - var req struct { - Text string `json:"text"` - } - input.UnmarshalPayload(&req) - - return agent.NewMessage("result", map[string]interface{}{ - "length": len(req.Text), - "status": "analyzed", - }), nil -} - -func (a *AnalyzerAgent) Stop(ctx context.Context) error { - a.ready = false - return nil -} - -func main() { - ctx := context.Background() - rt := aixgo.NewRuntime() - - rt.Register(NewAnalyzerAgent("analyzer")) - rt.Start(ctx) - - input := agent.NewMessage("analyze", map[string]string{"text": "Hello, Aixgo!"}) - response, _ := rt.Call(ctx, "analyzer", input) - - var result map[string]interface{} - response.UnmarshalPayload(&result) - fmt.Printf("Result: %v\n", result) - - rt.Stop(ctx) -} -``` - -### Core Interfaces - -**Agent Interface** - -```go -type Agent interface { - Name() string - Role() string - Start(ctx context.Context) error - Execute(ctx context.Context, input *Message) (*Message, error) - Stop(ctx context.Context) error - Ready() bool -} -``` - -**Runtime Interface** - -```go -type Runtime interface { - Register(agent Agent) error - Call(ctx context.Context, target string, input *Message) (*Message, error) - CallParallel(ctx context.Context, targets []string, input *Message) (map[string]*Message, map[string]error) - Send(target string, msg *Message) error - // ... and more -} -``` - -**Message Struct** - -```go -msg := agent.NewMessage("request", payload). - WithMetadata("correlation_id", "req-123"). - WithMetadata("user_id", "user-456") -``` - -### When to Use Public Interfaces - -| Scenario | Recommended Approach | -| -------------------------------------- | -------------------- | -| New standalone AI application | Full Aixgo framework | -| Adding agents to existing microservice | Public agent package | -| Custom runtime implementation | Public agent package | -| Need built-in ReAct, Classifier agents | Full Aixgo framework | -| Minimal dependency footprint required | Public agent package | - -### Integration Patterns - -The public package excels at integrating into existing services: - -```go -type AgentService struct { - runtime agent.Runtime -} - -func (s *AgentService) HandleRequest(w http.ResponseWriter, r *http.Request) { - input := agent.NewMessage("process", requestBody). - WithMetadata("request_id", r.Header.Get("X-Request-ID")) - - response, err := s.runtime.Call(r.Context(), "processor", input) - // Handle response... -} -``` - -## VertexAI Streaming Improvements - -v0.2.2 continues our focus on production reliability with improvements to the VertexAI provider's streaming implementation. - -### What's Fixed - -- **Cancellable Context**: Stream cleanup now properly respects context cancellation -- **Priority Error Checking**: Prevents race conditions in error handling -- **Improved Close()**: Timeout-based channel draining ensures clean shutdown -- **Goroutine Leak Prevention**: Properly terminates goroutines when streams are interrupted - -## Installation - -```bash -# Full framework -go get github.com/aixgo-dev/aixgo@v0.2.2 - -# Just the public interfaces -go get github.com/aixgo-dev/aixgo/agent@v0.2.2 -``` - -## Documentation - -- **[Using Public Interfaces Guide](/guides/using-public-interfaces)**: Complete guide to library-style integration -- **Package README**: Detailed API reference in the repository -- **Integration Examples**: Real-world patterns for embedding agents - -## What's Next - -- **Additional Public Interfaces**: Expanding exportable components based on feedback -- **Runtime Implementations**: Exploring distributed runtime options -- **Enhanced Testing Utilities**: Mocks and test helpers for agent development - -## Get Involved - -- **GitHub Issues**: [Report bugs or request features](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [Share your use cases](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [Contribution guide](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - ---- - -**Get started with library-style agent integration today:** - -```bash -go get github.com/aixgo-dev/aixgo/agent@v0.2.2 -``` diff --git a/web/content/blog/v0.2.3-and-v0.2.4-phased-startup.md b/web/content/blog/v0.2.3-and-v0.2.4-phased-startup.md deleted file mode 100644 index 77c7dfc..0000000 --- a/web/content/blog/v0.2.3-and-v0.2.4-phased-startup.md +++ /dev/null @@ -1,244 +0,0 @@ ---- -title: 'Aixgo v0.2.3 & v0.2.4: Phased Agent Startup with Dependency Ordering' -date: 2026-01-02 -description: 'Introducing dependency-aware agent startup that eliminates race conditions in multi-agent systems using topological sort and phased initialization.' -tags: ['release', 'runtime', 'startup', 'dependencies', 'orchestration'] -author: 'Aixgo Team' ---- - -We're excited to announce **Aixgo v0.2.3** and **v0.2.4**, featuring **phased agent startup with dependency ordering**. This eliminates race conditions in multi-agent systems and -makes startup behavior predictable and safe. - -## The Problem: Startup Race Conditions - -When all agents start concurrently, orchestrators can't rely on dependencies being ready during their `Start()` method. - -### Before v0.2.3 - -```go -rt.Register(databaseAgent) // Dependency -rt.Register(cacheAgent) // Dependency -rt.Register(orchestratorAgent) // Needs both above - -rt.Start(ctx) // All start concurrently - race condition! -``` - -**Problem**: Orchestrator might call `runtime.Call()` during `Start()`, but dependencies might not be ready yet. - -## The Solution: Phased Startup - -v0.2.3 introduces **phased agent startup** powered by topological sort. Agents are grouped into dependency levels and started in phases, with each phase completing before the next -begins. - -### How It Works - -1. **Declare Dependencies**: Use `depends_on` in agent definitions -2. **Automatic Ordering**: Aixgo builds a dependency graph and computes startup phases -3. **Phase-Based Execution**: Agents start in phases (0, 1, 2, ...) based on dependencies -4. **Concurrent Within Phases**: Agents in the same phase start concurrently -5. **Ready Polling**: Each phase waits for all agents to be `Ready()` before proceeding - -### After v0.2.3 - -```yaml -agents: - - name: database - role: producer - # No dependencies - Phase 0 - - - name: cache - role: producer - depends_on: [database] - # Phase 1 - - - name: api - role: react - depends_on: [database, cache] - # Phase 2 -``` - -**Result**: Guaranteed startup order with no race conditions. - -## Key Features - -### 1. Topological Sort with Kahn's Algorithm - -- **Cycle Detection**: Automatically detects and reports circular dependencies -- **Optimal Ordering**: Minimizes startup phases -- **Parallel Execution**: Independent agents start concurrently - -### 2. Configurable Timeout - -```yaml -config: - agent_start_timeout: 45s # Default: 30s -``` - -### 3. All Runtime Support - -Phased startup works across all runtime implementations: - -- **LocalRuntime**: Single-process deployments -- **Runtime**: Lightweight runtime -- **DistributedRuntime**: Multi-node orchestration via gRPC - -### 4. Backward Compatible - -If you don't specify `depends_on`, all agents start concurrently as before. Phased startup is opt-in. - -## Real-World Example - -```yaml -agents: - # Tier 1: Foundational services - - name: config-service - role: producer - - - name: database - role: producer - - # Tier 2: Services depending on Tier 1 - - name: cache - role: producer - depends_on: [database, config-service] - - - name: auth-service - role: react - depends_on: [database] - - # Tier 3: Application services - - name: user-service - role: react - depends_on: [database, cache, auth-service] - - - name: order-service - role: react - depends_on: [database, cache, auth-service] - - # Tier 4: Orchestrators - - name: api-gateway - role: react - depends_on: [user-service, order-service] - -config: - agent_start_timeout: 60s -``` - -**Startup sequence**: - -1. **Phase 0**: `config-service`, `database` (concurrent) -2. **Phase 1**: `cache`, `auth-service` (concurrent, after Phase 0 ready) -3. **Phase 2**: `user-service`, `order-service` (concurrent, after Phase 1 ready) -4. **Phase 3**: `api-gateway` (after Phase 2 ready) - -## Version History - -### v0.2.4 (January 3, 2026) - -**Bug Fixes**: - -- Fix test failures exposed by phased startup -- Add mutex to MockAgent to prevent race condition -- Fix errorAgent to use AgentDef name instead of hardcoded value - -### v0.2.3 (January 2, 2026) - -**Major Features**: - -- Add phased agent startup with dependency ordering (topological sort) -- Add `depends_on` field to AgentDef -- Add `agent_start_timeout` config option (30s default) -- Add `internal/graph` package with DependencyGraph and Kahn's algorithm -- Improve `Start()` to block until all agents are `Ready()` - -**Bug Fixes**: - -- Fix unchecked error returns in agent tests -- Replace deprecated `option.WithCredentialsFile` with `WithAuthCredentialsFile` -- Fix MockAgent.Start blocking that caused test timeouts - -## Migration Guide - -### Adding Dependencies to Existing Systems - -1. **Identify Dependencies**: Which agents need others to be ready first? - - ```yaml - # Before - agents: - - name: orchestrator - - name: worker1 - - name: worker2 - ``` - -2. **Add depends_on**: Declare explicit dependencies - - ```yaml - # After - agents: - - name: worker1 - - name: worker2 - - name: orchestrator - depends_on: [worker1, worker2] - ``` - -3. **Test Startup**: Verify agents start in the correct order - - ```bash - # You'll see log messages: - # Starting Phase 0: [worker1, worker2] - # All agents in Phase 0 are ready - # Starting Phase 1: [orchestrator] - # All agents in Phase 1 are ready - ``` - -### No Changes Required - -If your agents don't have startup dependencies, no changes are needed. - -## Performance Considerations - -### Startup Time Impact - -- **Single Phase**: No additional delay -- **Multiple Phases**: ~100-200ms per phase for Ready() polling -- **Typical 3-Phase System**: Adds ~300-600ms to total startup time - -This is negligible compared to agent initialization time (database connections, loading models, etc.). - -### Optimization Tips - -1. **Minimize Dependencies**: Only declare true dependencies -2. **Parallelize Within Tiers**: Agents at the same level start concurrently -3. **Fast Ready() Checks**: Keep `Ready()` implementations lightweight - -## What's Next - -- **Graceful Shutdown Ordering**: Apply dependency ordering to shutdown (reverse order) -- **Dynamic Dependency Resolution**: Allow runtime dependency declaration -- **Dependency Visualization**: Tools to visualize and debug dependency graphs -- **Health-Based Dependencies**: Wait for agents to be healthy, not just ready - -## Get Involved - -- **GitHub Issues**: [Report bugs or request enhancements](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [Share your use cases](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [Contribution guide](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - -## Installation - -```bash -# Upgrade to the latest version -go get github.com/aixgo-dev/aixgo@v0.2.4 - -# Or use the public agent package -go get github.com/aixgo-dev/aixgo/agent@v0.2.4 -``` - ---- - -**Eliminate startup race conditions with dependency-aware agent initialization:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.2.4 -``` diff --git a/web/content/blog/v0.2.5-monorepo-consolidation.md b/web/content/blog/v0.2.5-monorepo-consolidation.md deleted file mode 100644 index a5cfa00..0000000 --- a/web/content/blog/v0.2.5-monorepo-consolidation.md +++ /dev/null @@ -1,159 +0,0 @@ ---- -title: 'v0.2.5: Monorepo Consolidation' -date: 2026-01-03 -description: 'Website merged into main repository for unified development experience' -tags: ['release', 'infrastructure', 'developer-experience'] -author: 'Aixgo Team' ---- - -We're excited to announce **Aixgo v0.2.5**, featuring a significant infrastructure improvement: the Aixgo website has been consolidated into the main repository as a monorepo. This -change streamlines development and improves the contributor experience. - -## What Changed - -The Aixgo website source code, previously maintained in a separate `aixgo-dev/web` repository, now lives at `web/` in the main [aixgo-dev/aixgo](https://github.com/aixgo-dev/aixgo) -repository. - -**Before**: - -```bash -# Two separate repositories -github.com/aixgo-dev/aixgo # Framework code -github.com/aixgo-dev/web # Website code -``` - -**After**: - -```bash -# Single unified repository -github.com/aixgo-dev/aixgo -├── agents/ # Framework code -├── examples/ -├── docs/ -└── web/ # Website code - ├── content/ - ├── themes/ - └── static/ -``` - -## Why Monorepo? - -The consolidation delivers several key benefits: - -### Single Source of Truth - -Documentation and code now live together, making it easier to keep them in sync. When you update a feature, you can update the corresponding documentation in the same pull request. - -### Simplified Contribution Workflow - -Contributors no longer need to navigate between repositories. Want to improve an example and update the related guide? One repository, one PR. - -```bash -# Before: Two repositories, two PRs -cd aixgo && git checkout -b feature-x -cd ../web && git checkout -b docs-feature-x - -# After: One repository, one PR -cd aixgo && git checkout -b feature-x-with-docs -``` - -### Better Example Integration - -Example READMEs now link directly to comprehensive web guides without cross-repository references. The documentation hierarchy is clearer and easier to navigate. - -### Unified CI/CD - -Build, test, and deployment workflows are consolidated. Website changes trigger automated builds and validation alongside framework tests. - -## For Contributors - -### Website Development - -Developing the website locally is straightforward: - -```bash -# Clone the repository -git clone https://github.com/aixgo-dev/aixgo.git -cd aixgo - -# Start the development server -cd web && make dev - -# Or from the root directory -make web-dev -``` - -### Available Make Targets - -The root Makefile now includes website-specific targets: - -```bash -make web-dev # Start Hugo development server -make web-build # Build production site -make web-lint # Run markdownlint on content -make web-clean # Clean generated files -``` - -### Documentation Updates - -When adding or modifying features: - -1. Update code in the main package directories -2. Update documentation in `docs/` or `web/content/` as needed -3. Submit a single PR with both changes - -```bash -# Example workflow -git checkout -b add-new-pattern -# Edit internal/supervisor/patterns/new_pattern.go -# Edit web/content/guides/orchestration/new-pattern.md -git add . -git commit -m "feat: add new orchestration pattern with documentation" -``` - -## Deployment - -The website still deploys to Firebase Hosting via Cloud Build, but now triggers on changes to `web/**` in the main repository. The deployment pipeline remains unchanged: - -- **Trigger**: Push to main branch with changes in `web/` -- **Build**: Cloud Build runs `hugo` to generate static site -- **Deploy**: Automatic deployment to [aixgo.dev](https://aixgo.dev) - -## Migration Notes - -### For External Contributors - -If you had a fork of the old `web` repository: - -1. Fork the main [aixgo-dev/aixgo](https://github.com/aixgo-dev/aixgo) repository -2. Website source is now in the `web/` directory -3. Create PRs against the main repository - -### For Maintainers - -The old `aixgo-dev/web` repository is now archived and read-only. All future website updates happen in the main repository. - -## What's Next - -The monorepo consolidation is the foundation for several upcoming improvements: - -- **Enhanced Documentation Tooling**: Automated link validation and documentation coverage checks -- **Integrated Examples**: Live code examples embedded in documentation with automatic testing -- **Versioned Docs**: Documentation versioning tied to framework releases - -## Get Involved - -We'd love your feedback on the new monorepo structure: - -- **GitHub Issues**: [Report issues or suggest improvements](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [Share your thoughts](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [Contribution guide](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - ---- - -**Start contributing to both code and documentation today:** - -```bash -git clone https://github.com/aixgo-dev/aixgo.git -cd aixgo && make web-dev -``` diff --git a/web/content/blog/v0.2.6-dependency-updates.md b/web/content/blog/v0.2.6-dependency-updates.md deleted file mode 100644 index 48645ed..0000000 --- a/web/content/blog/v0.2.6-dependency-updates.md +++ /dev/null @@ -1,206 +0,0 @@ ---- -title: 'v0.2.6: Dependency Updates & Release Automation' -date: 2026-01-04 -description: 'Major Google GenAI SDK update, GitHub Actions modernization, and automated binary releases' -tags: ['release', 'infrastructure', 'dependencies', 'automation'] -author: 'Aixgo Team' ---- - -We're pleased to announce **Aixgo v0.2.6**, a maintenance release focused on dependency updates and release automation. This release brings Aixgo up to date with the latest Google GenAI SDK and modernizes our CI/CD infrastructure. - -## What's New - -### 1. Google GenAI SDK Major Update - -Updated `google.golang.org/genai` from v0.5.0 to **v1.40.0**, bringing significant improvements to the Vertex AI provider: - -**Breaking API Changes**: - -The GenAI SDK changed several pointer types to direct values for better type safety: - -```go -// Before (v0.5.0) -response.UsageMetadata.PromptTokenCount // *int32 -response.UsageMetadata.CandidatesTokenCount // *int32 - -// After (v1.40.0) -response.UsageMetadata.PromptTokenCount // int32 -response.UsageMetadata.CandidatesTokenCount // int32 -``` - -Aixgo's Vertex AI provider (`internal/llm/provider/vertexai.go`) has been updated to handle these changes seamlessly. - -**New Capabilities**: - -- **Thinking Levels**: Enhanced reasoning capabilities for Gemini models -- **Ephemeral Token Support**: Improved security for short-lived credentials -- **Multi-Speaker Support**: Better context handling for conversational AI - -These features are now available to Aixgo applications using the Vertex AI provider with compatible Gemini models. - -### 2. GitHub Actions Modernization - -Our CI/CD workflows now use the latest Actions versions with Node.js 24 support: - -- **golangci-lint-action**: v4 → v9 (improved linting performance and caching) -- **upload-artifact**: v5 → v6 (better retention policies and cross-workflow access) -- **download-artifact**: v3 → v12 (enhanced artifact management) - -These updates improve build reliability and prepare our infrastructure for future GitHub Actions features. - -### 3. Automated Binary Releases - -Added `.goreleaser.yaml` for streamlined release automation using [GoReleaser](https://goreleaser.com/): - -**Supported Platforms**: - -```bash -# Operating Systems -- Linux (ELF binaries) -- macOS (Darwin) -- Windows (.exe binaries) - -# Architectures -- amd64 (x86-64) -- arm64 (Apple Silicon, ARM servers) -``` - -**CLI Tools Included**: - -- `orchestrator` - Multi-agent system orchestration -- `benchmark` - Performance benchmarking suite -- `deploy` - Deployment automation -- `tools` - Utility commands - -**Binary Downloads**: - -Starting with v0.2.6, you can download pre-built binaries from [GitHub Releases](https://github.com/aixgo-dev/aixgo/releases): - -```bash -# Example: Download orchestrator for macOS ARM64 -curl -LO https://github.com/aixgo-dev/aixgo/releases/download/v0.2.6/orchestrator_darwin_arm64 -chmod +x orchestrator_darwin_arm64 -./orchestrator_darwin_arm64 --help -``` - -### 4. Installation Documentation Clarity - -Updated README with clearer installation paths for different use cases: - -**For Library Users** (embed Aixgo in your application): - -```bash -go get github.com/aixgo-dev/aixgo@v0.2.6 -``` - -**For CLI Tool Users** (use Aixgo commands): - -```bash -# Option 1: Install via go install -go install github.com/aixgo-dev/aixgo/cmd/orchestrator@v0.2.6 - -# Option 2: Download pre-built binaries -# Visit: https://github.com/aixgo-dev/aixgo/releases/v0.2.6 -``` - -**For Contributors** (develop Aixgo): - -```bash -git clone https://github.com/aixgo-dev/aixgo.git -cd aixgo -make build -``` - -## Why This Release Matters - -### Vertex AI Provider Improvements - -The GenAI v1.40.0 update brings production-ready features that enhance Aixgo's Vertex AI integration: - -1. **Thinking Levels**: Enable deeper reasoning for complex agent tasks -2. **Better Type Safety**: Direct value types reduce nil pointer errors -3. **Future-Proof**: Aligns with Google's long-term AI API design - -If you're using the Vertex AI provider, this update provides immediate access to new Gemini capabilities without code changes. - -### Release Automation Benefits - -GoReleaser integration streamlines our release process: - -- **Faster Releases**: Automated binary builds reduce release time from hours to minutes -- **Consistent Packaging**: All binaries follow the same naming and structure -- **Easy Distribution**: Users can choose between `go install` or pre-built binaries - -### CI/CD Modernization - -GitHub Actions updates improve developer experience: - -- **Faster Builds**: Improved caching and parallel execution -- **Better Artifacts**: Enhanced artifact retention and cross-workflow sharing -- **Node.js 24**: Latest runtime for future Actions features - -## Migration Guide - -### Upgrading to v0.2.6 - -**For Library Users**: - -```bash -# Update your go.mod -go get -u github.com/aixgo-dev/aixgo@v0.2.6 - -# Verify dependencies -go mod tidy -``` - -**For CLI Tool Users**: - -```bash -# Reinstall tools -go install github.com/aixgo-dev/aixgo/cmd/orchestrator@v0.2.6 - -# Or download new binaries from GitHub Releases -``` - -### Vertex AI Provider Changes - -No code changes required. The provider automatically handles the GenAI SDK updates: - -```yaml -# Your existing configuration works as-is -agents: - - name: gemini-analyst - role: react - model: gemini-2.0-flash-001 - prompt: "Analyze the following data..." - provider: vertexai -``` - -### Breaking Changes - -**None**. This is a maintenance release with backward-compatible changes. - -## What's Next - -The v0.2.6 release sets the foundation for upcoming improvements: - -- **Enhanced Vertex AI Features**: Leverage thinking levels and multi-speaker support -- **Automated Release Notes**: Generate changelogs from commits -- **Binary Checksums**: Add SHA256 verification for downloaded binaries -- **Homebrew Formula**: macOS package manager integration - -## Get Involved - -We'd love your feedback on the new release automation: - -- **GitHub Issues**: [Report issues or suggest improvements](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [Share your thoughts](https://github.com/orgs/aixgo-dev/discussions) -- **Contributing**: [Contribution guide](https://github.com/aixgo-dev/aixgo/blob/main/docs/CONTRIBUTING.md) - ---- - -**Upgrade to v0.2.6 today:** - -```bash -go get github.com/aixgo-dev/aixgo@v0.2.6 -``` diff --git a/web/content/blog/v0.3.3-session-persistence.md b/web/content/blog/v0.3.3-session-persistence.md deleted file mode 100644 index 8dda513..0000000 --- a/web/content/blog/v0.3.3-session-persistence.md +++ /dev/null @@ -1,401 +0,0 @@ ---- -title: "Aixgo v0.3.3: Session Persistence, Runtime Consolidation, and Security Hardening" -date: 2026-02-08 -description: "AI agents that remember with built-in session persistence, automated binary releases, unified runtime API, distributed runtime parity, and complete security baseline" -tags: ["release", "sessions", "security", "runtime", "goreleaser"] -author: "Aixgo Team" ---- - -We're excited to announce **Aixgo v0.3.3**, a major release bringing enterprise-grade session persistence, a unified runtime API, distributed runtime feature parity, automated binary releases, and comprehensive security hardening. This release transforms how you build stateful AI agents in Go. - -## What's New in v0.3.3 - -**Quick Links:** -- [Session Persistence](#1-session-persistence-major-feature) - AI agents with memory -- [Unified Runtime API](#2-unified-runtime-api) - Cleaner constructor pattern -- [Distributed Runtime](#3-distributed-runtime-parity) - Multi-node orchestration with TLS -- [GoReleaser Automation](#4-goreleaser-automation-major-infrastructure-update) - One-click binary installation -- [Security Hardening](#5-security-hardening-all-alerts-resolved) - Complete security baseline - -### 1. Session Persistence (Major Feature) - -AI agents now have memory out of the box. The new `pkg/session/` package provides durable conversation history with automatic persistence, checkpoints, and seamless resumption. - -**Key Features:** - -- **JSONL File Storage** - Lightweight, inspectable session storage with append-only writes -- **Redis Backend** - Distributed session support for multi-node deployments -- **Checkpoint/Restore** - Save conversation state and rollback when needed -- **Runtime Integration** - `CallWithSession()` automatically manages session lifecycle -- **Session-Aware Agents** - ReAct agents gain access to full conversation history - -**Quick Example:** - -```go -// Create session infrastructure -backend, _ := session.NewFileBackend("~/.aixgo/sessions") -mgr := session.NewManager(backend) -defer mgr.Close() - -// Create a session for a user -sess, _ := mgr.GetOrCreate(ctx, "assistant", "user-123") - -// Call agent with session - history automatically persists -result, _ := rt.CallWithSession(ctx, "assistant", msg, sess.ID()) - -// Create checkpoint before risky operation -checkpoint, _ := sess.Checkpoint(ctx) - -// Restore if needed -sess.Restore(ctx, checkpoint.ID) -``` - -**Why This Matters:** - -- **No More Memory Loss** - Agents remember context across restarts -- **Production-Ready** - Built-in persistence without external dependencies -- **Developer Friendly** - Sessions are automatic; opt-out for stateless use cases - -**Performance:** - -| Operation | Latency | -|-----------|---------| -| Create session | <1ms | -| Append message | <1ms | -| Get 100 messages | <5ms | -| Checkpoint | <1ms | - -See [SESSIONS.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/SESSIONS.md) for comprehensive documentation. - -### 2. Unified Runtime API - -The runtime API is now unified with `aixgo.NewRuntime()` using the functional options pattern: - -```go -rt := aixgo.NewRuntime( - aixgo.WithSessionManager(sessionMgr), - aixgo.WithMetrics(metricsCollector), - aixgo.WithTimeout(30*time.Second), -) -rt.Register("agent1", agent1) -rt.Start(ctx) -``` - -**Benefits:** - -- **Cleaner API** - Single constructor with optional configuration -- **Flexible** - Functional options pattern for extensibility -- **Future-Proof** - Easy to add new options without breaking changes - -### 3. Distributed Runtime Parity - -The distributed runtime (`DistributedRuntime`) now has feature parity with the local runtime. - -**New Capabilities:** - -1. **TLS/mTLS Support** - -```go -rt, err := runtime.NewDistributedRuntime( - ":50051", - runtime.WithTLS(certFile, keyFile, caFile), // mTLS - runtime.WithExternalTLS(), // Service mesh mode -) -``` - -2. **gRPC Streaming** - -Remote agents now support streaming for long-running operations and real-time responses. - -3. **Session Manager Integration** - -```go -// Distributed runtime with sessions -rt.SetSessionManager(sessionMgr) - -// Sessions work transparently across nodes -result, _ := rt.CallWithSession(ctx, "remote-agent", msg, sessID) -``` - -4. **Redis Session Backend** - -For distributed deployments, use Redis for shared session storage: - -```go -backend, err := session.NewRedisBackend( - "localhost:6379", - session.RedisOptions{ - Password: os.Getenv("REDIS_PASSWORD"), - DB: 0, - KeyPrefix: "aixgo:sessions:", - }, -) -``` - -**Use Cases:** - -- **Multi-Node Orchestration** - Scale agents across machines -- **Service Mesh Deployments** - Integrate with Istio, Linkerd -- **Kubernetes** - Cloud-native agent deployments -- **High Availability** - Redundant agent instances with shared state - -### 4. GoReleaser Automation (Major Infrastructure Update) - -We've introduced automated binary releases via GoReleaser and GitHub Actions, making installation significantly easier. - -**Key Features:** - -- **Single Binary Distribution** - Renamed CLI from `orchestrator` to `aixgo` for consistency -- **Multi-Platform Support** - Pre-built binaries for Linux, macOS, and Windows (amd64, arm64) -- **Automatic Releases** - Push a tag matching `v*` to trigger automatic build and publish -- **SBOM Generation** - Security compliance with Software Bill of Materials for all releases -- **Optimized Builds** - Static binaries (`CGO_ENABLED=0`) with stripped symbols for smaller size - -**Installation Options:** - -```bash -# Download pre-built binary (recommended) -curl -L https://github.com/aixgo-dev/aixgo/releases/latest/download/aixgo_0.3.3_Linux_x86_64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ - -# Install from source -go install github.com/aixgo-dev/aixgo/cmd/aixgo@v0.3.3 - -# Use as a library -go get github.com/aixgo-dev/aixgo@v0.3.3 -``` - -**CLI Simplification:** - -We've streamlined the CLI tools to focus on the core orchestrator functionality: - -- **Removed**: `benchmark`, `deploy/cloudrun`, `deploy/k8s`, `tools` - These were experimental and rarely used -- **Renamed**: `cmd/orchestrator` → `cmd/aixgo` for better discoverability -- **Single Binary**: One unified CLI for all orchestration needs - -**Developer Benefits:** - -- **Faster Onboarding** - Download and run without Go installation -- **Consistent Versioning** - Binary version matches library version -- **Reproducible Builds** - Checksums and SBOMs for verification -- **CI/CD Friendly** - Easy integration into pipelines with pre-built binaries - -### 5. Security Hardening (All Alerts Resolved) - -We've resolved **all remaining code scanning alerts** from GitHub Advanced Security, achieving a clean security baseline. - -**Final Security Fixes:** - -1. **G402: TLS Configuration** - Fixed 5 remaining alerts by replacing `//nolint:gosec` with `#nosec G402` annotations that GitHub CodeQL recognizes. All instances now include explicit warnings about `InsecureSkipVerify` usage. - -```go -// #nosec G402 -- InsecureSkipVerify required for self-signed certs in dev -// WARNING: Only use in development environments -tlsConfig := &tls.Config{InsecureSkipVerify: true} -``` - -**Previously Addressed (29 Total Fixes):** - -1. **G204: Subprocess Injection Prevention** - -```go -// Before: Vulnerable to command injection -exec.Command("sh", "-c", userInput) - -// After: Validated inputs with allowlist -if err := ValidateDeploymentInputs(platform, region); err != nil { - return fmt.Errorf("invalid input: %w", err) -} -``` - -2. **G304: Path Traversal Prevention** - -```go -// Before: Vulnerable to directory traversal -filepath.Join(baseDir, userProvidedPath) - -// After: Validated safe paths -if err := ValidateSafeFilePath(path, baseDir); err != nil { - return fmt.Errorf("unsafe path: %w", err) -} - -// Session backend validates all components -func validatePathComponent(s string) error { - if strings.Contains(s, "..") || strings.ContainsAny(s, "/\\") { - return fmt.Errorf("invalid path component: %q", s) - } - return nil -} -``` - -3. **G402: TLS Security** - -```go -// Before: InsecureSkipVerify without warning -tls.Config{InsecureSkipVerify: true} - -// After: Explicit warning and alternatives -// WARNING: InsecureSkipVerify=true disables certificate validation -// Only use in development. For production: -// 1. Use proper certificates -// 2. Use ExternalTLS for service mesh -// 3. Never expose to untrusted networks -``` - -4. **G115: Safe Integer Conversions** - -```go -// Before: Unsafe conversions -int32(userValue) // Can overflow - -// After: Bounds checking -func safeIntToInt32(v int) (int32, error) { - if v < math.MinInt32 || v > math.MaxInt32 { - return 0, fmt.Errorf("value %d out of int32 range", v) - } - return int32(v), nil -} -``` - -5. **G404: Cryptographic Randomness** - -```go -// Before: math/rand for security-critical operations -rand.Intn(100) - -// After: crypto/rand for session IDs and tokens -func generateSessionID() string { - b := make([]byte, 16) - if _, err := crypto_rand.Read(b); err != nil { - panic(err) - } - return hex.EncodeToString(b) -} -``` - -**Security Best Practices:** - -- **Example Secrets** - All documentation uses placeholder patterns (``) -- **Input Validation** - Strict allowlists for deployment commands and file paths -- **File Permissions** - Session storage uses restrictive 0700/0600 permissions -- **TLS Warnings** - Clear documentation of security implications - -See [SECURITY.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/SECURITY.md) for comprehensive security guidelines. - -## Examples - -We've added two complete session examples: - -1. **session-basic** - Session CRUD, checkpoints, and context helpers -2. **session-react** - ReAct agent with full conversation history - -Run them: - -```bash -cd examples/session-basic && go run main.go -cd examples/session-react && go run main.go -``` - -## Breaking Changes - -### Session Storage Location - -Default session storage moved to `~/.aixgo/sessions`: - -```yaml -session: - base_dir: ~/.aixgo/sessions # New default (was ./sessions) -``` - -## Installation - -### Pre-Built Binaries (Recommended) - -Download the latest release for your platform: - -```bash -# Linux (x86_64) -curl -L https://github.com/aixgo-dev/aixgo/releases/download/v0.3.3/aixgo_0.3.3_Linux_x86_64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ - -# Linux (ARM64) -curl -L https://github.com/aixgo-dev/aixgo/releases/download/v0.3.3/aixgo_0.3.3_Linux_arm64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ - -# macOS (Apple Silicon) -curl -L https://github.com/aixgo-dev/aixgo/releases/download/v0.3.3/aixgo_0.3.3_Darwin_arm64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ - -# macOS (Intel) -curl -L https://github.com/aixgo-dev/aixgo/releases/download/v0.3.3/aixgo_0.3.3_Darwin_x86_64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ - -# Windows -# Download aixgo_0.3.3_Windows_x86_64.zip from GitHub Releases -# Extract and add to PATH -``` - -### As a Go Library - -```bash -go get github.com/aixgo-dev/aixgo@v0.3.3 -``` - -### From Source - -```bash -# Install CLI tool -go install github.com/aixgo-dev/aixgo/cmd/aixgo@v0.3.3 - -# Or build from source -git clone https://github.com/aixgo-dev/aixgo.git -cd aixgo -git checkout v0.3.3 -go build ./cmd/aixgo -``` - -## What's Next - -**v0.3.4 (February 2026)**: - -- Session encryption at rest -- Audit logging for session operations -- PostgreSQL session backend -- Session retention policies - -**v0.3.5 (March 2026)**: - -- Conversation branching and trees -- Session search and filtering -- Memory compaction and summarization - -**v0.4.0 (Q2 2026)**: - -- Semantic search over session history -- Long-term memory with vector stores -- Multi-modal session support (images, audio) - -## Resources - -- **Documentation**: [docs/SESSIONS.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/SESSIONS.md) -- **Examples**: [examples/session-basic](https://github.com/aixgo-dev/aixgo/tree/main/examples/session-basic), [examples/session-react](https://github.com/aixgo-dev/aixgo/tree/main/examples/session-react) -- **Feature Catalog**: [docs/FEATURES.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md) -- **Security Guide**: [docs/SECURITY.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/SECURITY.md) - -## Feedback - -We'd love to hear from you: - -- **GitHub Issues**: [github.com/aixgo-dev/aixgo/issues](https://github.com/aixgo-dev/aixgo/issues) -- **Discussions**: [github.com/orgs/aixgo-dev/discussions](https://github.com/orgs/aixgo-dev/discussions) -- **Twitter**: [@aixgo_dev](https://twitter.com/aixgo_dev) - -## Contributors - -Thank you to everyone who contributed to this release through code, documentation, testing, and feedback. - ---- - -**Aixgo** - Production-grade AI agents in Go. - -[Website](https://aixgo.dev) | [GitHub](https://github.com/aixgo-dev/aixgo) | [Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo) diff --git a/web/content/examples/README.md b/web/content/examples/README.md deleted file mode 100644 index a0bf203..0000000 --- a/web/content/examples/README.md +++ /dev/null @@ -1,762 +0,0 @@ -# aixgo Configuration Examples - -Comprehensive collection of YAML configuration examples for aixgo agents, LLM providers, MCP servers, security modes, orchestration patterns, and real-world use cases. - -## Table of Contents - -- [Agent Types](#agent-types) (6 examples) -- [LLM Providers](#llm-providers) (6 examples) -- [MCP Integration](#mcp-integration) (3 examples) -- [Security Configurations](#security-configurations) (4 examples) -- [Orchestration Patterns](#orchestration-patterns) (13 examples) -- [Use Cases](#use-cases) (4 examples) - ---- - -## Agent Types - -Agent types define the behavior and capabilities of individual agents in your aixgo deployment. - -### [Producer Agent](agents/producer.yaml) - -Generates messages at regular intervals for downstream processing. - -**Use Cases:** Data streaming, event generation, monitoring, load testing - -**Key Features:** - -- Configurable interval timing -- Fan-out to multiple consumers -- Synthetic data generation -- Timestamp and ID tracking - -### [ReAct Agent](agents/react.yaml) - -Reasoning and Acting agent with tool use capabilities. Implements the ReAct pattern: Thought → Action → Observation. - -**Use Cases:** Question answering, research tasks, data retrieval, API integrations - -**Key Features:** - -- LLM-powered reasoning -- Tool calling / function calling -- JSON schema validation -- Multi-turn conversations -- Supports all major LLM providers - -### [Logger Agent](agents/logger.yaml) - -Simple logging agent that outputs messages to stdout/logs. - -**Use Cases:** Debugging, monitoring, audit trails, development - -**Key Features:** - -- Multi-input aggregation -- No configuration required -- Immediate output (no buffering) -- Message type and payload logging - -### [Classifier Agent](agents/classifier.yaml) - -AI-powered content classification with semantic understanding. - -**Use Cases:** Content categorization, intent detection, routing, triage - -**Key Features:** - -- Multi-label classification -- Few-shot learning -- Confidence scoring -- Semantic similarity -- Performance metrics tracking -- Alternative suggestions - -### [Aggregator Agent](agents/aggregator.yaml) - -AI-powered aggregation of multiple agent outputs. - -**Use Cases:** Consensus building, result synthesis, multi-agent coordination - -**Key Features:** - -- Multiple aggregation strategies (consensus, weighted, semantic, hierarchical, RAG) -- Conflict resolution -- Deduplication -- Semantic clustering -- Source attribution -- Performance analytics - -**Strategies:** - -- **Consensus:** Find common ground among inputs -- **Weighted:** Prioritize certain sources -- **Semantic:** Group by similarity -- **Hierarchical:** Multi-level aggregation -- **RAG-based:** Retrieval-augmented generation - -### [Planner Agent](agents/planner.yaml) - -AI-powered Chain-of-Thought planning and reasoning. - -**Use Cases:** Task decomposition, strategic planning, project management, workflow design - -**Key Features:** - -- Multiple planning strategies -- Self-critique and improvement -- Parallel step identification -- Risk assessment -- Backup plans -- Success criteria -- Performance tracking - -**Strategies:** - -- **Chain-of-Thought:** Linear reasoning -- **Tree-of-Thought:** Explore branches -- **ReAct Planning:** Thought-Action-Observation -- **Backward Chaining:** Goal-oriented -- **Hierarchical:** Multi-level decomposition -- **Monte Carlo:** Simulate multiple paths - ---- - -## LLM Providers - -Configuration examples for different LLM providers supported by aixgo. - -### [OpenAI](llm-providers/openai.yaml) - -GPT-3.5, GPT-4, and OpenAI-compatible endpoints. - -**Models:** gpt-3.5-turbo, gpt-4, gpt-4-turbo - -**Features:** - -- Function calling -- JSON mode -- Vision (GPT-4V) -- Streaming -- Fine-tuning support (GPT-3.5) - -**Cost:** $0.001-$0.06 per 1K tokens (varies by model) - -**Best For:** Production deployments, complex reasoning, high accuracy - -### [Anthropic Claude](llm-providers/anthropic.yaml) - -Claude 3 family: Haiku, Sonnet, and Opus. - -**Models:** claude-3-haiku, claude-3-sonnet, claude-3-opus - -**Features:** - -- 200K context window -- Vision support -- Tool use -- Constitutional AI -- Extended thinking -- JSON mode - -**Cost:** $0.25-$75 per MTok (varies by model) - -**Best For:** Long documents, complex reasoning, safety-critical applications - -### [Google Gemini](llm-providers/gemini.yaml) - -Gemini Pro and Gemini Pro Vision via Google AI Studio. - -**Models:** gemini-pro, gemini-pro-vision - -**Features:** - -- Free tier available -- Native multimodal -- Function calling -- Grounding (experimental) -- Safety settings - -**Cost:** Free tier, then usage-based - -**Best For:** Development, testing, image understanding - -### [Google Vertex AI](llm-providers/vertexai.yaml) - -Enterprise Gemini and PaLM on Google Cloud Platform. - -**Models:** gemini-pro, text-bison, code-bison - -**Features:** - -- Enterprise SLAs -- VPC-SC security -- CMEK encryption -- Regional deployment -- Audit logging -- Compliance certifications - -**Cost:** $0.00025-$0.0005 per 1K chars - -**Best For:** Enterprise production, compliance requirements, GCP integration - -### [xAI Grok](llm-providers/xai.yaml) - -Grok models from xAI (X.AI). - -**Models:** grok-beta, grok-1 - -**Features:** - -- Real-time information -- X (Twitter) integration -- Witty personality -- OpenAI-compatible API -- Function calling - -**Best For:** Current events, social media analysis, conversational interfaces - -### [HuggingFace](llm-providers/huggingface.yaml) - -Open-source models via HuggingFace Inference API. - -**Models:** Llama 2, Mistral, CodeLlama, Falcon, and more - -**Features:** - -- Serverless and dedicated endpoints -- Custom model deployment -- Cost-effective -- Model transparency -- Self-hosting option - -**Cost:** Free tier, then ~$0.06 per 1K tokens or GPU rental - -**Best For:** Cost optimization, transparency, customization, self-hosting - ---- - -## MCP Integration - -Model Context Protocol (MCP) enables agents to use external tools and services. - -### [Local Transport](mcp/local-transport.yaml) - -In-process MCP server communication (same process). - -**Use Cases:** Development, testing, embedded tools - -**Advantages:** - -- No network overhead (fastest) -- Simple configuration -- Secure (no network exposure) -- Easy debugging - -**Best For:** Development, single-process applications, performance-critical tools - -### [gRPC Transport](mcp/grpc-transport.yaml) - -Remote MCP server communication via gRPC. - -**Use Cases:** Production, microservices, distributed systems - -**Features:** - -- TLS/mTLS via cloud infrastructure (Cloud Run, GKE, service mesh) -- HTTP/2 multiplexing -- Streaming support -- Load balancing -- Health checking - -**Best For:** Production deployments, service isolation, distributed architectures - -### [Multiple Servers](mcp/multiple-servers.yaml) - -Connecting to multiple MCP servers simultaneously. - -**Use Cases:** Hybrid architectures, tool aggregation, complex workflows - -**Features:** - -- Mix local and remote servers -- Automatic tool discovery -- Tool routing -- Independent scaling -- Failure isolation - -**Best For:** Production systems requiring diverse tool sources - ---- - -## Security Configurations - -Security modes for different deployment environments. - -### [Disabled (Development)](security/disabled-dev.yaml) - -No authentication or authorization. - -**WARNING:** Development only! Never use in production! - -**Use Cases:** Local development, unit testing, prototyping - -**Features:** - -- No authentication -- No access control -- Minimal logging -- Fast iteration - -### [Builtin API Key](security/builtin-api-key.yaml) - -Application-level authentication using API keys. - -**Use Cases:** Service-to-service, API access, automation - -**Features:** - -- Environment or file-based keys -- Per-key identification -- Rate limiting support -- Audit logging -- Key rotation support - -**Best For:** Machine-to-machine authentication, CI/CD pipelines - -### [Delegated (IAP)](security/delegated-iap.yaml) - -Infrastructure-level authentication via Google Cloud IAP. - -**Use Cases:** Internal tools, dashboards, human users - -**Features:** - -- Google account authentication -- MFA support -- Centralized access control -- Cloud Logging integration -- No credential management - -**Best For:** Cloud Run, GKE deployments with human users - -### [Hybrid](security/hybrid.yaml) - -Combines delegated (IAP) and builtin (API key) authentication. - -**Use Cases:** Mixed client types (humans + services) - -**Features:** - -- IAP for human users -- API keys for services -- Unified authorization -- Comprehensive audit -- Flexible client support - -**Best For:** Production systems with diverse client types - ---- - -## Orchestration Patterns - -Multi-agent coordination patterns managed by supervisors. - -### [MapReduce](orchestration/mapreduce.yaml) - -Distribute work across agents, then aggregate results. - -**Pattern:** Map phase → Reduce phase - -**Use Cases:** Data processing, distributed analysis, parallel workloads - -**Characteristics:** - -- Horizontal scaling -- Parallel processing -- Result aggregation - -### [Parallel](orchestration/parallel.yaml) - -Multiple agents work independently on the same input. - -**Pattern:** All agents process simultaneously - -**Use Cases:** Multi-perspective analysis, redundancy, speed - -**Characteristics:** - -- Independent execution -- Concurrent processing -- Diverse viewpoints - -### [Sequential](orchestration/sequential.yaml) - -Chain of agents in sequence, each building on previous output. - -**Pattern:** Agent 1 → Agent 2 → Agent 3 → ... - -**Use Cases:** Multi-step workflows, pipeline processing - -**Characteristics:** - -- Ordered execution -- Intermediate results -- Step-by-step refinement - -### [Reflection](orchestration/reflection.yaml) - -Agent critiques and improves its own output. - -**Pattern:** Generate → Critique → Refine - -**Use Cases:** Quality improvement, self-correction - -**Characteristics:** - -- Iterative refinement -- Self-critique -- Quality enhancement - -### [Planning](orchestration/planning.yaml) - -Plan the approach first, then execute the plan. - -**Pattern:** Plan → Execute - -**Use Cases:** Complex tasks, strategic execution - -**Characteristics:** - -- Upfront planning -- Structured execution -- Clear strategy - -### [Classification](orchestration/classification.yaml) - -Route requests based on content classification. - -**Pattern:** Classify → Route → Process - -**Use Cases:** Content routing, triage, specialized handling - -**Characteristics:** - -- Content-aware routing -- Specialized agents -- Efficient distribution - -### [Supervisor](orchestration/supervisor.yaml) - -Hub-and-spoke coordination where supervisor delegates to specialists. - -**Pattern:** Supervisor receives requests, routes to specialists, aggregates responses - -**Use Cases:** Customer service, research tasks, content pipelines - -**Characteristics:** - -- Centralized control -- Simple reasoning -- Easy debugging - -### [Router](orchestration/router.yaml) - -Intelligent cost-optimized routing. - -**Pattern:** Classify input → Route to appropriate agent - -**Use Cases:** Cost optimization (25-50% savings), intent routing, model selection - -**Characteristics:** - -- Two-stage (classify → route) -- Low latency -- Cost-effective - -### [Swarm](orchestration/swarm.yaml) - -Decentralized agent handoffs. - -**Pattern:** Agents hand off to other agents dynamically - -**Use Cases:** Customer support handoffs, troubleshooting, adaptive routing - -**Characteristics:** - -- Mesh topology -- Agent-driven routing -- Dynamic delegation - -### [Hierarchical](orchestration/hierarchical.yaml) - -Multi-level delegation. - -**Pattern:** Manager → Sub-managers → Workers - -**Use Cases:** Project management, enterprise workflows, complex decomposition - -**Characteristics:** - -- Tree topology -- Scalable coordination -- Multi-level structure - -### [RAG](orchestration/rag.yaml) - -Retrieval-Augmented Generation. - -**Pattern:** Retrieve relevant docs → Generate grounded response - -**Use Cases:** Documentation Q&A, knowledge management, enterprise search - -**Characteristics:** - -- Vector search -- Reduced hallucinations -- Knowledge grounding - -### [Ensemble](orchestration/ensemble.yaml) - -Multi-model voting. - -**Pattern:** Multiple models vote → Aggregate consensus - -**Use Cases:** High-stakes decisions, medical diagnosis, content moderation - -**Characteristics:** - -- 25-50% error reduction -- Parallel execution -- Consensus-based - -### [Aggregation](orchestration/aggregation.yaml) - -Multi-agent synthesis. - -**Pattern:** Collect outputs → Apply aggregation strategy → Synthesize result - -**Use Cases:** Research synthesis, distributed decisions, multi-perspective analysis - -**Characteristics:** - -- Multiple strategies (consensus, weighted, semantic) -- Conflict resolution -- Source attribution - ---- - -## Use Cases - -Real-world application examples combining agents and patterns. - -### [Simple Chatbot](use-cases/simple-chatbot.yaml) - -Basic conversational AI assistant. - -**Components:** - -- ReAct agent with GPT-4 -- Tool for time/date queries -- Friendly, helpful personality - -**Use Cases:** - -- Customer support -- Information retrieval -- General Q&A - -### [Content Classifier](use-cases/content-classifier.yaml) - -Categorize and route incoming content. - -**Components:** - -- Classifier agent -- Multiple categories (spam, support, sales) -- Confidence thresholds - -**Use Cases:** - -- Email triage -- Support ticket routing -- Content moderation - -### [Multi-Expert Consensus](use-cases/multi-expert-consensus.yaml) - -Multiple expert agents reach consensus. - -**Components:** - -- 3 expert agents (different models) -- Aggregator for consensus -- High confidence threshold - -**Use Cases:** - -- Critical decisions -- Multi-perspective analysis -- Quality assurance - -### [Task Planner](use-cases/task-planner.yaml) - -Break down complex tasks into executable steps. - -**Components:** - -- Planner agent with Chain-of-Thought -- Self-critique enabled -- Alternative generation - -**Use Cases:** - -- Project management -- Workflow design -- Strategic planning - ---- - -## Getting Started - -### Running an Example - -```bash -# Set required environment variables -export OPENAI_API_KEY="your-key-here" - -# Run an example configuration -aixgo run examples/agents/react.yaml -``` - -### Combining Examples - -You can combine elements from different examples: - -```yaml -# Use security from security/builtin-api-key.yaml -environment: production -auth_mode: builtin - -# Use agents from agents/react.yaml -agents: - - name: my-agent - role: react - model: gpt-4 - # ... rest of config -``` - -### Environment Variables - -Most examples require environment variables for API keys: - -```bash -# OpenAI -export OPENAI_API_KEY="sk-..." - -# Anthropic -export ANTHROPIC_API_KEY="sk-ant-..." - -# Google -export GOOGLE_API_KEY="AIza..." -export VERTEX_PROJECT_ID="my-project" - -# xAI -export XAI_API_KEY="xai-..." - -# HuggingFace -export HUGGINGFACE_API_KEY="hf_..." -``` - -### Security Best Practices - -1. **Never commit API keys** to version control -2. **Use environment variables** or secrets management -3. **Enable authentication** in production (never use disabled mode) -4. **Enable audit logging** for compliance -5. **Deploy with TLS** - Cloud Run, GKE, and service mesh provide TLS by default -6. **Implement rate limiting** per client -7. **Monitor** all authentication attempts -8. **Rotate credentials** regularly - -### Performance Tips - -1. **Use appropriate model sizes** (don't over-provision) -2. **Local MCP** for low-latency tools -3. **Remote MCP** for scalability -4. **Cache** frequent queries -5. **Streaming** for better UX -6. **Monitor** token usage and costs -7. **Batch** where possible -8. **Parallel** execution for independent tasks - ---- - -## File Organization - -```text -examples/ -├── README.md (this file) -├── agents/ (6 examples) -│ ├── producer.yaml -│ ├── react.yaml -│ ├── logger.yaml -│ ├── classifier.yaml -│ ├── aggregator.yaml -│ └── planner.yaml -├── llm-providers/ (6 examples) -│ ├── openai.yaml -│ ├── anthropic.yaml -│ ├── gemini.yaml -│ ├── vertexai.yaml -│ ├── xai.yaml -│ └── huggingface.yaml -├── mcp/ (3 examples) -│ ├── local-transport.yaml -│ ├── grpc-transport.yaml -│ └── multiple-servers.yaml -├── security/ (4 examples) -│ ├── disabled-dev.yaml -│ ├── builtin-api-key.yaml -│ ├── delegated-iap.yaml -│ └── hybrid.yaml -├── orchestration/ (13 examples) -│ ├── mapreduce.yaml -│ ├── parallel.yaml -│ ├── sequential.yaml -│ ├── reflection.yaml -│ ├── planning.yaml -│ ├── classification.yaml -│ ├── supervisor.yaml -│ ├── router.yaml -│ ├── swarm.yaml -│ ├── hierarchical.yaml -│ ├── rag.yaml -│ ├── ensemble.yaml -│ └── aggregation.yaml -└── use-cases/ (4 examples) - ├── simple-chatbot.yaml - ├── content-classifier.yaml - ├── multi-expert-consensus.yaml - └── task-planner.yaml -``` - -**Total:** 36 comprehensive, production-ready examples - ---- - -## Additional Resources - -- **Documentation:** [aixgo.dev](https://aixgo.dev) -- **GitHub:** [github.com/aixgo-dev/aixgo](https://github.com/aixgo-dev/aixgo) -- **API Reference:** See `/docs/api` -- **Deployment Guides:** See `/docs/deployment` - ---- - -## Contributing - -Found an issue or want to add an example? Please open an issue or PR on GitHub. - -## License - -All examples are provided under the same license as aixgo. diff --git a/web/content/examples/agents/aggregator.yaml b/web/content/examples/agents/aggregator.yaml deleted file mode 100644 index 5f2aa8f..0000000 --- a/web/content/examples/agents/aggregator.yaml +++ /dev/null @@ -1,231 +0,0 @@ -# Aggregator Agent Configuration -# AI-powered aggregation of multiple agent outputs -# Supports consensus, weighted, semantic, hierarchical, and RAG-based aggregation - -agents: - - name: multi-agent-aggregator - role: aggregator - - # Model for synthesizing multiple inputs - model: gpt-4-turbo - - prompt: | - You are an expert synthesis agent that combines insights from multiple sources. - Your goal is to create comprehensive, accurate aggregations that preserve - important information while resolving contradictions intelligently. - - # Aggregator-specific configuration - aggregator_config: - # Aggregation strategy determines how inputs are combined - # Options: consensus, weighted, semantic, hierarchical, rag_based - aggregation_strategy: consensus - - # How to resolve conflicts between different agent outputs - # Options: vote, confidence, llm_arbitration, merge - conflict_resolution: llm_arbitration - - # Method for detecting duplicate or similar content - # Options: exact, fuzzy, semantic - deduplication_method: semantic - - # Whether to generate summary insights - summarization_enabled: true - - # Maximum number of input sources to process - max_input_sources: 10 - - # Timeout for waiting for inputs before aggregating (milliseconds) - timeout_ms: 5000 - - # Threshold for semantic similarity (0.0 - 1.0) - # Higher values mean stricter similarity requirements - semantic_similarity_threshold: 0.85 - - # Weights for different input sources (optional) - # Higher weights give more importance to specific agents - source_weights: - expert-agent: 1.5 - fact-checker: 1.3 - general-agent: 1.0 - - # Consensus threshold (0.0 - 1.0) - # Minimum agreement level required for consensus - consensus_threshold: 0.7 - - # LLM parameters - temperature: 0.5 # Balanced creativity for synthesis - max_tokens: 1500 # More tokens for comprehensive aggregation - - # Multiple input sources to aggregate - inputs: - - source: expert-agent - - source: fact-checker - - source: general-agent - - source: specialist-agent - - outputs: - - target: final-output - - target: aggregation-analytics - -# Example: Alternative semantic clustering aggregation - - name: semantic-aggregator - role: aggregator - model: claude-3-opus - - aggregator_config: - aggregation_strategy: semantic - semantic_similarity_threshold: 0.80 - summarization_enabled: true - timeout_ms: 3000 - temperature: 0.4 - max_tokens: 2000 - - inputs: - - source: agent-1 - - source: agent-2 - - source: agent-3 - - outputs: - - target: semantic-results - -# Example: Weighted aggregation for prioritized sources - - name: weighted-aggregator - role: aggregator - model: gpt-4 - - aggregator_config: - aggregation_strategy: weighted - source_weights: - primary-expert: 2.0 # Double weight - secondary-expert: 1.5 - auxiliary-source: 0.8 # Lower weight - timeout_ms: 4000 - temperature: 0.5 - max_tokens: 1200 - - inputs: - - source: primary-expert - - source: secondary-expert - - source: auxiliary-source - - outputs: - - target: weighted-results - -# Supporting agents for the pipeline - - name: expert-agent - role: react - model: gpt-4 - prompt: "You are a domain expert. Provide detailed, accurate analysis." - inputs: - - source: input-stream - outputs: - - target: multi-agent-aggregator - - - name: fact-checker - role: react - model: gpt-4 - prompt: "You are a fact-checker. Verify claims and provide evidence." - inputs: - - source: input-stream - outputs: - - target: multi-agent-aggregator - - - name: general-agent - role: react - model: gpt-3.5-turbo - prompt: "You are a generalist. Provide broad perspective and context." - inputs: - - source: input-stream - outputs: - - target: multi-agent-aggregator - - - name: specialist-agent - role: react - model: gpt-4 - prompt: "You are a technical specialist. Focus on implementation details." - inputs: - - source: input-stream - outputs: - - target: multi-agent-aggregator - - - name: input-stream - role: producer - interval: 10s - outputs: - - target: expert-agent - - target: fact-checker - - target: general-agent - - target: specialist-agent - - - name: final-output - role: logger - inputs: - - target: multi-agent-aggregator - - - name: aggregation-analytics - role: logger - inputs: - - source: multi-agent-aggregator - -# Environment variables required: -# - OPENAI_API_KEY: For GPT models -# - ANTHROPIC_API_KEY: For Claude models - -# Output format (JSON): -# { -# "aggregated_content": "Synthesized output combining all inputs...", -# "sources": ["expert-agent", "fact-checker", "general-agent", "specialist-agent"], -# "strategy_used": "consensus", -# "conflicts_resolved": [ -# { -# "topic": "implementation approach", -# "conflicting_sources": ["expert-agent", "specialist-agent"], -# "resolution": "Combined both approaches based on context", -# "reasoning": "Expert suggested A for scalability, specialist suggested B for performance..." -# } -# ], -# "consensus_level": 0.85, -# "summary_insights": "Key insights from aggregation...", -# "tokens_used": 1234, -# "processing_time_ms": 3456, -# "semantic_clusters": [ -# { -# "cluster_id": "cluster_0", -# "members": ["expert-agent", "fact-checker"], -# "core_concept": "security best practices", -# "avg_similarity": 0.92 -# } -# ] -# } - -# Aggregation Strategies: -# -# 1. consensus: Find common ground among all inputs -# - Best for: Combining similar perspectives -# - Identifies agreements and resolves conflicts -# -# 2. weighted: Prioritize certain sources based on configured weights -# - Best for: When some agents are more authoritative -# - Maintains weighted influence in final output -# -# 3. semantic: Group inputs by semantic similarity -# - Best for: Diverse inputs covering different aspects -# - Creates clusters of related information -# -# 4. hierarchical: Multi-level aggregation (group then synthesize) -# - Best for: Large numbers of inputs -# - Scales better with many agents -# -# 5. rag_based: Retrieval-augmented generation approach -# - Best for: Factual synthesis with source attribution -# - Maintains traceability to original sources - -# Notes: -# - timeout_ms controls how long to wait for inputs before processing -# - Aggregator buffers inputs until timeout, then processes batch -# - Semantic similarity uses text comparison (Levenshtein distance) -# - For production, consider using embedding-based similarity -# - Conflict resolution with LLM provides transparent reasoning -# - Performance metrics tracked: consensus level, processing time, token usage -# - Use higher temperature (0.5-0.7) for creative synthesis -# - Lower temperature (0.3-0.4) for factual aggregation diff --git a/web/content/examples/agents/classifier.yaml b/web/content/examples/agents/classifier.yaml deleted file mode 100644 index 3ca5a39..0000000 --- a/web/content/examples/agents/classifier.yaml +++ /dev/null @@ -1,133 +0,0 @@ -# Classifier Agent Configuration -# AI-powered content classification with semantic understanding -# Supports multi-label classification, few-shot learning, and confidence scoring - -agents: - - name: content-classifier - role: classifier - - # LLM model for semantic classification - # Lower temperature (0.3) recommended for consistent categorization - model: gpt-4-turbo - - # System prompt for classification context - prompt: | - You are an expert content classifier with deep semantic understanding. - Analyze content carefully and assign the most appropriate category. - Consider context, intent, and subtle nuances in the text. - - # Classifier-specific configuration - classifier_config: - # Available categories with descriptions and examples - categories: - - name: technical - description: "Technical documentation, code, engineering topics" - keywords: ["api", "code", "architecture", "implementation", "debugging"] - examples: - - "How to implement OAuth 2.0 authentication" - - "Database indexing strategies for performance" - - - name: business - description: "Business strategy, market analysis, financial topics" - keywords: ["revenue", "market", "strategy", "roi", "stakeholder"] - examples: - - "Q4 revenue projections and market trends" - - "Competitive analysis of SaaS pricing models" - - - name: support - description: "Customer support, troubleshooting, help requests" - keywords: ["help", "issue", "problem", "error", "not working"] - examples: - - "I can't log into my account" - - "The application crashes when I upload files" - - - name: feedback - description: "User feedback, feature requests, suggestions" - keywords: ["suggest", "would be nice", "feature request", "improvement"] - examples: - - "It would be great to have dark mode" - - "The search functionality could be faster" - - - name: general - description: "General inquiries, greetings, uncategorized content" - keywords: ["hello", "hi", "thanks", "information", "about"] - examples: - - "Hello, I'd like to know more about your product" - - "Thank you for the quick response" - - # Use embeddings for semantic similarity (requires embedding-capable model) - use_embeddings: false - - # Confidence threshold for accepting classification (0.0 - 1.0) - # Classifications below this threshold may return alternatives - confidence_threshold: 0.7 - - # Allow multiple labels per input - multi_label: false - - # Few-shot examples to improve classification accuracy - few_shot_examples: - - input: "Our API is returning 500 errors intermittently" - category: technical - reason: "Describes a technical API issue requiring engineering investigation" - - - input: "What's the pricing for enterprise customers?" - category: business - reason: "Business inquiry about pricing and commercial terms" - - - input: "The app keeps freezing on my iPhone" - category: support - reason: "User experiencing a technical problem needing support" - - # LLM parameters for classification - temperature: 0.3 # Low temperature for consistent classification - max_tokens: 500 # Sufficient for classification + reasoning - - inputs: - - source: content-stream - - outputs: - - target: classification-router - - target: analytics-logger - -# Supporting pipeline agents - - name: content-stream - role: producer - interval: 5s - outputs: - - target: content-classifier - - - name: classification-router - role: logger - inputs: - - source: content-classifier - - - name: analytics-logger - role: logger - inputs: - - source: content-classifier - -# Environment variables required: -# - OPENAI_API_KEY: For GPT models -# - ANTHROPIC_API_KEY: If using Claude models - -# Output format (JSON): -# { -# "category": "technical", -# "confidence": 0.92, -# "reasoning": "The content discusses API implementation details and authentication protocols", -# "alternatives": [ -# {"category": "support", "confidence": 0.45} -# ], -# "tokens_used": 234, -# "prompt_strategy": "few-shot" -# } - -# Notes: -# - Few-shot examples significantly improve accuracy -# - Confidence threshold filters uncertain classifications -# - Categories should be mutually exclusive unless multi_label is enabled -# - The agent tracks performance metrics (accuracy, token usage) -# - Use structured output for reliable JSON responses -# - Keywords and examples help the LLM understand category boundaries -# - Temperature should be low (0.2-0.4) for consistent results diff --git a/web/content/examples/agents/logger.yaml b/web/content/examples/agents/logger.yaml deleted file mode 100644 index d825761..0000000 --- a/web/content/examples/agents/logger.yaml +++ /dev/null @@ -1,72 +0,0 @@ -# Logger Agent Configuration -# Simple logging agent that outputs messages to stdout/logs -# Use case: Debugging, monitoring, audit trails - -agents: - # Example 1: Basic logger with single input - - name: basic-logger - role: logger - # Logger receives from one or more sources - inputs: - - source: data-producer - - # Example 2: Multi-input logger (aggregates from multiple sources) - - name: aggregated-logger - role: logger - inputs: - - source: agent-a - - source: agent-b - - source: agent-c - - # Example 3: Specialized audit logger - - name: audit-logger - role: logger - inputs: - - source: security-events - - source: auth-events - -# Supporting agents for the examples - - name: data-producer - role: producer - interval: 2s - outputs: - - target: basic-logger - - - name: agent-a - role: producer - interval: 1s - outputs: - - target: aggregated-logger - - - name: agent-b - role: producer - interval: 1500ms - outputs: - - target: aggregated-logger - - - name: agent-c - role: producer - interval: 2s - outputs: - - target: aggregated-logger - - - name: security-events - role: producer - interval: 5s - outputs: - - target: audit-logger - - - name: auth-events - role: producer - interval: 3s - outputs: - - target: audit-logger - -# Notes: -# - Logger outputs to stdout with [ALERT] prefix -# - Messages include type and payload -# - Useful for debugging agent pipelines -# - Can listen to multiple sources simultaneously -# - No configuration needed - works out of the box -# - Consider external log aggregation for production use -# - Logs are written immediately (no buffering) diff --git a/web/content/examples/agents/planner.yaml b/web/content/examples/agents/planner.yaml deleted file mode 100644 index 1bf8741..0000000 --- a/web/content/examples/agents/planner.yaml +++ /dev/null @@ -1,245 +0,0 @@ -# Planner Agent Configuration -# AI-powered Chain-of-Thought planning and reasoning -# Supports multiple planning strategies: CoT, Tree-of-Thought, ReAct, Hierarchical - -agents: - - name: strategic-planner - role: planner - - # Model for complex reasoning and planning - # Recommend using most capable models (GPT-4, Claude Opus) - model: gpt-4-turbo - - prompt: | - You are an expert strategic planner specializing in problem decomposition - and systematic reasoning. Your plans should be: - - Logically sound and complete - - Practically executable - - Risk-aware with contingencies - - Optimized for efficiency - - # Planner-specific configuration - planner_config: - # Planning strategy determines reasoning approach - # Options: chain_of_thought, tree_of_thought, react_planning, - # monte_carlo, backward_chaining, hierarchical_plan - planning_strategy: chain_of_thought - - # Maximum number of steps in the plan - max_steps: 20 - - # Detail level for each step - # Options: minimal, standard, detailed, comprehensive - step_detail_level: detailed - - # Enable backtracking if a step fails - enable_backtracking: true - - # Enable self-critique and plan improvement - enable_self_critique: true - - # Depth of reasoning chains (1-5) - # Higher values create more thorough analysis - reasoning_depth: 3 - - # Identify steps that can run in parallel - parallelizable_steps: true - - # Include alternative approaches for each step - include_alternatives: true - - # LLM parameters - temperature: 0.7 # Higher for creative problem-solving - max_tokens: 2000 # More tokens for detailed planning - - # Few-shot example plans to guide the planner - example_plans: - - problem: "Deploy a new microservice to production" - steps: - - "Review deployment checklist and dependencies" - - "Run comprehensive test suite in staging" - - "Prepare rollback plan and monitoring alerts" - - "Deploy to production with canary release" - - "Monitor metrics and error rates" - - "Gradually increase traffic to new version" - explanation: "Systematic deployment with risk mitigation and monitoring" - - - problem: "Investigate production performance degradation" - steps: - - "Check current metrics and error logs" - - "Identify when degradation started" - - "Review recent deployments and changes" - - "Analyze database query performance" - - "Check external service dependencies" - - "Implement fixes and monitor improvement" - explanation: "Root cause analysis with systematic elimination" - - inputs: - - source: problem-stream - - outputs: - - target: plan-executor - - target: plan-archive - -# Example: Tree-of-Thought planning for complex problems - - name: tot-planner - role: planner - model: claude-3-opus - - planner_config: - planning_strategy: tree_of_thought - max_steps: 15 - reasoning_depth: 4 - enable_self_critique: true - include_alternatives: true - temperature: 0.8 # Higher for exploring alternative paths - max_tokens: 2500 - - inputs: - - source: complex-problems - outputs: - - target: tot-results - -# Example: Hierarchical planning for large-scale problems - - name: hierarchical-planner - role: planner - model: gpt-4 - - planner_config: - planning_strategy: hierarchical_plan - max_steps: 30 - step_detail_level: comprehensive - reasoning_depth: 3 - parallelizable_steps: true - temperature: 0.7 - max_tokens: 3000 - - inputs: - - source: large-scale-problems - outputs: - - target: hierarchical-results - -# Supporting pipeline agents - - name: problem-stream - role: producer - interval: 30s - outputs: - - target: strategic-planner - - - name: plan-executor - role: logger - inputs: - - source: strategic-planner - - - name: plan-archive - role: logger - inputs: - - source: strategic-planner - -# Environment variables required: -# - OPENAI_API_KEY: For GPT models -# - ANTHROPIC_API_KEY: For Claude models - -# Output format (JSON): -# { -# "problem": "Original problem statement", -# "analysis": { -# "problem_type": "optimization", -# "domain": "software engineering", -# "constraints": ["time budget", "resource limits"], -# "available_resources": ["cloud infrastructure", "dev team"], -# "key_challenges": ["scalability", "data consistency"], -# "assumptions": ["existing auth system", "database available"] -# }, -# "steps": [ -# { -# "step_number": 1, -# "action": "Analyze current architecture bottlenecks", -# "reasoning": "Understanding current limitations guides optimization strategy", -# "prerequisites": [], -# "expected_outcome": "List of performance bottlenecks with metrics", -# "complexity": "medium", -# "can_parallelize": false, -# "confidence": 0.95, -# "alternatives": [ -# { -# "action": "Start with load testing first", -# "reasoning": "Empirical data before analysis", -# "trade_offs": "Takes more time but provides hard data" -# } -# ] -# } -# ], -# "execution_strategy": "parallel_optimized", -# "critical_path": [1, 2, 5, 8], -# "parallel_groups": [[3, 4], [6, 7]], -# "backup_plans": [ -# { -# "trigger_condition": "Performance improvement < 20%", -# "alternative_steps": [...], -# "description": "Fallback to horizontal scaling approach" -# } -# ], -# "success_criteria": [ -# "Response time reduced by 50%", -# "Zero downtime during implementation", -# "All tests passing" -# ], -# "risk_assessment": { -# "overall_risk": "medium", -# "risk_factors": [ -# { -# "factor": "Database migration", -# "severity": "high", -# "likelihood": 0.3, -# "impact": "Service downtime possible" -# } -# ], -# "mitigation_steps": ["Backup before migration", "Use blue-green deployment"] -# }, -# "total_complexity": "high", -# "estimated_duration": "2-3 days", -# "tokens_used": 1847, -# "planning_strategy": "chain_of_thought", -# "self_critique": "Plan is comprehensive but step 7 could be broken into sub-steps..." -# } - -# Planning Strategies Explained: -# -# 1. chain_of_thought: Linear reasoning with explicit steps -# - Best for: Well-defined problems with clear sequences -# - Provides transparent reasoning at each step -# -# 2. tree_of_thought: Explores multiple reasoning branches -# - Best for: Problems with multiple solution paths -# - Evaluates alternatives and selects best branch -# -# 3. react_planning: Alternates Thought-Action-Observation -# - Best for: Dynamic problems requiring feedback -# - Adapts plan based on intermediate results -# -# 4. backward_chaining: Starts from goal, works backward -# - Best for: Goal-oriented planning -# - Identifies prerequisites recursively -# -# 5. hierarchical_plan: Multi-level decomposition -# - Best for: Complex, large-scale problems -# - Creates high-level plan, then details each step -# -# 6. monte_carlo: Simulates multiple plan executions -# - Best for: Uncertain environments -# - Selects plan with highest success probability - -# Notes: -# - Chain-of-Thought is most reliable for production use -# - Self-critique improves plan quality but uses more tokens -# - Parallelizable steps can reduce execution time significantly -# - Critical path identifies the longest dependency chain -# - Risk assessment helps anticipate and mitigate failures -# - Backup plans provide fallback strategies -# - Success criteria make plans measurable and testable -# - Higher reasoning_depth creates more thorough analysis -# - Example plans significantly improve output quality -# - The planner tracks metrics and learns from execution history -# - Use temperature 0.7-0.8 for creative problem-solving -# - Use temperature 0.4-0.6 for structured, procedural planning diff --git a/web/content/examples/agents/producer.yaml b/web/content/examples/agents/producer.yaml deleted file mode 100644 index d595c0c..0000000 --- a/web/content/examples/agents/producer.yaml +++ /dev/null @@ -1,48 +0,0 @@ -# Producer Agent Configuration -# Generates messages at regular intervals for downstream processing -# Use case: Data streaming, event generation, monitoring - -agents: - - name: data-stream-producer - # Role defines the agent type - producer generates periodic messages - role: producer - - # Interval controls how often messages are produced - # Supported formats: ms, s, m, h (e.g., "500ms", "2s", "1m") - interval: 1s - - # Outputs define where produced messages are sent - # Multiple outputs allow fan-out to different consumers - outputs: - - target: data-processor - - target: data-logger - - # Example consumer that receives from the producer - - name: data-processor - role: react - model: gpt-4 - prompt: | - You are a data analyst. Process incoming data streams and extract insights. - inputs: - - source: data-stream-producer - outputs: - - target: results-logger - - - name: data-logger - role: logger - inputs: - - source: data-stream-producer - - - name: results-logger - role: logger - inputs: - - source: data-processor - -# Environment variables required: -# - OPENAI_API_KEY: For the react agent's LLM calls - -# Notes: -# - Producer generates synthetic data with random values -# - Interval can be adjusted based on throughput requirements -# - Each message includes timestamp and unique ID -# - Use for testing pipelines, load generation, or real-time simulations diff --git a/web/content/examples/agents/react.yaml b/web/content/examples/agents/react.yaml deleted file mode 100644 index 691b1c3..0000000 --- a/web/content/examples/agents/react.yaml +++ /dev/null @@ -1,105 +0,0 @@ -# ReAct Agent Configuration -# Reasoning and Acting agent with tool use capabilities -# Implements the ReAct pattern: Thought -> Action -> Observation - -agents: - - name: research-assistant - # ReAct role enables reasoning with action (tool calling) - role: react - - # Model selection - supports OpenAI, Anthropic, xAI, Gemini, Vertex AI - # The provider is auto-detected from the model name - model: gpt-4-turbo - - # System prompt defines the agent's behavior and capabilities - prompt: | - You are an expert research assistant specializing in data analysis and fact-checking. - - Your responsibilities: - - Analyze queries systematically using Chain-of-Thought reasoning - - Use available tools to gather information - - Provide evidence-based, well-reasoned answers - - Cite sources and explain your reasoning process - - Always think step-by-step before taking actions. - - # Tools define callable functions the agent can use - # Each tool requires a JSON Schema for input validation - tools: - - name: search_database - description: "Search the knowledge database for relevant information" - input_schema: - type: object - properties: - query: - type: string - description: "Search query string" - max_results: - type: number - description: "Maximum number of results to return" - minimum: 1 - maximum: 100 - required: [query] - - - name: calculate - description: "Perform mathematical calculations" - input_schema: - type: object - properties: - expression: - type: string - description: "Mathematical expression to evaluate" - required: [expression] - - - name: fetch_web_content - description: "Retrieve content from a web URL" - input_schema: - type: object - properties: - url: - type: string - description: "URL to fetch content from" - pattern: "^https?://" - timeout_ms: - type: number - description: "Request timeout in milliseconds" - default: 5000 - required: [url] - - # Input sources - where the agent receives messages - inputs: - - source: user-requests - - # Output targets - where results are sent - outputs: - - target: response-formatter - - target: audit-log - -# Additional agents for the pipeline - - name: user-requests - role: producer - interval: 10s - outputs: - - target: research-assistant - - - name: response-formatter - role: logger - inputs: - - source: research-assistant - - - name: audit-log - role: logger - inputs: - - source: research-assistant - -# Environment variables required: -# - OPENAI_API_KEY: API key for OpenAI models -# - ANTHROPIC_API_KEY: If using Claude models -# - XAI_API_KEY: If using Grok models - -# Notes: -# - Tool calls are automatically validated against input_schema -# - The agent will reason about which tools to use based on the query -# - Supports multi-turn conversations with tool results -# - Temperature and max_tokens can be configured via model parameters -# - Failed tool calls are reported back to the LLM for retry/alternative approaches diff --git a/web/content/examples/classifier-aggregator.md b/web/content/examples/classifier-aggregator.md deleted file mode 100644 index aedf8f2..0000000 --- a/web/content/examples/classifier-aggregator.md +++ /dev/null @@ -1,727 +0,0 @@ ---- -title: 'Classifier and Aggregator Examples' -description: "Production-ready examples showcasing AI-powered content classification and multi-agent aggregation workflows." -breadcrumb: 'Examples' -category: 'Examples' -weight: 4 ---- - -This guide showcases two powerful agent types for intelligent content processing and multi-agent coordination: the Classifier agent and the Aggregator agent. These examples demonstrate real-world use cases with complete configurations and explanations. - -**Working Examples**: - -- [classifier-workflow](https://github.com/aixgo-dev/aixgo/tree/main/examples/classifier-workflow) - Customer support ticket classification with intelligent routing -- [aggregator-workflow](https://github.com/aixgo-dev/aixgo/tree/main/examples/aggregator-workflow) - Multi-expert research synthesis with three aggregation strategies - -## Overview - -### Classifier Agent - -The Classifier agent uses LLM-powered semantic understanding to categorize content with confidence scoring, structured outputs, and few-shot learning capabilities. It goes beyond simple keyword matching to understand context and nuance. - -**Key Capabilities:** - -- Semantic content categorization -- Confidence scoring (0-1 scale) -- Few-shot learning without fine-tuning -- Multi-label classification support -- Structured JSON outputs with schema validation -- Alternative category suggestions - -### Aggregator Agent - -The Aggregator agent synthesizes outputs from multiple agents using intelligent strategies including consensus building, weighted synthesis, semantic clustering, hierarchical summarization, and RAG-based aggregation. - -**Key Capabilities:** - -- Five aggregation strategies for different use cases -- Automatic conflict detection and resolution -- Semantic clustering of similar outputs -- Consensus scoring and metrics -- Source attribution and weighting -- Performance tracking and observability - -## Classifier Agent Examples - -### Example 1: Customer Support Ticket Classification - -This example demonstrates an AI-powered support ticket routing system that automatically categorizes and prioritizes incoming customer requests. - -#### Use Case - -A SaaS company receives hundreds of support tickets daily across various categories: - -- Technical issues requiring engineering support -- Billing inquiries for the finance team -- Account access problems for the security team -- Feature requests for product management -- Bug reports for quality assurance -- General inquiries for customer success - -The Classifier agent analyzes each ticket's content and automatically routes it to the appropriate team with priority assignment. - -#### Configuration - -```yaml -supervisor: - name: support-coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - # Producer simulates incoming tickets - - name: ticket-source - role: producer - interval: 2s - outputs: - - target: incoming-tickets - - # Classifier categorizes tickets - - name: ticket-classifier - role: classifier - model: gpt-4-turbo - inputs: - - source: incoming-tickets - outputs: - - target: classified-tickets - - classifier_config: - # Category definitions with rich metadata - categories: - - name: technical_issue - description: "Issues requiring technical troubleshooting or product support including API errors, performance problems, system failures, and integration issues" - keywords: ["error", "bug", "crash", "not working", "500", "timeout", "api", "integration"] - examples: - - "The API returns 500 errors when I try to create a new user" - - "Our application crashes when processing large files" - - "Dashboard performance is very slow with large datasets" - - - name: billing_inquiry - description: "Questions about payments, invoices, pricing, subscriptions, or refunds" - keywords: ["payment", "invoice", "charge", "refund", "subscription", "billing", "price"] - examples: - - "I was charged twice for this month's subscription" - - "Can I get a refund for the unused portion?" - - "What's the difference between Pro and Enterprise pricing?" - - - name: account_access - description: "Login problems, password resets, authentication issues, or security concerns" - keywords: ["login", "password", "access", "authentication", "locked", "2fa", "security"] - examples: - - "I can't log into my account after the password reset" - - "My account is locked and I need access urgently" - - "2FA codes aren't being sent to my phone" - - - name: feature_request - description: "Suggestions for new features, enhancements, or product improvements" - keywords: ["feature", "enhancement", "suggestion", "would be nice", "could you add"] - examples: - - "Would be great to have dark mode support" - - "Can you add export to Excel functionality?" - - "Integration with Slack would be very helpful" - - - name: bug_report - description: "Detailed reports of system defects, unexpected behavior, or errors with reproduction steps" - keywords: ["bug", "defect", "incorrect", "broken", "wrong", "unexpected"] - examples: - - "The date picker shows wrong dates in Safari browser" - - "Export function produces corrupted CSV files" - - "Notifications are sent multiple times for the same event" - - - name: general_inquiry - description: "Other questions about products, services, documentation, or company information" - keywords: ["question", "how to", "information", "hours", "documentation"] - examples: - - "What are your business hours for phone support?" - - "Where can I find the API documentation?" - - "Do you offer enterprise-level SLAs?" - - # Minimum confidence threshold - confidence_threshold: 0.7 - - # Disable multi-label for clear routing - multi_label: false - - # Few-shot examples improve accuracy - few_shot_examples: - - input: "My account credentials aren't working after I reset my password" - category: account_access - reason: "User experiencing authentication issues following password reset" - - - input: "Can I downgrade my subscription and get a partial refund?" - category: billing_inquiry - reason: "Question about subscription changes and refund policies" - - - input: "The search function returns no results even though the data exists" - category: bug_report - reason: "Specific system defect with reproduction scenario" - - # LLM parameters - temperature: 0.3 # Low temperature for consistent categorization - max_tokens: 500 # Sufficient for reasoning - - # Logger outputs results - - name: classification-logger - role: logger - inputs: - - source: classified-tickets -``` - -#### Expected Output - -```json -{ - "category": "technical_issue", - "confidence": 0.92, - "reasoning": "User describes a specific API error (500) when performing a standard operation (user creation). This requires technical investigation and troubleshooting by the engineering team.", - "alternatives": [ - { - "category": "bug_report", - "confidence": 0.45 - } - ], - "tokens_used": 234, - "timestamp": "2024-01-15T10:30:00Z" -} -``` - -#### Key Features Demonstrated - -- **Semantic Understanding**: Goes beyond keywords to understand context -- **Confidence Scoring**: Each classification includes a confidence metric -- **Alternative Suggestions**: Provides secondary options for ambiguous cases -- **Few-Shot Learning**: Examples improve accuracy without model fine-tuning -- **Structured Outputs**: JSON schema validation ensures reliable parsing - -**View Complete Example**: [examples/classifier-workflow](https://github.com/aixgo-dev/aixgo/tree/main/examples/classifier-workflow) - -### Example 2: Multi-Label Content Tagging - -This example shows how to use the Classifier agent for assigning multiple tags to content items. - -#### Configuration Snippet - -```yaml -classifier_config: - # Enable multi-label classification - multi_label: true - - categories: - - name: technical - description: "Contains technical or engineering content" - - - name: urgent - description: "Requires immediate attention or action" - - - name: customer_facing - description: "Should be visible to end customers" - - - name: executive_summary - description: "Suitable for executive-level summaries" - - # Lower threshold for multi-label scenarios - confidence_threshold: 0.6 - - temperature: 0.4 -``` - -#### Expected Output - -```json -{ - "categories": ["technical", "urgent", "customer_facing"], - "confidence_scores": { - "technical": 0.89, - "urgent": 0.76, - "customer_facing": 0.71, - "executive_summary": 0.42 - }, - "reasoning": "Content contains technical details about a critical production issue that affects customers and requires immediate engineering attention.", - "tokens_used": 312 -} -``` - -## Aggregator Agent Examples - -### Example 1: Multi-Expert Research Synthesis - -This example demonstrates how multiple specialized AI agents can analyze a complex topic from different perspectives, with the Aggregator agent synthesizing their insights into a comprehensive report. - -#### Use Case - -A research team is analyzing "The Impact of Large Language Models on Software Development." They deploy six specialized expert agents: - -- **Technical Expert**: Deep technical implementation analysis -- **Data Scientist**: Empirical metrics and statistical analysis -- **Business Analyst**: ROI and economic impact assessment -- **Security Expert**: Security risks and vulnerability analysis -- **Ethics Expert**: Ethical implications and bias considerations -- **Domain Expert**: Practical implementation challenges - -The Aggregator agent combines their perspectives using three different strategies: consensus, semantic clustering, and weighted synthesis. - -#### Configuration - -```yaml -supervisor: - name: research-coordinator - model: gpt-4-turbo - max_rounds: 15 - -agents: - # Research topic producer - - name: research-prompt - role: producer - interval: 10s - outputs: - - target: research-topic - - # Expert agents analyze from different perspectives - - name: technical-expert - role: react - model: gpt-4-turbo - prompt: | - You are a senior technical architect with 15 years of experience. - Analyze topics from a deep technical implementation perspective. - Focus on architecture, scalability, performance, and technical feasibility. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - - name: data-scientist - role: react - model: gpt-4-turbo - prompt: | - You are a data scientist specializing in empirical analysis. - Provide statistical insights, metrics, and data-driven analysis. - Focus on measurable impacts and quantitative assessment. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - - name: business-analyst - role: react - model: gpt-4-turbo - prompt: | - You are a business analyst focused on ROI and economic impact. - Analyze business implications, cost-benefit, and market dynamics. - Focus on organizational impact and financial considerations. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - - name: security-expert - role: react - model: gpt-4-turbo - prompt: | - You are a cybersecurity expert specializing in AI systems. - Analyze security risks, vulnerabilities, and compliance considerations. - Focus on threat modeling and security best practices. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - - name: ethics-expert - role: react - model: gpt-4-turbo - prompt: | - You are an AI ethics expert. - Analyze ethical implications, bias, fairness, and social impact. - Focus on responsible AI practices and ethical considerations. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - - name: domain-expert - role: react - model: gpt-4-turbo - prompt: | - You are a domain expert with practical implementation experience. - Analyze real-world challenges, adoption barriers, and practical considerations. - Focus on implementation feasibility and lessons learned. - inputs: - - source: research-topic - outputs: - - target: expert-analyses - - # Consensus aggregation - find common ground - - name: consensus-aggregator - role: aggregator - model: gpt-4-turbo - inputs: - - source: expert-analyses - outputs: - - target: consensus-synthesis - aggregator_config: - aggregation_strategy: consensus - consensus_threshold: 0.75 - conflict_resolution: llm_mediated - timeout_ms: 5000 - temperature: 0.5 - max_tokens: 2000 - - # Semantic aggregation - cluster by themes - - name: semantic-aggregator - role: aggregator - model: gpt-4-turbo - inputs: - - source: expert-analyses - outputs: - - target: semantic-synthesis - aggregator_config: - aggregation_strategy: semantic - semantic_similarity_threshold: 0.85 - deduplication_method: semantic - timeout_ms: 5000 - temperature: 0.4 - max_tokens: 2000 - - # Weighted aggregation - prioritize expertise - - name: weighted-aggregator - role: aggregator - model: gpt-4-turbo - inputs: - - source: expert-analyses - outputs: - - target: weighted-synthesis - aggregator_config: - aggregation_strategy: weighted - source_weights: - technical-expert: 0.9 - data-scientist: 0.85 - business-analyst: 0.75 - security-expert: 0.95 - ethics-expert: 0.80 - domain-expert: 0.85 - conflict_resolution: highest_weight_wins - timeout_ms: 5000 - temperature: 0.5 - max_tokens: 2000 - - # Final logger - - name: synthesis-logger - role: logger - inputs: - - source: consensus-synthesis - - source: semantic-synthesis - - source: weighted-synthesis -``` - -#### Expected Output - Consensus Strategy - -```json -{ - "strategy": "consensus", - "consensus_level": 0.87, - "aggregated_content": "After analyzing expert inputs, there is strong consensus (87%) on the following key findings:\n\n1. LLMs significantly accelerate routine development tasks (40-60% productivity gain)\n2. Security considerations require new scanning and validation approaches\n3. Code quality shows mixed results, requiring human oversight\n4. Business ROI is positive for organizations above certain scale\n5. Ethical considerations around training data and bias remain critical\n\nKey areas of agreement:\n- Transformative impact on software development workflows\n- Need for new tooling and processes\n- Importance of developer training and adaptation\n\nResolved conflicts:\n- Testing approaches: Hybrid strategy combining automated and manual review\n- Adoption timeline: Phased implementation recommended over wholesale replacement", - - "conflicts_resolved": [ - { - "topic": "testing_methodology", - "conflicting_sources": ["technical-expert", "domain-expert"], - "resolution": "Hybrid approach combining both perspectives", - "reasoning": "Technical expert emphasized automated testing capabilities while domain expert highlighted practical limitations. Resolution integrates both automated AI-assisted testing with mandatory human review for critical paths." - } - ], - - "tokens_used": 1850, - "processing_time_ms": 2340 -} -``` - -#### Expected Output - Semantic Strategy - -```json -{ - "strategy": "semantic", - "semantic_clusters": [ - { - "cluster_id": "cluster_0", - "members": ["technical-expert", "domain-expert"], - "core_concept": "Implementation Challenges", - "avg_similarity": 0.89, - "summary": "Both experts emphasize practical implementation barriers including integration complexity, tooling maturity, and organizational readiness." - }, - { - "cluster_id": "cluster_1", - "members": ["security-expert", "ethics-expert"], - "core_concept": "Risk and Governance", - "avg_similarity": 0.82, - "summary": "Shared focus on risk management, compliance frameworks, and responsible AI practices." - }, - { - "cluster_id": "cluster_2", - "members": ["business-analyst", "data-scientist"], - "core_concept": "Measurable Impact", - "avg_similarity": 0.85, - "summary": "Quantitative analysis of productivity gains, cost savings, and empirical performance metrics." - } - ], - - "aggregated_content": "Thematic analysis reveals three primary concern areas:\n\n**Implementation Challenges (Technical + Domain)**\n- Integration with existing development workflows\n- Tooling ecosystem maturity gaps\n- Developer training and skill adaptation\n\n**Risk and Governance (Security + Ethics)**\n- New vulnerability classes from AI-generated code\n- Bias in training data and outputs\n- Compliance and regulatory considerations\n\n**Measurable Impact (Business + Data)**\n- 40-60% productivity gains for routine tasks\n- ROI positive at scale (100+ developers)\n- Variable quality requiring oversight investment", - - "tokens_used": 1650, - "processing_time_ms": 2120 -} -``` - -#### Expected Output - Weighted Strategy - -```json -{ - "strategy": "weighted", - "applied_weights": { - "security-expert": 0.95, - "technical-expert": 0.9, - "domain-expert": 0.85, - "data-scientist": 0.85, - "ethics-expert": 0.80, - "business-analyst": 0.75 - }, - - "aggregated_content": "Weighted analysis prioritizing security and technical expertise yields:\n\n**Critical Priority (Security Expert - Weight 0.95)**\n- New attack vectors from AI-generated code require specialized scanning\n- Supply chain risks from training data dependencies\n- Compliance frameworks lagging behind technology adoption\n\n**High Priority (Technical Expert - Weight 0.9)**\n- Architecture patterns evolving toward AI-first designs\n- Performance characteristics differ from traditional code\n- Integration complexity higher than anticipated\n\n**Important Considerations (Other Experts)**\n- Business ROI positive with appropriate scale and oversight\n- Ethical implications require ongoing monitoring\n- Practical adoption challenges in legacy systems\n\nRecommendations (weighted by expertise):\n1. Security-first adoption approach (Security Expert)\n2. Phased rollout with monitoring (Technical + Domain Experts)\n3. Investment in training and tooling (All Experts)", - - "tokens_used": 1720, - "processing_time_ms": 2250 -} -``` - -#### Key Features Demonstrated - -- **Multiple Aggregation Strategies**: Consensus, semantic, and weighted approaches -- **Conflict Resolution**: Automatic detection and LLM-mediated resolution -- **Semantic Clustering**: Grouping similar expert perspectives -- **Weighted Synthesis**: Prioritizing high-authority sources -- **Comprehensive Metrics**: Consensus levels, token usage, processing time - -**View Complete Example**: [examples/aggregator-workflow](https://github.com/aixgo-dev/aixgo/tree/main/examples/aggregator-workflow) - -### Example 2: RAG Pipeline with Multiple Retrievers - -This example shows how to use the Aggregator agent in a retrieval-augmented generation (RAG) system with multiple specialized retrievers. - -#### Configuration Snippet - -```yaml -agents: - # Multiple retrieval agents - - name: vector-retriever - role: react - model: gpt-4-turbo - prompt: "You are a vector similarity retriever" - outputs: - - target: retrieval-results - - - name: keyword-retriever - role: react - model: gpt-4-turbo - prompt: "You are a keyword-based retriever" - outputs: - - target: retrieval-results - - - name: graph-retriever - role: react - model: gpt-4-turbo - prompt: "You are a graph traversal retriever" - outputs: - - target: retrieval-results - - # RAG aggregator synthesizes retrieved content - - name: rag-synthesizer - role: aggregator - model: gpt-4-turbo - inputs: - - source: retrieval-results - outputs: - - target: final-answer - aggregator_config: - aggregation_strategy: rag_based - max_input_sources: 10 - timeout_ms: 3000 - temperature: 0.7 - max_tokens: 2000 -``` - -## When to Use Each Strategy - -### Classifier Agent - -**Use Classifier When:** - -- You need to categorize or route content automatically -- Semantic understanding is important (beyond keyword matching) -- You want confidence scores for quality assessment -- Multi-label tagging is required -- Few-shot learning can improve accuracy without fine-tuning - -**Example Scenarios:** - -- Customer support ticket routing -- Content moderation and filtering -- Intent detection in chatbots -- Document classification and organization -- Sentiment analysis with custom categories - -### Aggregator Agent Strategies - -#### Consensus Strategy - -**Use When:** - -- You need to find common ground among diverse opinions -- Conflict resolution and transparency are important -- Building agreement for decision-making -- Identifying universally accepted insights - -**Example Scenarios:** - -- Multi-expert research synthesis -- Fact verification across sources -- Team decision-making processes -- Policy development with stakeholder input - -#### Weighted Strategy - -**Use When:** - -- Some agents have more expertise or authority -- Certain perspectives are more critical -- You need to balance expertise with inclusion -- High-stakes decisions requiring domain expertise - -**Example Scenarios:** - -- Expert panel analysis with varying credentials -- Prioritizing technical over non-technical input -- Confidence-based output mixing -- Quality-weighted content aggregation - -#### Semantic Strategy - -**Use When:** - -- Understanding thematic relationships is crucial -- You want to preserve conceptual groupings -- Dealing with complex, multi-faceted topics -- Many agents (5+) producing varied outputs -- Deduplication of similar ideas is needed - -**Example Scenarios:** - -- Large-scale research synthesis -- Theme extraction from diverse sources -- Perspective identification and clustering -- Knowledge map creation - -#### Hierarchical Strategy - -**Use When:** - -- Dealing with very large numbers of agents (10+) -- Multi-level summarization is needed -- Token efficiency is critical -- Recursive aggregation provides better results - -**Example Scenarios:** - -- Enterprise-scale multi-agent systems -- Cascading summarization pipelines -- Cost-optimized large-scale aggregation - -#### RAG-Based Strategy - -**Use When:** - -- You have a knowledge base to reference -- Source attribution is important -- Fact-checking against documentation is needed -- Question-answering with citations - -**Example Scenarios:** - -- Multi-retriever RAG systems -- Document-based Q&A -- Research with source tracking -- Compliance-oriented systems requiring citations - -## Performance Considerations - -### Classifier Agent - -- **Token Usage**: 200-500 tokens per classification - - Add 150-300 tokens for few-shot examples - - Add 100-200 tokens for 10+ categories -- **Latency**: 500ms-2s depending on model -- **Cost Optimization**: Use GPT-4o-mini for high-volume classification - -### Aggregator Agent - -- **Token Usage** (varies by strategy and agent count): - - 2-3 agents: 500-1000 tokens - - 4-6 agents: 1000-1500 tokens - - 7-10 agents: 1500-2500 tokens - - 10+ agents: Use hierarchical strategy (1000-2000 tokens) -- **Latency**: 1s-5s depending on strategy and agent count -- **Timeout Configuration**: - - Fast agents (1-2s): `timeout_ms: 3000` - - Standard agents (3-5s): `timeout_ms: 5000` - - Complex agents (5-10s): `timeout_ms: 10000` - -## Best Practices - -### Classifier Best Practices - -1. **Category Design** - - Create clear, mutually exclusive categories (unless multi-label) - - Provide detailed descriptions with boundary explanations - - Include 3-5 diverse keywords per category - - Add 2-3 representative examples - -2. **Confidence Tuning** - - 0.5-0.6: Exploratory use - - 0.7-0.8: Production baseline - - 0.85+: High-stakes scenarios - -3. **Token Optimization** - - Use concise category descriptions - - Limit few-shot examples to 3 per category - - Set appropriate max_tokens (500 for classification) - -### Aggregator Best Practices - -1. **Strategy Selection** - - Consensus: Balanced synthesis with conflict transparency - - Weighted: Expert prioritization - - Semantic: Deduplication and theme extraction - - Hierarchical: Scalability (10+ agents) - - RAG-based: Citation preservation - -2. **Timeout Configuration** - - Set based on expected agent response times - - Add buffer for network latency - - Monitor timeout expiry rates - -3. **Input Management** - - Latest message from each source is used - - Buffering is automatic and thread-safe - - No manual buffer management needed - -## Next Steps - -- **[Agent Types Guide](/guides/agent-types)** - Comprehensive agent documentation -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Coordination patterns -- **[Classifier Example Source](https://github.com/aixgo-dev/aixgo/tree/main/examples/classifier-workflow)** - Complete implementation -- **[Aggregator Example Source](https://github.com/aixgo-dev/aixgo/tree/main/examples/aggregator-workflow)** - Complete implementation -- **[Agent Framework Code](https://github.com/aixgo-dev/aixgo/tree/main/agents)** - Source code reference - -## Additional Resources - -- [Classifier Agent Documentation](https://github.com/aixgo-dev/aixgo/blob/main/agents/README.md#classifier-agent) -- [Aggregator Agent Documentation](https://github.com/aixgo-dev/aixgo/blob/main/agents/README.md#aggregator-agent) -- [Complete Examples Directory](https://github.com/aixgo-dev/aixgo/tree/main/examples) -- [API Reference](https://pkg.go.dev/github.com/aixgo-dev/aixgo) diff --git a/web/content/examples/llm-providers/anthropic.yaml b/web/content/examples/llm-providers/anthropic.yaml deleted file mode 100644 index 480751b..0000000 --- a/web/content/examples/llm-providers/anthropic.yaml +++ /dev/null @@ -1,216 +0,0 @@ -# Anthropic Claude Provider Configuration -# Supports Claude 3 family: Haiku, Sonnet, and Opus - -agents: - - name: claude-opus-agent - role: react - - # Claude model selection - # Available: claude-3-haiku, claude-3-sonnet, claude-3-opus - model: claude-3-opus-20240229 - - prompt: | - You are Claude, an AI assistant created by Anthropic. - You are thoughtful, helpful, and provide well-reasoned responses. - You acknowledge uncertainty when appropriate. - - tools: - - name: analyze_code - description: "Analyze code for potential issues and improvements" - input_schema: - type: object - properties: - code: - type: string - description: "Source code to analyze" - language: - type: string - description: "Programming language" - required: [code, language] - - - name: research_topic - description: "Research a technical topic in depth" - input_schema: - type: object - properties: - topic: - type: string - description: "Topic to research" - depth: - type: string - enum: ["overview", "detailed", "comprehensive"] - required: [topic] - - inputs: - - source: complex-queries - outputs: - - target: opus-responses - -# Example: Fast and cost-effective with Claude Haiku - - name: claude-haiku-agent - role: react - model: claude-3-haiku-20240307 - - prompt: | - You are a quick-response assistant focused on efficiency. - Provide accurate, concise answers. - - inputs: - - source: simple-queries - outputs: - - target: haiku-responses - -# Example: Balanced performance with Claude Sonnet - - name: claude-sonnet-agent - role: react - model: claude-3-sonnet-20240229 - - prompt: | - You are a balanced AI assistant providing thorough yet efficient responses. - - inputs: - - source: standard-queries - outputs: - - target: sonnet-responses - -# Example: Long context with extended thinking - - name: claude-extended-agent - role: react - model: claude-3-opus-20240229 - - prompt: | - You are analyzing complex documents with extended context. - Take time to think through problems systematically. - Use the full context window when needed. - - tools: - - name: analyze_document - description: "Deep analysis of long documents" - input_schema: - type: object - properties: - document: - type: string - description: "Document text (up to 200K tokens)" - analysis_type: - type: string - enum: ["summary", "critique", "comparison"] - required: [document, analysis_type] - - inputs: - - source: document-stream - outputs: - - target: analysis-results - -# Supporting agents - - name: complex-queries - role: producer - interval: 30s - outputs: - - target: claude-opus-agent - - - name: opus-responses - role: logger - inputs: - - source: claude-opus-agent - -# Environment variables required: -# - ANTHROPIC_API_KEY: Your Anthropic API key (required) -# Get from: https://console.anthropic.com/settings/keys -# -# - ANTHROPIC_BASE_URL: Custom base URL (optional) -# Default: https://api.anthropic.com/v1 - -# Configuration via environment: -# export ANTHROPIC_API_KEY="sk-ant-..." -# export ANTHROPIC_BASE_URL="https://api.anthropic.com/v1" # optional - -# Model Comparison: -# -# Claude 3 Haiku: -# - Cost: $0.25 per MTok input, $1.25 per MTok output -# - Speed: Fastest (~1 second typical) -# - Context: 200K tokens -# - Use for: Simple tasks, high volume, real-time responses -# -# Claude 3 Sonnet: -# - Cost: $3 per MTok input, $15 per MTok output -# - Speed: Fast (~2-3 seconds typical) -# - Context: 200K tokens -# - Use for: Balanced performance and cost, most workloads -# -# Claude 3 Opus: -# - Cost: $15 per MTok input, $75 per MTok output -# - Speed: Moderate (~5-7 seconds typical) -# - Context: 200K tokens -# - Use for: Complex reasoning, highest accuracy, research - -# Key Features: -# - Extended Context: 200K tokens across all models -# - Vision: Image understanding in all models -# - Tool Use: Sophisticated function calling -# - Thinking: Can show reasoning process -# - Constitutional AI: Built-in safety and helpfulness -# - JSON Mode: Reliable structured outputs -# - Streaming: Real-time response generation - -# Rate Limits (varies by tier): -# Tier 1 (Default): -# - Haiku: 50 RPM, 25K TPM -# - Sonnet: 50 RPM, 20K TPM -# - Opus: 50 RPM, 10K TPM -# -# Higher tiers available with increased usage - -# Best Practices: -# 1. Use Haiku for speed-critical, simple tasks -# 2. Use Sonnet for most production workloads -# 3. Use Opus for complex reasoning and research -# 4. Leverage 200K context for long documents -# 5. Use thinking prompts for complex reasoning: -# "Think step by step" or "Show your reasoning" -# 6. System prompts go in separate field (not messages) -# 7. Cache long system prompts to reduce costs -# 8. Use streaming for better user experience -# 9. Tool definitions should be clear and specific -# 10. Monitor usage to stay within rate limits - -# Advanced Features: -# -# Extended Thinking: -# - Ask Claude to "think step by step" -# - Request reasoning before answers -# - Useful for math, logic, complex analysis -# -# Constitutional AI: -# - Built-in alignment with human values -# - Can refuse harmful requests -# - More reliable than GPT for safety -# -# Long Context: -# - 200K tokens = ~500 pages of text -# - Maintains quality across full context -# - Excellent for document analysis -# -# Vision: -# - Upload images directly -# - Supports diagrams, charts, screenshots -# - Can describe and analyze visual content - -# Anthropic-Specific Considerations: -# - System prompts are separate from messages -# - Tool use format differs from OpenAI -# - Provider auto-handles format conversion -# - Streaming uses Server-Sent Events (SSE) -# - Extended context maintained better than GPT-4 -# - More conservative than GPT (may decline more often) -# - Better at acknowledging uncertainty - -# Notes: -# - API key automatically loaded from ANTHROPIC_API_KEY -# - Provider auto-detected from model name (claude-*) -# - System prompts converted to Anthropic format -# - Tools formatted as Anthropic tool definitions -# - Supports both sync and streaming -# - Automatic retry with exponential backoff -# - JSON schema validation before API calls diff --git a/web/content/examples/llm-providers/gemini.yaml b/web/content/examples/llm-providers/gemini.yaml deleted file mode 100644 index 69c650c..0000000 --- a/web/content/examples/llm-providers/gemini.yaml +++ /dev/null @@ -1,227 +0,0 @@ -# Google Gemini Provider Configuration -# Supports Gemini Pro and Gemini Pro Vision via Google AI Studio - -agents: - - name: gemini-pro-agent - role: react - - # Gemini model selection - # Available: gemini-pro, gemini-pro-vision - model: gemini-pro - - prompt: | - You are Gemini, Google's advanced AI assistant. - Provide accurate, helpful, and comprehensive responses. - - tools: - - name: search_knowledge - description: "Search knowledge base for information" - input_schema: - type: object - properties: - query: - type: string - description: "Search query" - domain: - type: string - description: "Domain to search (tech, science, general)" - required: [query] - - - name: generate_content - description: "Generate creative content" - input_schema: - type: object - properties: - content_type: - type: string - enum: ["article", "summary", "outline"] - topic: - type: string - style: - type: string - enum: ["formal", "casual", "technical"] - required: [content_type, topic] - - inputs: - - source: user-queries - outputs: - - target: gemini-responses - -# Example: Vision capabilities with Gemini Pro Vision - - name: gemini-vision-agent - role: react - model: gemini-pro-vision - - prompt: | - You can analyze images and visual content. - Describe what you see accurately and answer questions about images. - - tools: - - name: analyze_image - description: "Analyze image content and answer questions" - input_schema: - type: object - properties: - image_url: - type: string - description: "URL of the image to analyze" - question: - type: string - description: "Question about the image" - required: [image_url] - - inputs: - - source: image-queries - outputs: - - target: vision-responses - -# Example: Multi-turn conversation with context - - name: gemini-conversational-agent - role: react - model: gemini-pro - - prompt: | - You maintain conversation context and provide coherent multi-turn interactions. - Reference previous messages when relevant. - - inputs: - - source: conversation-stream - outputs: - - target: conversation-responses - -# Supporting agents - - name: user-queries - role: producer - interval: 10s - outputs: - - target: gemini-pro-agent - - - name: gemini-responses - role: logger - inputs: - - source: gemini-pro-agent - -# Environment variables required: -# - GOOGLE_API_KEY: Your Google AI Studio API key (required) -# Get from: https://makersuite.google.com/app/apikey -# -# - GEMINI_BASE_URL: Custom base URL (optional) -# Default: https://generativelanguage.googleapis.com/v1 - -# Configuration via environment: -# export GOOGLE_API_KEY="AIza..." -# export GEMINI_BASE_URL="https://generativelanguage.googleapis.com/v1" # optional - -# Model Comparison: -# -# Gemini Pro: -# - Cost: Free tier available, then usage-based -# - Speed: Fast (2-3 seconds typical) -# - Context: 32K tokens -# - Use for: Text generation, reasoning, general tasks -# - Features: Function calling, multi-turn chat -# -# Gemini Pro Vision: -# - Cost: Same as Gemini Pro -# - Speed: Moderate (3-5 seconds typical) -# - Context: 16K tokens + image -# - Use for: Image understanding, visual Q&A -# - Features: Image analysis, OCR, scene understanding - -# Key Features: -# - Free Tier: Generous free quota for development -# - Multimodal: Native image understanding (Vision) -# - Function Calling: Tool use capabilities -# - Safety Settings: Configurable content filtering -# - Streaming: Real-time response generation -# - Grounding: Can cite sources (experimental) - -# Rate Limits (Free Tier): -# - 60 requests per minute -# - 1,500 requests per day -# - Paid tier has higher limits - -# Free Tier Quotas: -# - Gemini Pro: 60 RPM, 32K tokens per request -# - Gemini Pro Vision: 60 RPM, includes image input -# - No cost for moderate usage - -# Best Practices: -# 1. Use free tier for development and testing -# 2. Gemini Pro for text-only tasks -# 3. Gemini Pro Vision when images are involved -# 4. Set safety settings appropriately for your use case -# 5. Use streaming for better UX -# 6. Implement retry logic for rate limits -# 7. Cache responses when possible -# 8. Monitor usage to stay within quotas -# 9. Use function calling for structured outputs -# 10. Test thoroughly before upgrading to paid tier - -# Safety Settings: -# Gemini has built-in safety filters for: -# - Harassment -# - Hate speech -# - Sexually explicit content -# - Dangerous content -# -# Can be configured per request or use defaults - -# Multimodal Capabilities (Vision): -# - Image formats: PNG, JPEG, WebP -# - Max image size: 4MB -# - Multiple images per request -# - Image + text prompts -# - OCR from images -# - Chart and diagram understanding -# - Scene description - -# Function Calling: -# - Similar to OpenAI's tool use -# - JSON schema for parameters -# - Automatic parameter extraction -# - Multiple function calls per turn -# - Well-suited for agent workflows - -# Comparison with Other Providers: -# -# vs OpenAI: -# + Free tier available -# + Native multimodal (Vision) -# - Smaller context window (32K vs 128K) -# - Less mature ecosystem -# -# vs Anthropic: -# + Free tier -# + Google ecosystem integration -# - Smaller context (32K vs 200K) -# - Less sophisticated reasoning -# -# vs Vertex AI: -# - Simpler API (no GCP setup) -# - Lower rate limits -# + Easier for development -# - No enterprise features - -# Google AI Studio vs Vertex AI: -# This configuration uses Google AI Studio (makersuite.google.com): -# - Simpler setup (just API key) -# - Free tier available -# - Good for development -# -# For production, consider Vertex AI: -# - Enterprise SLAs -# - Higher rate limits -# - More models available -# - GCP integration -# - See vertexai.yaml for Vertex AI configuration - -# Notes: -# - API key automatically loaded from GOOGLE_API_KEY -# - Provider auto-detected from model name (gemini-*) -# - Free tier is generous for development -# - Vision model can handle both text and images -# - Safety filters may block some content -# - Streaming responses improve perceived performance -# - Function calling uses similar format to OpenAI -# - Consider Vertex AI for production deployments diff --git a/web/content/examples/llm-providers/huggingface.yaml b/web/content/examples/llm-providers/huggingface.yaml deleted file mode 100644 index 2c3034e..0000000 --- a/web/content/examples/llm-providers/huggingface.yaml +++ /dev/null @@ -1,309 +0,0 @@ -# HuggingFace Provider Configuration -# Access to open-source models via HuggingFace Inference API -# Supports serverless and dedicated endpoints - -agents: - - name: huggingface-llama-agent - role: react - - # HuggingFace model selection (organization/model format) - # Available: meta-llama/Llama-2-7b-chat-hf, mistralai/Mistral-7B-Instruct-v0.2, etc. - model: meta-llama/Llama-2-13b-chat-hf - - prompt: | - You are a helpful AI assistant powered by Llama 2. - Provide clear, accurate, and thoughtful responses. - - tools: - - name: search_documents - description: "Search through documentation" - input_schema: - type: object - properties: - query: - type: string - description: "Search query" - category: - type: string - enum: ["technical", "general", "api"] - required: [query] - - inputs: - - source: user-queries - outputs: - - target: llama-responses - -# Example: Mistral for fast, efficient inference - - name: huggingface-mistral-agent - role: react - model: mistralai/Mistral-7B-Instruct-v0.2 - - prompt: | - You are an efficient AI assistant powered by Mistral. - Provide concise, accurate responses optimized for speed. - - inputs: - - source: quick-queries - outputs: - - target: mistral-responses - -# Example: Code generation with CodeLlama - - name: huggingface-codellama-agent - role: react - model: codellama/CodeLlama-13b-Instruct-hf - - prompt: | - You are a code generation specialist powered by CodeLlama. - Generate clean, efficient, well-documented code. - - tools: - - name: generate_code - description: "Generate code based on requirements" - input_schema: - type: object - properties: - requirements: - type: string - description: "Code requirements" - language: - type: string - description: "Programming language" - framework: - type: string - description: "Framework (if applicable)" - required: [requirements, language] - - inputs: - - source: code-requests - outputs: - - target: code-results - -# Example: Dedicated inference endpoint for production - - name: huggingface-dedicated-agent - role: react - model: custom-deployment # Your deployed model name - - prompt: | - You are running on a dedicated HuggingFace inference endpoint - for production workloads with guaranteed performance and availability. - - inputs: - - source: production-queries - outputs: - - target: production-responses - -# Supporting agents - - name: user-queries - role: producer - interval: 10s - outputs: - - target: huggingface-llama-agent - - - name: llama-responses - role: logger - inputs: - - source: huggingface-llama-agent - -# Environment variables required: -# - HUGGINGFACE_API_KEY: Your HuggingFace API token (required) -# Get from: https://huggingface.co/settings/tokens -# -# - HUGGINGFACE_ENDPOINT: Custom inference endpoint URL (optional) -# For dedicated endpoints, use the endpoint URL -# Default: HuggingFace serverless inference API - -# Configuration via environment: -# export HUGGINGFACE_API_KEY="hf_..." -# export HUGGINGFACE_ENDPOINT="https://your-endpoint.endpoints.huggingface.cloud" # optional - -# Popular Model Options: -# -# Llama 2 Family: -# - meta-llama/Llama-2-7b-chat-hf: Fast, efficient, 7B params -# - meta-llama/Llama-2-13b-chat-hf: Balanced, 13B params -# - meta-llama/Llama-2-70b-chat-hf: Most capable, 70B params -# - Use for: General chat, instruction following -# -# Mistral Family: -# - mistralai/Mistral-7B-Instruct-v0.2: Fast, efficient, high quality -# - mistralai/Mixtral-8x7B-Instruct-v0.1: MoE, very capable -# - Use for: Speed + quality balance, efficiency -# -# Code Models: -# - codellama/CodeLlama-7b-Instruct-hf: Code generation, 7B -# - codellama/CodeLlama-13b-Instruct-hf: Better code quality, 13B -# - codellama/CodeLlama-34b-Instruct-hf: Most capable code model -# - Use for: Code generation, explanation, debugging -# -# Specialized Models: -# - tiiuae/falcon-7b-instruct: Fast, efficient -# - EleutherAI/gpt-j-6B: General purpose, open license -# - bigscience/bloom-7b1: Multilingual -# - google/flan-t5-xxl: Instruction-tuned, versatile - -# Deployment Options: -# -# 1. Serverless Inference (Free/Pay-per-use): -# - Use model name directly (e.g., meta-llama/Llama-2-7b-chat-hf) -# - No dedicated resources -# - Pay only for what you use -# - Shared infrastructure -# - Cold starts possible -# - Rate limits apply -# - Good for: Development, low-volume production -# -# 2. Dedicated Inference Endpoints (Production): -# - Deploy model to dedicated GPU -# - Guaranteed resources and uptime -# - No cold starts -# - Custom scaling -# - Higher throughput -# - Set HUGGINGFACE_ENDPOINT to endpoint URL -# - Good for: High-volume production, SLA requirements - -# Pricing: -# -# Serverless: -# - Free tier: Limited requests per month -# - Pay-per-use: ~$0.06 per 1K tokens (varies by model) -# - Rate limited -# -# Dedicated Endpoints: -# - GPU rental: Starting at ~$1/hour for small GPUs -# - ~$4-6/hour for medium GPUs (A10G, T4) -# - ~$10-15/hour for large GPUs (A100) -# - Auto-scaling available -# - Custom pricing for enterprise - -# Rate Limits (Serverless): -# Free tier: -# - Varies by model -# - Typically 1-10 requests per second -# - Daily quotas apply -# -# Pro tier: -# - Higher rate limits -# - ~$9/month subscription -# - Better quotas - -# Best Practices: -# 1. Start with serverless for development -# 2. Use dedicated endpoints for production -# 3. Choose model size based on latency/quality tradeoff -# 4. Implement request queuing for rate limits -# 5. Cache responses when possible -# 6. Monitor token usage and costs -# 7. Use smaller models when sufficient (7B vs 70B) -# 8. Implement fallback for rate limit errors -# 9. Test thoroughly before production deployment -# 10. Consider self-hosting for high volume - -# Model Selection Guide: -# -# For Speed (lowest latency): -# - Mistral-7B-Instruct (best speed/quality) -# - Llama-2-7b-chat (good speed) -# - falcon-7b-instruct (very fast) -# -# For Quality (best responses): -# - Llama-2-70b-chat (highest quality) -# - Mixtral-8x7B (excellent quality, MoE) -# - Llama-2-13b-chat (good balance) -# -# For Code: -# - CodeLlama-34b-Instruct (best code quality) -# - CodeLlama-13b-Instruct (good balance) -# - CodeLlama-7b-Instruct (fast code gen) -# -# For Cost Efficiency: -# - Mistral-7B (best value) -# - Llama-2-7b (good efficiency) -# - Falcon-7b (low cost) - -# Dedicated Endpoint Setup: -# 1. Go to https://huggingface.co/inference-endpoints -# 2. Create new endpoint -# 3. Select model (or upload custom) -# 4. Choose GPU type and region -# 5. Configure scaling -# 6. Deploy endpoint -# 7. Get endpoint URL -# 8. Set HUGGINGFACE_ENDPOINT environment variable - -# Custom Models: -# You can deploy your own fine-tuned models: -# 1. Upload model to HuggingFace Hub -# 2. Create inference endpoint -# 3. Use model name or endpoint URL -# 4. Same API as public models - -# Advantages of HuggingFace: -# + Open-source models (transparency) -# + Model flexibility (many options) -# + Cost-effective (especially serverless) -# + Self-hosting possible -# + No vendor lock-in -# + Active community -# + Custom model deployment -# - May require more tuning than commercial APIs -# - Smaller context windows than GPT-4/Claude -# - Less sophisticated out-of-box than commercial models - -# Context Window Sizes: -# - Llama 2: 4K tokens -# - Mistral: 8K tokens (some variants 32K) -# - CodeLlama: 16K tokens -# - Mixtral: 32K tokens -# -# Compare to: -# - GPT-4-turbo: 128K -# - Claude 3: 200K -# - Gemini Pro: 32K - -# Function Calling: -# Open-source models have varying function calling support: -# - Some models support it natively -# - Others require prompt engineering -# - Provider handles conversion where possible -# - Test thoroughly for your use case - -# Performance Optimization: -# 1. Use smaller models when sufficient -# 2. Batch requests when possible -# 3. Cache common queries -# 4. Dedicate endpoints for high volume -# 5. Use quantized models (GPTQ, AWQ) for speed -# 6. Implement request queuing -# 7. Monitor and optimize prompts -# 8. Consider self-hosting for cost at scale - -# Self-Hosting vs HuggingFace Inference: -# -# HuggingFace Inference: -# + No infrastructure management -# + Easy to get started -# + Pay-per-use pricing -# - Higher cost at scale -# - Shared resources (serverless) -# - Rate limits -# -# Self-Hosting: -# + Full control -# + Lower cost at high volume -# + No rate limits -# + Data privacy -# - Requires GPU infrastructure -# - DevOps overhead -# - Upfront costs - -# Notes: -# - API key automatically loaded from HUGGINGFACE_API_KEY -# - Provider auto-detected from model name format (org/model) -# - Supports both serverless and dedicated endpoints -# - Open-source models provide transparency and control -# - Model quality varies - test before production use -# - Consider dedicated endpoints for production SLAs -# - Smaller models (7B) can be very cost-effective -# - Custom model deployment supported -# - Self-hosting is an option for high volume -# - Active community and model ecosystem diff --git a/web/content/examples/llm-providers/openai.yaml b/web/content/examples/llm-providers/openai.yaml deleted file mode 100644 index 9205518..0000000 --- a/web/content/examples/llm-providers/openai.yaml +++ /dev/null @@ -1,144 +0,0 @@ -# OpenAI Provider Configuration -# Supports GPT-3.5, GPT-4, and OpenAI-compatible endpoints - -agents: - - name: openai-agent - role: react - - # OpenAI model selection - # Available models: gpt-3.5-turbo, gpt-4, gpt-4-turbo, gpt-4-turbo-preview - model: gpt-4-turbo - - prompt: | - You are a helpful AI assistant powered by OpenAI's GPT-4. - Provide clear, accurate, and helpful responses. - - tools: - - name: get_weather - description: "Get current weather for a location" - input_schema: - type: object - properties: - location: - type: string - description: "City name or coordinates" - units: - type: string - enum: ["celsius", "fahrenheit"] - default: "celsius" - required: [location] - - inputs: - - source: user-input - outputs: - - target: response-handler - -# Example: Cost-optimized with GPT-3.5-turbo - - name: cost-efficient-agent - role: react - model: gpt-3.5-turbo # Cheaper, faster for simpler tasks - - prompt: | - You are a quick-response assistant optimized for efficiency. - Provide concise, accurate answers. - - inputs: - - source: simple-queries - outputs: - - target: quick-responses - -# Example: OpenAI-compatible endpoint (e.g., LocalAI, Ollama with OpenAI API) - - name: compatible-endpoint-agent - role: react - model: custom-model # Model name for compatible endpoint - - prompt: | - You are running on an OpenAI-compatible endpoint. - - inputs: - - source: custom-input - outputs: - - target: custom-output - -# Supporting agents - - name: user-input - role: producer - interval: 10s - outputs: - - target: openai-agent - - - name: response-handler - role: logger - inputs: - - source: openai-agent - -# Environment variables required: -# - OPENAI_API_KEY: Your OpenAI API key (required) -# Get from: https://platform.openai.com/api-keys -# -# - OPENAI_BASE_URL: Custom base URL (optional) -# Default: https://api.openai.com/v1 -# Use for OpenAI-compatible endpoints (LocalAI, Ollama, etc.) -# Example: http://localhost:11434/v1 - -# Configuration via environment: -# export OPENAI_API_KEY="sk-..." -# export OPENAI_BASE_URL="https://api.openai.com/v1" # optional - -# Model Comparison: -# -# GPT-3.5-turbo: -# - Cost: $0.001 per 1K input tokens, $0.002 per 1K output tokens -# - Speed: Fast (1-2 seconds typical) -# - Use for: Simple tasks, high volume, cost optimization -# -# GPT-4: -# - Cost: $0.03 per 1K input tokens, $0.06 per 1K output tokens -# - Speed: Moderate (3-5 seconds typical) -# - Use for: Complex reasoning, high accuracy requirements -# -# GPT-4-turbo: -# - Cost: $0.01 per 1K input tokens, $0.03 per 1K output tokens -# - Speed: Fast (2-3 seconds typical) -# - Use for: Best balance of cost, speed, and capability -# - Context: 128K tokens - -# Features: -# - Function calling / tool use -# - JSON mode for structured outputs -# - Vision (GPT-4V for image inputs) -# - Streaming responses -# - Fine-tuning support (GPT-3.5) - -# Rate Limits (Tier 1): -# - GPT-3.5-turbo: 3,500 RPM, 90,000 TPM -# - GPT-4: 500 RPM, 10,000 TPM -# - GPT-4-turbo: 500 RPM, 30,000 TPM - -# Best Practices: -# 1. Use GPT-3.5 for simple, high-volume tasks -# 2. Use GPT-4 for complex reasoning and accuracy -# 3. Set appropriate temperature (0.0-2.0): -# - 0.0-0.3: Factual, deterministic -# - 0.7-1.0: Creative, varied -# 4. Implement retry logic for rate limits -# 5. Monitor token usage and costs -# 6. Use streaming for better UX on long responses -# 7. Cache common prompts to reduce API calls - -# OpenAI-Compatible Endpoints: -# The OpenAI provider works with any OpenAI-compatible API: -# - LocalAI: https://localai.io -# - Ollama: https://ollama.ai (with --openai-compat flag) -# - vLLM: https://vllm.ai -# - Text Generation WebUI: https://github.com/oobabooga/text-generation-webui -# -# Set OPENAI_BASE_URL to your endpoint and use appropriate model names. - -# Notes: -# - API key is automatically loaded from OPENAI_API_KEY environment variable -# - Provider is auto-detected from model name (gpt-*) -# - Supports both synchronous and streaming responses -# - Tool calls are automatically formatted for OpenAI's function calling -# - Retries with exponential backoff on rate limit errors -# - Validates tool schemas before sending to API diff --git a/web/content/examples/llm-providers/vertexai.yaml b/web/content/examples/llm-providers/vertexai.yaml deleted file mode 100644 index fcbab65..0000000 --- a/web/content/examples/llm-providers/vertexai.yaml +++ /dev/null @@ -1,292 +0,0 @@ -# Google Vertex AI Provider Configuration -# Enterprise-grade Gemini and PaLM models on Google Cloud Platform -# Requires GCP project and authentication - -agents: - - name: vertexai-gemini-agent - role: react - - # Vertex AI Gemini models - # Available: gemini-pro, gemini-pro-vision, gemini-ultra (limited access) - model: gemini-pro - - prompt: | - You are an enterprise AI assistant running on Google Cloud Vertex AI. - Provide reliable, secure, and compliant responses. - - tools: - - name: query_database - description: "Query enterprise database" - input_schema: - type: object - properties: - query: - type: string - description: "SQL query or search term" - database: - type: string - description: "Database identifier" - required: [query, database] - - - name: generate_report - description: "Generate business report" - input_schema: - type: object - properties: - report_type: - type: string - enum: ["financial", "operational", "technical"] - time_period: - type: string - description: "Time period for report (e.g., 'Q1 2024')" - metrics: - type: array - items: - type: string - required: [report_type, time_period] - - inputs: - - source: business-queries - outputs: - - target: enterprise-responses - -# Example: PaLM 2 for text generation - - name: vertexai-palm-agent - role: react - model: text-bison@002 # PaLM 2 text model - - prompt: | - You are a text generation specialist using PaLM 2. - Generate high-quality, coherent text for various purposes. - - inputs: - - source: generation-requests - outputs: - - target: generated-content - -# Example: Code generation with Codey - - name: vertexai-code-agent - role: react - model: code-bison@002 # Codey for code generation - - prompt: | - You are a code generation expert using Vertex AI Codey. - Generate clean, efficient, well-documented code. - - tools: - - name: explain_code - description: "Explain code functionality" - input_schema: - type: object - properties: - code: - type: string - language: - type: string - required: [code, language] - - inputs: - - source: code-requests - outputs: - - target: code-results - -# Example: Regional deployment for compliance - - name: vertexai-eu-agent - role: react - model: gemini-pro - - prompt: | - You are an EU-region compliant AI assistant. - All data processing occurs within EU boundaries. - - inputs: - - source: eu-queries - outputs: - - target: eu-responses - -# Supporting agents - - name: business-queries - role: producer - interval: 30s - outputs: - - target: vertexai-gemini-agent - - - name: enterprise-responses - role: logger - inputs: - - source: vertexai-gemini-agent - -# Environment variables required: -# - VERTEX_PROJECT_ID: Your GCP project ID (required) -# - VERTEX_LOCATION: GCP region (optional, default: us-central1) -# - GOOGLE_APPLICATION_CREDENTIALS: Path to service account key file (required) - -# Configuration via environment: -# export VERTEX_PROJECT_ID="my-gcp-project" -# export VERTEX_LOCATION="us-central1" # or europe-west1, asia-northeast1, etc. -# export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json" - -# Available Regions: -# - us-central1 (Iowa) -# - us-east4 (Virginia) -# - us-west1 (Oregon) -# - europe-west1 (Belgium) -# - europe-west4 (Netherlands) -# - asia-northeast1 (Tokyo) -# - asia-southeast1 (Singapore) - -# Model Comparison: -# -# Gemini Pro (Vertex AI): -# - Cost: $0.00025 per 1K chars input, $0.0005 per 1K chars output -# - Context: 32K tokens -# - Features: Function calling, streaming, grounding -# - SLA: 99.9% uptime guarantee -# -# PaLM 2 (text-bison): -# - Cost: $0.00025 per 1K chars input, $0.0005 per 1K chars output -# - Context: 8K tokens -# - Features: Text generation, classification -# - Use for: Stable, production text tasks -# -# Codey (code-bison): -# - Cost: $0.00025 per 1K chars input, $0.0005 per 1K chars output -# - Context: 6K tokens -# - Features: Code generation, completion, explanation -# - Use for: Software development tasks - -# Key Enterprise Features: -# - SLAs: 99.9% uptime guarantee -# - Security: VPC-SC, CMEK, private endpoints -# - Compliance: GDPR, HIPAA, SOC 2, ISO 27001 -# - Audit Logs: Cloud Audit Logs integration -# - IAM: Fine-grained access control -# - Regional Control: Data residency compliance -# - Model Garden: Access to multiple models -# - Private AI: No data used for training - -# Authentication Methods: -# -# 1. Service Account (Recommended for production): -# - Create service account in GCP Console -# - Grant "Vertex AI User" role -# - Download JSON key file -# - Set GOOGLE_APPLICATION_CREDENTIALS -# -# 2. Application Default Credentials (Development): -# - Run: gcloud auth application-default login -# - No service account needed -# - Uses your user credentials -# -# 3. Workload Identity (GKE): -# - Bind Kubernetes service account to GCP service account -# - No credential files needed -# - Most secure for GKE deployments - -# Required IAM Roles: -# - roles/aiplatform.user: For using Vertex AI models -# - roles/logging.logWriter: For audit logs (optional) -# - roles/monitoring.metricWriter: For metrics (optional) - -# Rate Limits (Default): -# - Gemini Pro: 300 requests per minute per project -# - PaLM 2: 300 requests per minute per project -# - Can request quota increases via GCP Console - -# Best Practices: -# 1. Use service accounts for production -# 2. Enable audit logging for compliance -# 3. Choose region based on data residency requirements -# 4. Implement circuit breakers for reliability -# 5. Use VPC Service Controls for sensitive data -# 6. Monitor costs via Cloud Billing -# 7. Set up alerts for quota limits -# 8. Use regional endpoints for lower latency -# 9. Implement request retries with exponential backoff -# 10. Tag resources for cost allocation - -# Security Considerations: -# -# VPC Service Controls: -# - Restrict API access to VPC perimeter -# - Prevent data exfiltration -# - Compliance requirement for many industries -# -# Customer-Managed Encryption Keys (CMEK): -# - Encrypt data with your own keys -# - Full control over encryption -# - Required for some compliance frameworks -# -# Private Google Access: -# - Access Vertex AI without internet -# - More secure for sensitive workloads -# - Lower latency within GCP -# -# Audit Logging: -# - All API calls logged to Cloud Logging -# - Integrate with SIEM systems -# - Compliance and forensics - -# Cost Optimization: -# 1. Use appropriate model for task (don't over-provision) -# 2. Implement caching for repeated queries -# 3. Set max_tokens to limit output length -# 4. Use batch prediction for large volumes -# 5. Monitor usage via Cloud Billing -# 6. Set budget alerts -# 7. Use committed use discounts for high volume -# 8. Choose cheaper models when appropriate (PaLM vs Gemini) - -# Comparison: Vertex AI vs Google AI Studio: -# -# Vertex AI (This config): -# + Enterprise SLAs and support -# + Higher rate limits -# + VPC-SC, CMEK security -# + Compliance certifications -# + Regional data residency -# + Integrated with GCP -# - Requires GCP setup -# - No free tier -# -# Google AI Studio: -# + Simpler setup (just API key) -# + Free tier available -# + Good for development -# - No SLAs -# - Lower rate limits -# - Consumer-grade - -# Regional Deployment Example: -# -# For GDPR compliance (EU data residency): -# export VERTEX_PROJECT_ID="my-eu-project" -# export VERTEX_LOCATION="europe-west1" -# -# For low latency in Asia: -# export VERTEX_LOCATION="asia-northeast1" -# -# For US compliance: -# export VERTEX_LOCATION="us-central1" - -# Model Garden: -# Vertex AI provides access to multiple model families: -# - Gemini: Latest Google models -# - PaLM 2: Stable text generation -# - Codey: Code-specialized models -# - Imagen: Image generation (separate API) -# - Chirp: Speech-to-text -# - Third-party: Llama 2, Claude (via Model Garden) - -# Notes: -# - Requires GCP project with billing enabled -# - Authentication via service account or ADC -# - Provider auto-detected from model name or explicit config -# - Regional endpoints for data residency -# - Enterprise-grade SLAs and support -# - Audit logs automatically integrated -# - VPC-SC compatible for secure deployments -# - CMEK supported for encryption -# - Model versions can be pinned for stability -# - Streaming responses supported -# - Function calling available on Gemini Pro diff --git a/web/content/examples/llm-providers/xai.yaml b/web/content/examples/llm-providers/xai.yaml deleted file mode 100644 index 50b9f30..0000000 --- a/web/content/examples/llm-providers/xai.yaml +++ /dev/null @@ -1,280 +0,0 @@ -# xAI Grok Provider Configuration -# Access to Grok models from xAI (X.AI / Twitter) -# OpenAI-compatible API interface - -agents: - - name: grok-agent - role: react - - # xAI Grok model selection - # Available: grok-beta, grok-1 - model: grok-beta - - prompt: | - You are Grok, xAI's conversational AI assistant. - You have a witty, helpful personality and can access real-time information. - Provide accurate, engaging, and sometimes humorous responses. - - tools: - - name: search_web - description: "Search the web for current information" - input_schema: - type: object - properties: - query: - type: string - description: "Search query" - max_results: - type: number - description: "Maximum results to return" - default: 5 - required: [query] - - - name: analyze_trends - description: "Analyze current trends and discussions" - input_schema: - type: object - properties: - topic: - type: string - description: "Topic to analyze" - platform: - type: string - enum: ["twitter", "web", "news"] - default: "web" - required: [topic] - - inputs: - - source: user-queries - outputs: - - target: grok-responses - -# Example: Real-time information assistant - - name: grok-realtime-agent - role: react - model: grok-beta - - prompt: | - You are a real-time information assistant with access to current data. - Provide up-to-date information on current events, trends, and discussions. - Always cite sources and indicate when information is time-sensitive. - - tools: - - name: get_current_events - description: "Get latest news and events" - input_schema: - type: object - properties: - category: - type: string - enum: ["tech", "business", "science", "general"] - region: - type: string - description: "Geographic region" - required: [category] - - inputs: - - source: news-queries - outputs: - - target: realtime-responses - -# Example: Conversational agent with personality - - name: grok-conversational-agent - role: react - model: grok-beta - - prompt: | - You are Grok with a distinctive personality: witty, insightful, and slightly irreverent. - Engage in natural conversation while being helpful and informative. - Don't be afraid to inject humor when appropriate. - - inputs: - - source: chat-stream - outputs: - - target: chat-responses - -# Supporting agents - - name: user-queries - role: producer - interval: 15s - outputs: - - target: grok-agent - - - name: grok-responses - role: logger - inputs: - - source: grok-agent - -# Environment variables required: -# - XAI_API_KEY: Your xAI API key (required) -# Get from: https://console.x.ai -# -# - XAI_BASE_URL: Custom base URL (optional) -# Default: https://api.x.ai/v1 - -# Configuration via environment: -# export XAI_API_KEY="xai-..." -# export XAI_BASE_URL="https://api.x.ai/v1" # optional - -# Model Information: -# -# Grok-Beta: -# - Latest model with ongoing improvements -# - Real-time information access -# - Conversational and witty personality -# - Context: Large context window -# - Speed: Fast responses (2-4 seconds typical) -# - Use for: General tasks, real-time info, conversation -# -# Grok-1: -# - Stable version of Grok -# - Consistent performance -# - Same capabilities as beta -# - Use for: Production deployments requiring stability - -# Key Features: -# - Real-time Information: Access to current web data -# - X (Twitter) Integration: Can reference recent posts/trends -# - Conversational: Natural, engaging dialogue style -# - OpenAI Compatible: Uses OpenAI-style API -# - Function Calling: Tool use capabilities -# - Streaming: Real-time response generation -# - Personality: Distinctive witty character - -# Pricing: -# - Contact xAI for current pricing -# - Usage-based pricing model -# - Different tiers available -# - Check console.x.ai for details - -# Rate Limits: -# - Varies by account tier -# - Check your dashboard for current limits -# - Implement retry logic for rate limit errors - -# Best Practices: -# 1. Leverage real-time capabilities for current information -# 2. Use witty personality appropriately for use case -# 3. Implement streaming for better UX -# 4. Cache responses when appropriate -# 5. Monitor rate limits and usage -# 6. Use function calling for structured outputs -# 7. Test personality fit for your application -# 8. Implement error handling for API failures -# 9. Consider fallback to other providers -# 10. Document xAI-specific behaviors for your team - -# Real-time Information Access: -# Grok has unique capabilities for current information: -# - Recent web content (within hours/days) -# - X (Twitter) posts and trends -# - News and current events -# - Market data and trends -# - Breaking news -# -# This makes Grok particularly useful for: -# - News aggregation and analysis -# - Trend analysis and reporting -# - Current event Q&A -# - Social media monitoring -# - Real-time fact-checking - -# OpenAI API Compatibility: -# Grok uses OpenAI-compatible endpoints: -# - Same request/response format -# - Function calling works the same -# - Streaming uses SSE -# - Easy migration from/to OpenAI -# - Minimal code changes needed - -# Personality Considerations: -# Grok has a distinctive personality: -# - Witty and sometimes humorous -# - Direct and honest -# - Slightly irreverent -# - Engaging conversational style -# -# This personality: -# + Makes interactions more engaging -# + Can improve user experience -# + Memorable brand association -# - May not suit all professional contexts -# - Requires consideration for enterprise use -# - Test thoroughly for your audience - -# Use Cases: -# -# Excellent for: -# - News and current events -# - Social media analysis -# - Trend monitoring -# - Conversational interfaces -# - Real-time Q&A -# - Market analysis -# - Breaking news alerts -# -# Consider alternatives for: -# - Highly formal/serious contexts -# - Maximum token efficiency -# - Extensive tool ecosystems -# - Long-document analysis (if context limited) -# - Mission-critical accuracy (always verify) - -# Comparison with Other Providers: -# -# vs OpenAI GPT-4: -# + Real-time information access -# + Distinctive personality -# - Smaller ecosystem -# - Less mature platform -# -# vs Anthropic Claude: -# + Real-time web access -# + More engaging personality -# - Shorter context window -# - Less sophisticated reasoning -# -# vs Google Gemini: -# + Better real-time information -# + More personality -# = Similar API compatibility - -# Integration Notes: -# - API is OpenAI-compatible -# - Drop-in replacement for OpenAI in many cases -# - Function calling format matches OpenAI -# - Streaming response format compatible -# - Easy to add as alternative provider -# - Minimal configuration changes needed - -# Error Handling: -# Implement robust error handling: -# - Rate limit errors (429) -# - API errors (5xx) -# - Invalid requests (400) -# - Authentication errors (401) -# - Timeout handling -# - Retry with exponential backoff -# - Fallback to alternative providers - -# Monitoring and Observability: -# Track key metrics: -# - Request latency -# - Token usage -# - Error rates -# - Rate limit hits -# - Cost per request -# - User satisfaction -# - Real-time info accuracy - -# Notes: -# - API key automatically loaded from XAI_API_KEY -# - Provider auto-detected from model name (grok-*) -# - OpenAI-compatible API makes integration easy -# - Real-time information is a key differentiator -# - Personality may need testing for your use case -# - Streaming recommended for better UX -# - Function calling supported -# - Monitor for rate limits and costs -# - Consider as complement to other providers -# - Unique for X (Twitter) integration diff --git a/web/content/examples/mcp/grpc-transport.yaml b/web/content/examples/mcp/grpc-transport.yaml deleted file mode 100644 index 48cd0fc..0000000 --- a/web/content/examples/mcp/grpc-transport.yaml +++ /dev/null @@ -1,269 +0,0 @@ -# MCP gRPC Transport Configuration -# Remote MCP server communication via gRPC -# Best for: Production, microservices, distributed systems - -# This example shows MCP servers accessed via gRPC for distributed deployments. -# Supports load balancing and service discovery. -# -# IMPORTANT: TLS/mTLS Security -# TLS is handled externally by your cloud infrastructure, NOT directly by Aixgo. -# Cloud Run, GKE Ingress, and other serverless platforms provide TLS by default. -# For Kubernetes deployments, use service mesh (Istio, Linkerd) or cert-manager for mTLS. - -agents: - - name: distributed-mcp-agent - role: react - model: gpt-4-turbo - - prompt: | - You are an AI assistant with access to distributed MCP services. - Use these remote tools for data access, computation, and integrations. - - inputs: - - source: user-requests - outputs: - - target: agent-results - - - name: user-requests - role: producer - interval: 20s - outputs: - - target: distributed-mcp-agent - - - name: agent-results - role: logger - inputs: - - source: distributed-mcp-agent - -# MCP Server Configuration (gRPC Transport) -# This configuration would typically be in a separate file or environment config - -# Example: Basic gRPC MCP Server (Insecure - Development Only) -# mcp_servers: -# - name: data-service -# transport: grpc -# address: "localhost:50051" -# tls: false # WARNING: Development only! -# -# - name: compute-service -# transport: grpc -# address: "localhost:50052" -# tls: false - -# Example: Production gRPC MCP Server with TLS -# mcp_servers: -# - name: secure-data-service -# transport: grpc -# address: "data-service.example.com:443" -# tls: true -# tls_config: -# ca_file: "/path/to/ca.crt" -# cert_file: "/path/to/client.crt" -# key_file: "/path/to/client.key" -# server_name: "data-service.example.com" -# insecure_skip_verify: false # MUST be false in production -# -# - name: secure-compute-service -# transport: grpc -# address: "compute-service.example.com:443" -# tls: true -# tls_config: -# ca_file: "/path/to/ca.crt" -# cert_file: "/path/to/client.crt" -# key_file: "/path/to/client.key" - -# Example: Kubernetes Service Discovery -# mcp_servers: -# - name: k8s-data-service -# transport: grpc -# address: "mcp-data-service.default.svc.cluster.local:50051" -# tls: true -# tls_config: -# ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt" - -# Environment Variables for Configuration: -# MCP_DATA_SERVICE_ADDRESS=data-service.example.com:443 -# MCP_DATA_SERVICE_TLS=true -# MCP_DATA_SERVICE_CA=/path/to/ca.crt -# MCP_DATA_SERVICE_CERT=/path/to/client.crt -# MCP_DATA_SERVICE_KEY=/path/to/client.key - -# gRPC Transport Features: -# -# Security: -# - TLS 1.2+ encryption -# - Mutual TLS (mTLS) for authentication -# - Certificate validation -# - Secure cipher suites -# -# Performance: -# - HTTP/2 multiplexing -# - Binary protocol (efficient) -# - Streaming support -# - Connection pooling -# -# Reliability: -# - Automatic retries -# - Health checking -# - Load balancing -# - Timeout configuration - -# TLS Configuration (Handled by Infrastructure): -# -# Cloud Run / Serverless: -# - TLS is enabled by default -# - No configuration needed in Aixgo -# - Managed certificates automatically -# -# Kubernetes / GKE: -# - Use Ingress with managed certificates -# - Or service mesh (Istio, Linkerd) for mTLS -# - cert-manager for certificate automation -# -# NOTE: Aixgo does not directly implement TLS. -# TLS termination is handled by your cloud infrastructure. - -# Certificate Management: -# -# Development: -# - Self-signed certificates OK -# - Can use insecure_skip_verify (carefully!) -# - Simple generation with openssl -# -# Production: -# - Use proper CA (Let's Encrypt, internal CA) -# - Never use insecure_skip_verify -# - Rotate certificates regularly -# - Use cert-manager in Kubernetes - -# Load Balancing: -# -# Client-side (gRPC): -# - Multiple addresses: "srv1:50051,srv2:50051,srv3:50051" -# - Round-robin by default -# - Automatic failover -# -# Server-side (Infrastructure): -# - Use load balancer (nginx, envoy, traefik) -# - Health check endpoints -# - Session affinity if needed - -# Service Discovery: -# -# DNS-based: -# - Use service DNS names -# - Kubernetes service discovery -# - Consul DNS -# -# API-based: -# - Consul catalog -# - etcd service registry -# - Kubernetes API -# -# Configuration: -# - Environment variables -# - Config files -# - Configuration management (consul-template, etc.) - -# Health Checking: -# Implement health checks in MCP servers: -# - gRPC health check protocol -# - Kubernetes liveness/readiness probes -# - Monitoring integration - -# Error Handling: -# gRPC provides rich error codes: -# - UNAVAILABLE: Service down, retry -# - DEADLINE_EXCEEDED: Timeout, may retry -# - UNAUTHENTICATED: Auth failed, don't retry -# - PERMISSION_DENIED: Insufficient permissions -# - RESOURCE_EXHAUSTED: Rate limited, back off - -# Retry Strategy: -# Implement exponential backoff: -# - Initial retry after 100ms -# - Double each retry (200ms, 400ms, 800ms) -# - Max retry delay 30s -# - Max 3-5 retries -# - Only retry idempotent operations - -# Monitoring and Observability: -# Track metrics: -# - Request latency (p50, p95, p99) -# - Error rates by gRPC status code -# - Connection status -# - Tool call success rate -# - Network bytes sent/received - -# Production Deployment Patterns: -# -# Pattern 1: Kubernetes Service -# - Deploy MCP server as Kubernetes service -# - Use ClusterIP for internal access -# - mTLS with cert-manager -# - Service discovery via DNS -# -# Pattern 2: Service Mesh -# - Use Istio or Linkerd -# - Automatic mTLS -# - Advanced traffic management -# - Built-in observability -# -# Pattern 3: API Gateway -# - Expose via API gateway -# - Authentication/authorization -# - Rate limiting -# - API management - -# Security Best Practices: -# 1. Deploy to TLS-enabled platforms (Cloud Run, GKE, etc.) -# 2. Use service mesh (Istio, Linkerd) for mTLS between services -# 3. Never expose gRPC services without TLS termination -# 4. Let infrastructure handle certificate management -# 5. Use cert-manager for Kubernetes deployments -# 6. Implement authentication at application level -# 7. Audit all tool access -# 8. Rate limit requests -# 9. Monitor for anomalies -# 10. Use VPC/private networking where possible - -# Performance Optimization: -# 1. Use connection pooling -# 2. Enable HTTP/2 -# 3. Implement caching where appropriate -# 4. Set appropriate timeouts -# 5. Use streaming for large data -# 6. Deploy servers close to consumers -# 7. Monitor and optimize latency -# 8. Use compression for large payloads - -# vs Local Transport: -# -# gRPC Transport: -# + Distributed deployment -# + Service isolation -# + Independent scaling -# + Language-agnostic -# + Network security (TLS) -# - Network latency -# - More complex setup -# - Requires infrastructure -# -# Local Transport: -# + Faster (no network) -# + Simpler setup -# + Easier debugging -# - Single process only -# - No distribution -# - No isolation - -# Notes: -# - gRPC transport enables distributed MCP architecture -# - TLS is handled by cloud infrastructure (Cloud Run, GKE, service mesh) -# - Aixgo does not directly implement TLS - deploy to TLS-enabled platforms -# - Use service discovery for dynamic environments -# - Implement proper error handling and retries -# - Monitor performance and reliability metrics -# - Consider service mesh for advanced mTLS scenarios -# - Load balancing improves availability and performance -# - gRPC's HTTP/2 provides excellent performance diff --git a/web/content/examples/mcp/local-transport.yaml b/web/content/examples/mcp/local-transport.yaml deleted file mode 100644 index cbd620c..0000000 --- a/web/content/examples/mcp/local-transport.yaml +++ /dev/null @@ -1,248 +0,0 @@ -# MCP Local Transport Configuration -# In-process MCP server communication (same process) -# Best for: Development, testing, embedded tools - -# This example shows how to configure MCP servers using local (in-process) transport. -# Local transport is the simplest option - no network configuration needed. - -agents: - - name: mcp-enabled-agent - role: react - model: gpt-4-turbo - - prompt: | - You are an AI assistant with access to MCP tools for file operations, - web search, and data analysis. Use these tools when needed to help users. - - # No tools defined here - they come from MCP servers - # Tools are discovered dynamically from connected MCP servers - - inputs: - - source: user-queries - outputs: - - target: agent-responses - -# Supporting agents - - name: user-queries - role: producer - interval: 15s - outputs: - - target: mcp-enabled-agent - - - name: agent-responses - role: logger - inputs: - - source: mcp-enabled-agent - -# MCP Server Configuration (separate from agents config in production) -# In a real deployment, this would be in application initialization code -# or a separate configuration file - -# Example MCP Server Registration (pseudo-config): -# This shows the concept - actual registration happens in Go code -# -# mcp_servers: -# - name: filesystem-tools -# transport: local -# tools: -# - name: read_file -# description: "Read contents of a file" -# schema: -# type: object -# properties: -# path: -# type: string -# description: "File path to read" -# required: [path] -# -# - name: write_file -# description: "Write contents to a file" -# schema: -# type: object -# properties: -# path: -# type: string -# description: "File path to write" -# content: -# type: string -# description: "Content to write" -# required: [path, content] -# -# - name: list_directory -# description: "List files in a directory" -# schema: -# type: object -# properties: -# path: -# type: string -# description: "Directory path" -# required: [path] -# -# - name: web-tools -# transport: local -# tools: -# - name: fetch_url -# description: "Fetch content from a URL" -# schema: -# type: object -# properties: -# url: -# type: string -# description: "URL to fetch" -# timeout_ms: -# type: number -# description: "Request timeout" -# default: 5000 -# required: [url] -# -# - name: search_web -# description: "Search the web" -# schema: -# type: object -# properties: -# query: -# type: string -# description: "Search query" -# max_results: -# type: number -# default: 10 -# required: [query] - -# How Local Transport Works: -# -# 1. Server Registration: -# - MCP servers are registered in the same process -# - No network communication overhead -# - Direct function calls -# -# 2. Tool Discovery: -# - Agent queries all registered MCP servers -# - Tools are listed and added to LLM context -# - Tool schemas are validated -# -# 3. Tool Execution: -# - When LLM calls a tool, provider routes to correct MCP server -# - Server executes tool handler -# - Result is returned to LLM -# - LLM continues reasoning with result - -# Advantages of Local Transport: -# + No network overhead (fastest) -# + Simple configuration (no ports, firewalls) -# + Secure (no network exposure) -# + Easy debugging (same process) -# + No serialization overhead -# + Perfect for embedded tools -# -# Limitations: -# - Tools must be in same process -# - No remote tool execution -# - Limited to single machine -# - No service isolation - -# When to Use Local Transport: -# - Development and testing -# - Embedded tool libraries -# - Single-process applications -# - Performance-critical tool calls -# - Prototyping and experimentation -# - Simple deployments - -# When to Use gRPC Transport Instead: -# - Distributed systems -# - Microservices architecture -# - Remote tool execution -# - Service isolation required -# - Multiple machines/containers -# - Production deployments with scaling - -# Example Go Code for Local MCP Server: -# -# ```go -# import "github.com/aixgo-dev/aixgo/pkg/mcp" -# -# // Create MCP server -# server := mcp.NewServer("filesystem-tools") -# -# // Register tools -# server.RegisterTool(mcp.ToolDefinition{ -# Name: "read_file", -# Description: "Read contents of a file", -# Schema: map[string]mcp.SchemaField{ -# "path": { -# Type: "string", -# Description: "File path to read", -# Required: true, -# }, -# }, -# Handler: func(ctx context.Context, params mcp.CallToolParams) (*mcp.CallToolResult, error) { -# path := params.Arguments["path"].(string) -# content, err := os.ReadFile(path) -# if err != nil { -# return nil, err -# } -# return &mcp.CallToolResult{ -# Content: []mcp.Content{{ -# Type: "text", -# Text: string(content), -# }}, -# }, nil -# }, -# }) -# -# // Register server for local transport -# mcp.RegisterLocalServer(server) -# -# // In your agent configuration, tools are auto-discovered -# ``` - -# Tool Discovery Flow: -# 1. Agent starts and connects to MCP servers -# 2. Agent calls ListTools() on each server -# 3. Tools are aggregated into a registry -# 4. Tool schemas are provided to LLM -# 5. LLM can call any discovered tool -# 6. Provider routes tool calls to correct server - -# Security Considerations: -# Even with local transport, validate inputs: -# - Sanitize file paths (prevent directory traversal) -# - Validate URLs (prevent SSRF) -# - Limit resource usage (file size, timeout) -# - Audit tool usage -# - Implement rate limiting if needed - -# Error Handling: -# Implement robust error handling in tool handlers: -# - Return descriptive error messages -# - Log errors for debugging -# - Don't expose sensitive information in errors -# - Handle timeouts gracefully -# - Provide fallback behaviors - -# Performance: -# Local transport is the fastest option: -# - No network serialization -# - Direct function calls -# - Minimal overhead -# - Microsecond latency typical -# - Limited only by tool handler performance - -# Testing: -# Local transport simplifies testing: -# - No network mocking needed -# - Direct tool invocation -# - Easy to unit test -# - Fast test execution -# - Simple setup/teardown - -# Notes: -# - Local transport is automatically used when server name matches registered server -# - No special configuration needed beyond server registration -# - Tools are discovered automatically -# - Multiple MCP servers can be registered -# - Each server can provide multiple tools -# - Tool names must be unique across all servers -# - Server names are used for routing but not exposed to LLM -# - Best for development and single-process deployments -# - Consider gRPC transport for distributed production systems diff --git a/web/content/examples/mcp/multiple-servers.yaml b/web/content/examples/mcp/multiple-servers.yaml deleted file mode 100644 index 10b5dfa..0000000 --- a/web/content/examples/mcp/multiple-servers.yaml +++ /dev/null @@ -1,319 +0,0 @@ -# MCP Multiple Servers Configuration -# Connecting to multiple MCP servers simultaneously -# Demonstrates tool aggregation from diverse sources - -# This example shows an agent accessing tools from multiple MCP servers, -# combining local and remote servers for a hybrid architecture. - -agents: - - name: multi-tool-agent - role: react - model: gpt-4-turbo - - prompt: | - You are an advanced AI assistant with access to a comprehensive toolkit - from multiple specialized services: - - - File system operations (local) - - Database queries (remote) - - Web APIs (remote) - - Data processing (remote) - - Notification services (remote) - - Use these tools intelligently to accomplish complex tasks that span - multiple domains. Combine tools from different services when needed. - - # No tools defined - all come from MCP servers - # The agent will have access to ALL tools from ALL connected servers - - inputs: - - source: complex-tasks - outputs: - - target: task-results - - - name: complex-tasks - role: producer - interval: 30s - outputs: - - target: multi-tool-agent - - - name: task-results - role: logger - inputs: - - source: multi-tool-agent - -# MCP Servers Configuration -# Multiple servers with different transports and purposes - -# Example Multi-Server Configuration: -# -# mcp_servers: -# # Local filesystem tools (fast, embedded) -# - name: filesystem -# transport: local -# # Tools registered in code: -# # - read_file, write_file, list_directory, delete_file, etc. -# -# # Remote database service (gRPC, production) -# - name: database -# transport: grpc -# address: "database-mcp.example.com:443" -# tls: true -# tls_config: -# ca_file: "/etc/certs/ca.crt" -# cert_file: "/etc/certs/client.crt" -# key_file: "/etc/certs/client.key" -# # Tools provided by remote server: -# # - query_sql, execute_sql, get_schema, list_tables, etc. -# -# # Remote web API service (gRPC, internal) -# - name: web-api -# transport: grpc -# address: "web-api-mcp.internal:50051" -# tls: true -# tls_config: -# ca_file: "/etc/certs/ca.crt" -# # Tools provided: -# # - http_get, http_post, fetch_json, search_web, etc. -# -# # Remote data processing service (gRPC, high-performance) -# - name: data-processor -# transport: grpc -# address: "data-processor.internal:50052" -# tls: true -# tls_config: -# ca_file: "/etc/certs/ca.crt" -# # Tools provided: -# # - process_csv, analyze_data, generate_stats, etc. -# -# # Remote notification service (gRPC) -# - name: notifications -# transport: grpc -# address: "notifications.internal:50053" -# tls: true -# tls_config: -# ca_file: "/etc/certs/ca.crt" -# # Tools provided: -# # - send_email, send_slack, send_sms, create_ticket, etc. -# -# # Development analytics server (local, for testing) -# - name: analytics -# transport: local -# # Tools registered in code: -# # - log_event, track_metric, create_report, etc. - -# How Multi-Server Tool Discovery Works: -# -# 1. Agent Initialization: -# - Connects to all configured MCP servers -# - Each server is queried for its tool list -# -# 2. Tool Registry: -# - All tools from all servers are aggregated -# - Tool names must be unique across all servers -# - Tool schemas are validated -# - Registry maps tool names to server names -# -# 3. Tool Execution: -# - LLM calls a tool by name -# - Provider looks up which server hosts the tool -# - Request is routed to the correct server -# - Result is returned to LLM - -# Example Tool Combination Scenario: -# -# User Request: "Analyze sales data and email report to team" -# -# Agent Execution: -# 1. Call database.query_sql("SELECT * FROM sales WHERE date > NOW() - INTERVAL 7 DAY") -# 2. Call data-processor.analyze_data(sql_results, "summary") -# 3. Call data-processor.generate_stats(analyzed_data, ["mean", "median", "total"]) -# 4. Call filesystem.write_file("/tmp/report.txt", formatted_stats) -# 5. Call notifications.send_email("team@example.com", "Weekly Sales Report", report_content) -# -# This combines tools from 4 different servers! - -# Tool Naming Conventions: -# Use prefixed names to avoid conflicts: -# - filesystem_read_file -# - database_query_sql -# - web_api_fetch_json -# - data_process_csv -# - notify_send_email - -# Server Organization Strategies: -# -# Strategy 1: By Domain -# - filesystem (all file operations) -# - database (all data queries) -# - web (all web/API operations) -# - notifications (all alert mechanisms) -# -# Strategy 2: By Deployment -# - local-tools (embedded, fast) -# - cloud-services (remote, scalable) -# - external-apis (third-party integrations) -# -# Strategy 3: By Security Level -# - public-tools (low-risk operations) -# - internal-tools (authenticated access) -# - privileged-tools (admin operations, audit required) - -# Performance Considerations: -# -# Local vs Remote: -# - Local tools: microseconds latency -# - Remote tools: milliseconds to seconds -# - Use local for frequent, fast operations -# - Use remote for specialized, intensive operations -# -# Parallelization: -# - Some tools can run in parallel -# - Database + Web API calls can be concurrent -# - Local tools have minimal overhead -# - Coordinate via LLM or supervisor - -# Error Handling Across Servers: -# -# Graceful Degradation: -# - If one server is unavailable, others still work -# - LLM can adapt based on available tools -# - Implement fallbacks where possible -# -# Error Propagation: -# - Server errors are returned to LLM -# - LLM can retry, use alternatives, or inform user -# - Log all errors for debugging - -# Security with Multiple Servers: -# -# Authentication: -# - Each server can have different auth requirements -# - mTLS for service-to-service -# - API keys for external services -# - IAM roles in cloud environments -# -# Authorization: -# - Tool-level permissions -# - User context propagation -# - Audit logging across all servers -# -# Network Security: -# - TLS for all remote servers -# - VPC/network isolation -# - Firewall rules -# - Service mesh for advanced policies - -# Monitoring Multi-Server Architecture: -# -# Metrics to Track: -# - Tool call distribution (which servers used most) -# - Latency by server -# - Error rate by server -# - Network failures -# - Authentication failures -# -# Distributed Tracing: -# - Trace requests across multiple servers -# - Identify bottlenecks -# - Visualize call chains -# - Use OpenTelemetry or similar - -# Configuration Management: -# -# Environment-based: -# Development: -# - More local servers (faster, easier debugging) -# - Some remote servers for integration testing -# - Relaxed security (self-signed certs OK) -# -# Production: -# - Mostly remote servers (scalable, isolated) -# - Minimal local servers (only essential) -# - Strong security (mTLS, validated certs) - -# Service Discovery Integration: -# -# Static Configuration: -# - Hardcoded server addresses -# - Simple, reliable -# - Requires config updates for changes -# -# Dynamic Discovery: -# - Consul, etcd, Kubernetes -# - Servers register themselves -# - Agents discover automatically -# - Handles scaling and failures - -# Testing Multi-Server Setups: -# -# Unit Testing: -# - Mock individual servers -# - Test tool routing logic -# - Validate error handling -# -# Integration Testing: -# - Use local servers in tests -# - Test actual tool execution -# - Verify multi-tool workflows -# -# E2E Testing: -# - Test against real remote servers -# - Validate security (TLS, auth) -# - Performance testing - -# Best Practices: -# 1. Organize servers by logical domain -# 2. Use consistent naming conventions -# 3. Document which server provides which tools -# 4. Implement health checks for all servers -# 5. Use local servers for low-latency needs -# 6. Use remote servers for scalability -# 7. Secure all remote connections with TLS -# 8. Monitor all servers independently -# 9. Implement graceful degradation -# 10. Test multi-server failure scenarios -# 11. Use distributed tracing for debugging -# 12. Keep server configurations in version control - -# Scaling Considerations: -# -# Horizontal Scaling: -# - Run multiple instances of each server -# - Load balance across instances -# - Use service discovery -# -# Vertical Scaling: -# - Increase server resources -# - Optimize tool implementations -# - Cache frequently used data -# -# Geographic Distribution: -# - Deploy servers close to data sources -# - Minimize cross-region latency -# - Consider data residency requirements - -# Example Complex Workflow: -# -# "Generate weekly analytics report and post to Slack" -# -# 1. database.query_sql("SELECT * FROM events WHERE week = CURRENT_WEEK") -# 2. data-processor.analyze_data(events, "weekly_summary") -# 3. analytics.log_event("report_generated", summary_metadata) -# 4. filesystem.write_file("/tmp/report.json", summary_json) -# 5. web-api.http_post("https://api.slack.com/...", slack_message) -# 6. notifications.send_email("manager@example.com", "Report Posted", "...") -# -# This orchestrates 6 tools across 5 different servers! - -# Notes: -# - Multiple MCP servers enable powerful distributed architectures -# - Tools from all servers appear as one unified toolkit to the LLM -# - Mix local and remote servers based on requirements -# - Each server can use different transport (local or gRPC) -# - Tool names must be unique across all servers -# - Routing is automatic based on tool name -# - Failures are isolated to individual servers -# - Provides excellent separation of concerns -# - Scales independently by server -# - Ideal for production microservices architectures diff --git a/web/content/examples/orchestration/classification.yaml b/web/content/examples/orchestration/classification.yaml deleted file mode 100644 index 2fc7f0e..0000000 --- a/web/content/examples/orchestration/classification.yaml +++ /dev/null @@ -1,11 +0,0 @@ -# Classification Pattern: Route based on content type -agents: - - name: classifier - role: classifier - model: gpt-4 - outputs: - - target: router - - name: router - role: aggregator - inputs: - - source: classifier diff --git a/web/content/examples/orchestration/mapreduce.yaml b/web/content/examples/orchestration/mapreduce.yaml deleted file mode 100644 index e0a5f18..0000000 --- a/web/content/examples/orchestration/mapreduce.yaml +++ /dev/null @@ -1,78 +0,0 @@ -# MapReduce Orchestration Pattern -# Distributes work across multiple agents, then aggregates results -# Use case: Parallel processing of large datasets, distributed analysis - -supervisor: - name: mapreduce-supervisor - model: gpt-4-turbo - pattern: mapreduce - max_rounds: 10 - -agents: - # Map phase: Multiple workers process different data partitions - - name: mapper-1 - role: react - model: gpt-4 - prompt: | - You are a mapper agent processing partition 1 of the data. - Analyze your partition and extract key insights. - inputs: - - source: data-stream - outputs: - - target: reducer - - - name: mapper-2 - role: react - model: gpt-4 - prompt: | - You are a mapper agent processing partition 2 of the data. - Analyze your partition and extract key insights. - inputs: - - source: data-stream - outputs: - - target: reducer - - - name: mapper-3 - role: react - model: gpt-4 - prompt: | - You are a mapper agent processing partition 3 of the data. - Analyze your partition and extract key insights. - inputs: - - source: data-stream - outputs: - - target: reducer - - # Reduce phase: Aggregate results from all mappers - - name: reducer - role: aggregator - model: gpt-4-turbo - aggregator_config: - aggregation_strategy: hierarchical - timeout_ms: 5000 - summarization_enabled: true - inputs: - - source: mapper-1 - - source: mapper-2 - - source: mapper-3 - outputs: - - target: final-results - - - name: data-stream - role: producer - interval: 20s - outputs: - - target: mapper-1 - - target: mapper-2 - - target: mapper-3 - - - name: final-results - role: logger - inputs: - - source: reducer - -# MapReduce Pattern: -# - Map: Distribute work to multiple agents in parallel -# - Reduce: Aggregate results from all mappers -# - Scales horizontally by adding more mappers -# - Ideal for data-parallel workloads diff --git a/web/content/examples/orchestration/parallel.yaml b/web/content/examples/orchestration/parallel.yaml deleted file mode 100644 index 9f2b330..0000000 --- a/web/content/examples/orchestration/parallel.yaml +++ /dev/null @@ -1,12 +0,0 @@ -# Parallel Orchestration: Multiple agents work independently -supervisor: - pattern: parallel -agents: - - name: parallel-agent-1 - role: react - model: gpt-4 - prompt: "Independent analysis agent 1" - - name: parallel-agent-2 - role: react - model: gpt-4 - prompt: "Independent analysis agent 2" diff --git a/web/content/examples/orchestration/phased-startup.yaml b/web/content/examples/orchestration/phased-startup.yaml deleted file mode 100644 index 387448b..0000000 --- a/web/content/examples/orchestration/phased-startup.yaml +++ /dev/null @@ -1,89 +0,0 @@ -# Phased Agent Startup with Dependencies -# -# This example demonstrates dependency-aware agent startup introduced in v0.2.3. -# Agents are started in phases based on their depends_on declarations, eliminating -# race conditions and ensuring proper initialization order. -# -# Startup sequence: -# - Phase 0: config-service, database (concurrent) -# - Phase 1: cache, auth-service (concurrent, after Phase 0) -# - Phase 2: user-service, order-service (concurrent, after Phase 1) -# - Phase 3: api-gateway (after Phase 2) - -supervisor: - name: phased-startup-coordinator - max_rounds: 100 - -config: - # Wait up to 60 seconds for agents to become ready - agent_start_timeout: 60s - -agents: - # ===== PHASE 0: Foundation services (no dependencies) ===== - - - name: config-service - role: producer - description: "Loads configuration from environment and files" - # No depends_on - starts in Phase 0 - - - name: database - role: producer - description: "Connects to PostgreSQL database" - # No depends_on - starts in Phase 0 - - # ===== PHASE 1: Services depending on foundation ===== - - - name: cache - role: producer - description: "Redis cache with database fallback" - depends_on: [database, config-service] - # Depends on Phase 0 agents - starts in Phase 1 - - - name: auth-service - role: react - model: gpt-4-turbo - prompt: "Handle authentication and JWT token verification" - depends_on: [database] - # Depends on Phase 0 agent - starts in Phase 1 - - # ===== PHASE 2: Application services ===== - - - name: user-service - role: react - model: gpt-4-turbo - prompt: "Manage user accounts, profiles, and preferences" - depends_on: [database, cache, auth-service] - # Depends on Phase 0 & 1 agents - starts in Phase 2 - inputs: - - source: api-gateway - outputs: - - target: api-gateway - - - name: order-service - role: react - model: gpt-4-turbo - prompt: "Process and manage customer orders" - depends_on: [database, cache, auth-service] - # Depends on Phase 0 & 1 agents - starts in Phase 2 - inputs: - - source: api-gateway - outputs: - - target: api-gateway - - # ===== PHASE 3: API Gateway ===== - - - name: api-gateway - role: react - model: gpt-4-turbo - prompt: | - You are an API gateway that routes requests to appropriate services. - Available services: user-service, order-service. - Analyze the request and delegate to the correct service. - depends_on: [user-service, order-service] - # Depends on Phase 2 agents - starts in Phase 3 - inputs: - - source: user-service - - source: order-service - outputs: - - target: user-service - - target: order-service diff --git a/web/content/examples/orchestration/planning.yaml b/web/content/examples/orchestration/planning.yaml deleted file mode 100644 index 09aa7fa..0000000 --- a/web/content/examples/orchestration/planning.yaml +++ /dev/null @@ -1,12 +0,0 @@ -# Planning Pattern: Plan first, then execute -agents: - - name: planner - role: planner - model: gpt-4 - outputs: - - target: executor - - name: executor - role: react - model: gpt-4 - inputs: - - source: planner diff --git a/web/content/examples/orchestration/reflection.yaml b/web/content/examples/orchestration/reflection.yaml deleted file mode 100644 index 54e18f3..0000000 --- a/web/content/examples/orchestration/reflection.yaml +++ /dev/null @@ -1,14 +0,0 @@ -# Reflection Pattern: Agent critiques and improves own output -agents: - - name: generator - role: react - model: gpt-4 - prompt: "Generate initial response" - outputs: - - target: critic - - name: critic - role: react - model: gpt-4 - prompt: "Critique and suggest improvements" - inputs: - - source: generator diff --git a/web/content/examples/orchestration/sequential.yaml b/web/content/examples/orchestration/sequential.yaml deleted file mode 100644 index fb94eb0..0000000 --- a/web/content/examples/orchestration/sequential.yaml +++ /dev/null @@ -1,14 +0,0 @@ -# Sequential Orchestration: Chain of agents in sequence -supervisor: - pattern: sequential -agents: - - name: step-1 - role: react - model: gpt-4 - outputs: - - target: step-2 - - name: step-2 - role: react - model: gpt-4 - inputs: - - source: step-1 diff --git a/web/content/examples/security/builtin-api-key.yaml b/web/content/examples/security/builtin-api-key.yaml deleted file mode 100644 index 9a2a735..0000000 --- a/web/content/examples/security/builtin-api-key.yaml +++ /dev/null @@ -1,79 +0,0 @@ -# Security Configuration: Built-in API Key Authentication -# Application-level authentication using API keys -# Suitable for service-to-service communication and API access - -environment: production -auth_mode: builtin - -builtin_auth: - method: api_key - api_keys: - # Load API keys from environment variables - # Format: AIXGO_API_KEY_= - source: environment - env_prefix: AIXGO_API_KEY_ - -# Alternative: Load from file -# builtin_auth: -# method: api_key -# api_keys: -# source: file -# file_path: /etc/aixgo/api-keys.json - -authorization: - enabled: true - default_deny: true - # policy_file: /etc/aixgo/rbac-policies.json - -audit: - enabled: true - backend: json - log_auth_decisions: true - # siem: - # enabled: true - # endpoint: https://siem.example.com/events - -# Environment Variables Required: -# AIXGO_API_KEY_service1= -# AIXGO_API_KEY_service2= -# AIXGO_API_KEY_admin= - -# API Key Format in Requests: -# Authorization: Bearer - -# Example: API Key Usage -agents: - - name: api-agent - role: react - model: gpt-4 - prompt: "You are an API service with authenticated access." - inputs: - - source: api-requests - outputs: - - target: api-responses - -# API Key Management Best Practices: -# 1. Generate cryptographically random keys (32+ characters) -# 2. Use unique keys per service/user -# 3. Rotate keys regularly (90 days recommended) -# 4. Store keys in secrets management (Vault, AWS Secrets Manager) -# 5. Never commit keys to version control -# 6. Use key prefixes for identification (sk-, pk-, etc.) -# 7. Implement key revocation mechanism -# 8. Monitor key usage and detect anomalies -# 9. Rate limit per API key -# 10. Log all authentication attempts - -# Key Rotation Process: -# 1. Generate new keys -# 2. Distribute to clients (support dual-key period) -# 3. Monitor usage of old keys -# 4. Disable old keys after transition period -# 5. Remove old keys from configuration - -# Notes: -# - Suitable for machine-to-machine authentication -# - Simpler than OAuth for service-to-service -# - Works well with CI/CD pipelines -# - Requires secure key storage -# - Less suitable for human users (use delegated auth) diff --git a/web/content/examples/security/delegated-iap.yaml b/web/content/examples/security/delegated-iap.yaml deleted file mode 100644 index 8ee36f6..0000000 --- a/web/content/examples/security/delegated-iap.yaml +++ /dev/null @@ -1,117 +0,0 @@ -# Security Configuration: Delegated Authentication (Google Cloud IAP) -# Infrastructure-level authentication handled by Identity-Aware Proxy -# Users authenticate with Google accounts, IAP verifies and forwards identity - -environment: production -auth_mode: delegated - -delegated_auth: - # IAP sets this header with authenticated user email - identity_header: X-Goog-Authenticated-User-Email - - iap: - enabled: true - verify_jwt: true - # IAP audience - get from GCP Console - # Format: /projects/PROJECT_NUMBER/apps/PROJECT_ID - audience: "/projects/123456789/apps/my-project-id" - - # Map additional IAP headers to principal attributes - header_mapping: - email: X-Goog-Authenticated-User-Email - user_id: X-Goog-Authenticated-User-ID - # Add custom mappings as needed - -authorization: - enabled: true - default_deny: true - policy_file: /etc/aixgo/iap-policies.json - -audit: - enabled: true - backend: json - log_auth_decisions: true - siem: - enabled: true - endpoint: https://logging.googleapis.com/v2/entries:write - -# IAP Configuration (in GCP Console): -# 1. Enable IAP for your Cloud Run service or GKE ingress -# 2. Configure OAuth consent screen -# 3. Add authorized users/groups -# 4. Get IAP audience value -# 5. Configure HTTPS load balancer with IAP - -# Example Authorization Policies (iap-policies.json): -# { -# "policies": [ -# { -# "principal": "user:admin@example.com", -# "resource": "*", -# "action": "*", -# "effect": "allow" -# }, -# { -# "principal": "group:developers@example.com", -# "resource": "agents.*", -# "action": ["read", "execute"], -# "effect": "allow" -# } -# ] -# } - -agents: - - name: iap-protected-agent - role: react - model: gpt-4 - prompt: "You are a secure agent accessible only to authenticated users." - inputs: - - source: user-requests - outputs: - - target: user-responses - -# IAP Benefits: -# + Google account authentication (familiar to users) -# + No application code for auth -# + Centralized access control -# + MFA support built-in -# + Integration with Google Workspace -# + Audit logs in Cloud Logging -# + Works with Cloud Run, GKE, Compute Engine - -# IAP Limitations: -# - Google Cloud only -# - Requires HTTPS load balancer -# - All users need Google accounts -# - Limited customization - -# Best Practices: -# 1. Always verify JWT signature -# 2. Validate audience claim -# 3. Use groups for access control (not individual users) -# 4. Enable MFA for sensitive operations -# 5. Monitor IAP access logs -# 6. Rotate service account keys -# 7. Use least-privilege IAM roles -# 8. Test failover scenarios - -# Security Considerations: -# - IAP provides authentication (identity) -# - Application must handle authorization (permissions) -# - Validate all user inputs even with IAP -# - Implement additional controls for sensitive operations -# - Monitor for unusual access patterns -# - IAP can be bypassed if service is publicly accessible -# - Always enforce IAP at load balancer level - -# Testing IAP Locally: -# For development, simulate IAP headers: -# curl -H "X-Goog-Authenticated-User-Email: accounts.google.com:test@example.com" \ -# http://localhost:8080/api/endpoint - -# Notes: -# - Perfect for internal tools and dashboards -# - Google Workspace integration -# - No credential management in application -# - Scales automatically with Cloud infrastructure -# - Audit logs automatically integrated with Cloud Logging diff --git a/web/content/examples/security/disabled-dev.yaml b/web/content/examples/security/disabled-dev.yaml deleted file mode 100644 index c663b33..0000000 --- a/web/content/examples/security/disabled-dev.yaml +++ /dev/null @@ -1,77 +0,0 @@ -# Security Configuration: Disabled (Development Only) -# WARNING: Never use this configuration in production! -# For local development and testing only. - -environment: development -auth_mode: disabled - -# No authentication required -# All requests are accepted without identity verification -# Use only on localhost or isolated development environments - -authorization: - enabled: false - default_deny: false - # No access control - all operations allowed - -audit: - enabled: false - backend: memory - log_auth_decisions: false - # Minimal logging to reduce development noise - -# Agent configuration with no security -agents: - - name: dev-agent - role: react - model: gpt-4 - prompt: "You are a development assistant with unrestricted access." - inputs: - - source: dev-input - outputs: - - target: dev-output - - - name: dev-input - role: producer - interval: 10s - outputs: - - target: dev-agent - - - name: dev-output - role: logger - inputs: - - source: dev-agent - -# Use Cases: -# - Local development on laptop -# - Unit testing -# - Integration testing (isolated environment) -# - Prototyping and experimentation - -# Security Warnings: -# 1. NEVER expose this configuration to the internet -# 2. NEVER use in production, staging, or any shared environment -# 3. NEVER commit API keys with this config -# 4. ONLY use on localhost or isolated networks - -# How to Use Safely: -# 1. Run only on localhost (127.0.0.1) -# 2. Use firewall to block external access -# 3. Don't commit sensitive data to git -# 4. Switch to proper auth before deployment -# 5. Use environment-specific configs - -# Migration to Production: -# Before deploying: -# 1. Change environment to "production" -# 2. Set auth_mode to "builtin", "delegated", or "hybrid" -# 3. Enable authorization with default_deny: true -# 4. Enable audit logging with appropriate backend -# 5. Test authentication and authorization thoroughly - -# Notes: -# - This is the simplest configuration possible -# - Ideal for rapid development iteration -# - No overhead from auth checks -# - Fast startup and testing -# - Remember to switch to secure config before any deployment diff --git a/web/content/examples/security/hybrid.yaml b/web/content/examples/security/hybrid.yaml deleted file mode 100644 index 00e35b1..0000000 --- a/web/content/examples/security/hybrid.yaml +++ /dev/null @@ -1,161 +0,0 @@ -# Security Configuration: Hybrid Authentication -# Combines delegated (IAP) and builtin (API key) authentication -# Supports both human users and service-to-service calls - -environment: production -auth_mode: hybrid - -# Delegated auth for human users via IAP -delegated_auth: - identity_header: X-Goog-Authenticated-User-Email - iap: - enabled: true - verify_jwt: true - audience: "/projects/123456789/apps/my-project-id" - header_mapping: - email: X-Goog-Authenticated-User-Email - user_id: X-Goog-Authenticated-User-ID - -# Builtin auth for service accounts and automation -builtin_auth: - method: api_key - api_keys: - source: environment - env_prefix: AIXGO_API_KEY_ - -authorization: - enabled: true - default_deny: true - policy_file: /etc/aixgo/hybrid-policies.json - -audit: - enabled: true - backend: json - log_auth_decisions: true - siem: - enabled: true - endpoint: https://siem.example.com/events - -# How Hybrid Mode Works: -# 1. Check for IAP headers (X-Goog-Authenticated-User-Email) -# -> If present: Use delegated auth (human user) -# 2. Check for Authorization header with Bearer token -# -> If present: Use builtin auth (service account) -# 3. If neither: Reject request (default deny) - -# Example Authorization Policies: -# { -# "policies": [ -# # Human users (via IAP) -# { -# "principal": "user:admin@example.com", -# "resource": "*", -# "action": "*", -# "effect": "allow" -# }, -# { -# "principal": "group:operators@example.com", -# "resource": "agents.*", -# "action": ["read", "execute"], -# "effect": "allow" -# }, -# # Service accounts (via API key) -# { -# "principal": "service:ci-pipeline", -# "resource": "agents.deploy", -# "action": "execute", -# "effect": "allow" -# }, -# { -# "principal": "service:monitoring", -# "resource": "agents.*.metrics", -# "action": "read", -# "effect": "allow" -# } -# ] -# } - -agents: - - name: hybrid-auth-agent - role: react - model: gpt-4 - prompt: | - You are a hybrid-auth agent accessible to both human users - (via IAP) and automated services (via API keys). - inputs: - - source: mixed-requests - outputs: - - target: responses - -# Use Cases for Hybrid Auth: -# -# Human Users (IAP): -# - Web dashboards -# - Admin panels -# - Interactive tools -# - Manual operations -# -# Service Accounts (API Keys): -# - CI/CD pipelines -# - Monitoring systems -# - Scheduled jobs -# - Microservice communication -# - External integrations - -# Environment Setup: -# export AIXGO_API_KEY_ci_pipeline=sk-ci-secure-key-here -# export AIXGO_API_KEY_monitoring=sk-mon-secure-key-here -# export AIXGO_API_KEY_integration=sk-int-secure-key-here - -# Request Examples: -# -# Human user via IAP (automatic): -# - Browser request to https://app.example.com -# - IAP authenticates user -# - Headers: X-Goog-Authenticated-User-Email: accounts.google.com:user@example.com -# -# Service account via API key: -# curl -H "Authorization: Bearer sk-ci-secure-key-here" \ -# https://app.example.com/api/deploy - -# Advantages of Hybrid: -# + Flexibility for different client types -# + Seamless user experience (no API keys for humans) -# + Secure service-to-service communication -# + Unified authorization policy -# + Comprehensive audit trail - -# Best Practices: -# 1. Use IAP for all human interactive access -# 2. Use API keys only for automation and services -# 3. Separate policies for human vs service principals -# 4. More restrictive policies for service accounts -# 5. Rotate API keys regularly -# 6. Monitor both auth types independently -# 7. Different rate limits for each auth type -# 8. Explicit deny rules for sensitive operations -# 9. MFA enforcement for human users -# 10. Audit all service account usage - -# Rate Limiting Strategy: -# - Human users: per-user rate limits (higher) -# - Service accounts: per-key rate limits (lower) -# - Global rate limit as backstop -# - Different limits for different operations - -# Monitoring: -# Track separately: -# - IAP authentication success/failure -# - API key authentication success/failure -# - Authorization decisions by principal type -# - Resource access patterns -# - Anomaly detection per auth type - -# Notes: -# - Most flexible auth mode -# - Supports diverse client types -# - Requires careful policy design -# - More complex to configure than single mode -# - Ideal for production environments with mixed clients -# - Each auth method can fail independently -# - Policies must account for both principal types diff --git a/web/content/examples/use-cases/content-classifier.yaml b/web/content/examples/use-cases/content-classifier.yaml deleted file mode 100644 index d4ea166..0000000 --- a/web/content/examples/use-cases/content-classifier.yaml +++ /dev/null @@ -1,14 +0,0 @@ -# Content Classifier: Categorize incoming content -agents: - - name: content-classifier - role: classifier - model: gpt-4 - classifier_config: - categories: - - name: spam - description: "Spam or unwanted content" - - name: support - description: "Customer support requests" - - name: sales - description: "Sales inquiries" - confidence_threshold: 0.7 diff --git a/web/content/examples/use-cases/conversation-memory.yaml b/web/content/examples/use-cases/conversation-memory.yaml deleted file mode 100644 index 5f0fcb3..0000000 --- a/web/content/examples/use-cases/conversation-memory.yaml +++ /dev/null @@ -1,217 +0,0 @@ -# Conversation Memory: Stateful agent with long-term memory -# This example demonstrates using vector databases to store and retrieve -# conversation history, enabling personalized, context-aware interactions. - -# Supervisor coordinates memory operations -supervisor: - name: memory-coordinator - model: gpt-4-turbo - max_rounds: 10 - -# Embedding configuration for memory search -embeddings: - provider: openai - openai: - api_key: ${OPENAI_API_KEY} - model: text-embedding-3-small - -# Vector store for conversation memory -vectorstore: - provider: firestore # Use 'memory' for development - embedding_dimensions: 1536 - default_top_k: 10 # Retrieve more context for conversations - firestore: - project_id: ${GCP_PROJECT} - collection: conversations - # credentials_file: /path/to/service-account.json - -agents: - - name: stateful-assistant - role: react - model: gpt-4-turbo - prompt: | - You are a personal AI assistant with memory of past conversations. - - Your capabilities: - - Remember user preferences, interests, and past interactions - - Provide personalized responses based on conversation history - - Reference previous discussions naturally - - Build rapport over time through consistent context - - Guidelines: - 1. Search conversation memory before responding - 2. Acknowledge and reference relevant past interactions - 3. Adapt your tone and style based on user preferences - 4. Store important information for future reference - 5. Respect user privacy and data boundaries - - Provide helpful, context-aware assistance that improves over time. - - # Conversation memory configuration - conversation_memory: - enabled: true - collection: conversations - - # Scope defines how memories are partitioned - scope: - - user # Separate memory per user - - session # Track current session - # - tenant # For multi-tenant applications - - # Memory management - max_turns: 20 # Maximum conversation turns to retain - ttl: 24h # Memory expiration (24 hours) - embedding_model: text-embedding-3-small - - # What to store in memory - store: - - user_messages: true # Store user inputs - - assistant_messages: true # Store assistant responses - - tool_calls: true # Store tool usage - - metadata: true # Store conversation metadata - - # Memory retrieval strategy - retrieval: - method: hybrid # Combine semantic search + recency - top_k: 5 # Most relevant memories - min_score: 0.6 # Relevance threshold - boost_recent: 0.3 # Weight factor for recent memories - - # Summarization for long conversations - summarization: - enabled: true - trigger_after: 15 # Summarize after 15 turns - keep_last_n: 5 # Keep last 5 turns uncompressed - model: gpt-4-turbo # Model for summarization - - # Privacy and data management - privacy: - anonymize_pii: true # Remove personal information - encrypt_at_rest: true # Encrypt stored memories - user_data_retention: 30d # Delete after 30 days - - tools: - - name: search_memory - description: "Search past conversation history for relevant context" - vectorstore: - collection: conversations - generate_embedding: true - filters: - scope: current_user # Only search current user's memories - input_schema: - type: object - properties: - query: - type: string - description: "What to search for in conversation history" - time_range: - type: string - description: "Optional time range (e.g., 'last_week', 'today', 'last_month')" - required: [query] - - - name: store_memory - description: "Explicitly store important information for future reference" - vectorstore: - collection: conversations - generate_embedding: true - input_schema: - type: object - properties: - content: - type: string - description: "The information to remember" - importance: - type: number - default: 5 - description: "Importance level (1-10), affects retrieval priority" - tags: - type: array - items: { type: string } - description: "Tags for categorizing the memory (e.g., 'preference', 'fact', 'goal')" - required: [content] - - - name: get_conversation_summary - description: "Retrieve a summary of the current conversation" - input_schema: - type: object - properties: - include_details: - type: boolean - default: false - description: "Include detailed breakdown of topics discussed" - - - name: forget_memory - description: "Delete specific memories at user request" - vectorstore: - collection: conversations - input_schema: - type: object - properties: - memory_id: - type: string - description: "ID of the memory to delete" - time_range: - type: string - description: "Delete all memories in a time range (e.g., 'today', 'this_week')" - - inputs: - - source: user - description: "User messages with conversation context" - metadata: - user_id: required # User identification for memory scoping - session_id: required # Session tracking - outputs: - - target: user - description: "Context-aware, personalized responses" - - # Memory analytics and management - - name: memory-manager - role: logger - description: Monitors memory usage and maintains data quality - tools: - - name: get_memory_stats - description: "Retrieve statistics about stored memories" - vectorstore: - collection: conversations - input_schema: - type: object - properties: - user_id: - type: string - description: "Get stats for specific user" - aggregation: - type: string - enum: [user, session, global] - default: user - - - name: cleanup_expired - description: "Remove expired memories based on TTL" - vectorstore: - collection: conversations - input_schema: - type: object - properties: - dry_run: - type: boolean - default: true - description: "Preview deletions without executing" - - - name: export_memories - description: "Export user memories for data portability (GDPR compliance)" - vectorstore: - collection: conversations - input_schema: - type: object - properties: - user_id: - type: string - description: "User ID to export memories for" - format: - type: string - enum: [json, csv] - default: json - required: [user_id] - - inputs: - - source: stateful-assistant - description: "Monitor memory operations" diff --git a/web/content/examples/use-cases/multi-expert-consensus.yaml b/web/content/examples/use-cases/multi-expert-consensus.yaml deleted file mode 100644 index cd6413b..0000000 --- a/web/content/examples/use-cases/multi-expert-consensus.yaml +++ /dev/null @@ -1,30 +0,0 @@ -# Multi-Expert Consensus: Multiple experts reach consensus -agents: - - name: expert-1 - role: react - model: gpt-4 - prompt: "You are a technical expert." - outputs: - - target: consensus-builder - - name: expert-2 - role: react - model: claude-3-opus - prompt: "You are a business expert." - outputs: - - target: consensus-builder - - name: expert-3 - role: react - model: gpt-4 - prompt: "You are a domain specialist." - outputs: - - target: consensus-builder - - name: consensus-builder - role: aggregator - model: gpt-4-turbo - aggregator_config: - aggregation_strategy: consensus - consensus_threshold: 0.8 - inputs: - - source: expert-1 - - source: expert-2 - - source: expert-3 diff --git a/web/content/examples/use-cases/rag-chatbot.yaml b/web/content/examples/use-cases/rag-chatbot.yaml deleted file mode 100644 index 62abfbe..0000000 --- a/web/content/examples/use-cases/rag-chatbot.yaml +++ /dev/null @@ -1,139 +0,0 @@ -# RAG Chatbot: Knowledge-based question answering using vector search -# This example demonstrates using vector databases for retrieval-augmented generation -# to answer questions with factual information from a knowledge base. - -# Supervisor coordinates indexing and query operations -supervisor: - name: rag-coordinator - model: gpt-4-turbo - max_rounds: 5 - -# Embedding configuration for generating vector representations -embeddings: - provider: openai - openai: - api_key: ${OPENAI_API_KEY} - model: text-embedding-3-small # 1536 dimensions, cost-efficient - -# Vector store for persistent document storage -vectorstore: - provider: firestore # Use 'memory' for development - embedding_dimensions: 1536 # Must match embedding model - default_top_k: 5 - firestore: - project_id: ${GCP_PROJECT} - collection: knowledge_base - # credentials_file: /path/to/service-account.json # Optional: uses ADC if not set - -agents: - # Agent 1: Document indexer for ingesting knowledge - - name: knowledge-indexer - role: indexer - description: Prepares and indexes documents into the vector database - tools: - - name: chunk_document - description: "Break large documents into semantic chunks for better retrieval" - input_schema: - type: object - properties: - document: - type: string - description: "The full document text to chunk" - chunk_size: - type: number - default: 500 - description: "Target size for each chunk in tokens" - overlap: - type: number - default: 50 - description: "Number of tokens to overlap between chunks" - required: [document] - - - name: index_document - description: "Store a document chunk in the vector database with embeddings" - vectorstore: - collection: knowledge_base - generate_embedding: true # Automatically embed content - input_schema: - type: object - properties: - id: - type: string - description: "Unique document identifier" - content: - type: string - description: "Document content to index" - metadata: - type: object - description: "Additional metadata (source, category, date, etc.)" - properties: - source: { type: string } - category: { type: string } - author: { type: string } - date: { type: string } - required: [id, content] - - # Agent 2: RAG chatbot for answering questions - - name: rag-chatbot - role: react - model: gpt-4-turbo - prompt: | - You are a knowledgeable AI assistant with access to a comprehensive knowledge base. - - When answering questions: - 1. ALWAYS search the knowledge base first using the search_knowledge tool - 2. Base your answers on the retrieved information - 3. Cite specific document IDs as sources (e.g., "According to doc-123...") - 4. If the knowledge base lacks relevant information, clearly state this - 5. Provide accurate, well-structured answers - - Do not make up information. Only use what you retrieve from the knowledge base. - tools: - - name: search_knowledge - description: "Search the vector database for relevant documents using semantic similarity" - vectorstore: - collection: knowledge_base - top_k: 5 # Number of most similar documents to retrieve - min_score: 0.7 # Minimum similarity threshold (0.0-1.0) - generate_embedding: true # Automatically embed the query - input_schema: - type: object - properties: - query: - type: string - description: "The search query or question" - filters: - type: object - description: "Optional metadata filters" - properties: - category: { type: string } - source: { type: string } - date_after: { type: string } - required: [query] - - - name: get_document - description: "Retrieve the full content of a specific document by ID" - vectorstore: - collection: knowledge_base - input_schema: - type: object - properties: - document_id: - type: string - description: "The unique ID of the document to retrieve" - required: [document_id] - - inputs: - - source: user - description: "User questions and requests" - outputs: - - target: user - description: "AI-generated answers with citations" - - # Agent 3: Interaction logger for observability - - name: interaction-logger - role: logger - description: Logs all interactions for debugging and analytics - inputs: - - source: rag-chatbot - - source: user diff --git a/web/content/examples/use-cases/semantic-cache.yaml b/web/content/examples/use-cases/semantic-cache.yaml deleted file mode 100644 index d8ab82f..0000000 --- a/web/content/examples/use-cases/semantic-cache.yaml +++ /dev/null @@ -1,120 +0,0 @@ -# Semantic Cache: Intelligent response caching using vector similarity -# This example demonstrates using vector databases to cache LLM responses -# and retrieve them for semantically similar queries, reducing costs and latency. - -# Supervisor coordinates caching operations -supervisor: - name: cache-coordinator - model: gpt-4-turbo - max_rounds: 3 - -# Embedding configuration for query similarity matching -embeddings: - provider: openai - openai: - api_key: ${OPENAI_API_KEY} - model: text-embedding-3-small # Fast, cost-efficient embeddings - -# Vector store for semantic cache -vectorstore: - provider: firestore # Use 'memory' for development - embedding_dimensions: 1536 - default_top_k: 1 # Only need the best match for caching - firestore: - project_id: ${GCP_PROJECT} - collection: llm_cache - # credentials_file: /path/to/service-account.json - -agents: - - name: cached-assistant - role: react - model: gpt-4-turbo - prompt: | - You are an efficient AI assistant that provides helpful, accurate answers. - Your responses are cached for faster retrieval on similar queries. - - Provide clear, concise answers to user questions. - Focus on being helpful while maintaining response quality. - - # Semantic cache configuration - semantic_cache: - enabled: true - collection: llm_cache # Vector store collection for cache - ttl: 5m # Cache expiration time (5 minutes) - similarity_threshold: 0.95 # High threshold for cache hits (0.0-1.0) - embedding_model: text-embedding-3-small # Must match embeddings config - - # Cache key includes these fields for partitioning - key_fields: - - model # Different models have different caches - - temperature # Different temperatures have different caches - - # Metadata stored with each cache entry - metadata: - version: "1.0" - application: "aixgo-example" - - # Cache invalidation rules - invalidate_on: - - user_feedback: negative # Clear cache on negative feedback - - accuracy_score: < 0.8 # Clear low-quality responses - - # Standard tool configuration (cache applies automatically) - tools: - - name: get_current_time - description: "Get the current date and time" - input_schema: - type: object - properties: {} - - - name: calculate - description: "Perform basic arithmetic calculations" - input_schema: - type: object - properties: - expression: - type: string - description: "Mathematical expression to evaluate (e.g., '2 + 2')" - required: [expression] - - inputs: - - source: user - description: "User queries (checked against cache first)" - outputs: - - target: user - description: "Cached or freshly generated responses" - - # Cache statistics monitor - - name: cache-monitor - role: logger - description: Tracks cache hit rates and performance metrics - tools: - - name: get_cache_stats - description: "Retrieve cache performance statistics" - vectorstore: - collection: llm_cache - input_schema: - type: object - properties: - time_window: - type: string - default: "1h" - description: "Time window for statistics (e.g., '1h', '24h', '7d')" - - - name: clear_cache - description: "Clear expired or invalid cache entries" - vectorstore: - collection: llm_cache - input_schema: - type: object - properties: - older_than: - type: string - description: "Clear entries older than this duration (e.g., '1h', '24h')" - score_below: - type: number - description: "Clear entries with similarity score below threshold" - - inputs: - - source: cached-assistant - description: "Monitor cache hits and misses" diff --git a/web/content/examples/use-cases/simple-chatbot.yaml b/web/content/examples/use-cases/simple-chatbot.yaml deleted file mode 100644 index 4fa6e98..0000000 --- a/web/content/examples/use-cases/simple-chatbot.yaml +++ /dev/null @@ -1,18 +0,0 @@ -# Simple Chatbot: Basic conversational AI -agents: - - name: chatbot - role: react - model: gpt-4 - prompt: | - You are a friendly, helpful chatbot assistant. - Provide clear, concise answers to user questions. - tools: - - name: get_time - description: "Get current time" - input_schema: - type: object - properties: {} - inputs: - - source: user-messages - outputs: - - target: chat-responses diff --git a/web/content/examples/use-cases/task-planner.yaml b/web/content/examples/use-cases/task-planner.yaml deleted file mode 100644 index e25d152..0000000 --- a/web/content/examples/use-cases/task-planner.yaml +++ /dev/null @@ -1,14 +0,0 @@ -# Task Planner: Break down complex tasks into steps -agents: - - name: task-planner - role: planner - model: gpt-4-turbo - planner_config: - planning_strategy: chain_of_thought - max_steps: 15 - enable_self_critique: true - include_alternatives: true - inputs: - - source: task-requests - outputs: - - target: execution-monitor diff --git a/web/content/features-patterns.md b/web/content/features-patterns.md deleted file mode 100644 index 196c21e..0000000 --- a/web/content/features-patterns.md +++ /dev/null @@ -1,598 +0,0 @@ ---- -title: "Orchestration Patterns" -description: "Production-proven agent orchestration patterns in Aixgo" -weight: 30 ---- - -# Agent Orchestration Patterns - -Aixgo provides **13 production-proven orchestration patterns** for building AI agent systems. Each pattern solves specific problems and is backed by real-world usage from industry-leading frameworks. - -All patterns are **fully implemented and production-ready** in v0.2.0+. - -> 📖 **Complete Technical Documentation:** -> - **[PATTERNS.md on GitHub](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md)** - **Complete pattern catalog** with detailed code examples, configuration templates, use cases, and implementation guides -> - **[FEATURES.md on GitHub](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md)** - Authoritative feature reference with all capabilities -> - **[GitHub Repository](https://github.com/aixgo-dev/aixgo)** - Source code and examples -> -> **This page provides a marketing-friendly overview for evaluation and planning.** - -## Pattern Overview - -
- -### ✅ Implemented (v0.2.0+) - -
- -#### Supervisor Pattern -**Centralized orchestration with specialized agents** - -Route tasks to expert agents, aggregate results, and maintain conversation state. Perfect for customer service and multi-agent workflows. - -```yaml -orchestration: - pattern: supervisor - agents: [billing, tech-support, sales] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#1-supervisor-pattern) - -
- -
- -#### Sequential Pattern -**Ordered pipeline execution** - -Execute agents in sequence where each step's output feeds the next. Ideal for ETL, content pipelines, and multi-stage workflows. - -```yaml -orchestration: - pattern: sequential - agents: [extract, transform, validate, load] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#2-sequential-pattern) - -
- -
- -#### Parallel Pattern -**Concurrent execution with aggregation** - -Execute multiple agents simultaneously and aggregate results. **3-4× speedup** for independent tasks. - -**Use cases**: Multi-source research, batch processing, A/B testing - -```yaml -orchestration: - pattern: parallel - agents: [competitive-analysis, market-sizing, tech-trends] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#3-parallel-pattern) - -
- -
- -#### Router Pattern -**Intelligent routing for cost optimization** - -Route simple queries to cheap models, complex to expensive. **25-50% cost reduction** in production. - -**Use cases**: Cost optimization, intent-based routing, model selection - -```yaml -orchestration: - pattern: router - classifier: intent-classifier - routes: - simple: gpt-3.5-turbo - complex: gpt-4-turbo -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#4-router-pattern) - -
- -
- -#### Swarm Pattern -**Decentralized agent handoffs** - -Dynamic agent-to-agent handoffs based on conversational context. Popularized by OpenAI Swarm. - -**Use cases**: Customer service handoffs, adaptive routing, collaborative problem-solving - -```yaml -orchestration: - pattern: swarm - agents: [general, billing, technical] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#5-swarm-pattern) - -
- -
- -#### Hierarchical Pattern -**Multi-level delegation** - -Managers delegate to sub-managers who delegate to workers. Perfect for complex decomposition. - -**Use cases**: Enterprise workflows, project management, organizational hierarchies - -```yaml -orchestration: - pattern: hierarchical - manager: project-manager - teams: - frontend: [ui-engineer, ux-engineer] - backend: [api-engineer, db-engineer] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#6-hierarchical-pattern) - -
- -
- -#### RAG Pattern -**Retrieval-Augmented Generation** - -Retrieve relevant docs from vector store, then generate grounded answers. **Most common enterprise pattern**. - -**Use cases**: Enterprise chatbots, documentation Q&A, knowledge retrieval - -```yaml -orchestration: - pattern: rag - retriever: vector-store - generator: answer-agent - top_k: 5 -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#7-rag-pattern) - -
- -
- -#### Reflection Pattern -**Iterative refinement with self-critique** - -Generator creates output, critic reviews it, generator refines. **20-50% quality improvement**. - -**Use cases**: Code generation, content creation, complex reasoning - -```yaml -orchestration: - pattern: reflection - generator: code-generator - critic: code-reviewer - max_iterations: 3 -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#8-reflection-pattern) - -
- -
- -#### Ensemble Pattern -**Multi-model voting for accuracy** - -Multiple models vote on outputs to reduce errors. **25-50% error reduction** in high-stakes decisions. - -**Use cases**: Medical diagnosis, financial forecasting, content moderation - -```yaml -orchestration: - pattern: ensemble - models: [gpt-4, claude-3.5, gemini-1.5] - voting: majority -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#9-ensemble-pattern) - -
- -
- -#### Classifier Pattern -**Intent-based routing** - -Classify user intent and route to specialized agents. Perfect for support ticket routing and content categorization. - -**Use cases**: Ticket classification, intent-based routing, content categorization - -```yaml -orchestration: - pattern: classifier - classifier: intent-classifier - routes: - technical: tech-agent - billing: billing-agent -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#10-classifier-pattern) - -
- -
- -#### Aggregation Pattern -**Multi-agent synthesis** - -Combine outputs from multiple agents using consensus, weighted, or semantic strategies. - -**Use cases**: Expert synthesis, multi-source analysis, decision fusion - -```yaml -orchestration: - pattern: aggregation - agents: [expert-1, expert-2, expert-3] - strategy: consensus -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#11-aggregation-pattern) - -
- -
- -#### Planning Pattern -**Dynamic task decomposition** - -Break complex tasks into subtasks with dynamic replanning based on execution results. - -**Use cases**: Multi-step research, data pipelines, software development - -```yaml -orchestration: - pattern: planning - planner: task-planner - executors: [executor-1, executor-2] -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#12-planning-pattern) - -
- -
- -#### MapReduce Pattern -**Distributed batch processing** - -Process large datasets in parallel chunks and aggregate results. - -**Use cases**: Batch processing, data analysis, document processing - -```yaml -orchestration: - pattern: mapreduce - mapper: chunk-processor - reducer: result-aggregator -``` - -[View Pattern Docs →](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#13-mapreduce-pattern) - -
- -### 🔮 Future Patterns (Roadmap) - -
- -#### Debate Pattern -**Adversarial collaboration for accuracy** - -Agents with different perspectives debate to reach consensus. **20-40% improvement in factual accuracy**. - -**Use cases**: Research synthesis, legal analysis, complex decision-making - -**Phase 5** · v2.1+ (2025 H2) - -
- -
- -#### Plan-and-Execute Pattern -**Strategic planning before execution** - -Planner decomposes task, executors handle sub-tasks, planner adjusts based on results. - -**Use cases**: Multi-step research, data pipelines, software development - -**Phase 2 or v2.1** · TBD - -
- -
- -#### Nested/Composite Pattern -**Encapsulate complex workflows as single agents** - -Complex multi-agent workflows packaged as reusable components. - -**Use cases**: Modular agent development, workflow reuse, testing - -**Phase 6** · v2.2+ (2025 H2) - -
- -
- -## Pattern Comparison - -| Pattern | Complexity | Cost | Latency | Accuracy | Production Maturity | -|---------|-----------|------|---------|----------|---------------------| -| **Supervisor** | Low | 1× | Low | Medium | ⭐⭐⭐⭐⭐ Very High | -| **Sequential** | Low | N× | High | Medium | ⭐⭐⭐⭐⭐ Very High | -| **Parallel** | Medium | N× | Low | Medium | ⭐⭐⭐⭐⭐ Very High | -| **Router** | Low | 0.25-0.5× | Low | High | ⭐⭐⭐⭐⭐ Very High | -| **Swarm** | Medium | Variable | Medium | High | ⭐⭐⭐⭐ High | -| **Hierarchical** | Medium | N× | Medium | High | ⭐⭐⭐⭐ High | -| **RAG** | Medium | 0.3× | Medium | High | ⭐⭐⭐⭐⭐ Very High | -| **Reflection** | Medium | 2-4× | High | Very High | ⭐⭐⭐⭐ High | -| **Ensemble** | Medium | 3-5× | Low | Very High | ⭐⭐⭐⭐ High | -| **Classifier** | Low | 1× | Low | High | ⭐⭐⭐⭐⭐ Very High | -| **Aggregation** | Medium | N× | Medium | High | ⭐⭐⭐⭐ High | -| **Planning** | Medium | Optimized | Medium | High | ⭐⭐⭐⭐ High | -| **MapReduce** | Medium | N× | Low | High | ⭐⭐⭐⭐⭐ Very High | -| **Debate** (Roadmap) | High | 9× | Very High | Very High | 🔮 Future | -| **Nested** (Roadmap) | High | Variable | Variable | Variable | 🔮 Future | - -### Cost Legend -- `1×` = Single agent execution -- `N×` = N agents (sequential or parallel) -- `0.25-0.5×` = Router savings (cheap models for most queries) -- `0.3×` = RAG token reduction vs full KB -- `2-4×`, `3-5×`, `9×` = Multiple iterations/agents - -### Latency Legend -- **Low**: < 1s overhead -- **Medium**: 1-5s overhead -- **High**: 5-15s overhead -- **Very High**: > 15s overhead - -## Pattern Selection Guide - -### Choose by Goal - -**💰 Reduce Costs** -→ **Router** (25-50% savings) or **RAG** (70% token reduction) - -**⚡ Improve Speed** -→ **Parallel** (3-4× speedup) - -**🎯 Improve Accuracy** -→ **Ensemble** (25-50% error reduction) or **Reflection** (20-50% improvement) - -**🔄 Adaptive Routing** -→ **Swarm** (dynamic handoffs) or **Supervisor** (centralized control) - -**📊 Complex Workflows** -→ **Hierarchical** (multi-level) or **Sequential** (ordered steps) - -**📚 Knowledge-Intensive** -→ **RAG** (retrieval-augmented) - -### Decision Tree - -```text -Need to reduce costs? → Router or RAG -Need high accuracy? → Ensemble or Reflection -Have independent sub-tasks? → Parallel -Need ordered steps? → Sequential -Need dynamic routing? → Swarm -Need multi-level management? → Hierarchical -Need knowledge base access? → RAG -General orchestration? → Supervisor (default) -``` - -## Real-World Examples - -### Cost Optimization with Router - -```go -// Before: Always using GPT-4 -// Cost: $0.03 per request -// After: Router to GPT-3.5 for 80% of queries -// Cost: $0.006 per request (80% savings) - -router := orchestration.NewRouter( - "cost-optimizer", - runtime, - "complexity-classifier", - map[string]string{ - "simple": "gpt-3.5-turbo-agent", - "complex": "gpt-4-turbo-agent", - }, -) -``` - -**Result**: **25-50% cost reduction** in production deployments. - -### Speed Improvement with Parallel - -```go -// Before: Sequential execution -// Time: 10s (sum of all agents) -// After: Parallel execution -// Time: 3s (max of all agents) - -parallel := orchestration.NewParallel( - "market-research", - runtime, - []string{"competitors", "market-size", "trends", "regulations"}, -) -``` - -**Result**: **3-4× speedup** for independent research tasks. - -### Accuracy Improvement with Ensemble - -```go -// Before: Single model (GPT-4) -// Error rate: 15% -// After: 3-model ensemble -// Error rate: 6% (60% reduction) - -ensemble := orchestration.NewEnsemble( - "medical-diagnosis", - runtime, - []string{"gpt4-diagnostic", "claude-diagnostic", "gemini-diagnostic"}, - orchestration.WithVotingStrategy(orchestration.VotingMajority), -) -``` - -**Result**: **25-50% error reduction** for high-stakes decisions. - -## Roadmap - -All 13 core patterns are **implemented and production-ready** in v0.2.0+. - -### Future Patterns (v2.1+, 2025 H2) - -- 🔮 **Debate Pattern** - Adversarial collaboration for accuracy -- 🔮 **Nested/Composite Pattern** - Encapsulate complex workflows as single agents - -## Getting Started - -### Installation - -```bash -go get github.com/aixgo-dev/aixgo -``` - -### Quick Example - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - - name: data-producer - role: producer - interval: 1s - outputs: - - target: analyzer - - - name: analyzer - role: react - model: gpt-4-turbo - prompt: | - You are a data analyst. Analyze incoming data and provide insights. - inputs: - - source: data-producer - outputs: - - target: logger - - - name: logger - role: logger - inputs: - - source: analyzer -``` - -```go -// main.go -package main - -import ( - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - if err := aixgo.Run("config/agents.yaml"); err != nil { - panic(err) - } -} -``` - -### Learn More - -- [Documentation](https://aixgo.dev) -- [Pattern Catalog](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md) -- [Examples](https://github.com/aixgo-dev/aixgo/tree/main/examples) -- [GitHub Repository](https://github.com/aixgo-dev/aixgo) - -## Support - -- **GitHub Issues**: [github.com/aixgo-dev/aixgo/issues](https://github.com/aixgo-dev/aixgo/issues) -- **Documentation**: [aixgo.dev](https://aixgo.dev) -- **Discord**: Join our community (coming soon) - ---- - - diff --git a/web/content/features.md b/web/content/features.md deleted file mode 100644 index ff96631..0000000 --- a/web/content/features.md +++ /dev/null @@ -1,158 +0,0 @@ ---- -title: 'Aixgo Features' -description: "Explore Aixgo's complete feature set across AI agents, LLM providers, security, observability, and infrastructure. Production-ready features for building scalable AI agent systems." ---- - -> **Complete Technical Documentation:** -> - **[FEATURES.md on GitHub](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md)** - Authoritative feature catalog with code references -> - **[PATTERNS.md on GitHub](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md)** - Deep-dive guides for all 13 orchestration patterns -> - **[Roadmap](https://github.com/orgs/aixgo-dev/projects/1)** - Development roadmap and planned features - ---- - -## At a Glance - -| Category | Count | Status | -|----------|-------|--------| -| Agent Types | 6 specialized types | ✅ All implemented | -| LLM Providers | 6+ cloud + local | ✅ All implemented | -| Orchestration Patterns | 13 patterns | ✅ All implemented | -| Security Modes | 4 auth modes | ✅ All implemented | -| Observability Backends | 6+ backends | ✅ All implemented | - -### Performance - -- **Binary Size**: <20MB -- **Cold Start**: <100ms -- **Infrastructure Savings**: 60-70% vs Python frameworks - ---- - -## Agent Types - -Build specialized agents for different tasks: - -| Agent | Purpose | Key Features | -|-------|---------|--------------| -| **ReAct** | Reasoning + Acting | Tool calling, streaming, structured outputs | -| **Classifier** | Content routing | Confidence scores, multi-label, custom taxonomies | -| **Aggregator** | Multi-agent synthesis | 5 LLM strategies + 4 voting modes | -| **Planner** | Task decomposition | Dependency analysis, progress tracking | -| **Producer** | Message generation | Interval-based, event-driven | -| **Logger** | Audit trails | Structured JSON, multiple targets | - ---- - -## LLM Providers - -Connect to any provider with a unified interface: - -| Provider | Models | Status | -|----------|--------|--------| -| **OpenAI** | GPT-4, GPT-3.5 Turbo | ✅ | -| **Anthropic** | Claude 3.5 Sonnet, Opus, Haiku | ✅ | -| **Google Gemini** | Gemini 1.5 Pro, Flash | ✅ | -| **xAI** | Grok-beta | ✅ | -| **Vertex AI** | Gemini on GCP | ✅ | -| **HuggingFace** | Meta-Llama, Mistral, 100+ models | ✅ | -| **Ollama** | phi, llama, mistral, gemma (local) | ✅ | -| **vLLM** | Self-hosted inference | ✅ | - ---- - -## Orchestration Patterns - -All 13 patterns are production-ready: - -| Pattern | Benefit | -|---------|---------| -| **Supervisor** | Simple multi-agent coordination | -| **Sequential** | ETL and content pipelines | -| **Parallel** | 3-4× speedup | -| **Router** | 25-50% cost savings | -| **Swarm** | Adaptive agent handoffs | -| **Hierarchical** | Complex workflows | -| **RAG** | 70% token reduction | -| **Reflection** | 20-50% quality improvement | -| **Ensemble** | 25-50% error reduction | -| **Classifier** | Intent-based routing | -| **Aggregation** | Expert consensus | -| **Planning** | Multi-step workflows | -| **MapReduce** | Large dataset processing | - -[View Pattern Details →](/features-patterns) - ---- - -## Security - -Enterprise-grade security built-in: - -- **4 Auth Modes**: Disabled, Delegated (IAP), Builtin (API keys), Hybrid -- **Input Protection**: SSRF protection, sanitization, prompt injection defense -- **Rate Limiting**: Token bucket, per-user quotas -- **Audit**: SIEM integration (Elasticsearch, Splunk, Datadog) - ---- - -## Observability - -Complete production monitoring: - -- **Tracing**: OpenTelemetry with Langfuse, Jaeger, Honeycomb, Grafana -- **Metrics**: Prometheus export, system and agent metrics -- **Cost Tracking**: Automatic token counting, per-request costs -- **Health Checks**: Kubernetes-ready liveness and readiness probes - ---- - -## Deployment - -Deploy anywhere with Go's simplicity: - -| Target | Details | -|--------|---------| -| **Single Binary** | <20MB, zero dependencies | -| **Docker** | Multi-stage builds, ~50MB standard | -| **Kubernetes** | Full manifests, HPA ready | -| **Cloud Run** | Auto-scaling, IAP integration | - ---- - -## Getting Started - -```bash -go get github.com/aixgo-dev/aixgo -``` - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - model: gpt-4-turbo - -agents: - - name: analyzer - role: react - model: gpt-4-turbo - prompt: "You are a data analyst." -``` - -```go -package main - -import "github.com/aixgo-dev/aixgo" - -func main() { - aixgo.Run("config/agents.yaml") -} -``` - ---- - -## Learn More - -- **[Quick Start Guide](/guides/quick-start)** - Get running in 5 minutes -- **[Pattern Catalog](/features-patterns)** - All 13 orchestration patterns -- **[GitHub Repository](https://github.com/aixgo-dev/aixgo)** - Source code and examples -- **[Full Feature Reference](https://github.com/aixgo-dev/aixgo/blob/main/docs/FEATURES.md)** - Complete technical documentation diff --git a/web/content/guides/_index.md b/web/content/guides/_index.md deleted file mode 100644 index 9a4a1c0..0000000 --- a/web/content/guides/_index.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: 'Getting Started' -description: 'Learn how to build production-grade AI agents with Aixgo.' -aliases: - - /docs - - /docs/ ---- diff --git a/web/content/guides/agent-types.md b/web/content/guides/agent-types.md deleted file mode 100644 index 8da21f0..0000000 --- a/web/content/guides/agent-types.md +++ /dev/null @@ -1,759 +0,0 @@ ---- -title: 'Agent Types Guide' -description: "Comprehensive guide to all Aixgo agent types including Classifier and Aggregator agents with examples and best practices." -breadcrumb: 'Guides' -category: 'Agents' -weight: 3 ---- - -Aixgo provides specialized agent types for building production-grade multi-agent systems. This guide covers all available agent types, when to use each, and how to configure them for optimal performance. - -## Overview - -Aixgo offers six core agent types, each designed for specific roles in your multi-agent architecture: - -- **Producer**: Generate periodic messages for downstream processing -- **ReAct**: LLM-powered reasoning and tool execution -- **Logger**: Message consumption and persistence -- **Classifier**: Intelligent content classification with confidence scoring -- **Aggregator**: Multi-agent output synthesis and consensus building -- **Planner**: Task decomposition and workflow orchestration - -## Producer Agent - -Producer agents generate messages at configured intervals, providing the data input for your agent workflows. - -### When to Use - -- Polling external APIs or data sources -- Generating synthetic test data -- Periodic health checks or monitoring -- Time-based event triggers -- ETL pipeline data ingestion - -### Configuration - -```yaml -agents: - - name: event-generator - role: producer - interval: 500ms - outputs: - - target: processor -``` - -### Best Practices - -- Set appropriate intervals based on your data source refresh rate -- Use exponential backoff for failed polling attempts -- Consider rate limits when polling external APIs -- Implement circuit breakers for unreliable sources - -**Learn more**: [Producer examples](https://github.com/aixgo-dev/aixgo/tree/main/examples/producer-workflow) - -## ReAct Agent - -ReAct (Reasoning + Acting) agents combine LLM reasoning with tool execution capabilities for complex decision-making workflows. - -### When to Use - -- Data analysis requiring intelligent reasoning -- Decision-making workflows with business logic -- Natural language processing tasks -- Complex multi-step operations -- Tool-assisted problem solving - -### Configuration - -```yaml -agents: - - name: analyst - role: react - model: gpt-4-turbo - prompt: 'You are an expert data analyst.' - tools: - - name: query_database - description: 'Query the database' - input_schema: - type: object - properties: - query: { type: string } - required: [query] - inputs: - - source: event-generator - outputs: - - target: logger -``` - -### Best Practices - -- Provide clear, specific system prompts -- Define precise tool schemas with validation -- Use appropriate temperature settings (0.2-0.4 for deterministic, 0.7-1.0 for creative) -- Implement timeout handling for long-running operations -- Monitor token usage and optimize prompts - -**Learn more**: [ReAct examples](https://github.com/aixgo-dev/aixgo/tree/main/examples/react-workflow) - -## Logger Agent - -Logger agents consume and persist messages, providing observability and audit capabilities for your workflows. - -### When to Use - -- Audit logging and compliance -- Debugging multi-agent workflows -- Data persistence and archival -- Monitoring and alerting -- Performance metric collection - -### Configuration - -```yaml -agents: - - name: audit-log - role: logger - inputs: - - source: analyst -``` - -### Best Practices - -- Use structured logging formats (JSON) -- Implement log rotation and retention policies -- Set up log aggregation for distributed systems -- Create alerts for error patterns - -**Learn more**: [Logger examples](https://github.com/aixgo-dev/aixgo/tree/main/examples/logger-workflow) - -## Classifier Agent - -Classifier agents use LLM-powered semantic understanding to categorize content with confidence scoring, few-shot learning, and structured outputs. - -### When to Use - -- Customer support ticket routing and prioritization -- Content moderation and categorization -- Document classification and tagging -- Intent detection in conversational AI -- Sentiment analysis with custom categories -- Multi-label content tagging - -### Key Features - -- **Structured JSON Outputs**: Schema-validated responses for reliable parsing -- **Confidence Scoring**: Automatic quality assessment (0-1 scale) -- **Few-Shot Learning**: Improve accuracy with example-based training -- **Multi-Label Support**: Assign multiple categories simultaneously -- **Alternative Classifications**: Secondary suggestions for low-confidence results -- **Semantic Understanding**: Context-aware classification beyond keywords - -### Configuration - -```yaml -agents: - - name: ticket-classifier - role: classifier - model: gpt-4-turbo - inputs: - - source: support-tickets - outputs: - - target: classified-tickets - classifier_config: - categories: - - name: technical_issue - description: "Issues requiring technical troubleshooting or product support" - keywords: ["error", "bug", "not working", "crash"] - examples: - - "The app crashes when I click submit" - - "Error code 500 appears on checkout" - - - name: billing_inquiry - description: "Questions about payments, invoices, or pricing" - keywords: ["payment", "invoice", "charge", "refund"] - examples: - - "I was charged twice this month" - - "Can I get a refund?" - - # Minimum confidence for automatic classification - confidence_threshold: 0.7 - - # Allow multiple categories per input - multi_label: false - - # Few-shot examples for improved accuracy - few_shot_examples: - - input: "My account won't let me log in" - category: technical_issue - reason: "Authentication system issue" - - # LLM parameters - temperature: 0.3 # Low for consistent classification - max_tokens: 500 # Sufficient for reasoning -``` - -### Category Definition Best Practices - -Each category should include: - -- **name**: Unique identifier (use snake_case) -- **description**: Clear explanation of category boundaries -- **keywords**: Terms strongly associated with this category -- **examples**: 2-3 representative samples - -### Confidence Threshold Guidelines - -- **0.5-0.6**: Exploratory use, may have incorrect classifications -- **0.7-0.8**: Production baseline, good accuracy/coverage balance -- **0.85+**: High-stakes scenarios, may reject ambiguous inputs - -### Example Output - -```json -{ - "category": "technical_issue", - "confidence": 0.92, - "reasoning": "User describes specific product issue requiring technical assistance", - "alternatives": [ - {"category": "billing_inquiry", "confidence": 0.15} - ], - "tokens_used": 234 -} -``` - -**Learn more**: -- [Classifier agent documentation](/Users/charlesgreen/go/src/github.com/aixgo-dev/aixgo/agents/README.md) -- [Classifier workflow example](/Users/charlesgreen/go/src/github.com/aixgo-dev/aixgo/examples/classifier-workflow/README.md) - -## Aggregator Agent - -Aggregator agents synthesize outputs from multiple agents using 9 intelligent strategies, from zero-cost deterministic voting to sophisticated LLM-powered consensus building. - -### When to Use - -- Multi-agent research synthesis -- Combining outputs from specialized expert agents -- Consensus building in distributed AI systems -- Ensemble learning for improved accuracy -- Cross-validation of agent outputs -- RAG systems with multiple retrievers -- Conflict resolution between diverse perspectives -- Production systems requiring deterministic, reproducible results - -### Key Features - -- **9 Aggregation Strategies**: 5 LLM-powered + 4 deterministic voting methods -- **Resilience by Default**: Handles partial failures, missing inputs, timeouts -- **Zero-Cost Options**: Deterministic voting strategies with no LLM calls -- **Conflict Resolution**: Automatic detection and LLM-mediated resolution -- **Semantic Clustering**: Group similar outputs using text similarity -- **Consensus Scoring**: Quantify agreement levels (0-1 scale) -- **Performance Tracking**: Built-in observability and metrics - -### All 9 Aggregation Strategies - -#### LLM-Powered Strategies - -These strategies use LLM reasoning for sophisticated synthesis. They provide high-quality results but incur API costs and latency. - -**1. consensus** - Find common ground among diverse opinions - -- **Use when**: Need balanced synthesis with conflict transparency -- **Cost**: $$ (LLM calls for analysis) -- **Speed**: Slow (2-5s depending on agent count) -- **Reproducibility**: Low (LLM output varies) -- **Best for**: Fact verification, balanced synthesis, transparent disagreement handling - -```yaml -aggregator_config: - aggregation_strategy: consensus - consensus_threshold: 0.7 - conflict_resolution: llm_mediated -``` - -**2. weighted** - Prioritize high-authority sources - -- **Use when**: Some agents have more expertise or reliability -- **Cost**: $$ (LLM calls for synthesis) -- **Speed**: Slow (2-5s) -- **Reproducibility**: Low (LLM output varies) -- **Best for**: Expert prioritization, confidence-based mixing, known reliability differences - -```yaml -aggregator_config: - aggregation_strategy: weighted - source_weights: - expert_agent: 1.0 - general_agent_1: 0.6 - general_agent_2: 0.4 -``` - -**3. semantic** - Group similar outputs by theme - -- **Use when**: Many agents with overlapping insights -- **Cost**: $$$ (LLM + embedding calls) -- **Speed**: Slow (3-7s with embedding overhead) -- **Reproducibility**: Medium (embeddings are deterministic, synthesis is not) -- **Best for**: Large agent counts (5+), deduplication, perspective identification - -```yaml -aggregator_config: - aggregation_strategy: semantic - semantic_similarity_threshold: 0.85 - deduplication_method: semantic -``` - -**4. hierarchical** - Multi-level summarization - -- **Use when**: 10+ agents, need efficient aggregation -- **Cost**: $$$$ (multiple LLM calls for hierarchical processing) -- **Speed**: Slowest (5-15s for multi-level aggregation) -- **Reproducibility**: Low (multiple LLM calls compound variance) -- **Best for**: Large agent counts (10+), token efficiency, structured summarization - -```yaml -aggregator_config: - aggregation_strategy: hierarchical - max_input_sources: 20 - summarization_enabled: true -``` - -**5. rag_based** - Citation-based aggregation - -- **Use when**: Need source attribution and traceability -- **Cost**: $$ (LLM calls for generation) -- **Speed**: Slow (2-5s) -- **Reproducibility**: Low (LLM output varies) -- **Best for**: Question answering, multi-source research, citation preservation - -```yaml -aggregator_config: - aggregation_strategy: rag_based - max_input_sources: 10 -``` - -#### Deterministic Strategies (v0.1.3+) - -These strategies provide instant, reproducible results with zero LLM costs. Perfect for production systems requiring deterministic behavior. - -**6. voting_majority** - Simple majority vote - -- **Use when**: Democratic decision, equal weight agents -- **Cost**: $0 (no LLM calls) -- **Speed**: Instant (<1ms) -- **Reproducibility**: 100% (same inputs always produce same output) -- **Best for**: Classification tasks, binary decisions, equal-expertise agents - -```yaml -aggregator_config: - aggregation_strategy: voting_majority - # No LLM configuration needed -``` - -**7. voting_unanimous** - Requires all to agree - -- **Use when**: Safety-critical decisions, consensus required -- **Cost**: $0 (no LLM calls) -- **Speed**: Instant (<1ms) -- **Reproducibility**: 100% -- **Best for**: High-stakes decisions, regulatory compliance, safety validation - -```yaml -aggregator_config: - aggregation_strategy: voting_unanimous - # Fails unless all agents agree -``` - -**8. voting_weighted** - Confidence-weighted voting - -- **Use when**: Expert panels with varying confidence levels -- **Cost**: $0 (no LLM calls) -- **Speed**: Instant (<1ms) -- **Reproducibility**: 100% -- **Best for**: Expert systems with confidence scoring, hierarchical decision making - -```yaml -aggregator_config: - aggregation_strategy: voting_weighted - source_weights: - expert_1: 0.9 - expert_2: 0.7 - expert_3: 0.5 -``` - -**9. voting_confidence** - Highest confidence wins - -- **Use when**: Defer to most confident expert -- **Cost**: $0 (no LLM calls) -- **Speed**: Instant (<1ms) -- **Reproducibility**: 100% -- **Best for**: Expert selection, competitive agent systems, confidence-based routing - -```yaml -aggregator_config: - aggregation_strategy: voting_confidence - # Selects result from agent with highest confidence score -``` - -### Resilience Features - -The aggregator is **resilient by default**, designed to handle real-world failures gracefully: - -- **Handles missing inputs**: Processes available results even if some agents fail or timeout -- **Timeout-based collection**: Waits specified time for inputs, then proceeds with what's available -- **Minimum response requirements**: Can specify minimum agents needed before aggregating -- **Confidence scoring**: Weights results by confidence levels when available -- **Partial result support**: Aggregates whatever is available, doesn't require all sources - -### When to Use Each Strategy - -**Use deterministic (voting_*)** when: - -- Reproducibility required (testing, debugging, compliance) -- Cost optimization critical ($0 vs $$ per aggregation) -- Speed essential (instant vs seconds) -- Simple aggregation sufficient (voting, selection) -- Auditing requires deterministic behavior - -**Use LLM-powered** when: - -- Semantic synthesis needed (narrative combining diverse viewpoints) -- Conflict resolution requires reasoning -- Narrative output preferred over structured votes -- Complex cross-agent analysis required -- Quality justifies cost and latency - -### Strategy Selection Decision Tree - -```text -Need reproducibility (same input → same output)? -├─ YES → Use deterministic strategies (voting_*) -│ ├─ All must agree? → voting_unanimous -│ ├─ Different expertise levels? → voting_weighted -│ ├─ Defer to most confident? → voting_confidence -│ └─ Equal weight voting? → voting_majority -│ -└─ NO → Use LLM-powered strategies - ├─ 10+ agents? → hierarchical - ├─ Need citations? → rag_based - ├─ Many similar outputs? → semantic - ├─ Different expertise? → weighted - └─ Balanced synthesis? → consensus -``` - -### Example: Resilient Aggregation - -```yaml -agents: - - name: policy-aggregator - role: aggregator - model: gpt-4-turbo - inputs: - - source: legal-expert - - source: technical-expert - - source: business-expert - - source: compliance-expert - - source: security-expert - outputs: - - target: final-report - aggregator_config: - # Deterministic voting for reproducibility - aggregation_strategy: voting_majority - - # Resilience settings - timeout_ms: 5000 # Wait 5s for inputs - max_input_sources: 5 # Expect up to 5 agents - min_input_sources: 3 # Require at least 3 responses - - # Even if only 3 of 5 respond in time, aggregation proceeds - # If fewer than 3 respond, aggregation fails gracefully -``` - -**Behavior:** - -- If all 5 experts respond within 5s: Aggregate all inputs -- If 3-4 respond: Proceed with available inputs (meets minimum) -- If <3 respond: Fail with clear error message -- If some agents fail: Continue with successful responses - -See [resilient-aggregation example](../../examples/resilient-aggregation/) for complete implementation. - -### Full Configuration Example - -```yaml -agents: - - name: research-synthesizer - role: aggregator - model: gpt-4-turbo - inputs: - - source: expert-1 - - source: expert-2 - - source: expert-3 - outputs: - - target: final-report - aggregator_config: - # Strategy selection - aggregation_strategy: consensus - - # Conflict handling - conflict_resolution: llm_mediated - - # Deduplication - deduplication_method: semantic - - # Enable summarization - summarization_enabled: true - - # Maximum agents to aggregate - max_input_sources: 10 - - # Timeout for collecting inputs (ms) - timeout_ms: 5000 - - # Semantic clustering threshold - semantic_similarity_threshold: 0.85 - - # Source weights (for weighted strategy) - source_weights: - expert-1: 1.0 - expert-2: 0.7 - expert-3: 0.5 - - # Consensus threshold - consensus_threshold: 0.7 - - # LLM parameters - temperature: 0.5 - max_tokens: 1500 -``` - -### Example Output - -```json -{ - "strategy": "consensus", - "consensus_level": 0.87, - "aggregated_content": "After analyzing all expert inputs, the following synthesis emerges...", - "conflicts_resolved": [ - { - "topic": "implementation_approach", - "conflicting_sources": ["expert-1", "expert-2"], - "resolution": "Hybrid approach combining both perspectives", - "reasoning": "Expert 1's architectural concerns addressed by Expert 2's practical constraints" - } - ], - "semantic_clusters": [ - { - "cluster_id": "cluster_0", - "members": ["expert-1", "expert-3"], - "core_concept": "technical_implementation", - "avg_similarity": 0.89 - } - ], - "tokens_used": 1250 -} -``` - -### Best Practices - -#### Strategy Selection - -- **Consensus**: Use when you need balanced synthesis with conflict transparency -- **Weighted**: Use when certain agents have more expertise or authority -- **Semantic**: Use for deduplication and thematic organization (5+ agents) -- **Hierarchical**: Use for scalability with many agents (10+) -- **RAG-based**: Use for question answering with source attribution - -#### Timeout Configuration - -Set based on expected agent response times: - -- Fast agents (1-2s): `timeout_ms: 3000` -- Standard agents (3-5s): `timeout_ms: 5000` -- Complex agents (5-10s): `timeout_ms: 10000` - -#### Token Management - -Typical token usage: - -- 2-3 agents: 500-1000 tokens -- 4-6 agents: 1000-1500 tokens -- 7-10 agents: 1500-2500 tokens -- 10+ agents: Use hierarchical strategy - -**Learn more**: -- [Aggregator agent documentation](/Users/charlesgreen/go/src/github.com/aixgo-dev/aixgo/agents/README.md) -- [Aggregator workflow example](/Users/charlesgreen/go/src/github.com/aixgo-dev/aixgo/examples/aggregator-workflow/README.md) - -## Planner Agent - -Planner agents decompose complex tasks into executable steps and orchestrate their execution across multiple agents. - -### When to Use - -- Complex multi-step workflows requiring coordination -- Dynamic task decomposition based on context -- Adaptive workflow execution -- Resource allocation and scheduling -- Goal-oriented planning with dependencies - -### Configuration - -```yaml -agents: - - name: task-planner - role: planner - model: gpt-4-turbo - prompt: 'You are a task planning expert' - inputs: - - source: user-requests - outputs: - - target: execution-queue -``` - -**Learn more**: [Planner examples](https://github.com/aixgo-dev/aixgo/tree/main/examples/planner-workflow) - -## Integration Patterns - -### Parallel Classification + Aggregation - -Combine multiple classifiers with an aggregator for comprehensive analysis: - -```yaml -agents: - # Input producer - - name: content-source - role: producer - outputs: - - target: content - - # Parallel classifiers - - name: sentiment-classifier - role: classifier - inputs: - - source: content - outputs: - - target: classifications - classifier_config: - categories: - - name: positive - description: "Positive sentiment" - - name: negative - description: "Negative sentiment" - - - name: topic-classifier - role: classifier - inputs: - - source: content - outputs: - - target: classifications - classifier_config: - categories: - - name: technology - description: "Technology-related content" - - name: business - description: "Business-related content" - - # Aggregator combines classifications - - name: final-classifier - role: aggregator - inputs: - - source: classifications - outputs: - - target: final-output - aggregator_config: - aggregation_strategy: consensus -``` - -### Multi-Expert Research Pipeline - -Deploy specialized experts with weighted aggregation: - -```yaml -agents: - # Expert agents - - name: technical-expert - role: react - model: gpt-4-turbo - prompt: "You are a technical architecture expert" - outputs: - - target: expert-analyses - - - name: security-expert - role: react - model: gpt-4-turbo - prompt: "You are a security expert" - outputs: - - target: expert-analyses - - - name: business-expert - role: react - model: gpt-4-turbo - prompt: "You are a business analyst" - outputs: - - target: expert-analyses - - # Weighted aggregator - - name: research-synthesis - role: aggregator - model: gpt-4-turbo - inputs: - - source: expert-analyses - outputs: - - target: final-report - aggregator_config: - aggregation_strategy: weighted - source_weights: - technical-expert: 0.9 - security-expert: 0.95 - business-expert: 0.7 -``` - -## Performance Considerations - -### Token Usage Optimization - -- **Producer**: No LLM calls, zero token usage -- **ReAct**: 200-2000 tokens per message (depends on complexity) -- **Logger**: No LLM calls, zero token usage -- **Classifier**: 200-500 tokens per classification (add 150-300 for few-shot) -- **Aggregator**: 500-2500 tokens (scales with agent count) -- **Planner**: 300-1000 tokens per planning operation - -### Latency Guidelines - -- **Producer**: <10ms (local generation) -- **ReAct**: 500ms-5s (LLM-dependent) -- **Logger**: <50ms (I/O-dependent) -- **Classifier**: 500ms-2s (LLM-dependent) -- **Aggregator**: 1s-5s (scales with agent count) -- **Planner**: 1s-3s (LLM-dependent) - -### Cost Management - -Choose appropriate models for your use case: - -```yaml -# Production traffic - balance cost and quality -model: gpt-4o-mini - -# Critical decisions - maximum accuracy -model: gpt-4-turbo - -# High volume, simple tasks - lowest cost -model: gpt-3.5-turbo -``` - -## Next Steps - -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Learn orchestration patterns -- **[Classifier Example](/examples/classifier-workflow)** - Hands-on classifier implementation -- **[Aggregator Example](/examples/aggregator-workflow)** - Multi-agent synthesis walkthrough -- **[Provider Integration](/guides/provider-integration)** - Configure LLM providers -- **[Production Deployment](/guides/production-deployment)** - Deploy agent systems - -## Additional Resources - -- [Agent Framework Source Code](https://github.com/aixgo-dev/aixgo/tree/main/agents) -- [Complete Examples Directory](https://github.com/aixgo-dev/aixgo/tree/main/examples) -- [API Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo) diff --git a/web/content/guides/aws-bedrock.md b/web/content/guides/aws-bedrock.md deleted file mode 100644 index a26bcfc..0000000 --- a/web/content/guides/aws-bedrock.md +++ /dev/null @@ -1,921 +0,0 @@ ---- -title: 'AWS Bedrock Integration Guide' -description: 'Integrate Aixgo with Amazon Bedrock for enterprise-grade AI models with single API access, regional deployment, and AWS security.' -breadcrumb: 'Reference' -category: 'Reference' -weight: 12 ---- - -## Introduction - -Amazon Bedrock is AWS's fully managed service providing API access to foundation models from leading AI companies through a unified interface. Bedrock enables you to build and scale generative AI applications with enterprise security, compliance, and operational simplicity. - -### What is Amazon Bedrock? - -Bedrock offers access to foundation models from Anthropic (Claude), Amazon (Nova, Titan), Meta (Llama), Mistral AI, and more through a single API. Unlike direct provider integrations, Bedrock provides: - -- **Unified API** - Single integration for multiple model providers -- **AWS Security** - IAM-based access control, VPC endpoints, encryption at rest/in transit -- **Compliance** - HIPAA, GDPR, SOC 2, ISO 27001 certified -- **Regional Deployment** - Deploy in AWS regions globally with data residency compliance -- **Cost Management** - Consolidated billing, cost allocation tags, AWS Cost Explorer integration -- **Guardrails** - Built-in content filtering, PII detection, topic blocking - -### Benefits for Go Applications - -Integrating Bedrock with Aixgo unlocks powerful advantages: - -**No Python Dependencies** - Pure Go implementation eliminates the 1GB+ Python containers and 10-45 second cold starts typical of Python frameworks. Aixgo with Bedrock delivers <100ms cold starts in single binaries under 20MB. - -**Enterprise Security** - Leverage AWS IAM roles, VPC endpoints, and AWS PrivateLink to keep model requests within your VPC without internet traversal. Bedrock never uses your data for model training. - -**Multi-Region Resilience** - Deploy agents across multiple AWS regions with automatic failover. Bedrock is available in 10+ regions globally. - -**Cost Optimization** - Use AWS Cost Explorer to track model usage by project, environment, or agent. Set up billing alarms to prevent runaway costs. - -### When to Use Bedrock vs Direct Provider APIs - -**Choose Bedrock when you:** - -- Already operate AWS infrastructure -- Need compliance certifications (HIPAA, GDPR, SOC 2) -- Require data residency in specific AWS regions -- Want consolidated billing across multiple model providers -- Need VPC isolation for sensitive workloads - -**Choose Direct Provider APIs when you:** - -- Operate multi-cloud or cloud-agnostic infrastructure -- Need the absolute latest model versions (Bedrock has ~1-4 week lag) -- Require provider-specific features not yet in Bedrock -- Prefer direct provider support relationships - -## Prerequisites - -### AWS Account Setup - -1. **Create AWS Account** (if not already created): - - Visit https://aws.amazon.com - - Complete account registration - - Set up billing payment method - -1. **Enable Bedrock in your region**: - - Navigate to AWS Bedrock console: https://console.aws.amazon.com/bedrock - - Select your preferred region (e.g., us-east-1, us-west-2, eu-west-1) - - Bedrock automatically provisions in enabled regions - -### IAM Permissions - -Create an IAM policy with the minimum required permissions: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Sid": "BedrockModelAccess", - "Effect": "Allow", - "Action": [ - "bedrock:InvokeModel", - "bedrock:InvokeModelWithResponseStream", - "bedrock:ListFoundationModels", - "bedrock:GetFoundationModel" - ], - "Resource": [ - "arn:aws:bedrock:*::foundation-model/*" - ] - } - ] -} -``` - -Attach this policy to: - -- An IAM user (for development) -- An IAM role (for production EC2/ECS/EKS deployments) - -### Model Access Requests - -Before using models, request access in the Bedrock console: - -1. Navigate to **Model access** in the Bedrock console -1. Select models you want to use: - - Anthropic Claude models (claude-3-5-sonnet, claude-3-haiku, etc.) - - Amazon Nova models (nova-pro, nova-lite, nova-micro) - - Meta Llama models (llama3-70b, llama3-8b) - - Mistral AI models (mistral-large, mistral-7b) - - Amazon Titan models (titan-text-express, titan-text-lite) -1. Click **Request model access** -1. Wait for approval (usually instant for most models) - -### Go and Aixgo Installation - -**Install Go 1.26+**: - -```bash -# Verify Go version -go version # Should show 1.26 or higher - -# If Go is not installed, download from https://go.dev/dl/ -``` - -**Install Aixgo**: - -```bash -go get github.com/aixgo-dev/aixgo -``` - -## Quick Start - -### Environment Setup - -Configure AWS credentials using one of these methods: - -**Method 1: Environment Variables** - -```bash -export AWS_ACCESS_KEY_ID= -export AWS_SECRET_ACCESS_KEY= -export AWS_REGION=us-east-1 -``` - -**Method 2: AWS CLI Configuration** - -```bash -aws configure -# Enter access key, secret key, and region when prompted -``` - -**Method 3: IAM Roles (Recommended for Production)** - -When running on EC2, ECS, or EKS, attach an IAM role with Bedrock permissions. No credentials needed in code. - -### First API Call Example - -Create a simple agent using Claude 3.5 Sonnet via Bedrock: - -**config/bedrock-agent.yaml**: - -```yaml -supervisor: - name: bedrock-coordinator - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - -agents: - - name: bedrock-analyst - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - prompt: | - You are a data analyst using AWS Bedrock. - Analyze incoming data and provide insights. - temperature: 0.7 - max_tokens: 1000 -``` - -**main.go**: - -```go -package main - -import ( - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - if err := aixgo.Run("config/bedrock-agent.yaml"); err != nil { - panic(err) - } -} -``` - -**Run it**: - -```bash -export AWS_REGION=us-east-1 -go run main.go -``` - -### Model Selection Guidance - -Choose models based on your use case: - -**For complex reasoning and analysis**: `anthropic.claude-3-5-sonnet-20240620-v1:0` - -**For fast, cost-effective tasks**: `anthropic.claude-3-haiku-20240307-v1:0` or `amazon.nova-micro-v1:0` - -**For multimodal (text + images)**: `anthropic.claude-3-5-sonnet-20240620-v1:0` or `amazon.nova-pro-v1:0` - -**For long-context processing**: `anthropic.claude-3-5-sonnet-20240620-v1:0` (200K tokens) - -**For code generation**: `meta.llama3-70b-instruct-v1:0` or `anthropic.claude-3-5-sonnet-20240620-v1:0` - -## Configuration Options - -### YAML Configuration - -**Basic agent configuration**: - -```yaml -agents: - - name: bedrock-researcher - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - temperature: 0.5 - max_tokens: 2000 - top_p: 0.9 -``` - -**Multi-model configuration with fallback**: - -```yaml -agents: - - name: resilient-agent - role: react - providers: - # Primary: Claude 3.5 Sonnet (most capable) - - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - - # Fallback 1: Amazon Nova Pro (fast, cost-effective) - - model: amazon.nova-pro-v1:0 - provider: bedrock - region: us-east-1 - - # Fallback 2: Claude 3 Haiku (cheapest) - - model: anthropic.claude-3-haiku-20240307-v1:0 - provider: bedrock - region: us-west-2 - - fallback_strategy: cascade -``` - -### Go SDK Programmatic Usage - -**Direct agent creation**: - -```go -import ( - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/providers/bedrock" -) - -func main() { - agent := aixgo.NewAgent( - aixgo.WithName("bedrock-agent"), - aixgo.WithModel("anthropic.claude-3-5-sonnet-20240620-v1:0"), - aixgo.WithProvider(bedrock.Provider{ - Region: "us-east-1", - }), - aixgo.WithTemperature(0.7), - aixgo.WithMaxTokens(1000), - ) - - // Use agent... -} -``` - -**With custom AWS credentials**: - -```go -import ( - "github.com/aws/aws-sdk-go-v2/config" - "github.com/aws/aws-sdk-go-v2/credentials" -) - -cfg, err := config.LoadDefaultConfig(ctx, - config.WithRegion("us-east-1"), - config.WithCredentialsProvider( - credentials.NewStaticCredentialsProvider( - "access-key-id", - "secret-access-key", - "", - ), - ), -) - -agent := aixgo.NewAgent( - aixgo.WithProvider(bedrock.Provider{ - AWSConfig: cfg, - }), -) -``` - -### Multi-Region Deployment - -Deploy agents across multiple regions for resilience: - -```yaml -agents: - # Primary region: us-east-1 - - name: us-east-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - - # Secondary region: us-west-2 - - name: us-west-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-west-2 - - # European region: eu-west-1 - - name: eu-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: eu-west-1 -``` - -## Authentication - -### Environment Variables - -**Standard AWS credentials**: - -```bash -export AWS_ACCESS_KEY_ID= -export AWS_SECRET_ACCESS_KEY= -export AWS_REGION=us-east-1 -``` - -**With session token (for temporary credentials)**: - -```bash -export AWS_ACCESS_KEY_ID= -export AWS_SECRET_ACCESS_KEY= -export AWS_SESSION_TOKEN= -export AWS_REGION=us-east-1 -``` - -### IAM Roles for EC2/ECS/EKS - -**Recommended for production deployments**. Attach an IAM role to your compute instance: - -**EC2 Instance**: - -1. Create IAM role with Bedrock permissions policy -1. Attach role to EC2 instance -1. No credentials needed in code - SDK automatically uses instance metadata - -**ECS Task**: - -```json -{ - "taskRoleArn": "arn:aws:iam::123456789012:role/BedrockTaskRole", - "containerDefinitions": [ - { - "name": "aixgo-agent", - "image": "myapp:latest", - "environment": [ - {"name": "AWS_REGION", "value": "us-east-1"} - ] - } - ] -} -``` - -**EKS Pod (IRSA - IAM Roles for Service Accounts)**: - -```yaml -apiVersion: v1 -kind: ServiceAccount -metadata: - name: aixgo-agent - annotations: - eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/BedrockPodRole ---- -apiVersion: apps/v1 -kind: Deployment -metadata: - name: aixgo-agent -spec: - template: - spec: - serviceAccountName: aixgo-agent - containers: - - name: agent - image: myapp:latest - env: - - name: AWS_REGION - value: us-east-1 -``` - -### AWS Profiles - -Use named profiles for development: - -```bash -# ~/.aws/credentials -[default] -aws_access_key_id = -aws_secret_access_key = - -[production] -aws_access_key_id = -aws_secret_access_key = -``` - -```bash -# Use specific profile -export AWS_PROFILE=production -go run main.go -``` - -### Cross-Account Access - -Access Bedrock in another AWS account using role assumption: - -```go -import ( - "github.com/aws/aws-sdk-go-v2/config" - "github.com/aws/aws-sdk-go-v2/credentials/stscreds" - "github.com/aws/aws-sdk-go-v2/service/sts" -) - -cfg, err := config.LoadDefaultConfig(ctx) -stsClient := sts.NewFromConfig(cfg) - -creds := stscreds.NewAssumeRoleProvider(stsClient, - "arn:aws:iam::123456789012:role/CrossAccountBedrockRole") - -cfg, err = config.LoadDefaultConfig(ctx, - config.WithCredentialsProvider(creds), -) -``` - -## Available Models - -### Model ID Reference Table - -| Provider | Model Name | Model ID | Context Length | Features | -|----------|-----------|----------|----------------|----------| -| **Anthropic** | Claude 3.5 Sonnet | `anthropic.claude-3-5-sonnet-20240620-v1:0` | 200K tokens | Tool use, vision, long context | -| **Anthropic** | Claude 3 Opus | `anthropic.claude-3-opus-20240229-v1:0` | 200K tokens | Most capable, highest quality | -| **Anthropic** | Claude 3 Sonnet | `anthropic.claude-3-sonnet-20240229-v1:0` | 200K tokens | Balanced performance/cost | -| **Anthropic** | Claude 3 Haiku | `anthropic.claude-3-haiku-20240307-v1:0` | 200K tokens | Fastest, lowest cost | -| **Amazon** | Nova Pro | `amazon.nova-pro-v1:0` | 300K tokens | Multimodal, video understanding | -| **Amazon** | Nova Lite | `amazon.nova-lite-v1:0` | 300K tokens | Fast, cost-effective | -| **Amazon** | Nova Micro | `amazon.nova-micro-v1:0` | 128K tokens | Ultra-low latency | -| **Meta** | Llama 3.1 405B | `meta.llama3-1-405b-instruct-v1:0` | 128K tokens | Largest, most capable | -| **Meta** | Llama 3.1 70B | `meta.llama3-1-70b-instruct-v1:0` | 128K tokens | High quality, open source | -| **Meta** | Llama 3.1 8B | `meta.llama3-1-8b-instruct-v1:0` | 128K tokens | Fast, cost-effective | -| **Meta** | Llama 3 70B | `meta.llama3-70b-instruct-v1:0` | 8K tokens | Strong performance | -| **Meta** | Llama 3 8B | `meta.llama3-8b-instruct-v1:0` | 8K tokens | Lightweight | -| **Mistral AI** | Mistral Large 2 | `mistral.mistral-large-2407-v1:0` | 128K tokens | Complex reasoning | -| **Mistral AI** | Mistral Small | `mistral.mistral-small-2402-v1:0` | 32K tokens | Cost-efficient | -| **Mistral AI** | Mixtral 8x7B | `mistral.mixtral-8x7b-instruct-v0:1` | 32K tokens | Mixture of experts | -| **Amazon** | Titan Text Express | `amazon.titan-text-express-v1` | 8K tokens | AWS-native, summarization | -| **Amazon** | Titan Text Lite | `amazon.titan-text-lite-v1` | 4K tokens | Ultra-low cost | - -### Regional Availability - -Model availability varies by region. Check current availability: - -```bash -aws bedrock list-foundation-models --region us-east-1 -``` - -**Generally available in**: us-east-1, us-west-2, eu-west-1, eu-central-1, ap-southeast-1, ap-northeast-1 - -## Advanced Features - -### Tool Calling - -Enable agents to call functions using Bedrock's tool use API: - -```yaml -agents: - - name: tool-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - prompt: | - You are an assistant with access to tools. - Use the get_weather tool to answer weather questions. - tools: - - name: get_weather - description: Get current weather for a location - parameters: - type: object - properties: - location: - type: string - description: City name - unit: - type: string - enum: [celsius, fahrenheit] - required: [location] -``` - -**Go implementation**: - -```go -type WeatherTool struct{} - -func (t *WeatherTool) Execute(ctx context.Context, args map[string]any) (any, error) { - location := args["location"].(string) - unit := args["unit"].(string) - - // Call weather API - weather, err := getWeather(location) - if err != nil { - return nil, err - } - - return map[string]any{ - "temperature": convertTemp(weather.Temp, unit), - "conditions": weather.Conditions, - "location": location, - }, nil -} -``` - -### Structured Output - -Use JSON schema to enforce response structure: - -```yaml -agents: - - name: structured-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - prompt: | - Extract customer information from the text. - Return as JSON with name, email, phone fields. - output_schema: - type: object - properties: - name: - type: string - email: - type: string - format: email - phone: - type: string - required: [name, email] -``` - -Aixgo validates responses against schema with automatic retry on validation failures (40-70% improved reliability). - -### Streaming Responses - -Stream model outputs for real-time user experiences: - -```go -agent := aixgo.NewAgent( - aixgo.WithModel("anthropic.claude-3-5-sonnet-20240620-v1:0"), - aixgo.WithProvider(bedrock.Provider{Region: "us-east-1"}), - aixgo.WithStreaming(true), -) - -stream, err := agent.StreamExecute(ctx, input) -for chunk := range stream { - fmt.Print(chunk.Content) -} -``` - -### Guardrails Integration - -Apply AWS Bedrock Guardrails for content filtering: - -```yaml -agents: - - name: safe-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - bedrock_config: - guardrail_id: - guardrail_version: "1" -``` - -Guardrails provide: - -- **Content filtering** - Block harmful, toxic, or inappropriate content -- **PII redaction** - Automatically detect and mask sensitive information -- **Topic blocking** - Prevent discussion of specific topics -- **Word filtering** - Block profanity and custom word lists - -## Production Deployment - -### VPC Endpoints - -Use VPC endpoints to keep Bedrock traffic within your VPC: - -1. **Create VPC endpoint**: - -```bash -aws ec2 create-vpc-endpoint \ - --vpc-id vpc-12345678 \ - --service-name com.amazonaws.us-east-1.bedrock-runtime \ - --route-table-ids rtb-12345678 \ - --subnet-ids subnet-12345678 -``` - -1. **Configure security group**: - -Allow HTTPS (port 443) from your application subnets. - -1. **Update endpoint policy** (optional): - -```json -{ - "Statement": [ - { - "Principal": "*", - "Action": [ - "bedrock:InvokeModel", - "bedrock:InvokeModelWithResponseStream" - ], - "Effect": "Allow", - "Resource": "arn:aws:bedrock:*::foundation-model/*" - } - ] -} -``` - -**Benefit**: Traffic never leaves AWS network, improving security and reducing latency. - -### CloudTrail Logging - -Enable CloudTrail to audit all Bedrock API calls: - -```bash -aws cloudtrail create-trail \ - --name bedrock-audit-trail \ - --s3-bucket-name my-cloudtrail-bucket - -aws cloudtrail start-logging --name bedrock-audit-trail -``` - -CloudTrail logs capture: - -- Model invocations (InvokeModel, InvokeModelWithResponseStream) -- Request metadata (timestamp, IAM principal, source IP) -- Model IDs and regions used -- Error responses - -Use for compliance auditing, security analysis, and debugging. - -### Cost Management - -**Set up billing alarms**: - -```bash -aws cloudwatch put-metric-alarm \ - --alarm-name bedrock-cost-alert \ - --alarm-description "Alert when Bedrock costs exceed $100/day" \ - --metric-name EstimatedCharges \ - --namespace AWS/Billing \ - --statistic Maximum \ - --period 86400 \ - --threshold 100 \ - --comparison-operator GreaterThanThreshold -``` - -**Track costs by tag**: - -Tag Bedrock invocations using IAM role tags or application tags: - -```go -// Tag IAM role with project/environment -// Costs automatically allocated in Cost Explorer -``` - -**Use AWS Cost Explorer** to analyze: - -- Cost by model (Claude vs Nova vs Llama) -- Cost by region -- Cost by project or environment -- Trend analysis and forecasting - -### Multi-Region Failover - -Implement automatic failover across regions: - -```yaml -agents: - - name: resilient-agent - role: react - providers: - # Primary: us-east-1 - - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - timeout: 30s - - # Failover 1: us-west-2 - - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-west-2 - timeout: 30s - - # Failover 2: eu-west-1 - - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: eu-west-1 - timeout: 30s - - fallback_strategy: cascade - retry: - max_attempts: 2 - initial_backoff: 1s -``` - -Aixgo automatically tries each region in order if the primary fails. - -## Cost Optimization - -### Model Selection by Use Case - -**Choose the right model for each task to optimize costs**: - -| Use Case | Recommended Model | Why | -|----------|------------------|-----| -| Simple classification | `amazon.nova-micro-v1:0` | Lowest cost, sub-second latency | -| Chatbots | `anthropic.claude-3-haiku-20240307-v1:0` | Fast, cost-effective, natural conversation | -| Data analysis | `anthropic.claude-3-5-sonnet-20240620-v1:0` | Strong reasoning, tool use | -| Content generation | `meta.llama3-1-70b-instruct-v1:0` | High quality, lower cost than Claude | -| Code generation | `anthropic.claude-3-5-sonnet-20240620-v1:0` | Best for complex code | -| Summarization | `amazon.titan-text-express-v1` | Purpose-built, ultra-low cost | -| Complex reasoning | `anthropic.claude-3-opus-20240229-v1:0` | Most capable (use sparingly) | - -**Cost comparison (approximate per 1M tokens)**: - -- Claude 3 Haiku: $0.25 input / $1.25 output -- Amazon Nova Micro: $0.035 input / $0.14 output -- Amazon Titan Express: $0.20 input / $0.60 output -- Llama 3.1 70B: $0.99 input / $0.99 output -- Claude 3.5 Sonnet: $3.00 input / $15.00 output - -### Token Usage Monitoring - -**Enable token tracking in Aixgo**: - -```yaml -observability: - llm_observability: - enabled: true - track_tokens: true - daily_token_limit: 1000000 - cost_alert_threshold: 50 # Alert if daily cost > $50 -``` - -**Track via CloudWatch metrics**: - -Aixgo publishes custom metrics to CloudWatch: - -- `LLM/TokensUsed` - Total tokens per request -- `LLM/InputTokens` - Prompt tokens -- `LLM/OutputTokens` - Completion tokens -- `LLM/Cost` - Estimated cost per request - -**Set up cost dashboards**: - -Create CloudWatch dashboard to monitor: - -- Daily token usage trend -- Cost by agent -- Cost by model -- Anomaly detection - -## Troubleshooting - -### Common Errors - -#### AccessDenied - -**Error Message**: - -```text -An error occurred (AccessDeniedException) when calling the InvokeModel operation: -User: arn:aws:iam::123456789012:user/myuser is not authorized to perform: -bedrock:InvokeModel on resource: arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0 -``` - -**Solutions**: - -1. **Verify IAM permissions** - Ensure user/role has `bedrock:InvokeModel` permission -1. **Check resource ARN** - Confirm policy allows access to specific model -1. **Verify model access** - Request model access in Bedrock console if not already granted -1. **Check region** - Ensure IAM policy allows access in the target region - -```bash -# Test IAM permissions -aws bedrock invoke-model \ - --model-id anthropic.claude-3-5-sonnet-20240620-v1:0 \ - --region us-east-1 \ - --body '{"anthropic_version": "bedrock-2023-05-31", "messages": [{"role": "user", "content": "Hi"}], "max_tokens": 100}' \ - --content-type application/json \ - output.txt -``` - -#### ThrottlingException - -**Error Message**: - -```text -An error occurred (ThrottlingException) when calling the InvokeModel operation: -Rate exceeded -``` - -**Solutions**: - -1. **Implement exponential backoff** - Aixgo does this automatically with retry configuration -1. **Request quota increase** - Submit request in AWS Service Quotas console -1. **Distribute load** - Use multiple regions or models -1. **Reduce request rate** - Add rate limiting in your application - -```yaml -agents: - - name: throttle-resilient-agent - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock - region: us-east-1 - retry: - max_attempts: 5 - initial_backoff: 2s - max_backoff: 30s - multiplier: 2 -``` - -#### ModelNotFound - -**Error Message**: - -```text -An error occurred (ResourceNotFoundException) when calling the InvokeModel operation: -Could not resolve the foundation model from the model identifier: anthropic.claude-3-5-sonnet-20240620-v1:0 -``` - -**Solutions**: - -1. **Verify model ID** - Check for typos in model identifier -1. **Check regional availability** - Model may not be available in your region -1. **Request model access** - Enable model in Bedrock console Model Access page -1. **Use correct model ID format** - Bedrock model IDs differ from provider APIs - -```bash -# List available models in your region -aws bedrock list-foundation-models --region us-east-1 | grep modelId -``` - -### Debug Logging - -Enable debug logging for detailed request/response information: - -```bash -export AIXGO_DEBUG=true -export AWS_SDK_LOG_LEVEL=debug -go run main.go -``` - -Debug logs include: - -- Full request payloads sent to Bedrock -- Response bodies and headers -- Token counts and costs -- Retry attempts and backoff timing -- AWS SDK debug output - -### AWS Support Resources - -**AWS Support Plans**: - -- **Developer** - Business hours email support ($29/month) -- **Business** - 24/7 support, <1 hour response for urgent issues ($100/month) -- **Enterprise** - Dedicated TAM, <15 minute response for critical issues ($15,000/month) - -**Bedrock Documentation**: https://docs.aws.amazon.com/bedrock/ - -**AWS re:Post Community**: https://repost.aws/tags/TAL_SxuHzRSGusOWwH3XbQw/amazon-bedrock - -**Bedrock Quotas**: https://console.aws.amazon.com/servicequotas/ (search for "Bedrock") - -**Open Support Case**: https://console.aws.amazon.com/support/ - -## Next Steps - -Now that you have Bedrock integration configured, explore advanced Aixgo capabilities: - -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Build complex workflows with multiple Bedrock agents -- **[Observability](/guides/observability)** - Monitor Bedrock costs and performance with OpenTelemetry -- **[Production Deployment](/guides/production-deployment)** - Deploy Bedrock agents on ECS/EKS -- **[Vector Databases & RAG](/guides/vector-databases)** - Combine Bedrock with retrieval-augmented generation -- **[Cost Optimization](/guides/cost-optimization)** - Advanced techniques for reducing Bedrock costs -- **[Provider Integration](/guides/provider-integration)** - Compare Bedrock with direct provider APIs diff --git a/web/content/guides/chat-assistant.md b/web/content/guides/chat-assistant.md deleted file mode 100644 index 55ce17e..0000000 --- a/web/content/guides/chat-assistant.md +++ /dev/null @@ -1,534 +0,0 @@ ---- -title: "Interactive Coding Assistant" -description: "Use the aixgo chat command to interact with an AI coding assistant that can read and write files, run git operations, and execute terminal commands" -category: "Tools" -weight: 3 ---- - -The `aixgo chat` command provides an interactive coding assistant that combines conversational AI with practical development tools. It runs as a single lightweight binary with no external runtime dependencies. - -## Table of Contents - -- [Prerequisites](#prerequisites) -- [Installation](#installation) -- [API Key Setup](#api-key-setup) -- [Starting a Session](#starting-a-session) -- [CLI Flags](#cli-flags) -- [In-Session Commands](#in-session-commands) -- [Built-in Tools](#built-in-tools) -- [Session Management](#session-management) -- [Model Selection](#model-selection) -- [Security Model](#security-model) -- [Example Workflows](#example-workflows) -- [Troubleshooting](#troubleshooting) - ---- - -## Prerequisites - -- Go 1.26 or later (for building from source) -- At least one LLM provider API key (see [API Key Setup](#api-key-setup)) - ---- - -## Installation - -**Via `go install`:** - -```bash -go install github.com/aixgo-dev/aixgo/cmd/aixgo@latest -``` - -**Via pre-built binary:** - -```bash -curl -L https://github.com/aixgo-dev/aixgo/releases/latest/download/aixgo_Linux_x86_64.tar.gz | tar xz -sudo mv aixgo /usr/local/bin/ -``` - -Verify the installation: - -```bash -aixgo --version -``` - ---- - -## API Key Setup - -The assistant auto-detects which providers are available based on environment variables. Set at least one before starting a session. - -```bash -export ANTHROPIC_API_KEY= # Claude models -export OPENAI_API_KEY= # GPT models -export GOOGLE_API_KEY= # Gemini models -export XAI_API_KEY= # Grok models -``` - -To see which models are available with your configured keys, run `aixgo models` (see [Model Selection](#model-selection)). - ---- - -## Starting a Session - -**Start with default model (`claude-sonnet-4-6`):** - -```bash -aixgo chat -``` - -**Start with a specific model:** - -```bash -aixgo chat --model gpt-4o -aixgo chat --model gemini-2.5-flash -``` - -**Resume a previous session:** - -```bash -aixgo chat --session -``` - -**Disable streaming output:** - -```bash -aixgo chat --no-stream -``` - -Once started, the assistant displays a welcome prompt and waits for input on a `>` line. - -```text -╭──────────────────────────────────────────────────╮ -│ Aixgo Interactive Assistant │ -╰──────────────────────────────────────────────────╯ - Model: claude-sonnet-4-6 - Type /help for commands, /quit to exit - -> -``` - -To exit at any time, type `/quit` or press `Ctrl+C`. The session is saved automatically. - ---- - -## CLI Flags - -| Flag | Short | Default | Description | -|------|-------|---------|-------------| -| `--model` | `-m` | `claude-sonnet-4-6` (or `$AIXGO_MODEL`) | Model to use for the session | -| `--session` | `-s` | — | Resume an existing session by ID | -| `--no-stream` | — | `false` | Disable streaming and wait for full responses | - -The default model can be overridden with the `AIXGO_MODEL` environment variable: - -```bash -export AIXGO_MODEL=gpt-4o -aixgo chat # starts with gpt-4o -``` - ---- - -## In-Session Commands - -Commands are prefixed with `/` and interpreted directly rather than sent to the model. - -| Command | Description | -|---------|-------------| -| `/model ` | Switch to a different model without losing conversation context | -| `/cost` | Display the current session cost summary (total cost, message count, active model) | -| `/save` | Write the session to disk immediately | -| `/clear` | Reset conversation history (prompts for confirmation before clearing) | -| `/help` | Print the command reference | -| `/quit` | Save the session and exit | - -**Example: switching models mid-conversation:** - -```text -> /model gpt-4o -Switched to model: gpt-4o - -> /cost - -Session Cost Summary: - Total cost: $0.0045 - Messages: 8 - Model: gpt-4o -``` - -**Notes:** - -- `/model` accepts any model name from the supported list. If the required API key is not set, the switch will fail with an error. -- `/clear` resets both the in-memory history and the coordinator state. The session file on disk is not deleted. -- `/quit` and `/exit` are aliases. - ---- - -## Built-in Tools - -The assistant has access to three categories of tools that the model can invoke automatically in response to natural language requests. You do not call these tools directly. - -### File Operations - -| Tool | Description | -|------|-------------| -| `read_file` | Read the contents of a file | -| `write_file` | Create or overwrite a file | -| `glob` | Find files matching a glob pattern | -| `grep` | Search file contents using a regex pattern | - -**Example prompts that use file tools:** - -```text -> Read the main.go file -> Find all Go files in the pkg directory -> Search for TODO comments across the codebase -> Create a new file at internal/util/strings.go with these helper functions -``` - -File tools validate paths before operating on them to prevent directory traversal. See [Security Model](#security-model) for details. - -### Git Operations - -| Tool | Description | -|------|-------------| -| `git_status` | Show working tree status | -| `git_diff` | View uncommitted changes | -| `git_commit` | Create a commit with a generated message | -| `git_log` | View recent commit history | - -**Example prompts that use git tools:** - -```text -> What files have I changed? -> Show me the diff for the session package -> Commit the current changes with an appropriate message -> What were the last five commits? -``` - -`git_commit` analyzes the staged changes, generates a commit message, and presents it for confirmation before committing. - -### Terminal Execution - -The assistant can execute shell commands through the `exec` tool. All executions require explicit user confirmation before running. - -**Confirmation prompt:** - -```text -Execute command: go test ./... -Continue? (y/n): -``` - -Respond `y` to proceed or `n` to cancel. The command output is returned to the model so it can interpret results and continue the conversation. - -**Example prompts that use the terminal tool:** - -```text -> Run the test suite -> What Go version is installed? -> Build the project -> Show running processes -``` - -See [Security Model](#security-model) for the full list of allowed commands and blocking rules. - ---- - -## Session Management - -Sessions are stored as JSON files at `~/.aixgo/sessions/`. Every message exchange is automatically saved after completion. Manual saves are available with `/save`. - -### Listing Sessions - -```bash -aixgo session list -``` - -Output: - -```text -Saved Sessions: -───────────────────────────────────────────────────────────────── -ID Model Messages Cost Last Updated -───────────────────────────────────────────────────────────────── -a1b2c3d4e5f6 claude-sonnet-4-6 14 $0.0234 10:42 -e5f6g7h8i9j0 gpt-4o 8 $0.0156 Mon 14:30 - -Resume a session with: aixgo chat --session -Or: aixgo session resume -``` - -### Resuming a Session - -```bash -aixgo session resume a1b2c3d4e5f6 -# equivalent to: -aixgo chat --session a1b2c3d4e5f6 -``` - -Resuming a session restores the full message history to the model's context window. The model retains awareness of prior exchanges. - -### Deleting a Session - -```bash -aixgo session delete e5f6g7h8i9j0 -``` - -This removes the session file from `~/.aixgo/sessions/`. The operation is not reversible. - -### Session File Format - -Each session is stored as a human-readable JSON file: - -```json -{ - "id": "a1b2c3d4", - "model": "claude-sonnet-4-6", - "created": "2026-03-08T10:00:00Z", - "updated": "2026-03-08T11:30:00Z", - "total_cost": 0.0234, - "messages": [ - { - "role": "user", - "content": "Read main.go", - "timestamp": "2026-03-08T10:00:00Z" - }, - { - "role": "assistant", - "content": "Here's the content of main.go...", - "timestamp": "2026-03-08T10:00:05Z", - "model": "claude-sonnet-4-6", - "cost": 0.0012 - } - ] -} -``` - -Session files can be read, copied, or backed up with standard file tools. - ---- - -## Model Selection - -### Dynamic Model Discovery - -Models are **fetched dynamically** from each provider's API, ensuring you always have access to the latest models available to your API key. Run `aixgo models` to view all available models: - -```bash -aixgo models -``` - -Output: - -```text -Fetching available models... - -Available Models: -════════════════════════════════════════════════════════════════════════════════ -Model Provider Description Input/1M Output/1M -──────────────────────────────────────────────────────────────────────────────── -claude-sonnet-4-6 anthropic Smart, efficient for everyday... $3.00 $15.00 -claude-opus-4-6 anthropic Powerful for complex challenges $15.00 $75.00 -claude-haiku-4-5-20251001 anthropic Fastest for daily tasks $0.25 $1.25 -gpt-4o openai Latest GPT-4 with vision $2.50 $10.00 -gpt-4o-mini openai Smaller, faster GPT-4o $0.15 $0.60 -gemini-2.5-flash gemini Fast Gemini model $0.08 $0.30 - -Total: 12 models from 3 providers -``` - -### Force Refresh - -Model lists are cached for 5 minutes. To force a refresh from provider APIs: - -```bash -aixgo models --refresh -``` - -### Switching Models Mid-Conversation - -Use `/model` inside a session to switch without losing history: - -```text -> /model claude-haiku-4-5-20251001 -Switched to model: claude-haiku-4-5-20251001 -``` - -The conversation history carries over. Subsequent messages are sent to the new model. This is useful for routing straightforward tasks to lower-cost models and complex analysis to higher-capability models. - -### Cost Tracking - -The assistant displays per-message cost after each response when the cost exceeds $0.001: - -```text -[Cost: $0.0023 | Session total: $0.0045] -``` - -Use `/cost` for a full session summary at any time. - -> **Note:** Costs shown during streaming responses are estimated (approximately 4 characters per token). Non-streaming responses use exact token counts from the provider's API. - ---- - -## Security Model - -### Command Allowlist - -The terminal tool (`exec`) only runs commands from an explicit allowlist. Commands not on this list are rejected before any confirmation prompt is shown. - -**Allowed command categories:** - -| Category | Commands | -|----------|----------| -| Build tools | `go`, `make`, `npm`, `yarn`, `pnpm`, `cargo`, `gradle`, `mvn`, `pip`, `poetry` | -| Version control | `git` | -| File operations (read) | `ls`, `cat`, `head`, `tail`, `wc`, `find`, `grep`, `which`, `file`, `basename` | -| System info | `pwd`, `whoami`, `uname`, `date`, `env`, `echo`, `printf`, `dirname` | -| Process | `ps` | -| Network (read) | `curl`, `wget`, `ping`, `nslookup`, `host` | -| Utilities | `jq`, `yq`, `sed`, `awk`, `sort`, `uniq`, `diff`, `tree` | -| Containers | `docker` | -| Cloud CLIs | `gcloud`, `aws`, `az`, `kubectl` | - -### Blocked Subcommands - -Certain subcommands are blocked even for allowed base commands: - -| Command | Blocked subcommands | -|---------|---------------------| -| `git` | `push`, `reset`, `rebase`, `force-push` | -| `rm` | `-rf`, `-r`, `--recursive` | -| `docker` | `rm`, `rmi`, `prune`, `system prune` | - -### Shell Operator Restrictions - -The following shell operators are blocked to prevent bypass through chaining: - -- `&&`, `||`, `;` — command chaining -- `` ` ``, `$(` — command substitution -- `<` — input redirection - -Pipe (`|`) is permitted only when the pipe target is one of: `grep`, `head`, `tail`, `wc`, `sort`, `uniq`, `jq`, `awk`, `sed`. - -Output redirection (`>`) is permitted only to `/dev/null` or with `2>&1`. - -### Path Validation - -File tools reject paths containing `..` components to prevent directory traversal outside the working directory. - -### Confirmation Requirement - -The terminal tool always requires explicit user confirmation before executing a command. There is no way to pre-approve commands or disable this prompt. - ---- - -## Example Workflows - -### Code Review - -```text -> Read all Go files in the pkg/security directory -[Reads and summarizes each file] - -> Identify any inputs that are not validated before use -[Analyzes code for validation gaps] - -> Generate a summary of findings in SECURITY_REVIEW.md -[Creates the file] -``` - -### Refactoring - -```text -> Find all functions that return errors without wrapping them with %w -[Uses grep to locate unwrapped errors] - -> Show me the first three examples with their surrounding context -[Reads relevant file sections] - -> Update each one to use fmt.Errorf with %w -[Modifies files and reports changes] - -> Show me the diff before we commit -[Runs git diff] - -> Commit these changes -[Generates message, prompts for confirmation, creates commit] -``` - -### Documentation Generation - -```text -> Read all exported functions in pkg/mcp -[Scans and reads relevant files] - -> Generate API reference documentation in docs/mcp-api.md -[Creates structured documentation file] - -> Run the markdown linter on the new file -[Executes linter command after confirmation] -``` - -### Cost-Aware Multi-Model Workflow - -```text -> /model claude-haiku-4-5-20251001 -Switched to model: claude-haiku-4-5-20251001 - -> Generate boilerplate unit tests for the functions in agents/react.go -[Uses cheaper model for repetitive generation work] - -> /model claude-opus-4-6 -Switched to model: claude-opus-4-6 - -> Review the generated tests for correctness and edge cases -[Uses more capable model for analysis] - -> /cost - -Session Cost Summary: - Total cost: $0.0089 - Messages: 12 - Model: claude-opus-4-6 -``` - ---- - -## Troubleshooting - -**"command not allowed" when running a terminal command** - -The command is not in the allowlist. Check the [Allowed command categories](#command-allowlist) table. If you need an unlisted command, run it directly in your terminal outside the assistant. - -**"failed to get provider" on startup** - -The API key for the requested model is not set or is invalid. Run `aixgo models` to check which models are available, then verify the corresponding environment variable is exported correctly. - -**"failed to switch model" when using `/model`** - -The target model requires an API key that is not configured. Set the relevant environment variable and try again. - -**"No models available" when running `aixgo models`** - -No API keys are configured. Export at least one provider's API key: - -```bash -export ANTHROPIC_API_KEY= -``` - -**Streaming output stops mid-response** - -This can occur due to network interruption or provider-side rate limiting. The session up to that point is saved. Resume with `aixgo chat --session ` and repeat the request. Use `--no-stream` to disable streaming if the problem persists. - -**High session costs** - -Long conversation histories are sent with every request, accumulating token usage. Use `/clear` to reset history when starting a new task, or start a fresh session with `aixgo chat`. - ---- - -## Next Steps - -- [Session Persistence](/guides/sessions/) - Detailed session API for programmatic access -- [Provider Integration](/guides/provider-integration/) - Configure additional LLM providers -- [Cost Optimization](/guides/cost-optimization/) - Strategies for reducing LLM spend -- [Quick Start](/guides/quick-start/) - Build multi-agent pipelines with the orchestration framework diff --git a/web/content/guides/core-concepts.md b/web/content/guides/core-concepts.md deleted file mode 100644 index c174a34..0000000 --- a/web/content/guides/core-concepts.md +++ /dev/null @@ -1,450 +0,0 @@ ---- -title: 'Core Concepts' -description: "Understand Aixgo's fundamental building blocks: agent types, supervisor patterns, and message-based communication." -breadcrumb: 'Getting Started' -category: 'Getting Started' -weight: 2 ---- - -Aixgo implements a message-based multi-agent architecture inspired by successful production patterns. This guide explains the core concepts you need to understand before building complex systems. These patterns reflect our [philosophy](/why-aixgo) of production-first design with Go-native patterns. - -## Agent Types - -Aixgo provides six foundational agent types, each designed for specific roles in your system: - -### Producer Agents - -Producer agents generate periodic messages for downstream processing. They're ideal for data ingestion, event generation, or any scenario where you need to create messages on a -schedule. - -```yaml -agents: - - name: event-generator - role: producer - interval: 500ms - outputs: - - target: processor -``` - -**Use cases:** - -- Polling external APIs -- Generating synthetic data for testing -- Periodic health checks -- Time-based event triggers - -### ReAct Agents - -ReAct (Reasoning + Acting) agents combine LLM-powered reasoning with tool calling capabilities. They receive messages, reason about them using an LLM, and can execute tools to -perform actions. - -```yaml -agents: - - name: analyst - role: react - model: gpt-4-turbo - prompt: 'You are an expert data analyst.' - tools: - - name: query_database - description: 'Query the database' - input_schema: - type: object - properties: - query: { type: string } - required: [query] -``` - -**Use cases:** - -- Data analysis and enrichment -- Decision-making workflows -- Natural language processing -- Complex business logic with LLM reasoning - -### Logger Agents - -Logger agents consume and persist messages for observability, debugging, and audit trails. They're the endpoints of your data flows. - -```yaml -agents: - - name: audit-log - role: logger - inputs: - - source: analyst -``` - -**Use cases:** - -- Audit logging -- Debugging workflows -- Data persistence -- Monitoring and alerting - -### Classifier Agents - -Classifier agents categorize content or documents using LLM-powered classification with multiple strategies. They excel at organizing, tagging, and routing data based on -intelligent analysis. - -```yaml -agents: - - name: content-classifier - role: classifier - model: gpt-4-turbo - prompt: 'Classify incoming content into categories' - strategy: multi-label # Options: zero-shot, few-shot, multi-label, single-label - categories: - - technology - - business - - science - - entertainment -``` - -**Classification Strategies:** - -- **ZeroShot** - Classify without training examples -- **FewShot** - Use provided examples for better accuracy -- **MultiLabel** - Assign multiple categories to a single item -- **SingleLabel** - Assign exactly one category per item - -**Use cases:** - -- Content categorization and tagging -- Email routing and triage -- Document organization -- Sentiment analysis and topic detection - -### Aggregator Agents - -Aggregator agents combine outputs from multiple agents using sophisticated strategies. They synthesize information, build consensus, and create unified results from diverse inputs. - -```yaml -agents: - - name: result-aggregator - role: aggregator - model: gpt-4-turbo - prompt: 'Combine analysis from multiple sources' - strategy: semantic # Options: consensus, weighted, semantic, hierarchical, rag - inputs: - - source: analyzer-1 - - source: analyzer-2 - - source: analyzer-3 -``` - -**Aggregation Strategies:** - -- **Consensus** - Find agreement across inputs -- **Weighted** - Prioritize certain sources over others -- **Semantic** - Use embeddings to merge similar information -- **Hierarchical** - Organize information by importance -- **RAG** - Retrieval-augmented generation for context-aware synthesis - -**Use cases:** - -- Multi-agent consensus building -- Research synthesis from multiple sources -- Decision-making with diverse inputs -- Knowledge base construction - -### Planner Agents - -Planner agents create sophisticated reasoning chains and execution strategies. They break down complex problems into steps and coordinate multi-stage workflows. - -```yaml -agents: - - name: task-planner - role: planner - model: gpt-4-turbo - prompt: 'Plan the execution strategy' - strategy: chain-of-thought # Options: chain-of-thought, tree-of-thought, react, monte-carlo, backward-chaining, hierarchical -``` - -**Planning Strategies:** - -- **Chain-of-Thought** - Linear step-by-step reasoning -- **Tree-of-Thought** - Explore multiple reasoning paths -- **ReAct** - Reason and act iteratively -- **MonteCarlo** - Simulate multiple scenarios -- **Backward Chaining** - Work backwards from goal to steps -- **Hierarchical** - Break down into sub-problems - -**Use cases:** - -- Complex problem decomposition -- Multi-step workflow planning -- Strategic decision-making -- Research and analysis pipelines - -## Vector Stores & RAG - -Aixgo includes first-class support for vector databases and Retrieval-Augmented Generation (RAG), enabling your agents to access and search large knowledge bases with semantic understanding. - -### What are Vector Stores? - -Vector stores are databases optimized for similarity search using high-dimensional embeddings. They enable: - -- **Semantic Search**: Find information by meaning, not just keywords -- **Long-Term Memory**: Give agents persistent memory across sessions -- **Knowledge Grounding**: Reduce hallucinations by grounding responses in real data -- **Context Retrieval**: Provide relevant context for more accurate responses - -### Collection-Based Architecture - -Aixgo's vectorstore uses a Collection-based architecture for logical isolation: - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" - -store, _ := memory.New() - -// Create isolated collections for different purposes -cache := store.Collection("cache", - vectorstore.WithTTL(5*time.Minute), - vectorstore.WithDeduplication(true), -) - -memory := store.Collection("agent-memory", - vectorstore.WithScope("user", "session"), -) - -docs := store.Collection("knowledge-base", - vectorstore.WithDimensions(1536), -) -``` - -### 10 Powerful Use Cases - -1. **Semantic Caching**: Cache LLM responses by meaning, not exact text -2. **Agent Memory**: Give agents persistent, scoped memory -3. **Conversation History**: Track and search dialogue history -4. **Content Deduplication**: Automatically detect duplicate content -5. **Multi-Modal Search**: Search across text, images, and documents -6. **Temporal Data**: Automatic expiration and time-based queries -7. **Multi-Tenancy**: Isolate data by tenant/user/session -8. **Batch Operations**: Efficient bulk indexing with progress tracking -9. **Streaming Queries**: Handle large result sets efficiently -10. **Advanced Filtering**: Complex boolean queries with metadata filters - -### Supported Providers - -- **Memory**: In-memory store for development and testing -- **Firestore**: Production-ready, serverless vector database -- **Qdrant** (coming soon): High-performance vector search -- **pgvector** (coming soon): PostgreSQL extension for vector search - -### Quick Example - -```go -// Index documents -doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent("Aixgo is a Go framework for AI agents"), - Embedding: vectorstore.NewEmbedding(embedding, "text-embedding-3-small"), - Tags: []string{"documentation", "framework"}, -} -result, _ := docs.Upsert(ctx, doc) - -// Semantic search -query := &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "text-embedding-3-small"), - Limit: 5, - MinScore: 0.7, - Filters: vectorstore.TagFilter("documentation"), -} -results, _ := docs.Query(ctx, query) -``` - -### Learn More - -- **[Vector Databases Guide](/guides/vector-databases)**: Complete guide to building RAG systems -- **[Embeddings Guide](/guides/embeddings)**: Choosing and using embedding models -- **[RAG Example](/examples/rag-agent)**: Production-ready RAG implementation - -## The Supervisor Pattern - -The supervisor is the orchestration layer that manages agent lifecycle and message routing. It provides the following capabilities: - -### Lifecycle Management - -- **Dependency-aware startup** - Starts agents in the correct order based on their input/output relationships -- **Graceful shutdown** - Ensures clean termination of all agents -- **Health monitoring** - Tracks agent status and handles failures - -### Message Routing - -The supervisor routes messages between agents based on configured inputs and outputs: - -```yaml -supervisor: - name: coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - - name: producer - role: producer - outputs: - - target: analyzer # Messages flow to analyzer - - - name: analyzer - role: react - inputs: - - source: producer # Receives from producer - outputs: - - target: logger # Sends to logger - - - name: logger - role: logger - inputs: - - source: analyzer # Receives from analyzer -``` - -### Execution Constraints - -- **Max rounds** - Limits total iterations to prevent runaway workflows -- **Timeouts** - Prevents agents from running indefinitely -- **Error handling** - Manages agent failures gracefully - -### Observability Hooks - -The supervisor provides distributed tracing integration, allowing you to: - -- Track message flow across agents -- Measure agent performance -- Debug complex workflows -- Monitor system health - -## Communication Abstraction - -One of Aixgo's most powerful features is its runtime abstraction layer that handles message transport automatically. - -### Local Mode (Development) - -Uses Go channels for in-process communication: - -```go -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) -supervisor.AddAgent(analyzer) -supervisor.Run() // Uses Go channels internally -``` - -**Benefits:** - -- Fast development iteration -- Easy debugging -- No infrastructure required -- Perfect for prototyping - -### Distributed Mode (Production) - -Uses gRPC with protobuf for multi-node orchestration: - -```go -// Same code as local mode! -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) -supervisor.AddAgent(analyzer) -supervisor.Run() // Uses gRPC when configured -``` - -**Benefits:** - -- Multi-region deployment -- Horizontal scaling -- Fault isolation -- Production-grade reliability - -### Automatic Selection - -The runtime automatically selects the appropriate transport based on configuration. This means: - -1. **Prototype locally** - Develop with Go channels for speed -2. **Deploy to single instance** - Run on Cloud Run/Lambda with same code -3. **Scale to distributed** - Add configuration, no code changes needed - -## Message Flow Example - -Here's how messages flow through a typical Aixgo system: - -```yaml -# Data flows: producer → analyzer → logger - -supervisor: - name: coordinator - max_rounds: 5 - -agents: - - name: data-source - role: producer - interval: 1s - outputs: - - target: processor - - - name: processor - role: react - model: gpt-4-turbo - prompt: 'Process the incoming data' - inputs: - - source: data-source - outputs: - - target: storage - - - name: storage - role: logger - inputs: - - source: processor -``` - -**Execution flow:** - -1. Supervisor starts agents in order: data-source → processor → storage -2. `data-source` generates a message every 1 second -3. Message is routed to `processor` via supervisor -4. `processor` analyzes with LLM and outputs result -5. Message is routed to `storage` -6. `storage` persists the result -7. Process repeats for max_rounds (5 iterations) -8. Supervisor initiates graceful shutdown - -## Key Principles - -### 1. Declarative Configuration - -Agents and workflows are defined in YAML, making them: - -- Version-controlled -- Easy to review -- Platform-independent -- Testable - -### 2. Type Safety - -Go's type system ensures: - -- Configuration errors caught at compile time -- Tool interfaces are type-checked -- Refactoring is safe across large systems - -### 3. Observable by Default - -Every agent interaction is: - -- Traceable via OpenTelemetry -- Logged with structured context -- Measurable with metrics - -### 4. Production-Ready from Day One - -The same code runs in: - -- Local development -- Single-instance deployments -- Distributed production systems - -## Next Steps - -Now that you understand the core concepts, you can: - -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Build complex multi-agent workflows -- **[Single Binary vs Distributed Mode](/guides/single-vs-distributed)** - Understand scaling patterns -- **[Type Safety & LLM Integration](/guides/type-safety)** - Leverage compile-time guarantees diff --git a/web/content/guides/cost-optimization.md b/web/content/guides/cost-optimization.md deleted file mode 100644 index d01b0e1..0000000 --- a/web/content/guides/cost-optimization.md +++ /dev/null @@ -1,683 +0,0 @@ ---- -title: 'Cost Optimization Strategies' -description: 'Optimize LLM costs using caching, monitoring, intelligent routing, and deterministic aggregation' -breadcrumb: 'Cost Optimization' -category: 'Production' -weight: 25 ---- - -Aixgo provides built-in cost tracking and optimization features. This guide shows how to combine them with standard infrastructure tools (Redis, OpenTelemetry) for 25-80% cost reduction in production systems. - -**Working Example**: See [cost-optimization](https://github.com/aixgo-dev/aixgo/tree/main/examples/cost-optimization) for a complete implementation demonstrating all three optimization strategies. - -## Overview - -LLM API costs can quickly escalate in production. Aixgo helps you optimize costs through: - -- **Built-in cost tracking**: Automatic token usage and cost monitoring per agent -- **Application-level caching**: Cache responses using Redis or similar -- **Intelligent routing**: Use Router pattern to send queries to cost-appropriate models -- **Deterministic aggregation**: $0 voting strategies vs expensive LLM aggregation -- **Budget monitoring**: Alert on cost thresholds before overspend -- **Batch processing**: Process multiple items in single API calls - -## Built-In Cost Tracking - -Every LLM call is automatically tracked with detailed metrics. - -### What's Tracked - -- **Token usage**: Input and output tokens per call -- **Cost calculation**: Based on provider pricing tables -- **Per-agent metrics**: Know which agents cost most -- **OpenTelemetry integration**: Export to Prometheus, Datadog, Langfuse -- **Time-series data**: Track costs over time - -### Querying Cost Metrics - -```go -import ( - "github.com/aixgo-dev/aixgo/internal/observability" -) - -func getCostMetrics() { - // Total cost (all agents) - totalCost := observability.GetMetric("llm.cost.total") - fmt.Printf("Total LLM cost: $%.2f\n", totalCost) - - // Total tokens used - tokensUsed := observability.GetMetric("llm.tokens.total") - fmt.Printf("Total tokens: %d\n", tokensUsed) - - // Cost per agent - agentCost := observability.GetMetricByLabel("llm.cost.total", - map[string]string{"agent": "policy-analyzer"}) - fmt.Printf("Policy analyzer cost: $%.2f\n", agentCost) - - // Daily cost - dailyCost := observability.GetMetric("llm.cost.daily") - fmt.Printf("Today's cost: $%.2f\n", dailyCost) -} -``` - -### Export to Monitoring Systems - -```yaml -# OpenTelemetry configuration -observability: - enabled: true - exporter: prometheus - endpoint: http://prometheus:9090 - - # Or use Langfuse for LLM-specific tracking - # exporter: langfuse - # langfuse_public_key: pk-lf-... - # langfuse_secret_key: sk-lf-... -``` - -## Cost Optimization Strategies - -### 1. Application-Level Caching (60-80% cost reduction) - -**Why application-level caching?** - -Caching is an infrastructure concern, not a framework concern. Use Redis, Memcached, or HTTP cache headers at the application layer. - -**Benefits:** - -- 60-80% cost reduction for repeated queries -- Framework stays focused on agent orchestration -- You control TTL, invalidation, cache keys -- Standard tools (Redis) with proven reliability - -#### Redis Caching Wrapper - -```go -package cache - -import ( - "context" - "crypto/sha256" - "encoding/json" - "fmt" - "time" - - "github.com/redis/go-redis/v9" - "github.com/aixgo-dev/aixgo/pkg/agent" -) - -type CachedAgent struct { - cache *redis.Client - agent agent.Agent - ttl time.Duration -} - -func NewCachedAgent(rdb *redis.Client, ag agent.Agent, ttl time.Duration) *CachedAgent { - return &CachedAgent{ - cache: rdb, - agent: ag, - ttl: ttl, - } -} - -func (c *CachedAgent) Execute(ctx context.Context, msg *agent.Message) (*agent.Message, error) { - // Generate cache key from message payload - cacheKey := c.generateCacheKey(msg.Payload) - - // Check cache first - if cached, err := c.cache.Get(ctx, cacheKey).Result(); err == nil { - return &agent.Message{Payload: cached}, nil - } - - // Cache miss - execute agent - result, err := c.agent.Execute(ctx, msg) - if err != nil { - return nil, err - } - - // Cache the result - if err := c.cache.Set(ctx, cacheKey, result.Payload, c.ttl).Err(); err != nil { - // Log cache write failure but don't fail the request - fmt.Printf("Cache write failed: %v\n", err) - } - - return result, nil -} - -func (c *CachedAgent) generateCacheKey(payload string) string { - hash := sha256.Sum256([]byte(payload)) - return fmt.Sprintf("agent:%s:%x", c.agent.Name(), hash) -} -``` - -#### Usage Example - -```go -func main() { - // Setup Redis - rdb := redis.NewClient(&redis.Options{ - Addr: "localhost:6379", - }) - - // Create base agent - baseAgent := agent.NewReact("policy-analyzer", model, "Analyze policy") - - // Wrap with caching - cachedAgent := cache.NewCachedAgent(rdb, baseAgent, 1*time.Hour) - - // Use cached agent in workflow - // Identical queries return cached results instantly - result, err := cachedAgent.Execute(ctx, msg) -} -``` - -#### Cache Strategies - -**Aggressive caching (24h+ TTL):** - -- Static content analysis -- Historical data processing -- Infrequently changing queries -- **Cost reduction: 70-80%** - -```go -cachedAgent := cache.NewCachedAgent(rdb, agent, 24*time.Hour) -``` - -**Moderate caching (1-6h TTL):** - -- Semi-static queries -- Daily reports -- Moderate freshness requirements -- **Cost reduction: 40-60%** - -```go -cachedAgent := cache.NewCachedAgent(rdb, agent, 3*time.Hour) -``` - -**Conservative caching (<1h TTL):** - -- Real-time data with some tolerance -- Frequently changing content -- Short-term deduplication -- **Cost reduction: 20-40%** - -```go -cachedAgent := cache.NewCachedAgent(rdb, agent, 30*time.Minute) -``` - -### 2. Intelligent Model Selection (25-50% cost reduction) - -Use the **Router pattern** to route queries to cost-appropriate models. - -#### Cost-Based Routing - -```yaml -supervisor: - name: cost-optimizer - -agents: - # Classify query complexity - - name: complexity-classifier - role: classifier - model: gpt-4o-mini # Use cheap model for classification - inputs: - - source: user-query - outputs: - - target: cheap-handler - condition: 'category == simple' - - target: mid-handler - condition: 'category == moderate' - - target: expensive-handler - condition: 'category == complex' - classifier_config: - categories: - - name: simple - description: "Basic queries answerable with simple reasoning" - examples: - - "What is the capital of France?" - - "Define machine learning" - - - name: moderate - description: "Queries requiring moderate analysis" - examples: - - "Compare supervised vs unsupervised learning" - - "Explain benefits of microservices" - - - name: complex - description: "Queries requiring deep reasoning and analysis" - examples: - - "Design a distributed consensus algorithm" - - "Analyze trade-offs in system architecture" - - # Cost-optimized handlers - - name: cheap-handler - role: react - model: gpt-4o-mini # $0.15/1M input tokens - prompt: 'Handle simple query efficiently' - inputs: - - source: complexity-classifier - - - name: mid-handler - role: react - model: gpt-4-turbo # $10/1M input tokens - prompt: 'Handle moderate complexity query' - inputs: - - source: complexity-classifier - - - name: expensive-handler - role: react - model: gpt-4 # $30/1M input tokens - prompt: 'Handle complex query with deep analysis' - inputs: - - source: complexity-classifier -``` - -#### Cost Comparison - -**Without routing (all queries → GPT-4):** - -- 1000 queries/day -- Average 500 tokens/query -- Cost: 1000 × 500 × $30/1M = $15/day -- **Monthly cost: $450** - -**With routing (70% simple, 20% moderate, 10% complex):** - -- Simple (700): 700 × 500 × $0.15/1M = $0.05/day -- Moderate (200): 200 × 500 × $10/1M = $1.00/day -- Complex (100): 100 × 500 × $30/1M = $1.50/day -- **Daily cost: $2.55** -- **Monthly cost: $76.50** - -**Savings: 83% ($373.50/month)** - -### 3. Deterministic Aggregation ($0 vs $$) - -Use deterministic voting strategies instead of LLM aggregation when possible. - -#### Cost Comparison: LLM vs Voting - -**LLM-powered consensus:** - -```yaml -aggregator_config: - aggregation_strategy: consensus # Uses LLM -``` - -- Cost: $0.02-0.05 per aggregation (depending on agent count) -- Speed: 2-5 seconds -- Quality: High (semantic synthesis) - -**Deterministic voting:** - -```yaml -aggregator_config: - aggregation_strategy: voting_majority # No LLM -``` - -- Cost: $0.00 per aggregation -- Speed: <1ms (instant) -- Quality: Good (for classification/voting tasks) - -#### When to Use Each - -**Use voting_majority when:** - -- Classifying content (sentiment, category, priority) -- Binary decisions (approve/reject) -- Equal-weight agent opinions -- Reproducibility required - -**Use LLM consensus when:** - -- Narrative synthesis needed -- Semantic understanding required -- Conflict resolution complex -- Quality justifies cost - -#### Cost Savings Example - -1000 aggregations/day: - -- **LLM consensus**: 1000 × $0.03 = $30/day ($900/month) -- **Deterministic voting**: 1000 × $0 = $0/day ($0/month) -- **Savings: $900/month** - -### 4. Batch Processing (90%+ cost reduction) - -Process multiple items in one API call instead of many individual calls. - -#### Anti-Pattern (100 separate calls) - -```go -// DON'T DO THIS - expensive! -for _, item := range items { - result, err := llm.CreateStructured[Result](ctx, client, item, nil) - results = append(results, result) -} -``` - -Cost: 100 calls × overhead = expensive - -#### Optimized (1 batched call) - -```go -// DO THIS - efficient! -type BatchResults struct { - Results []Result `json:"results" validate:"required,dive"` -} - -prompt := fmt.Sprintf("Analyze these items: %v", items) -batchResult, err := llm.CreateStructured[BatchResults](ctx, client, prompt, nil) -``` - -Cost: 1 call = 90%+ savings - -#### Batch Size Optimization - -```go -const ( - MinBatchSize = 10 // Below this, batching adds overhead - MaxBatchSize = 100 // Above this, context window limits hit -) - -func processBatch(ctx context.Context, items []Item) ([]Result, error) { - if len(items) < MinBatchSize { - // Process individually for small batches - return processIndividually(ctx, items) - } - - // Split into optimal batch sizes - batches := splitIntoBatches(items, MaxBatchSize) - - var results []Result - for _, batch := range batches { - batchResults, err := processSingleBatch(ctx, batch) - if err != nil { - return nil, err - } - results = append(results, batchResults...) - } - - return results, nil -} -``` - -### 5. Budget Monitoring and Alerts - -Track spending and alert on thresholds before overspend. - -#### Budget Monitor Implementation - -```go -package budget - -import ( - "context" - "fmt" - "log" - "time" - - "github.com/aixgo-dev/aixgo/internal/observability" -) - -type BudgetMonitor struct { - dailyLimit float64 - warningLevel float64 // Percentage of limit for warnings (e.g., 0.8 = 80%) - alertFunc func(cost float64, limit float64, level string) -} - -func NewBudgetMonitor(dailyLimit float64, warningLevel float64, alertFunc func(float64, float64, string)) *BudgetMonitor { - return &BudgetMonitor{ - dailyLimit: dailyLimit, - warningLevel: warningLevel, - alertFunc: alertFunc, - } -} - -func (b *BudgetMonitor) CheckBudget(ctx context.Context) error { - cost := observability.GetMetric("llm.cost.daily") - - // Critical: Budget exceeded - if cost > b.dailyLimit { - b.alertFunc(cost, b.dailyLimit, "critical") - return fmt.Errorf("daily budget exceeded: $%.2f > $%.2f", - cost, b.dailyLimit) - } - - // Warning: Approaching limit - if cost > b.dailyLimit*b.warningLevel { - b.alertFunc(cost, b.dailyLimit, "warning") - log.Printf("Warning: %.0f%% of daily budget used ($%.2f of $%.2f)", - (cost/b.dailyLimit)*100, cost, b.dailyLimit) - } - - return nil -} - -func (b *BudgetMonitor) StartMonitoring(ctx context.Context, interval time.Duration) { - ticker := time.NewTicker(interval) - defer ticker.Stop() - - for { - select { - case <-ticker.C: - if err := b.CheckBudget(ctx); err != nil { - log.Printf("Budget check failed: %v", err) - } - case <-ctx.Done(): - return - } - } -} -``` - -#### Usage - -```go -func main() { - // Setup budget monitoring - monitor := budget.NewBudgetMonitor( - 100.0, // $100 daily limit - 0.8, // Alert at 80% usage - func(cost, limit float64, level string) { - // Send alert (Slack, PagerDuty, etc.) - if level == "critical" { - sendAlert(fmt.Sprintf("CRITICAL: Budget exceeded! $%.2f > $%.2f", cost, limit)) - } else { - sendAlert(fmt.Sprintf("WARNING: 80%% budget used ($%.2f)", cost)) - } - }, - ) - - // Start background monitoring - ctx := context.Background() - go monitor.StartMonitoring(ctx, 5*time.Minute) - - // Run your application - // Monitor will alert on threshold violations -} -``` - -### 6. Model Selection by Task - -Different models have different cost/quality tradeoffs. Choose appropriately. - -#### Model Cost Comparison (as of 2024) - -| Model | Input Cost | Output Cost | Use Case | -|-------|-----------|-------------|----------| -| gpt-4o-mini | $0.15/1M | $0.60/1M | Simple queries, classification | -| gpt-4-turbo | $10/1M | $30/1M | Moderate complexity | -| gpt-4 | $30/1M | $60/1M | Complex reasoning | -| claude-3-haiku | $0.25/1M | $1.25/1M | Fast, simple tasks | -| claude-3-sonnet | $3/1M | $15/1M | Balanced quality/cost | -| claude-3-opus | $15/1M | $75/1M | Highest quality | - -#### Task-Optimized Model Selection - -```yaml -agents: - # Simple classification - cheapest model - - name: content-classifier - role: classifier - model: gpt-4o-mini - prompt: 'Classify content category' - - # Data extraction - mid-tier model - - name: data-extractor - role: react - model: gpt-4-turbo - prompt: 'Extract structured data from text' - - # Complex analysis - expensive model - - name: deep-analyzer - role: react - model: gpt-4 - prompt: 'Perform deep analytical reasoning' - - # High-volume logging - no LLM needed - - name: logger - role: logger - # No model cost - just writes to storage -``` - -## Cost Optimization Checklist - -Use this checklist for production systems: - -- [ ] **Enable cost tracking**: OpenTelemetry integration configured -- [ ] **Implement caching**: Redis wrapper for frequently accessed data -- [ ] **Use Router pattern**: Route to cheap models when appropriate -- [ ] **Choose voting over LLM**: Use deterministic aggregation when possible -- [ ] **Batch when possible**: Combine multiple items in single calls -- [ ] **Monitor budgets**: Alert at 80% threshold -- [ ] **Review expensive agents**: Weekly analysis of high-cost agents -- [ ] **Optimize prompts**: Shorter prompts reduce token costs -- [ ] **Set max_tokens limits**: Prevent unexpectedly long responses -- [ ] **Use appropriate models**: Match model tier to task complexity - -## Real-World Cost Optimization Example - -### Before Optimization - -**System:** Customer support ticket analysis - -**Configuration:** - -- 10,000 tickets/day -- All tickets processed by GPT-4 -- No caching -- LLM aggregation for all summaries -- Average 800 tokens per ticket - -**Daily Cost:** - -- Tickets: 10,000 × 800 tokens × $30/1M = $240/day -- Aggregation: 1,000 summaries × $0.03 = $30/day -- **Total: $270/day ($8,100/month)** - -### After Optimization - -**Optimizations applied:** - -1. **60% cache hit rate** (Redis, 6h TTL) -2. **Router pattern** (70% simple → gpt-4o-mini, 30% complex → gpt-4) -3. **Deterministic voting** for priority classification -4. **Batch processing** for summaries (10 tickets/batch) - -**Daily Cost:** - -- Cached (6,000): $0 -- Simple (2,800): 2,800 × 800 × $0.15/1M = $0.34/day -- Complex (1,200): 1,200 × 800 × $30/1M = $28.80/day -- Priority voting: $0 (deterministic) -- Batch summaries: 100 batches × $0.30 = $30/day -- **Total: $59.14/day ($1,774/month)** - -**Savings: 78% ($6,326/month)** - -### Optimization ROI - -| Optimization | Cost Reduction | Effort | ROI | -|-------------|----------------|--------|-----| -| Redis caching | $162/day | 2 days | High | -| Router pattern | $40/day | 1 day | Very High | -| Voting aggregation | $30/day | 4 hours | Very High | -| Batch processing | $10/day | 4 hours | High | - -## Monitoring Cost Trends - -### Export to Prometheus - -```yaml -# prometheus.yml -scrape_configs: - - job_name: 'aixgo' - scrape_interval: 30s - static_configs: - - targets: ['localhost:9090'] -``` - -### Grafana Dashboard - -Create dashboard with: - -- Daily cost trend -- Cost per agent -- Token usage by model -- Cache hit rate -- Cost per query (moving average) - -### Alerts - -```yaml -# alerts.yml -groups: - - name: cost_alerts - rules: - - alert: HighDailyCost - expr: llm_cost_daily > 100 - for: 5m - labels: - severity: warning - annotations: - summary: "Daily LLM cost exceeds $100" - - - alert: CriticalDailyCost - expr: llm_cost_daily > 200 - for: 1m - labels: - severity: critical - annotations: - summary: "CRITICAL: Daily cost exceeds $200" - - - alert: LowCacheHitRate - expr: cache_hit_rate < 0.4 - for: 10m - labels: - severity: warning - annotations: - summary: "Cache hit rate below 40%" -``` - -## Best Practices - -1. **Start with monitoring**: Can't optimize what you don't measure -2. **Cache aggressively**: Most queries are repeated -3. **Route intelligently**: 70%+ of queries are simple -4. **Batch when possible**: Dramatic cost savings for multi-item processing -5. **Use deterministic aggregation**: Free vs expensive for voting tasks -6. **Set budget alerts**: Prevent overspend surprises -7. **Review monthly**: Identify high-cost agents and optimize -8. **Test caching behavior**: Ensure cache invalidation works correctly -9. **Monitor cache hit rates**: >50% hit rate indicates good caching strategy -10. **Use appropriate models**: Don't use GPT-4 for classification - -## See Also - -- [Router Pattern](./multi-agent-orchestration/#router-pattern) - Intelligent routing -- [Aggregator Strategies](./agent-types/#aggregator-agent) - Deterministic vs LLM aggregation -- [Observability Guide](./observability/) - Metrics and monitoring -- [Production Deployment](./production-deployment/) - Production best practices diff --git a/web/content/guides/docker-from-scratch.md b/web/content/guides/docker-from-scratch.md deleted file mode 100644 index 63ce291..0000000 --- a/web/content/guides/docker-from-scratch.md +++ /dev/null @@ -1,583 +0,0 @@ ---- -title: 'Building Docker Images from Scratch' -description: "Create minimal <20MB production containers leveraging Aixgo's single binary advantage." -breadcrumb: 'Deployment' -category: 'Deployment' -weight: 7 ---- - -One of Aixgo's most compelling advantages is deployment size. While Python AI frameworks produce 1GB+ containers, Aixgo agents compile to <20MB binaries. This guide shows how to -build minimal production containers. - -## The Container Size Problem - -Python-based AI frameworks create massive containers: - -```dockerfile -# Python AI service - typical Dockerfile -FROM python:3.11 -COPY requirements.txt . -RUN pip install -r requirements.txt - -# Result: 1.2GB+ container -# - Python runtime: 900MB -# - Dependencies: 300MB+ -# - Your code: <1MB -``` - -**Problems:** - -- Slow deployments (minutes to pull 1GB) -- High storage costs -- Large attack surface -- Slow cold starts (30-45 seconds) - -## Aixgo's Solution: FROM scratch - -Go compiles to static binaries with zero runtime dependencies: - -```dockerfile -# Aixgo service - minimal Dockerfile -FROM scratch -COPY aixgo-agent / -CMD ["/aixgo-agent"] - -# Result: <20MB total container -# - Go binary: <20MB -# - No runtime needed -# - No dependencies -``` - -**Benefits:** - -- Fast deployments (seconds to pull <20MB) -- Minimal storage costs -- Tiny attack surface -- Instant cold starts (<100ms) - -## Basic Dockerfile: Single Binary - -The simplest production Dockerfile: - -```dockerfile -FROM golang:1.21 AS builder - -WORKDIR /app - -# Copy go mod files -COPY go.mod go.sum ./ -RUN go mod download - -# Copy source code -COPY . . - -# Build static binary -RUN CGO_ENABLED=0 GOOS=linux go build -o agent main.go - -# Runtime stage -FROM scratch - -# Copy binary from builder -COPY --from=builder /app/agent /agent - -# Copy config (optional) -COPY --from=builder /app/config/ /config/ - -CMD ["/agent"] -``` - -**Build and run:** - -```bash -docker build -t aixgo-agent:latest . -docker run aixgo-agent:latest -``` - -**Result:** ~18MB container - -## Optimized Dockerfile: Smallest Possible Image - -Further optimization with build flags: - -```dockerfile -FROM golang:1.21-alpine AS builder - -WORKDIR /app - -# Copy dependencies -COPY go.mod go.sum ./ -RUN go mod download - -# Copy source -COPY . . - -# Build with optimizations -RUN CGO_ENABLED=0 GOOS=linux go build \ - -ldflags="-w -s" \ - -trimpath \ - -o agent \ - main.go - -# Runtime: scratch (0MB base) -FROM scratch - -# Copy CA certificates (for HTTPS) -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ - -# Copy binary -COPY --from=builder /app/agent /agent - -# Copy config -COPY --from=builder /app/config/ /config/ - -CMD ["/agent"] -``` - -**Build flags explained:** - -- `CGO_ENABLED=0` - Disable C dependencies (pure Go) -- `-ldflags="-w -s"` - Strip debug info and symbol table -- `-trimpath` - Remove file system paths from binary -- Result: ~15-20MB container - -## Multi-Stage Build Best Practices - -### Stage 1: Builder (golang:alpine) - -Use Alpine for smaller build stage: - -```dockerfile -FROM golang:1.21-alpine AS builder - -# Install build dependencies (if needed) -RUN apk add --no-cache git - -WORKDIR /app - -# Layer caching: dependencies first -COPY go.mod go.sum ./ -RUN go mod download - -# Then copy source (changes more frequently) -COPY . . - -# Build -RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o agent main.go -``` - -### Stage 2: Runtime (scratch) - -Minimal runtime with only essentials: - -```dockerfile -FROM scratch - -# Copy CA certificates for HTTPS API calls -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ - -# Copy timezone data (if needed) -COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo - -# Copy binary -COPY --from=builder /app/agent /agent - -# Copy config -COPY config/ /config/ - -# Non-root user (security best practice) -USER 65534:65534 - -CMD ["/agent"] -``` - -## Handling External Dependencies - -### CA Certificates (for HTTPS) - -If your agent calls external APIs: - -```dockerfile -FROM scratch - -# Required for HTTPS calls to LLM providers -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ - -COPY --from=builder /app/agent /agent -CMD ["/agent"] -``` - -### Timezone Data - -If your agent uses timezone-aware date/time: - -```dockerfile -FROM scratch - -# For time.LoadLocation() -COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo -ENV TZ=UTC - -COPY --from=builder /app/agent /agent -CMD ["/agent"] -``` - -### Configuration Files - -Mount configs as volumes or copy at build time: - -```dockerfile -# Option 1: Copy at build time -COPY config/agents.yaml /config/agents.yaml - -# Option 2: Mount as volume (more flexible) -# docker run -v ./config:/config aixgo-agent -``` - -## Size Comparison: Python vs Aixgo - -Real-world example: data analysis agent - -### Python (LangChain) - -```dockerfile -FROM python:3.11 - -COPY requirements.txt . -RUN pip install -r requirements.txt - -COPY . . - -CMD ["python", "main.py"] -``` - -**requirements.txt:** - -```text -langchain==0.1.0 -openai==1.0.0 -pandas==2.1.0 -numpy==1.24.0 -# ... 50+ more dependencies -``` - -**Result:** - -- Base image: 900MB -- Dependencies: 400MB -- Total: **1.3GB** - -### Aixgo - -```dockerfile -FROM golang:1.21-alpine AS builder -WORKDIR /app -COPY . . -RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o agent main.go - -FROM scratch -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ -COPY --from=builder /app/agent /agent -COPY --from=builder /app/config/ /config/ -CMD ["/agent"] -``` - -**Result:** - -- Base image: 0MB (scratch) -- Binary: <20MB -- Total: **<20MB** - -**Improvement: 150x smaller** - -## Security Hardening - -### Run as Non-Root User - -```dockerfile -FROM scratch - -# Copy binary -COPY --from=builder /app/agent /agent - -# Copy passwd file for non-root user -COPY --from=builder /etc/passwd /etc/passwd - -# Run as nobody (UID 65534) -USER 65534:65534 - -CMD ["/agent"] -``` - -### Read-Only Filesystem - -```dockerfile -# Dockerfile -FROM scratch -COPY --from=builder /app/agent /agent -USER 65534:65534 -CMD ["/agent"] -``` - -```bash -# Run with read-only root filesystem -docker run --read-only aixgo-agent:latest -``` - -### Minimal Attack Surface - -`FROM scratch` has: - -- No shell -- No package manager -- No utilities -- No OS files - -Only your binary exists. Nothing else to exploit. - -## Cloud-Specific Optimizations - -### Google Cloud Run - -```dockerfile -FROM golang:1.21-alpine AS builder -WORKDIR /app -COPY . . -RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o agent main.go - -FROM scratch -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ -COPY --from=builder /app/agent /agent -COPY config/ /config/ - -# Cloud Run sets PORT env var -CMD ["/agent"] -``` - -**Deploy:** - -```bash -gcloud run deploy aixgo-agent \ - --image gcr.io/my-project/aixgo-agent \ - --platform managed \ - --memory 512Mi \ - --max-instances 10 -``` - -### AWS Lambda - -For Lambda, use AWS Lambda Go runtime: - -```dockerfile -FROM golang:1.21 AS builder -WORKDIR /app -COPY . . -RUN GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -ldflags="-w -s" -o bootstrap main.go - -FROM scratch -COPY --from=builder /app/bootstrap /bootstrap -COPY config/ /config/ -ENTRYPOINT ["/bootstrap"] -``` - -**Package and deploy:** - -```bash -zip function.zip bootstrap config/* -aws lambda create-function \ - --function-name aixgo-agent \ - --runtime provided.al2 \ - --handler bootstrap \ - --zip-file fileb://function.zip -``` - -## Build Optimization Techniques - -### Layer Caching - -Order Dockerfile commands from least to most frequently changed: - -```dockerfile -FROM golang:1.21-alpine AS builder -WORKDIR /app - -# 1. Dependencies (cached unless go.mod changes) -COPY go.mod go.sum ./ -RUN go mod download - -# 2. Source code (changes frequently) -COPY . . -RUN CGO_ENABLED=0 go build -ldflags="-w -s" -o agent main.go -``` - -### Multi-Platform Builds - -Build for multiple architectures: - -```bash -docker buildx create --use -docker buildx build \ - --platform linux/amd64,linux/arm64 \ - -t my-registry/aixgo-agent:latest \ - --push \ - . -``` - -### Build-Time Variables - -Inject version info at build time: - -```dockerfile -ARG VERSION=dev -ARG GIT_COMMIT=unknown - -RUN CGO_ENABLED=0 go build \ - -ldflags="-w -s -X main.Version=${VERSION} -X main.GitCommit=${GIT_COMMIT}" \ - -o agent \ - main.go -``` - -```bash -docker build \ - --build-arg VERSION=1.0.0 \ - --build-arg GIT_COMMIT=$(git rev-parse HEAD) \ - -t aixgo-agent:1.0.0 \ - . -``` - -## Debugging Minimal Containers - -### Problem: No Shell in scratch - -You can't `docker exec` into a `scratch` container (no shell). - -**Solution 1: Debug build with shell** - -```dockerfile -# Production -FROM scratch AS production -COPY --from=builder /app/agent /agent -CMD ["/agent"] - -# Debug (with shell) -FROM alpine:latest AS debug -COPY --from=builder /app/agent /agent -CMD ["/bin/sh"] -``` - -Build debug version: - -```bash -docker build --target debug -t aixgo-agent:debug . -docker run -it aixgo-agent:debug /bin/sh -``` - -**Solution 2: Logging** - -Use structured logging to stdout: - -```go -import "log/slog" - -func main() { - logger := slog.New(slog.NewJSONHandler(os.Stdout, nil)) - slog.SetDefault(logger) - - slog.Info("Starting agent", "version", Version) - // ... -} -``` - -View logs: - -```bash -docker logs -f -``` - -## Performance Impact - -Container size affects: - -| Metric | 1.2GB (Python) | <20MB (Aixgo) | Impact | -| -------------------- | ------------------ | ----------------- | --------------- | -| **Pull time** | 2-5 minutes | 5-10 seconds | 24-60x faster | -| **Cold start** | 30-45 seconds | <100ms | 300-450x faster | -| **Storage cost** | $0.10/GB/month | $0.001/GB/month | 100x cheaper | -| **Deploy frequency** | Slow (discouraged) | Fast (encouraged) | Higher velocity | - -## Real-World Example: Minimal Production Dockerfile - -Complete production-ready Dockerfile: - -```dockerfile -# Build stage -FROM golang:1.21-alpine AS builder - -# Install certificates -RUN apk add --no-cache ca-certificates git - -WORKDIR /app - -# Dependencies -COPY go.mod go.sum ./ -RUN go mod download - -# Source -COPY . . - -# Build with all optimizations -RUN CGO_ENABLED=0 GOOS=linux go build \ - -ldflags="-w -s -X main.Version=${VERSION:-dev}" \ - -trimpath \ - -o agent \ - main.go - -# Runtime stage -FROM scratch - -# Metadata -LABEL maintainer="your-team@company.com" -LABEL version="1.0.0" - -# Copy certificates for HTTPS -COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ - -# Copy timezone data -COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo - -# Copy binary -COPY --from=builder /app/agent /agent - -# Copy config -COPY config/ /config/ - -# Non-root user -USER 65534:65534 - -# Health check (if HTTP server) -HEALTHCHECK --interval=30s --timeout=3s \ - CMD ["/agent", "--health-check"] - -CMD ["/agent"] -``` - -**Build:** - -```bash -docker build -t my-registry/aixgo-agent:1.0.0 . -docker push my-registry/aixgo-agent:1.0.0 -``` - -## Key Takeaways - -1. **FROM scratch** - Smallest possible base (0MB) -2. **Multi-stage builds** - Keep builder separate from runtime -3. **Static binaries** - `CGO_ENABLED=0` for zero dependencies -4. **Strip symbols** - `-ldflags="-w -s"` reduces size -5. **Layer caching** - Dependencies before source code -6. **60x smaller** - <20MB vs 1.2GB Python containers - -## Next Steps - -- **[Production Deployment](/guides/production-deployment)** - Deploy minimal containers to production -- **[Single Binary vs Distributed](/guides/single-vs-distributed)** - Understand scaling patterns -- **[Observability & Monitoring](/guides/observability)** - Monitor containerized agents diff --git a/web/content/guides/embeddings.md b/web/content/guides/embeddings.md deleted file mode 100644 index 54e14f6..0000000 --- a/web/content/guides/embeddings.md +++ /dev/null @@ -1,1871 +0,0 @@ ---- -title: 'Embeddings in Aixgo' -description: 'Complete guide to choosing and using embedding providers for RAG, semantic search, and similarity matching' -category: 'RAG & Embeddings' -weight: 5 ---- - -# Embeddings in Aixgo - -This guide provides a comprehensive reference for working with embeddings in Aixgo, covering provider selection, configuration, optimization, and production deployment strategies. - -## Overview - -### What Are Embeddings? - -Embeddings are dense numerical representations that capture the semantic meaning of text. Similar texts produce similar vectors, enabling powerful AI applications: - -- **Semantic Search**: Find content by meaning, not keywords -- **RAG Systems**: Power retrieval-augmented generation -- **Similarity Matching**: Compare and cluster documents -- **Duplicate Detection**: Identify similar or redundant content -- **Recommendation Systems**: Suggest related items - -**Example:** -```text -"machine learning" → [0.42, 0.78, 0.11, ...] -"deep learning" → [0.41, 0.79, 0.12, ...] (similar vector) -"cooking recipe" → [0.89, 0.12, 0.05, ...] (different vector) -``` - -### Why Embeddings Matter - -Embeddings bridge the gap between human language and machine computation, enabling: - -1. **Context-Aware Search**: Find "automobile" when searching for "car" -2. **Cross-Language Understanding**: Match concepts across languages -3. **Efficient Storage**: Compress semantic information into fixed-size vectors -4. **Fast Similarity Computation**: Use cosine similarity for quick comparisons -5. **Knowledge Retrieval**: Power RAG systems with relevant context - -### Quick Comparison - -| Provider | Cost | Quality | Speed | Best For | -|----------|------|---------|-------|----------| -| **HuggingFace API** | Free (rate limited) | Good-Excellent | Medium | Development, prototyping | -| **HuggingFace TEI** | Free (self-host) | Good-Excellent | Very Fast | High-volume production | -| **OpenAI** | $0.02-0.13/1M tokens | Excellent | Fast | Production quality | - -## Provider Comparison - -### HuggingFace Inference API - -The HuggingFace Inference API provides free access to thousands of embedding models, perfect for development and prototyping. - -**Key Features:** -- Free tier with rate limits (varies by model) -- Access to state-of-the-art models -- No infrastructure required -- Automatic model loading and caching - -**Popular Models:** -```go -// Fast and efficient (384 dimensions) -Model: "sentence-transformers/all-MiniLM-L6-v2" -// Excellent quality/speed balance - -// High quality (1024 dimensions) -Model: "BAAI/bge-large-en-v1.5" -// Top performance on benchmarks - -// Multilingual support (1024 dimensions) -Model: "thenlper/gte-large" -// Supports 100+ languages -``` - -**Configuration Example:** -```go -package main - -import ( - "context" - "fmt" - "log" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" -) - -func main() { - ctx := context.Background() - - // Configure HuggingFace API - config := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - APIKey: os.Getenv("HUGGINGFACE_API_KEY"), // Optional - WaitForModel: true, // Wait if model needs loading - UseCache: true, // Use HF's server-side cache - }, - } - - // Create embedding service - embSvc, err := embeddings.New(config) - if err != nil { - log.Fatal(err) - } - defer embSvc.Close() - - // Generate embedding - text := "Aixgo is a production-grade AI framework for Go" - embedding, err := embSvc.Embed(ctx, text) - if err != nil { - log.Fatal(err) - } - - fmt.Printf("Dimensions: %d\n", len(embedding)) - fmt.Printf("First 5 values: %v\n", embedding[:5]) -} -``` - -**Rate Limits:** -- Without API key: ~100 requests/hour -- With free API key: ~1000 requests/hour -- Pro accounts: Higher limits available - -**Best For:** -- Development and testing -- Prototyping new features -- Small-scale applications -- Evaluating different models - -### HuggingFace TEI (Self-Hosted) - -Text Embeddings Inference (TEI) is a high-performance embedding server optimized for production workloads. - -**Key Features:** -- 10x faster than API calls -- No rate limits -- GPU acceleration support -- Automatic batching and caching -- Production-grade performance - -**Docker Deployment:** -```bash -# CPU deployment -docker run -d \ - --name tei \ - -p 8080:8080 \ - -v $PWD/data:/data \ - ghcr.io/huggingface/text-embeddings-inference:latest \ - --model-id BAAI/bge-large-en-v1.5 \ - --max-batch-size 128 \ - --max-client-batch-size 32 - -# GPU deployment (NVIDIA) -docker run -d \ - --name tei-gpu \ - -p 8080:8080 \ - -v $PWD/data:/data \ - --gpus all \ - ghcr.io/huggingface/text-embeddings-inference:latest \ - --model-id BAAI/bge-large-en-v1.5 \ - --max-batch-size 256 \ - --max-client-batch-size 64 -``` - -**Docker Compose Example:** -```yaml -version: '3.8' -services: - tei: - image: ghcr.io/huggingface/text-embeddings-inference:latest - ports: - - "8080:8080" - volumes: - - ./models:/data - environment: - - MODEL_ID=BAAI/bge-large-en-v1.5 - command: - - --model-id=BAAI/bge-large-en-v1.5 - - --max-batch-size=128 - - --max-client-batch-size=32 - - --max-batch-tokens=16384 - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] -``` - -**Configuration Example:** -```go -config := embeddings.Config{ - Provider: "huggingface_tei", - HuggingFaceTEI: &embeddings.HuggingFaceTEIConfig{ - Endpoint: "http://localhost:8080", - Model: "BAAI/bge-large-en-v1.5", - Normalize: true, // L2 normalize vectors - Truncate: true, // Auto-truncate long texts - }, -} - -embSvc, err := embeddings.New(config) -if err != nil { - log.Fatal(err) -} - -// Batch processing for efficiency -texts := []string{ - "First document", - "Second document", - "Third document", -} - -embeddings, err := embSvc.EmbedBatch(ctx, texts) -if err != nil { - log.Fatal(err) -} -``` - -**Performance Benefits:** -- Latency: 5-10ms per request (vs 50-100ms for API) -- Throughput: 1000+ embeddings/second with GPU -- Batch processing: Automatic optimization -- No network overhead to external APIs - -**Best For:** -- High-volume production workloads -- Low-latency requirements -- Cost-sensitive applications -- Data privacy requirements (on-premise) - -### OpenAI - -OpenAI provides state-of-the-art embedding models with excellent quality and flexible pricing. - -**Model Comparison:** - -| Model | Dimensions | Price/1M tokens | Quality | Use Case | -|-------|------------|-----------------|---------|-----------| -| text-embedding-3-small | 1536 | $0.02 | Very Good | General purpose | -| text-embedding-3-large | 3072 | $0.13 | Excellent | High accuracy | -| text-embedding-ada-002 | 1536 | $0.10 | Good | Legacy support | - -**Custom Dimensions:** -OpenAI's new models support dimension reduction: -```go -config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", - Dimensions: 512, // Reduce from 1536 to 512 - }, -} -``` - -**Configuration Example:** -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" -) - -func main() { - ctx := context.Background() - - // Configure OpenAI embeddings - config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", - // Optional: custom dimensions - // Dimensions: 768, - }, - } - - embSvc, err := embeddings.New(config) - if err != nil { - log.Fatal(err) - } - defer embSvc.Close() - - // Single embedding - embedding, err := embSvc.Embed(ctx, "Sample text") - if err != nil { - log.Fatal(err) - } - - // Batch embeddings (more efficient) - texts := []string{ - "First document about AI", - "Second document about machine learning", - "Third document about deep learning", - } - - embeddings, err := embSvc.EmbedBatch(ctx, texts) - if err != nil { - log.Fatal(err) - } - - fmt.Printf("Generated %d embeddings\n", len(embeddings)) -} -``` - -**Cost Optimization:** -```go -// Calculate embedding costs -func calculateCost(texts []string, model string) float64 { - totalTokens := 0 - for _, text := range texts { - // Rough estimate: 1 token ≈ 4 characters - totalTokens += len(text) / 4 - } - - costPerMillion := map[string]float64{ - "text-embedding-3-small": 0.02, - "text-embedding-3-large": 0.13, - "text-embedding-ada-002": 0.10, - } - - return float64(totalTokens) * costPerMillion[model] / 1_000_000 -} - -// Example usage -texts := make([]string, 10000) -// ... populate texts -cost := calculateCost(texts, "text-embedding-3-small") -fmt.Printf("Estimated cost: $%.4f\n", cost) -``` - -**Best For:** -- Production applications requiring high quality -- Applications with moderate volume -- When consistency and reliability are critical -- Multi-language support requirements - -## Choosing the Right Provider - -### Decision Matrix - -Use this matrix to select the optimal provider based on your requirements: - -| Requirement | HuggingFace API | HuggingFace TEI | OpenAI | -|-------------|-----------------|-----------------|---------| -| **Budget: Free** | ✅ Best | ✅ Best (self-host) | ❌ | -| **Budget: <$100/month** | ✅ | ✅ | ✅ Good | -| **Budget: >$100/month** | ✅ | ✅ | ✅ Best | -| **Quality: Good** | ✅ | ✅ | ✅ | -| **Quality: Excellent** | ✅ (bge-large) | ✅ (bge-large) | ✅ Best | -| **Latency: <10ms** | ❌ | ✅ Best | ❌ | -| **Latency: <50ms** | ❌ | ✅ | ✅ | -| **Latency: <100ms** | ✅ | ✅ | ✅ | -| **Volume: <1K/day** | ✅ Best | ✅ | ✅ | -| **Volume: 1K-100K/day** | ❌ | ✅ Best | ✅ | -| **Volume: >100K/day** | ❌ | ✅ Best | ✅ | -| **Infrastructure: None** | ✅ Best | ❌ | ✅ Best | -| **Infrastructure: Docker** | ❌ | ✅ | ❌ | -| **Data Privacy** | ❌ | ✅ Best | ❌ | - -### Practical Recommendations - -**For Development:** -```go -// Use HuggingFace API - free and easy -config := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - }, -} -``` - -**For Production (Budget-Conscious):** -```go -// Deploy TEI with Docker -config := embeddings.Config{ - Provider: "huggingface_tei", - HuggingFaceTEI: &embeddings.HuggingFaceTEIConfig{ - Endpoint: "http://tei-service:8080", - Model: "BAAI/bge-large-en-v1.5", - }, -} -``` - -**For Production (Quality-First):** -```go -// Use OpenAI for best quality -config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-large", - }, -} -``` - -**For Hybrid Approach:** -```go -// Use multiple providers with fallback -type EmbeddingServiceWithFallback struct { - primary embeddings.EmbeddingService - fallback embeddings.EmbeddingService -} - -func (e *EmbeddingServiceWithFallback) Embed(ctx context.Context, text string) ([]float32, error) { - emb, err := e.primary.Embed(ctx, text) - if err != nil { - // Fallback to secondary provider - return e.fallback.Embed(ctx, text) - } - return emb, nil -} -``` - -## Integration with Vectorstore - -Embeddings seamlessly integrate with Aixgo's vectorstore for building RAG systems and semantic search. - -### Matching Dimensions - -Ensure your vectorstore and embeddings use compatible dimensions: - -```go -package main - -import ( - "context" - "fmt" - "log" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" -) - -func main() { - ctx := context.Background() - - // Setup embeddings - embConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", // 384 dims - }, - } - embSvc, err := embeddings.New(embConfig) - if err != nil { - log.Fatal(err) - } - defer embSvc.Close() - - // Get embedding dimensions - dims := embSvc.Dimensions() - fmt.Printf("Embedding dimensions: %d\n", dims) - - // Create vectorstore with matching dimensions - store, err := memory.New( - memory.WithEmbeddingDimensions(dims), - ) - if err != nil { - log.Fatal(err) - } - defer store.Close() - - // Create collection - docs := store.Collection("documents") - - // Index document - text := "Aixgo provides powerful embedding capabilities" - embedding, err := embSvc.Embed(ctx, text) - if err != nil { - log.Fatal(err) - } - - doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent(text), - Embedding: vectorstore.NewEmbedding(embedding, embSvc.Model()), - } - - _, err = docs.Upsert(ctx, doc) - if err != nil { - log.Fatal(err) - } -} -``` - -### Batch Processing for Efficiency - -Process multiple documents efficiently: - -```go -func batchIndexDocuments( - ctx context.Context, - collection vectorstore.Collection, - embSvc embeddings.EmbeddingService, - texts []string, -) error { - const batchSize = 100 - - for i := 0; i < len(texts); i += batchSize { - end := i + batchSize - if end > len(texts) { - end = len(texts) - } - - batch := texts[i:end] - - // Batch embed - embeddings, err := embSvc.EmbedBatch(ctx, batch) - if err != nil { - return fmt.Errorf("batch embed failed: %w", err) - } - - // Create documents - docs := make([]*vectorstore.Document, len(batch)) - for j, text := range batch { - docs[j] = &vectorstore.Document{ - ID: fmt.Sprintf("doc-%d", i+j), - Content: vectorstore.NewTextContent(text), - Embedding: vectorstore.NewEmbedding(embeddings[j], embSvc.Model()), - } - } - - // Batch upsert - result, err := collection.UpsertBatch(ctx, docs) - if err != nil { - return fmt.Errorf("batch upsert failed: %w", err) - } - - fmt.Printf("Indexed batch %d-%d: %d succeeded, %d failed\n", - i, end, result.Succeeded, result.Failed) - } - - return nil -} -``` - -### Caching Strategies - -Implement embedding cache to reduce API calls and costs: - -```go -package main - -import ( - "context" - "crypto/md5" - "encoding/hex" - "sync" - "time" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" -) - -type CachedEmbeddingService struct { - service embeddings.EmbeddingService - cache map[string]cacheEntry - mu sync.RWMutex - ttl time.Duration -} - -type cacheEntry struct { - embedding []float32 - timestamp time.Time -} - -func NewCachedEmbeddingService(service embeddings.EmbeddingService, ttl time.Duration) *CachedEmbeddingService { - return &CachedEmbeddingService{ - service: service, - cache: make(map[string]cacheEntry), - ttl: ttl, - } -} - -func (c *CachedEmbeddingService) Embed(ctx context.Context, text string) ([]float32, error) { - // Generate cache key - hash := md5.Sum([]byte(text)) - key := hex.EncodeToString(hash[:]) - - // Check cache - c.mu.RLock() - if entry, ok := c.cache[key]; ok { - if time.Since(entry.timestamp) < c.ttl { - c.mu.RUnlock() - return entry.embedding, nil - } - } - c.mu.RUnlock() - - // Generate embedding - embedding, err := c.service.Embed(ctx, text) - if err != nil { - return nil, err - } - - // Update cache - c.mu.Lock() - c.cache[key] = cacheEntry{ - embedding: embedding, - timestamp: time.Now(), - } - c.mu.Unlock() - - return embedding, nil -} - -func (c *CachedEmbeddingService) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error) { - results := make([][]float32, len(texts)) - uncached := make([]int, 0) - uncachedTexts := make([]string, 0) - - // Check cache for each text - c.mu.RLock() - for i, text := range texts { - hash := md5.Sum([]byte(text)) - key := hex.EncodeToString(hash[:]) - - if entry, ok := c.cache[key]; ok && time.Since(entry.timestamp) < c.ttl { - results[i] = entry.embedding - } else { - uncached = append(uncached, i) - uncachedTexts = append(uncachedTexts, text) - } - } - c.mu.RUnlock() - - // Generate embeddings for uncached texts - if len(uncachedTexts) > 0 { - embeddings, err := c.service.EmbedBatch(ctx, uncachedTexts) - if err != nil { - return nil, err - } - - // Update results and cache - c.mu.Lock() - for i, idx := range uncached { - results[idx] = embeddings[i] - - hash := md5.Sum([]byte(uncachedTexts[i])) - key := hex.EncodeToString(hash[:]) - c.cache[key] = cacheEntry{ - embedding: embeddings[i], - timestamp: time.Now(), - } - } - c.mu.Unlock() - } - - return results, nil -} - -// Usage -func main() { - // Create base service - config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", - }, - } - baseSvc, _ := embeddings.New(config) - - // Wrap with cache (1 hour TTL) - cachedSvc := NewCachedEmbeddingService(baseSvc, time.Hour) - - // Use cached service - ctx := context.Background() - embedding, _ := cachedSvc.Embed(ctx, "Sample text") - // First call: generates embedding - - embedding2, _ := cachedSvc.Embed(ctx, "Sample text") - // Second call: returns from cache -} -``` - -### Error Handling - -Implement robust error handling for production: - -```go -func embedWithRetry( - ctx context.Context, - svc embeddings.EmbeddingService, - text string, - maxRetries int, -) ([]float32, error) { - var lastErr error - - for i := 0; i < maxRetries; i++ { - embedding, err := svc.Embed(ctx, text) - if err == nil { - return embedding, nil - } - - lastErr = err - - // Check if error is retryable - if !isRetryableError(err) { - return nil, err - } - - // Exponential backoff - backoff := time.Duration(1<= c.capacity { - oldest := c.lru.Back() - if oldest != nil { - c.lru.Remove(oldest) - delete(c.cache, oldest.Value.(*cacheItem).key) - } - } - - elem := c.lru.PushFront(&cacheItem{key, embedding}) - c.cache[key] = elem -} -``` - -**Consider Dimension Reduction:** -```go -// OpenAI supports custom dimensions -config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-large", - Dimensions: 1024, // Reduce from 3072 to 1024 - // Minimal quality loss, 66% cost reduction - }, -} -``` - -## Model Selection Guide - -### By Use Case - -| Use Case | Recommended Models | Dimensions | Provider | -|----------|-------------------|------------|----------| -| **General Text** | all-MiniLM-L6-v2, text-embedding-3-small | 384, 1536 | HF, OpenAI | -| **High Quality** | bge-large-en-v1.5, text-embedding-3-large | 1024, 3072 | HF, OpenAI | -| **Multilingual** | gte-large, multilingual-e5-large | 1024 | HF | -| **Code Search** | codebert-base, codegen-embeddings | 768 | HF | -| **Long Documents** | jina-embeddings-v2-base-en | 768 | HF | -| **Low Latency** | all-MiniLM-L6-v2 | 384 | HF TEI | -| **Budget** | all-MiniLM-L6-v2 | 384 | HF API | - -### By Dimensions - -**384 Dimensions (Fast, Good Quality):** -```go -// Best for real-time applications -Model: "sentence-transformers/all-MiniLM-L6-v2" -// 120M parameters, 5ms inference -``` - -**768 Dimensions (Balanced):** -```go -// Good balance of speed and quality -Model: "sentence-transformers/all-mpnet-base-v2" -// 110M parameters, 10ms inference -``` - -**1024 Dimensions (High Quality):** -```go -// Excellent quality for most use cases -Model: "BAAI/bge-large-en-v1.5" -// 335M parameters, 15ms inference -``` - -**1536 Dimensions (OpenAI Standard):** -```go -// OpenAI's default dimension -Model: "text-embedding-3-small" -// Good quality, managed service -``` - -**3072 Dimensions (Best Quality):** -```go -// Maximum quality available -Model: "text-embedding-3-large" -// Best for critical applications -``` - -### Performance Benchmarks - -```go -// Benchmark different models -func benchmarkModels() { - models := []struct { - name string - dims int - config embeddings.Config - }{ - { - name: "MiniLM", - dims: 384, - config: embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - }, - }, - }, - { - name: "BGE-Large", - dims: 1024, - config: embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "BAAI/bge-large-en-v1.5", - }, - }, - }, - } - - text := "Sample text for benchmarking" - - for _, model := range models { - svc, _ := embeddings.New(model.config) - - start := time.Now() - _, err := svc.Embed(context.Background(), text) - elapsed := time.Since(start) - - fmt.Printf("%s (%d dims): %v\n", model.name, model.dims, elapsed) - } -} -``` - -## Common Issues & Troubleshooting - -### Dimension Mismatches - -**Problem:** -```text -Error: dimension mismatch: expected 384, got 1536 -``` - -**Solution:** -```go -// Ensure consistent dimensions across your pipeline -embSvc, _ := embeddings.New(config) -dims := embSvc.Dimensions() - -// Configure vectorstore with matching dimensions -store, _ := memory.New( - memory.WithEmbeddingDimensions(dims), -) - -// For Firestore, create index with correct dimensions -// gcloud firestore indexes composite create \ -// --field-config=field-path=embedding.vector,vector-config='{"dimension":"384","flat":{}}' -``` - -### Rate Limits - -**Problem:** -```text -Error: rate limit exceeded -``` - -**Solutions:** -```go -// 1. Add API key for higher limits -config := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - APIKey: os.Getenv("HUGGINGFACE_API_KEY"), - }, -} - -// 2. Implement rate limiting -type RateLimitedEmbedding struct { - service embeddings.EmbeddingService - limiter *rate.Limiter -} - -func (r *RateLimitedEmbedding) Embed(ctx context.Context, text string) ([]float32, error) { - if err := r.limiter.Wait(ctx); err != nil { - return nil, err - } - return r.service.Embed(ctx, text) -} - -// 3. Deploy TEI for unlimited requests -// See TEI deployment section above -``` - -### Model Loading Errors - -**Problem:** -```text -Error: model is currently loading -``` - -**Solution:** -```go -config := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "BAAI/bge-large-en-v1.5", - WaitForModel: true, // Wait for model to load - UseCache: false, // Force fresh model load - }, -} -``` - -### Embedding Quality Issues - -**Problem:** Poor search results or similarity scores - -**Debugging:** -```go -func debugEmbeddingQuality(svc embeddings.EmbeddingService) { - ctx := context.Background() - - // Test similar texts - similar := []string{ - "machine learning", - "deep learning", - "artificial intelligence", - } - - embeddings := make([][]float32, len(similar)) - for i, text := range similar { - embeddings[i], _ = svc.Embed(ctx, text) - } - - // Check similarities - for i := 0; i < len(similar); i++ { - for j := i + 1; j < len(similar); j++ { - sim := cosineSimilarity(embeddings[i], embeddings[j]) - fmt.Printf("%s <-> %s: %.3f\n", similar[i], similar[j], sim) - } - } - - // Test dissimilar texts - dissimilar := "cooking recipes" - dissimilarEmb, _ := svc.Embed(ctx, dissimilar) - - for i, text := range similar { - sim := cosineSimilarity(embeddings[i], dissimilarEmb) - fmt.Printf("%s <-> %s: %.3f\n", text, dissimilar, sim) - } -} -``` - -**Solutions:** -1. Try a larger model (bge-large vs all-MiniLM) -2. Fine-tune on domain-specific data -3. Adjust text preprocessing -4. Use hybrid search (semantic + keyword) - -### Memory Issues with Large Batches - -**Problem:** -```text -Error: out of memory -``` - -**Solution:** -```go -func processLargeDataset( - svc embeddings.EmbeddingService, - texts []string, -) error { - const maxBatchSize = 50 - - for i := 0; i < len(texts); i += maxBatchSize { - end := i + maxBatchSize - if end > len(texts) { - end = len(texts) - } - - batch := texts[i:end] - - // Process batch with timeout - ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) - embeddings, err := svc.EmbedBatch(ctx, batch) - cancel() - - if err != nil { - // Log error and continue - log.Printf("Batch %d failed: %v", i/maxBatchSize, err) - continue - } - - // Process embeddings - for j, emb := range embeddings { - // Store or process embedding - _ = emb - _ = j - } - - // Optional: garbage collection hint - if i%1000 == 0 { - runtime.GC() - } - } - - return nil -} -``` - -## Production Deployment - -### HuggingFace TEI Setup - -**Full Production Setup:** - -```yaml -# docker-compose.yml -version: '3.8' - -services: - tei: - image: ghcr.io/huggingface/text-embeddings-inference:latest - ports: - - "8080:8080" - volumes: - - ./models:/data - environment: - - HUGGING_FACE_HUB_TOKEN=${HUGGING_FACE_HUB_TOKEN} - command: - - --model-id=BAAI/bge-large-en-v1.5 - - --max-batch-size=256 - - --max-client-batch-size=64 - - --max-batch-tokens=16384 - - --max-concurrent-requests=512 - deploy: - replicas: 3 - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8080/health"] - interval: 30s - timeout: 10s - retries: 3 - start_period: 40s - - nginx: - image: nginx:alpine - ports: - - "80:80" - volumes: - - ./nginx.conf:/etc/nginx/nginx.conf - depends_on: - - tei -``` - -**Nginx Load Balancer Configuration:** -```nginx -upstream tei_backend { - least_conn; - server tei_1:8080 max_fails=3 fail_timeout=30s; - server tei_2:8080 max_fails=3 fail_timeout=30s; - server tei_3:8080 max_fails=3 fail_timeout=30s; -} - -server { - listen 80; - - location / { - proxy_pass http://tei_backend; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - - # Timeouts - proxy_connect_timeout 10s; - proxy_send_timeout 60s; - proxy_read_timeout 60s; - - # Buffering - proxy_buffering off; - - # Health checks - proxy_next_upstream error timeout http_502 http_503; - proxy_next_upstream_tries 3; - } -} -``` - -**Monitoring and Scaling:** -```go -// Health check endpoint -func healthCheckTEI(endpoint string) error { - resp, err := http.Get(endpoint + "/health") - if err != nil { - return err - } - defer resp.Body.Close() - - if resp.StatusCode != http.StatusOK { - return fmt.Errorf("unhealthy: status %d", resp.StatusCode) - } - return nil -} - -// Auto-scaling based on latency -type TEIAutoScaler struct { - targetLatency time.Duration - minInstances int - maxInstances int -} - -func (s *TEIAutoScaler) CheckScaling(avgLatency time.Duration, currentInstances int) int { - if avgLatency > s.targetLatency && currentInstances < s.maxInstances { - return currentInstances + 1 // Scale up - } - if avgLatency < s.targetLatency/2 && currentInstances > s.minInstances { - return currentInstances - 1 // Scale down - } - return currentInstances // No change -} -``` - -### OpenAI Best Practices - -**Rate Limit Handling:** -```go -type OpenAIRateLimiter struct { - service embeddings.EmbeddingService - limiter *rate.Limiter - semaphore chan struct{} -} - -func NewOpenAIRateLimiter(service embeddings.EmbeddingService, rps int, maxConcurrent int) *OpenAIRateLimiter { - return &OpenAIRateLimiter{ - service: service, - limiter: rate.NewLimiter(rate.Limit(rps), rps), - semaphore: make(chan struct{}, maxConcurrent), - } -} - -func (r *OpenAIRateLimiter) Embed(ctx context.Context, text string) ([]float32, error) { - // Acquire semaphore - r.semaphore <- struct{}{} - defer func() { <-r.semaphore }() - - // Rate limit - if err := r.limiter.Wait(ctx); err != nil { - return nil, err - } - - return r.service.Embed(ctx, text) -} -``` - -**Retry Strategies:** -```go -func embedWithExponentialBackoff( - ctx context.Context, - svc embeddings.EmbeddingService, - text string, -) ([]float32, error) { - maxRetries := 5 - baseDelay := time.Second - - for i := 0; i < maxRetries; i++ { - embedding, err := svc.Embed(ctx, text) - if err == nil { - return embedding, nil - } - - // Check for rate limit error - if strings.Contains(err.Error(), "rate_limit") { - delay := baseDelay * time.Duration(math.Pow(2, float64(i))) - if delay > 30*time.Second { - delay = 30 * time.Second - } - - select { - case <-time.After(delay): - continue - case <-ctx.Done(): - return nil, ctx.Err() - } - } - - // Non-retryable error - return nil, err - } - - return nil, fmt.Errorf("max retries exceeded") -} -``` - -**Cost Monitoring:** -```go -type CostTracker struct { - mu sync.Mutex - tokenCount int64 - model string -} - -func (ct *CostTracker) Track(text string) { - // Estimate tokens (rough: 1 token ≈ 4 characters) - tokens := len(text) / 4 - - ct.mu.Lock() - ct.tokenCount += int64(tokens) - ct.mu.Unlock() -} - -func (ct *CostTracker) GetCost() float64 { - ct.mu.Lock() - defer ct.mu.Unlock() - - costPerMillion := map[string]float64{ - "text-embedding-3-small": 0.02, - "text-embedding-3-large": 0.13, - "text-embedding-ada-002": 0.10, - } - - return float64(ct.tokenCount) * costPerMillion[ct.model] / 1_000_000 -} - -func (ct *CostTracker) Reset() { - ct.mu.Lock() - ct.tokenCount = 0 - ct.mu.Unlock() -} -``` - -**Fallback Providers:** -```go -type FallbackEmbeddingService struct { - providers []embeddings.EmbeddingService -} - -func (f *FallbackEmbeddingService) Embed(ctx context.Context, text string) ([]float32, error) { - var lastErr error - - for i, provider := range f.providers { - embedding, err := provider.Embed(ctx, text) - if err == nil { - if i > 0 { - log.Printf("Using fallback provider %d", i) - } - return embedding, nil - } - lastErr = err - } - - return nil, fmt.Errorf("all providers failed: %w", lastErr) -} - -// Usage -fallbackSvc := &FallbackEmbeddingService{ - providers: []embeddings.EmbeddingService{ - openaiSvc, // Primary - teiSvc, // Fallback 1 - huggingfaceSvc, // Fallback 2 - }, -} -``` - -## Complete Examples - -### Building a Semantic Search System - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" -) - -type SemanticSearch struct { - embeddings embeddings.EmbeddingService - collection vectorstore.Collection -} - -func NewSemanticSearch() (*SemanticSearch, error) { - // Setup embeddings - embConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "BAAI/bge-large-en-v1.5", - }, - } - embSvc, err := embeddings.New(embConfig) - if err != nil { - return nil, err - } - - // Setup vector store - store, err := memory.New() - if err != nil { - return nil, err - } - - // Create collection - collection := store.Collection("documents", - vectorstore.WithDeduplication(true), - vectorstore.WithMaxDocuments(100000), - ) - - return &SemanticSearch{ - embeddings: embSvc, - collection: collection, - }, nil -} - -func (ss *SemanticSearch) IndexDocuments(ctx context.Context, documents map[string]string) error { - // Batch process documents - ids := make([]string, 0, len(documents)) - texts := make([]string, 0, len(documents)) - - for id, text := range documents { - ids = append(ids, id) - texts = append(texts, text) - } - - // Generate embeddings - embeddings, err := ss.embeddings.EmbedBatch(ctx, texts) - if err != nil { - return fmt.Errorf("embedding generation failed: %w", err) - } - - // Create document objects - docs := make([]*vectorstore.Document, len(texts)) - for i, text := range texts { - docs[i] = &vectorstore.Document{ - ID: ids[i], - Content: vectorstore.NewTextContent(text), - Embedding: vectorstore.NewEmbedding(embeddings[i], ss.embeddings.Model()), - } - } - - // Batch upsert - result, err := ss.collection.UpsertBatch(ctx, docs) - if err != nil { - return fmt.Errorf("upsert failed: %w", err) - } - - fmt.Printf("Indexed %d documents (%d succeeded, %d failed)\n", - len(docs), result.Succeeded, result.Failed) - - return nil -} - -func (ss *SemanticSearch) Search(ctx context.Context, query string, limit int) ([]SearchResult, error) { - // Generate query embedding - queryEmb, err := ss.embeddings.Embed(ctx, query) - if err != nil { - return nil, fmt.Errorf("query embedding failed: %w", err) - } - - // Search - result, err := ss.collection.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, ss.embeddings.Model()), - Limit: limit, - MinScore: 0.5, - }) - if err != nil { - return nil, fmt.Errorf("search failed: %w", err) - } - - // Convert results - results := make([]SearchResult, 0, len(result.Matches)) - for _, match := range result.Matches { - results = append(results, SearchResult{ - ID: match.Document.ID, - Content: match.Document.Content.String(), - Score: match.Score, - }) - } - - return results, nil -} - -type SearchResult struct { - ID string - Content string - Score float64 -} - -func (ss *SemanticSearch) Close() error { - return ss.embeddings.Close() -} - -func main() { - ctx := context.Background() - - // Initialize search system - search, err := NewSemanticSearch() - if err != nil { - log.Fatal(err) - } - defer search.Close() - - // Index sample documents - documents := map[string]string{ - "doc1": "Aixgo is a production-grade AI framework for Go", - "doc2": "Machine learning enables computers to learn from data", - "doc3": "Go is a statically typed, compiled programming language", - "doc4": "Neural networks are inspired by biological neural networks", - "doc5": "Docker containers provide consistent deployment environments", - } - - if err := search.IndexDocuments(ctx, documents); err != nil { - log.Fatal(err) - } - - // Perform searches - queries := []string{ - "AI framework for golang", - "deep learning and neural networks", - "container deployment", - } - - for _, query := range queries { - fmt.Printf("\nSearching for: '%s'\n", query) - results, err := search.Search(ctx, query, 3) - if err != nil { - log.Printf("Search failed: %v", err) - continue - } - - for i, result := range results { - fmt.Printf("%d. [%.3f] %s: %s\n", - i+1, result.Score, result.ID, result.Content) - } - } -} -``` - -### Multi-Provider Embedding Pipeline - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "sync" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" -) - -type MultiProviderPipeline struct { - providers map[string]embeddings.EmbeddingService - mu sync.RWMutex -} - -func NewMultiProviderPipeline() (*MultiProviderPipeline, error) { - providers := make(map[string]embeddings.EmbeddingService) - - // Setup HuggingFace API - hfConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - }, - } - hfSvc, err := embeddings.New(hfConfig) - if err == nil { - providers["huggingface"] = hfSvc - } - - // Setup OpenAI (if API key available) - if apiKey := os.Getenv("OPENAI_API_KEY"); apiKey != "" { - openaiConfig := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: apiKey, - Model: "text-embedding-3-small", - }, - } - openaiSvc, err := embeddings.New(openaiConfig) - if err == nil { - providers["openai"] = openaiSvc - } - } - - // Setup TEI (if available) - if endpoint := os.Getenv("TEI_ENDPOINT"); endpoint != "" { - teiConfig := embeddings.Config{ - Provider: "huggingface_tei", - HuggingFaceTEI: &embeddings.HuggingFaceTEIConfig{ - Endpoint: endpoint, - Model: "BAAI/bge-large-en-v1.5", - }, - } - teiSvc, err := embeddings.New(teiConfig) - if err == nil { - providers["tei"] = teiSvc - } - } - - if len(providers) == 0 { - return nil, fmt.Errorf("no embedding providers available") - } - - return &MultiProviderPipeline{ - providers: providers, - }, nil -} - -func (mp *MultiProviderPipeline) CompareProviders(ctx context.Context, text string) { - mp.mu.RLock() - defer mp.mu.RUnlock() - - fmt.Printf("Comparing embeddings for: '%s'\n\n", text) - - embeddings := make(map[string][]float32) - - // Generate embeddings with each provider - for name, svc := range mp.providers { - emb, err := svc.Embed(ctx, text) - if err != nil { - fmt.Printf("%s: Error - %v\n", name, err) - continue - } - embeddings[name] = emb - fmt.Printf("%s: Generated %d dimensions\n", name, len(emb)) - } - - // Compare similarities between providers - fmt.Println("\nCross-provider similarities:") - providers := make([]string, 0, len(embeddings)) - for name := range embeddings { - providers = append(providers, name) - } - - for i := 0; i < len(providers); i++ { - for j := i + 1; j < len(providers); j++ { - emb1 := embeddings[providers[i]] - emb2 := embeddings[providers[j]] - - if len(emb1) != len(emb2) { - fmt.Printf("%s <-> %s: Different dimensions (%d vs %d)\n", - providers[i], providers[j], len(emb1), len(emb2)) - continue - } - - similarity := cosineSimilarity(emb1, emb2) - fmt.Printf("%s <-> %s: %.3f\n", - providers[i], providers[j], similarity) - } - } -} - -func (mp *MultiProviderPipeline) BenchmarkProviders(ctx context.Context, texts []string) { - mp.mu.RLock() - defer mp.mu.RUnlock() - - fmt.Printf("Benchmarking with %d texts\n\n", len(texts)) - - for name, svc := range mp.providers { - start := time.Now() - - // Single embedding - _, err := svc.Embed(ctx, texts[0]) - singleTime := time.Since(start) - - if err != nil { - fmt.Printf("%s: Error - %v\n", name, err) - continue - } - - // Batch embedding - start = time.Now() - _, err = svc.EmbedBatch(ctx, texts) - batchTime := time.Since(start) - - if err != nil { - fmt.Printf("%s: Batch error - %v\n", name, err) - continue - } - - fmt.Printf("%s:\n", name) - fmt.Printf(" Single: %v\n", singleTime) - fmt.Printf(" Batch (%d): %v\n", len(texts), batchTime) - fmt.Printf(" Avg per text: %v\n\n", batchTime/time.Duration(len(texts))) - } -} - -func (mp *MultiProviderPipeline) Close() { - mp.mu.Lock() - defer mp.mu.Unlock() - - for _, svc := range mp.providers { - svc.Close() - } -} - -func main() { - ctx := context.Background() - - pipeline, err := NewMultiProviderPipeline() - if err != nil { - log.Fatal(err) - } - defer pipeline.Close() - - // Compare providers - pipeline.CompareProviders(ctx, "Artificial intelligence is transforming technology") - - // Benchmark providers - texts := []string{ - "Machine learning algorithms", - "Natural language processing", - "Computer vision applications", - "Deep learning neural networks", - "Reinforcement learning agents", - } - - fmt.Println("\n" + strings.Repeat("=", 50)) - pipeline.BenchmarkProviders(ctx, texts) -} -``` - -## Next Steps - -- **Vector Databases**: [Build RAG systems with embeddings](./vector-databases.md) -- **Provider Integration**: [Configure LLM and embedding providers](./provider-integration.md) -- **Production Deployment**: [Deploy embeddings at scale](./production-deployment.md) -- **API Reference**: [Embeddings Package Documentation](https://pkg.go.dev/github.com/aixgo-dev/aixgo/pkg/embeddings) - -## Resources - -- [HuggingFace Model Hub](https://huggingface.co/models?pipeline_tag=sentence-similarity) -- [OpenAI Embeddings Guide](https://platform.openai.com/docs/guides/embeddings) -- [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) -- [Embedding Model Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) -- [MTEB Benchmark](https://github.com/embeddings-benchmark/mteb) diff --git a/web/content/guides/extending-aixgo.md b/web/content/guides/extending-aixgo.md deleted file mode 100644 index 44d5ffe..0000000 --- a/web/content/guides/extending-aixgo.md +++ /dev/null @@ -1,1096 +0,0 @@ ---- -title: 'Extending Aixgo' -description: 'Guide to adding custom LLM providers, vector databases, and embedding services' -category: 'Advanced' -weight: 7 ---- - -# Extending Aixgo - -Aixgo is designed to be extensible. This guide shows you how to add custom providers for LLMs, vector databases, and embeddings. - -## Overview - -Aixgo uses a registry pattern that allows you to: - -- Add custom LLM providers -- Implement new vector database backends -- Integrate custom embedding services -- Extend existing providers with additional features - -### Provider Architecture - -```text -┌─────────────────────────────────────────┐ -│ Application Code │ -└─────────────────┬───────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────┐ -│ Provider-Agnostic Interface │ -│ (LLM, VectorStore, EmbeddingService) │ -└─────────────────┬───────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────┐ -│ Provider Registry │ -│ {"openai": Factory, "custom": ...} │ -└─────────────────┬───────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────┐ -│ Concrete Implementations │ -│ OpenAI, Anthropic, Custom, etc. │ -└─────────────────────────────────────────┘ -``` - -## Adding a Custom Vector Store - -Let's implement a Qdrant vector store as an example using Aixgo's Collection-based architecture. - -### Step 1: Implement the VectorStore Interface - -```go -// pkg/vectorstore/qdrant/qdrant.go -package qdrant - -import ( - "context" - "fmt" - "sync" - - qdrantclient "github.com/qdrant/go-client/qdrant" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" -) - -// QdrantStore implements the vectorstore.VectorStore interface -type QdrantStore struct { - client *qdrantclient.Client - collections sync.Map // thread-safe collection cache -} - -// Option configures the Qdrant store -type Option func(*QdrantStore) error - -// WithHost sets the Qdrant server address -func WithHost(host string) Option { - return func(q *QdrantStore) error { - q.host = host - return nil - } -} - -// WithAPIKey sets the API key for authentication -func WithAPIKey(apiKey string) Option { - return func(q *QdrantStore) error { - q.apiKey = apiKey - return nil - } -} - -// New creates a new Qdrant vector store -func New(opts ...Option) (*QdrantStore, error) { - store := &QdrantStore{ - host: "localhost:6333", - apiKey: "", - } - - // Apply options - for _, opt := range opts { - if err := opt(store); err != nil { - return nil, err - } - } - - // Create Qdrant client - client, err := qdrantclient.NewClient(&qdrantclient.Config{ - Host: store.host, - APIKey: store.apiKey, - }) - if err != nil { - return nil, fmt.Errorf("failed to create qdrant client: %w", err) - } - - store.client = client - return store, nil -} - -// Collection returns or creates a collection -func (q *QdrantStore) Collection(name string, opts ...vectorstore.CollectionOption) vectorstore.Collection { - // Check cache - if coll, ok := q.collections.Load(name); ok { - return coll.(vectorstore.Collection) - } - - // Create new collection - config := vectorstore.NewCollectionConfig(opts...) - coll := &QdrantCollection{ - store: q, - name: name, - config: config, - } - - // Ensure Qdrant collection exists - if err := coll.ensureExists(context.Background()); err != nil { - // Log error but return collection anyway - it will fail on use - fmt.Printf("Warning: failed to ensure collection exists: %v\n", err) - } - - // Cache and return - q.collections.Store(name, coll) - return coll -} - -// Close closes the Qdrant client -func (q *QdrantStore) Close() error { - return q.client.Close() -} -``` - -### Step 2: Implement the Collection Interface - -```go -// QdrantCollection implements vectorstore.Collection -type QdrantCollection struct { - store *QdrantStore - name string - config *vectorstore.CollectionConfig -} - -// ensureExists creates the Qdrant collection if it doesn't exist -func (c *QdrantCollection) ensureExists(ctx context.Context) error { - exists, err := c.store.client.CollectionExists(ctx, c.name) - if err != nil { - return fmt.Errorf("failed to check collection: %w", err) - } - - if !exists { - // Create collection with vector configuration - dims := c.config.Dimensions - if dims == 0 { - dims = 384 // Default dimension - } - - err = c.store.client.CreateCollection(ctx, &qdrantclient.CreateCollection{ - CollectionName: c.name, - VectorsConfig: &qdrantclient.VectorsConfig{ - Params: &qdrantclient.VectorParams{ - Size: uint64(dims), - Distance: qdrantclient.Distance_Cosine, - }, - }, - }) - if err != nil { - return fmt.Errorf("failed to create collection: %w", err) - } - } - - return nil -} - -// Upsert inserts or updates a document -func (c *QdrantCollection) Upsert(ctx context.Context, doc *vectorstore.Document) (*vectorstore.UpsertResult, error) { - // Convert to Qdrant point - point := &qdrantclient.PointStruct{ - Id: &qdrantclient.PointId{ - PointIdOptions: &qdrantclient.PointId_Uuid{ - Uuid: doc.ID, - }, - }, - Vectors: &qdrantclient.Vectors{ - VectorsOptions: &qdrantclient.Vectors_Vector{ - Vector: &qdrantclient.Vector{ - Data: doc.Embedding.Vector, - }, - }, - }, - Payload: buildPayload(doc), - } - - // Upsert to Qdrant - _, err := c.store.client.Upsert(ctx, &qdrantclient.UpsertPoints{ - CollectionName: c.name, - Points: []*qdrantclient.PointStruct{point}, - }) - if err != nil { - return nil, fmt.Errorf("failed to upsert: %w", err) - } - - return &vectorstore.UpsertResult{Inserted: 1}, nil -} - -// Query performs similarity search -func (c *QdrantCollection) Query(ctx context.Context, query *vectorstore.Query) (*vectorstore.QueryResult, error) { - // Build Qdrant search request - searchParams := &qdrantclient.SearchPoints{ - CollectionName: c.name, - Vector: query.Embedding.Vector, - Limit: uint64(query.Limit), - WithPayload: &qdrantclient.WithPayloadSelector{ - SelectorOptions: &qdrantclient.WithPayloadSelector_Enable{Enable: true}, - }, - } - - if query.MinScore > 0 { - searchParams.ScoreThreshold = &query.MinScore - } - - // Add filters if provided - if query.Filters != nil { - searchParams.Filter = buildQdrantFilter(query.Filters) - } - - // Execute search - results, err := c.store.client.Search(ctx, searchParams) - if err != nil { - return nil, fmt.Errorf("search failed: %w", err) - } - - // Convert results - matches := make([]*vectorstore.Match, 0, len(results)) - for _, hit := range results { - doc := parseQdrantPoint(hit) - matches = append(matches, &vectorstore.Match{ - Document: doc, - Score: hit.Score, - }) - } - - return &vectorstore.QueryResult{Matches: matches}, nil -} - -// Delete removes documents by IDs -func (c *QdrantCollection) Delete(ctx context.Context, ids ...string) error { - if len(ids) == 0 { - return nil - } - - // Convert string IDs to Qdrant point IDs - pointIds := make([]*qdrantclient.PointId, len(ids)) - for i, id := range ids { - pointIds[i] = &qdrantclient.PointId{ - PointIdOptions: &qdrantclient.PointId_Uuid{Uuid: id}, - } - } - - // Delete from Qdrant - _, err := c.store.client.Delete(ctx, &qdrantclient.DeletePoints{ - CollectionName: c.name, - Points: &qdrantclient.PointsSelector{ - PointsSelectorOneOf: &qdrantclient.PointsSelector_Points{ - Points: &qdrantclient.PointsIdsList{Ids: pointIds}, - }, - }, - }) - - return err -} - -// Helper functions - -func buildPayload(doc *vectorstore.Document) map[string]interface{} { - payload := make(map[string]interface{}) - - // Store content based on type - switch content := doc.Content.(type) { - case *vectorstore.TextContent: - payload["content_type"] = "text" - payload["content_text"] = content.Text - case *vectorstore.ImageContent: - payload["content_type"] = "image" - payload["content_url"] = content.URL - } - - // Store metadata - for k, v := range doc.Metadata { - payload[k] = v - } - - // Store scope if present - if doc.Scope != nil { - if doc.Scope.Tenant != "" { - payload["scope_tenant"] = doc.Scope.Tenant - } - if doc.Scope.User != "" { - payload["scope_user"] = doc.Scope.User - } - } - - // Store tags - if len(doc.Tags) > 0 { - payload["tags"] = doc.Tags - } - - return payload -} - -func parseQdrantPoint(point *qdrantclient.ScoredPoint) *vectorstore.Document { - doc := &vectorstore.Document{ - ID: point.Id.GetUuid(), - Metadata: make(map[string]any), - } - - // Parse content - if contentType, ok := point.Payload["content_type"].(string); ok { - switch contentType { - case "text": - if text, ok := point.Payload["content_text"].(string); ok { - doc.Content = vectorstore.NewTextContent(text) - } - case "image": - if url, ok := point.Payload["content_url"].(string); ok { - doc.Content = vectorstore.NewImageURL(url) - } - } - } - - // Parse scope - if tenant, ok := point.Payload["scope_tenant"].(string); ok { - if doc.Scope == nil { - doc.Scope = &vectorstore.Scope{} - } - doc.Scope.Tenant = tenant - } - if user, ok := point.Payload["scope_user"].(string); ok { - if doc.Scope == nil { - doc.Scope = &vectorstore.Scope{} - } - doc.Scope.User = user - } - - // Parse tags - if tags, ok := point.Payload["tags"].([]interface{}); ok { - doc.Tags = make([]string, len(tags)) - for i, tag := range tags { - doc.Tags[i] = tag.(string) - } - } - - // Parse metadata (exclude system fields) - for k, v := range point.Payload { - if !isSystemField(k) { - doc.Metadata[k] = v - } - } - - return doc -} - -func isSystemField(key string) bool { - systemFields := map[string]bool{ - "content_type": true, - "content_text": true, - "content_url": true, - "scope_tenant": true, - "scope_user": true, - "tags": true, - } - return systemFields[key] -} - -func buildQdrantFilter(filter vectorstore.Filter) *qdrantclient.Filter { - // Implement filter translation from Aixgo's Filter to Qdrant's Filter - // This is a simplified example - production code needs full implementation - return &qdrantclient.Filter{ - // Add filter conditions based on filter type - } -} -``` - -### Step 3: Usage Example - -```go -package main - -import ( - "context" - "fmt" - "os" - - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/qdrant" -) - -func main() { - ctx := context.Background() - - // Create Qdrant store with options - store, err := qdrant.New( - qdrant.WithHost("localhost:6333"), - qdrant.WithAPIKey(os.Getenv("QDRANT_API_KEY")), - ) - if err != nil { - panic(err) - } - defer store.Close() - - // Create a collection for documents - docs := store.Collection("documents", - vectorstore.WithDimensions(384), - vectorstore.WithDeduplication(true), - ) - - // Upsert a document - doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent("Aixgo is a Go framework for AI agents"), - Embedding: vectorstore.NewEmbedding( - []float32{0.1, 0.2, 0.3}, // ... 384 dimensions - "all-MiniLM-L6-v2", - ), - Tags: []string{"documentation"}, - Metadata: map[string]any{"source": "readme"}, - } - - result, err := docs.Upsert(ctx, doc) - if err != nil { - panic(err) - } - fmt.Printf("Inserted %d documents\n", result.Inserted) - - // Query the collection - queryResult, err := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding( - []float32{0.11, 0.21, 0.31}, // ... 384 dimensions - "all-MiniLM-L6-v2", - ), - Limit: 5, - MinScore: 0.7, - }) - if err != nil { - panic(err) - } - - // Display results - for _, match := range queryResult.Matches { - fmt.Printf("Score: %.2f - %s\n", - match.Score, - match.Document.Content.String(), - ) - } -} -``` - -## Adding a Custom Embedding Provider - -Let's implement a Cohere embeddings provider. - -### Step 1: Define Configuration - -```go -// pkg/embeddings/embeddings.go -package embeddings - -type CohereConfig struct { - APIKey string `yaml:"api_key" json:"api_key"` - Model string `yaml:"model" json:"model"` -} - -func (cc *CohereConfig) Validate() error { - if cc.APIKey == "" { - return fmt.Errorf("cohere api_key is required") - } - if cc.Model == "" { - cc.Model = "embed-english-v3.0" // Default - } - return nil -} - -// Add to Config -type Config struct { - // ... existing fields ... - Cohere *CohereConfig `yaml:"cohere,omitempty" json:"cohere,omitempty"` -} -``` - -### Step 2: Implement the Interface - -```go -// pkg/embeddings/cohere.go -package embeddings - -import ( - "bytes" - "context" - "encoding/json" - "fmt" - "io" - "net/http" - "time" -) - -type CohereEmbeddings struct { - apiKey string - model string - client *http.Client -} - -type cohereRequest struct { - Texts []string `json:"texts"` - Model string `json:"model"` -} - -type cohereResponse struct { - Embeddings [][]float32 `json:"embeddings"` -} - -func init() { - Register("cohere", NewCohere) -} - -func NewCohere(config Config) (EmbeddingService, error) { - if config.Cohere == nil { - return nil, fmt.Errorf("cohere configuration is required") - } - - return &CohereEmbeddings{ - apiKey: config.Cohere.APIKey, - model: config.Cohere.Model, - client: &http.Client{ - Timeout: 30 * time.Second, - }, - }, nil -} - -func (c *CohereEmbeddings) Embed(ctx context.Context, text string) ([]float32, error) { - embeddings, err := c.EmbedBatch(ctx, []string{text}) - if err != nil { - return nil, err - } - if len(embeddings) == 0 { - return nil, fmt.Errorf("no embeddings returned") - } - return embeddings[0], nil -} - -func (c *CohereEmbeddings) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error) { - reqBody := cohereRequest{ - Texts: texts, - Model: c.model, - } - - jsonData, err := json.Marshal(reqBody) - if err != nil { - return nil, fmt.Errorf("failed to marshal request: %w", err) - } - - req, err := http.NewRequestWithContext(ctx, "POST", "https://api.cohere.ai/v1/embed", bytes.NewBuffer(jsonData)) - if err != nil { - return nil, err - } - - req.Header.Set("Content-Type", "application/json") - req.Header.Set("Authorization", "Bearer "+c.apiKey) - - resp, err := c.client.Do(req) - if err != nil { - return nil, err - } - defer resp.Body.Close() - - body, err := io.ReadAll(resp.Body) - if err != nil { - return nil, err - } - - if resp.StatusCode != http.StatusOK { - return nil, fmt.Errorf("API error (status %d): %s", resp.StatusCode, string(body)) - } - - var cohereResp cohereResponse - if err := json.Unmarshal(body, &cohereResp); err != nil { - return nil, fmt.Errorf("failed to parse response: %w", err) - } - - return cohereResp.Embeddings, nil -} - -func (c *CohereEmbeddings) Dimensions() int { - // Cohere embed-english-v3.0 returns 1024 dimensions - return 1024 -} - -func (c *CohereEmbeddings) ModelName() string { - return c.model -} - -func (c *CohereEmbeddings) Close() error { - c.client.CloseIdleConnections() - return nil -} -``` - -## Adding a Custom LLM Provider - -Example: Adding Mistral AI support. - -### Implementation - -```go -// pkg/llm/mistral.go -package llm - -import ( - "context" - "fmt" - // ... imports -) - -type MistralLLM struct { - apiKey string - model string - temperature float32 - client *http.Client -} - -func init() { - Register("mistral", NewMistral) -} - -func NewMistral(config Config) (LLM, error) { - if config.Mistral == nil { - return nil, fmt.Errorf("mistral configuration required") - } - - return &MistralLLM{ - apiKey: config.Mistral.APIKey, - model: config.Mistral.Model, - temperature: config.Mistral.Temperature, - client: &http.Client{ - Timeout: 60 * time.Second, - }, - }, nil -} - -func (m *MistralLLM) Complete(ctx context.Context, prompt string) (string, error) { - // Implementation similar to OpenAI - // Use Mistral's API endpoints and format -} - -func (m *MistralLLM) Chat(ctx context.Context, messages []Message) (Message, error) { - // Implement chat completion -} - -func (m *MistralLLM) Close() error { - m.client.CloseIdleConnections() - return nil -} -``` - -## Best Practices - -### 1. Configuration Validation - -Always validate configuration early: - -```go -func (c *CustomConfig) Validate() error { - if c.RequiredField == "" { - return fmt.Errorf("required_field must be set") - } - if c.Port < 1 || c.Port > 65535 { - return fmt.Errorf("invalid port: %d", c.Port) - } - // Set defaults - if c.Timeout == 0 { - c.Timeout = 30 * time.Second - } - return nil -} -``` - -### 2. Error Handling - -Provide clear, actionable error messages: - -```go -// Bad -return fmt.Errorf("error: %w", err) - -// Good -return fmt.Errorf("failed to connect to Qdrant at %s:%d: %w", - q.host, q.port, err) -``` - -### 3. Context Support - -Always respect context cancellation: - -```go -func (c *CustomProvider) Search(ctx context.Context, query Query) (Results, error) { - // Check context before expensive operations - select { - case <-ctx.Done(): - return nil, ctx.Err() - default: - } - - // Pass context to HTTP requests - req, err := http.NewRequestWithContext(ctx, "POST", url, body) -} -``` - -### 4. Resource Cleanup - -Implement proper cleanup: - -```go -func (c *CustomProvider) Close() error { - // Close connections - if c.client != nil { - c.client.Close() - } - - // Close channels - if c.done != nil { - close(c.done) - } - - return nil -} -``` - -### 5. Testing - -Add comprehensive tests: - -```go -func TestCustomProvider(t *testing.T) { - // Test configuration validation - t.Run("ValidateConfig", func(t *testing.T) { - config := CustomConfig{} - err := config.Validate() - if err == nil { - t.Error("expected validation error for empty config") - } - }) - - // Test provider creation - t.Run("NewProvider", func(t *testing.T) { - config := CustomConfig{ - Host: "localhost", - Port: 1234, - } - provider, err := New(config) - if err != nil { - t.Fatalf("failed to create provider: %v", err) - } - defer provider.Close() - }) - - // Test operations - t.Run("Operations", func(t *testing.T) { - // Test upsert, search, delete, get - }) -} -``` - -## Complete Example: pgvector Provider - -Here's a complete implementation for PostgreSQL with pgvector extension: - -```go -// pkg/vectorstore/pgvector/pgvector.go -package pgvector - -import ( - "context" - "database/sql" - "fmt" - - _ "github.com/lib/pq" - "github.com/pgvector/pgvector-go" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" -) - -type PgVectorStore struct { - db *sql.DB - table string - embeddingDimensions int - defaultTopK int -} - -func init() { - vectorstore.Register("pgvector", New) -} - -func New(config vectorstore.Config) (vectorstore.VectorStore, error) { - if config.PgVector == nil { - return nil, fmt.Errorf("pgvector configuration required") - } - - // Connect to PostgreSQL - db, err := sql.Open("postgres", config.PgVector.ConnectionString) - if err != nil { - return nil, fmt.Errorf("failed to connect: %w", err) - } - - // Test connection - if err := db.Ping(); err != nil { - return nil, fmt.Errorf("failed to ping database: %w", err) - } - - // Create table if not exists - createTableSQL := fmt.Sprintf(` - CREATE TABLE IF NOT EXISTS %s ( - id TEXT PRIMARY KEY, - content TEXT NOT NULL, - embedding vector(%d), - metadata JSONB, - created_at TIMESTAMP DEFAULT NOW(), - updated_at TIMESTAMP DEFAULT NOW() - ) - `, config.PgVector.Table, config.EmbeddingDimensions) - - if _, err := db.Exec(createTableSQL); err != nil { - return nil, fmt.Errorf("failed to create table: %w", err) - } - - // Create vector index - indexSQL := fmt.Sprintf(` - CREATE INDEX IF NOT EXISTS %s_embedding_idx - ON %s USING ivfflat (embedding vector_cosine_ops) - WITH (lists = 100) - `, config.PgVector.Table, config.PgVector.Table) - - if _, err := db.Exec(indexSQL); err != nil { - return nil, fmt.Errorf("failed to create index: %w", err) - } - - return &PgVectorStore{ - db: db, - table: config.PgVector.Table, - embeddingDimensions: config.EmbeddingDimensions, - defaultTopK: config.DefaultTopK, - }, nil -} - -func (p *PgVectorStore) Upsert(ctx context.Context, documents []vectorstore.Document) error { - if len(documents) == 0 { - return nil - } - - tx, err := p.db.BeginTx(ctx, nil) - if err != nil { - return err - } - defer tx.Rollback() - - stmt, err := tx.PrepareContext(ctx, fmt.Sprintf(` - INSERT INTO %s (id, content, embedding, metadata, created_at, updated_at) - VALUES ($1, $2, $3, $4, $5, $6) - ON CONFLICT (id) DO UPDATE SET - content = EXCLUDED.content, - embedding = EXCLUDED.embedding, - metadata = EXCLUDED.metadata, - updated_at = EXCLUDED.updated_at - `, p.table)) - if err != nil { - return err - } - defer stmt.Close() - - for _, doc := range documents { - // Convert metadata to JSONB - metadataJSON, _ := json.Marshal(doc.Metadata) - - // Convert embedding to pgvector format - embedding := pgvector.NewVector(doc.Embedding) - - _, err := stmt.ExecContext(ctx, - doc.ID, - doc.Content, - embedding, - metadataJSON, - doc.CreatedAt, - doc.UpdatedAt, - ) - if err != nil { - return err - } - } - - return tx.Commit() -} - -func (p *PgVectorStore) Search(ctx context.Context, query vectorstore.SearchQuery) ([]vectorstore.SearchResult, error) { - if query.TopK == 0 { - query.TopK = p.defaultTopK - } - - embedding := pgvector.NewVector(query.Embedding) - - querySQL := fmt.Sprintf(` - SELECT id, content, embedding, metadata, - 1 - (embedding <=> $1) as score - FROM %s - WHERE 1 - (embedding <=> $1) >= $2 - ORDER BY embedding <=> $1 - LIMIT $3 - `, p.table) - - rows, err := p.db.QueryContext(ctx, querySQL, embedding, query.MinScore, query.TopK) - if err != nil { - return nil, err - } - defer rows.Close() - - var results []vectorstore.SearchResult - for rows.Next() { - var doc vectorstore.Document - var embeddingData pgvector.Vector - var metadataJSON []byte - var score float32 - - err := rows.Scan(&doc.ID, &doc.Content, &embeddingData, &metadataJSON, &score) - if err != nil { - return nil, err - } - - // Parse metadata - json.Unmarshal(metadataJSON, &doc.Metadata) - doc.Embedding = embeddingData.Slice() - - results = append(results, vectorstore.SearchResult{ - Document: doc, - Score: score, - }) - } - - return results, rows.Err() -} - -func (p *PgVectorStore) Delete(ctx context.Context, ids []string) error { - if len(ids) == 0 { - return nil - } - - placeholders := make([]string, len(ids)) - args := make([]interface{}, len(ids)) - for i, id := range ids { - placeholders[i] = fmt.Sprintf("$%d", i+1) - args[i] = id - } - - query := fmt.Sprintf("DELETE FROM %s WHERE id IN (%s)", - p.table, strings.Join(placeholders, ",")) - - _, err := p.db.ExecContext(ctx, query, args...) - return err -} - -func (p *PgVectorStore) Get(ctx context.Context, ids []string) ([]vectorstore.Document, error) { - // Similar to Delete, but with SELECT -} - -func (p *PgVectorStore) Close() error { - return p.db.Close() -} -``` - -## Deployment and Distribution - -### Creating a Separate Module - -For external providers, create a separate module: - -```text -github.com/yourorg/aixgo-qdrant/ -├── go.mod -├── go.sum -├── README.md -├── qdrant.go -└── qdrant_test.go -``` - -**go.mod:** - -```go -module github.com/yourorg/aixgo-qdrant - -go 1.21 - -require ( - github.com/aixgo-dev/aixgo v0.2.4 - github.com/qdrant/go-client v1.0.0 -) -``` - -**Usage:** - -```go -import ( - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - _ "github.com/yourorg/aixgo-qdrant" // Register provider -) - -config := vectorstore.Config{ - Provider: "qdrant", - // ... -} -``` - -## Testing Custom Providers - -### Integration Tests - -```go -func TestQdrantIntegration(t *testing.T) { - if testing.Short() { - t.Skip("skipping integration test") - } - - // Start Qdrant with Docker - ctx := context.Background() - container := startQdrantContainer(t) - defer container.Stop(ctx) - - // Create provider - config := vectorstore.Config{ - Provider: "qdrant", - EmbeddingDimensions: 384, - Qdrant: &vectorstore.QdrantConfig{ - Host: "localhost", - Port: 6333, - Collection: "test", - }, - } - - store, err := vectorstore.New(config) - if err != nil { - t.Fatal(err) - } - defer store.Close() - - // Run tests - testUpsert(t, store) - testSearch(t, store) - testDelete(t, store) -} -``` - -## Contributing Back - -To contribute your provider to Aixgo: - -1. Fork the repository -2. Create a feature branch -3. Add your provider implementation -4. Add tests (unit + integration) -5. Update documentation -6. Submit a pull request - -See [CONTRIBUTING.md](https://github.com/aixgo-dev/aixgo/blob/main/CONTRIBUTING.md) for details. - -## Resources - -- [Qdrant Go Client](https://github.com/qdrant/go-client) -- [pgvector Go](https://github.com/pgvector/pgvector-go) -- [Cohere API](https://docs.cohere.ai/) -- [Mistral API](https://docs.mistral.ai/) - -## Next Steps - -- Try the [RAG Agent Example](../../examples/rag-agent) -- Read [Vector Databases Guide](./vector-databases.md) -- Explore [Provider Integration](./provider-integration.md) diff --git a/web/content/guides/multi-agent-orchestration.md b/web/content/guides/multi-agent-orchestration.md deleted file mode 100644 index 45de73e..0000000 --- a/web/content/guides/multi-agent-orchestration.md +++ /dev/null @@ -1,836 +0,0 @@ ---- -title: 'Multi-Agent Orchestration' -description: 'Learn how to coordinate multiple agents with supervisor patterns and message routing.' -breadcrumb: 'Core Concepts' -category: 'Core Concepts' -weight: 4 ---- - -Multi-agent systems unlock powerful capabilities: divide complex tasks across specialized agents, enable parallel processing, and create sophisticated workflows. Aixgo provides 13 production-proven orchestration patterns that make building these systems straightforward. - -## Overview - -Aixgo supports **13 production-proven orchestration patterns** for building AI agent systems. Each pattern solves specific coordination challenges and is backed by real-world production usage. - -### Pattern Status - -| Pattern | Status | Production Usage | Key Benefit | -|---------|--------|-----------------|-------------| -| Supervisor | ✅ Stable | High | Central coordination & lifecycle management | -| Sequential | ✅ Stable | High | Ordered task execution | -| Parallel | ✅ Stable | High | Concurrent processing | -| Router | ✅ Stable | High | Intelligent task routing | -| Swarm | ✅ Stable | Medium | Autonomous collaboration | -| Hierarchical | ✅ Stable | Medium | Multi-level delegation | -| RAG | ✅ Stable | High | Knowledge-augmented responses | -| Reflection | ✅ Stable | Medium | Self-improvement & critique | -| Ensemble | ✅ Stable | Medium | Multi-model consensus | -| Classifier | ✅ Stable | High | Intelligent categorization | -| Aggregation | ✅ Stable | High | Multi-source synthesis | -| Planning | ✅ Stable | Medium | Strategic task decomposition | -| MapReduce | ✅ Stable | Medium | Distributed data processing | - -All 13 patterns are production-ready and actively used in real-world applications. - -## Pattern Catalog - -For comprehensive pattern details, see [PATTERNS.md](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md). - -### 1. Supervisor Pattern - -The supervisor coordinates agent lifecycle, message routing, and execution constraints. It's the foundational layer for all multi-agent systems. - -**When to use**: Every multi-agent system requiring centralized control, lifecycle management, and message routing. - -**Quick example**: - -```yaml -supervisor: - name: coordinator - max_rounds: 10 - -agents: - - name: worker-1 - outputs: [{ target: worker-2 }] - - name: worker-2 - inputs: [{ source: worker-1 }] -``` - -**See**: [Supervisor Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#1-supervisor-pattern) - -### 2. Sequential Pattern - -Execute agents in order where each step depends on the previous output. Ideal for ETL pipelines and multi-stage workflows. - -**When to use**: Ordered execution, stage dependencies, deterministic processing. - -**Quick example**: - -```yaml -agents: - - name: ingest - outputs: [{ target: process }] - - name: process - inputs: [{ source: ingest }] - outputs: [{ target: store }] - - name: store - inputs: [{ source: process }] -``` - -**See**: [Sequential Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#2-sequential-pattern) - -### 3. Parallel Pattern - -Execute multiple agents concurrently for independent tasks. Reduces latency through parallelism. - -**When to use**: Independent tasks, multi-perspective analysis, distributed processing. - -**Quick example**: - -```yaml -agents: - - name: source - outputs: [{ target: analyzer-1 }, { target: analyzer-2 }, { target: analyzer-3 }] - - name: analyzer-1 - inputs: [{ source: source }] - - name: analyzer-2 - inputs: [{ source: source }] - - name: analyzer-3 - inputs: [{ source: source }] -``` - -**See**: [Parallel Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#3-parallel-pattern) - -### 4. Router Pattern - -Intelligently route messages to appropriate agents based on classification or routing logic. Achieves 60-80% cost savings in production. - -**When to use**: Cost optimization, skill-based routing, conditional workflows. - -**Quick example**: - -```yaml -agents: - - name: classifier - role: classifier - categories: [simple, moderate, complex] - outputs: - - target: simple-handler - condition: 'category == simple' - - target: complex-handler - condition: 'category == complex' -``` - -**See**: [Router Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#4-router-pattern) - -### 5. Swarm Pattern - -Autonomous peer-to-peer agent collaboration without central coordination. Agents self-organize and communicate directly. - -**When to use**: Brainstorming, consensus building, emergent problem solving. - -**Quick example**: - -```yaml -agents: - - name: agent-1 - inputs: [{ source: agent-2 }, { source: agent-3 }] - outputs: [{ target: agent-2 }, { target: agent-3 }] - - name: agent-2 - inputs: [{ source: agent-1 }, { source: agent-3 }] - outputs: [{ target: agent-1 }, { target: agent-3 }] - - name: agent-3 - inputs: [{ source: agent-1 }, { source: agent-2 }] - outputs: [{ target: agent-1 }, { target: agent-2 }] -``` - -**See**: [Swarm Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#5-swarm-pattern) - -### 6. Hierarchical Pattern - -Tree structure with managers delegating to workers. Enables multi-level task decomposition and scalable coordination. - -**When to use**: Large agent teams (>10), organizational modeling, multi-level workflows. - -**Quick example**: - -```yaml -agents: - - name: executive - outputs: [{ target: manager-1 }, { target: manager-2 }] - - name: manager-1 - inputs: [{ source: executive }] - outputs: [{ target: worker-1a }, { target: worker-1b }] - - name: worker-1a - inputs: [{ source: manager-1 }] -``` - -**See**: [Hierarchical Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#6-hierarchical-pattern) - -### 7. RAG Pattern - -Combines knowledge retrieval with generation to ground responses in factual data. Essential for question answering over documents. - -**When to use**: Knowledge base integration, reducing hallucinations, domain-specific information. - -**Quick example**: - -```yaml -agents: - - name: retriever - role: retriever - vector_store: pinecone - top_k: 5 - outputs: [{ target: generator }] - - name: generator - role: react - inputs: [{ source: retriever }] -``` - -**See**: [RAG Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#7-rag-pattern) - -### 8. Reflection Pattern - -Agents critique and improve their outputs through iterative self-review. Improves quality through self-correction. - -**When to use**: High-quality content generation, code review, iterative refinement. - -**Quick example**: - -```yaml -agents: - - name: generator - outputs: [{ target: critic }] - - name: critic - inputs: [{ source: generator }] - outputs: [{ target: refiner }] - - name: refiner - inputs: [{ source: critic }] -``` - -**See**: [Reflection Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#8-reflection-pattern) - -### 9. Ensemble Pattern - -Combine outputs from multiple models to improve accuracy and reduce variance through voting. - -**When to use**: High-stakes decisions, model comparison, uncertainty reduction. - -**Quick example**: - -```yaml -agents: - - name: model-1 - outputs: [{ target: aggregator }] - - name: model-2 - outputs: [{ target: aggregator }] - - name: model-3 - outputs: [{ target: aggregator }] - - name: aggregator - role: aggregator - strategy: majority-vote - inputs: [{ source: model-1 }, { source: model-2 }, { source: model-3 }] -``` - -**See**: [Ensemble Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#9-ensemble-pattern) - -### 10. Classifier Pattern - -Categorize inputs into predefined categories for intelligent routing. Achieves 95%+ accuracy in production. - -**When to use**: Content categorization, intent detection, automated triage. - -**Quick example**: - -```yaml -agents: - - name: classifier - role: classifier - categories: [technical, business, legal, marketing] - outputs: - - target: technical-handler - condition: 'has_category("technical")' -``` - -**See**: [Classifier Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#10-classifier-pattern) - -### 11. Aggregation Pattern - -Combine multiple inputs into coherent synthesis. Supports consensus, weighted, and semantic aggregation strategies. - -**When to use**: Multi-source data fusion, consensus decision making, report generation. - -**Quick example**: - -```yaml -agents: - - name: aggregator - role: aggregator - strategy: consensus - min_inputs: 2 - timeout: 30s - inputs: [{ source: source-1 }, { source: source-2 }, { source: source-3 }] -``` - -**See**: [Aggregation Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#11-aggregation-pattern) - -### 12. Planning Pattern - -Decompose complex tasks into executable steps for strategic problem solving. - -**When to use**: Complex task decomposition, multi-step workflows, dynamic workflow generation. - -**Quick example**: - -```yaml -agents: - - name: planner - role: planner - strategy: chain-of-thought - max_steps: 10 - outputs: [{ target: executor }] - - name: executor - inputs: [{ source: planner }] -``` - -**See**: [Planning Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#12-planning-pattern) - -### 13. MapReduce Pattern - -Distribute processing across workers (map) and aggregate results (reduce) for large-scale data processing. - -**When to use**: Batch document processing, large dataset analysis, distributed computation. - -**Quick example**: - -```yaml -agents: - - name: splitter - outputs: [{ target: mapper-1 }, { target: mapper-2 }, { target: mapper-3 }] - - name: mapper-1 - inputs: [{ source: splitter }] - outputs: [{ target: reducer }] - - name: reducer - role: aggregator - strategy: concatenate - inputs: [{ source: mapper-1 }, { source: mapper-2 }, { source: mapper-3 }] -``` - -**See**: [MapReduce Pattern Details](https://github.com/aixgo-dev/aixgo/blob/main/docs/PATTERNS.md#13-mapreduce-pattern) - -## Pattern Comparison Matrix - -| Pattern | Complexity | Latency | Cost | Accuracy | Use When | -|---------|-----------|---------|------|----------|----------| -| Sequential | Low | High | Low | Medium | Ordered execution needed | -| Parallel | Low | Low | Medium | Medium | Independent tasks | -| Router | Medium | Low | Low | Medium | Cost optimization critical | -| Supervisor | Low | Medium | Low | Medium | Always (foundation) | -| Swarm | High | High | High | High | Creative problem solving | -| Hierarchical | High | Medium | Medium | Medium | Large agent teams | -| RAG | Medium | Medium | Medium | High | Knowledge grounding needed | -| Reflection | Medium | High | High | High | Quality > speed | -| Ensemble | Medium | High | High | High | Critical decisions | -| Classifier | Low | Low | Low | High | Categorization needed | -| Aggregation | Medium | Medium | Medium | High | Multi-source synthesis | -| Planning | Medium | Medium | Medium | Medium | Complex task decomposition | -| MapReduce | Medium | Low | Medium | Medium | Large-scale processing | - -## Pattern Selection Guide - -### By Use Case - -**Cost Optimization** - -- Router: 60-80% cost savings by routing to appropriate model tiers -- RAG: Reduce hallucinations, fewer retries -- Classifier: Efficient categorization before expensive processing - -**Speed/Performance** - -- Parallel: Concurrent execution reduces latency -- Router: Fast routing to appropriate handlers -- MapReduce: Distributed processing for throughput - -**Accuracy/Quality** - -- Ensemble: Multi-model consensus improves accuracy -- Reflection: Iterative refinement enhances quality -- RAG: Grounded responses reduce errors - -**Flexibility** - -- Swarm: Emergent behavior adapts to scenarios -- Supervisor: Central control enables dynamic orchestration -- Hierarchical: Scalable multi-level delegation - -**Complex Workflows** - -- Hierarchical: Multi-level task decomposition -- Sequential: Ordered multi-stage processing -- Planning: Strategic task breakdown - -### Decision Tree - -```text -Start: What's your primary goal? - -├─ Cost Optimization -│ ├─ Route by complexity? → Router Pattern -│ ├─ Ground in knowledge? → RAG Pattern -│ └─ Categorize first? → Classifier Pattern -│ -├─ Speed/Performance -│ ├─ Independent tasks? → Parallel Pattern -│ ├─ Large dataset? → MapReduce Pattern -│ └─ Route efficiently? → Router Pattern -│ -├─ Accuracy/Quality -│ ├─ Multi-model consensus? → Ensemble Pattern -│ ├─ Iterative improvement? → Reflection Pattern -│ └─ Knowledge grounding? → RAG Pattern -│ -├─ Flexibility -│ ├─ Emergent behavior? → Swarm Pattern -│ ├─ Central control? → Supervisor Pattern -│ └─ Multi-level hierarchy? → Hierarchical Pattern -│ -└─ Complex Workflows - ├─ Ordered stages? → Sequential Pattern - ├─ Task decomposition? → Planning Pattern - └─ Large team? → Hierarchical Pattern -``` - -## Combining Patterns - -Patterns can be composed to create sophisticated systems. Common combinations: - -**Router + RAG**: Route queries by complexity, use RAG for complex queries requiring knowledge grounding. - -**Classifier + Parallel**: Classify input, then run parallel analyses on each category. - -**Sequential + Reflection**: Multi-stage pipeline where each stage includes reflection for quality. - -**Hierarchical + MapReduce**: Hierarchical coordination of MapReduce workers for massive scale. - -**Ensemble + Aggregation**: Multiple models with sophisticated aggregation strategies. - -## Implementation Status - -**Production Ready (13 patterns)** - -All 13 patterns are stable and production-ready: - -- Supervisor, Sequential, Parallel, Router -- Swarm, Hierarchical, RAG, Reflection -- Ensemble, Classifier, Aggregation -- Planning, MapReduce - -**Future Patterns (2 patterns)** - -Under research and development: - -- Tool-Use Pattern: Agents with function calling capabilities -- Memory Pattern: Long-term memory and context retention - -## Phased Agent Startup with Dependencies - -**New in v0.2.3**: Aixgo now provides dependency-aware agent startup that eliminates race conditions and ensures agents initialize in the correct order. - -### The Problem: Startup Race Conditions - -In multi-agent systems, orchestrator agents often need to verify their dependencies are ready during startup. When all agents start concurrently, race conditions can occur where orchestrators try to use dependencies that haven't finished initializing. - -### The Solution: depends_on Field - -Declare explicit startup dependencies using the `depends_on` field: - -```yaml -agents: - # Phase 0: No dependencies - - name: database - role: producer - - # Phase 1: Depends on database - - name: cache - role: producer - depends_on: [database] - - # Phase 2: Depends on database and cache - - name: api - role: react - depends_on: [database, cache] -``` - -### How It Works - -1. **Topological Sort**: Aixgo uses Kahn's algorithm to compute startup order -1. **Phase Grouping**: Agents are grouped into phases (0, 1, 2, ...) based on dependencies -1. **Concurrent Within Phases**: Agents in the same phase start concurrently for performance -1. **Ready Polling**: Each phase waits for all agents to be `Ready()` before proceeding -1. **Timeout Protection**: Configurable timeout (30s default) prevents infinite waiting - -### Benefits - -**Eliminates Race Conditions** - -```yaml -# Before v0.2.3: Race condition possible -agents: - - name: orchestrator # Might start before workers - - name: worker1 - - name: worker2 - -# After v0.2.3: Guaranteed order -agents: - - name: worker1 - - name: worker2 - - name: orchestrator - depends_on: [worker1, worker2] # Starts only after workers are ready -``` - -**Parallel Performance** - -Agents without dependencies on each other start concurrently: - -```yaml -agents: - # These start concurrently (Phase 0) - - name: service-a - - name: service-b - - name: service-c - - # This starts after all above are ready (Phase 1) - - name: orchestrator - depends_on: [service-a, service-b, service-c] -``` - -**Clear Error Messages** - -If startup fails, you get precise information about which agent didn't become ready: - -```text -agent startup failed: agent 'database' not ready after 30s timeout -``` - -### Configuration - -#### Startup Timeout - -Control how long to wait for agents to become ready: - -```yaml -config: - agent_start_timeout: 45s # Default: 30s -``` - -#### Complex Dependencies - -Build multi-tier systems with complex dependency graphs: - -```yaml -agents: - # Tier 1: Foundation - - name: config-service - role: producer - - - name: database - role: producer - - # Tier 2: Services depending on Tier 1 - - name: cache - role: producer - depends_on: [database, config-service] - - - name: auth-service - role: react - depends_on: [database] - - # Tier 3: Application layer - - name: user-service - role: react - depends_on: [database, cache, auth-service] - - - name: order-service - role: react - depends_on: [database, cache, auth-service] - - # Tier 4: API Gateway - - name: api-gateway - role: react - depends_on: [user-service, order-service] -``` - -**Startup sequence**: - -1. Phase 0: `config-service`, `database` (concurrent) -1. Phase 1: `cache`, `auth-service` (concurrent, after Phase 0) -1. Phase 2: `user-service`, `order-service` (concurrent, after Phase 1) -1. Phase 3: `api-gateway` (after Phase 2) - -### Backward Compatibility - -Phased startup is **opt-in and backward compatible**: - -- **Without `depends_on`**: All agents start concurrently (v0.2.2 behavior) -- **With `depends_on`**: Agents start in phases based on dependencies - -### Best Practices - -1. **Declare True Dependencies**: Only use `depends_on` for agents that must be ready before others start -1. **Minimize Phases**: Group independent agents at the same tier for faster startup -1. **Fast Ready() Checks**: Keep `Ready()` implementations lightweight (simple boolean checks) -1. **Set Appropriate Timeouts**: Increase `agent_start_timeout` if agents need time to initialize - -### Cycle Detection - -Aixgo automatically detects circular dependencies and fails fast with a clear error: - -```yaml -# This configuration will fail validation -agents: - - name: agent-a - depends_on: [agent-b] - - name: agent-b - depends_on: [agent-a] # Circular dependency! -``` - -Error message: - -```text -circular dependency detected: agent-a -> agent-b -> agent-a -``` - -### Runtime Support - -Phased startup works across all runtime implementations: - -- **LocalRuntime**: Single-process deployments -- **Runtime**: Lightweight runtime -- **DistributedRuntime**: Multi-node gRPC-based deployments - -### Message-Based Dependencies (Legacy) - -Before v0.2.3, dependencies were implicit through inputs/outputs. This still works but doesn't guarantee startup order: - -```yaml -# Legacy approach: message routing (doesn't guarantee startup order) -agents: - - name: worker - outputs: [{ target: orchestrator }] - - name: orchestrator - inputs: [{ source: worker }] -``` - -**Recommendation**: Use `depends_on` for startup dependencies, keep `inputs`/`outputs` for message routing. - -## Execution Constraints - -### Max Rounds - -Limits the total number of workflow iterations: - -```yaml -supervisor: - max_rounds: 10 # Stop after 10 iterations -``` - -This prevents runaway workflows and ensures predictable resource usage. - -### Agent-Level Timeouts - -Set timeouts per agent for long-running operations: - -```yaml -agents: - - name: slow-processor - role: react - model: gpt-4-turbo - timeout: 30s # Fail if processing takes >30s -``` - -## Error Handling - -The supervisor provides built-in error handling: - -### Automatic Retry - -Configure retry behavior for transient failures: - -```yaml -agents: - - name: api-caller - role: react - retry: - max_attempts: 3 - backoff: exponential - initial_interval: 1s -``` - -### Graceful Degradation - -Agents can fail without crashing the entire system: - -```yaml -supervisor: - failure_mode: continue # or 'stop' to halt on first error -``` - -## Best Practices - -### 1. Keep Workflows Acyclic - -Avoid circular dependencies—messages should flow in one direction: - -**Bad:** - -```yaml -# Creates a cycle: A → B → C → A -agents: - - name: A - outputs: [{ target: B }] - inputs: [{ source: C }] # Circular! -``` - -**Good:** - -```yaml -# Acyclic: A → B → C -agents: - - name: A - outputs: [{ target: B }] - - name: B - inputs: [{ source: A }] - outputs: [{ target: C }] - - name: C - inputs: [{ source: B }] -``` - -### 2. Use Descriptive Names - -Agent names should describe their role: - -```yaml -agents: - - name: customer-data-enricher # Clear - - name: agent-1 # Unclear -``` - -### 3. Set Reasonable max_rounds - -Too high = wasted resources, too low = incomplete workflows: - -```yaml -supervisor: - max_rounds: 10 # Typical: 5-20 rounds -``` - -### 4. Monitor Performance - -Use observability to track: - -- Message latency between agents -- Agent processing time -- Bottlenecks in the workflow -- Pattern-specific metrics - -## Real-World Example: Content Moderation Pipeline - -This example combines multiple patterns (Classifier, Parallel, Aggregation, Planning): - -```yaml -supervisor: - name: content-moderation - max_rounds: 100 - -agents: - # Ingest content submissions - - name: content-ingestion - role: producer - interval: 100ms - outputs: - - target: content-classifier - - target: text-analyzer - - target: image-analyzer - - # Classify content type (Classifier Pattern) - - name: content-classifier - role: classifier - model: gpt-4-turbo - prompt: 'Classify content into categories' - strategy: multi-label - categories: - - user-generated - - commercial - - news - - entertainment - inputs: - - source: content-ingestion - outputs: - - target: decision-aggregator - - # Parallel analysis (Parallel Pattern) - - name: text-analyzer - role: react - model: gpt-4-turbo - prompt: 'Analyze text for policy violations' - inputs: - - source: content-ingestion - outputs: - - target: decision-aggregator - - - name: image-analyzer - role: react - model: gpt-4-turbo - prompt: 'Analyze images for inappropriate content' - inputs: - - source: content-ingestion - outputs: - - target: decision-aggregator - - # Aggregate results (Aggregation Pattern) - - name: decision-aggregator - role: aggregator - model: gpt-4-turbo - prompt: 'Combine classification, text, and image analysis using consensus strategy' - strategy: consensus - inputs: - - source: content-classifier - - source: text-analyzer - - source: image-analyzer - outputs: - - target: action-planner - - # Plan actions (Planning Pattern) - - name: action-planner - role: planner - model: gpt-4-turbo - prompt: 'Plan appropriate actions: approve, reject, or flag for human review' - strategy: chain-of-thought - inputs: - - source: decision-aggregator - outputs: - - target: action-handler - - # Execute decision - - name: action-handler - role: logger - inputs: - - source: action-planner -``` - -This pipeline demonstrates: - -1. Ingests content at 10 submissions/second -2. Classifies content type with multi-label strategy (Classifier Pattern) -3. Analyzes text and images in parallel (Parallel Pattern) -4. Aggregates results using consensus strategy (Aggregation Pattern) -5. Plans actions with chain-of-thought reasoning (Planning Pattern) -6. Executes appropriate action - -## Next Steps - -- **[Production Deployment](/guides/production-deployment)** - Deploy multi-agent systems to production -- **[Observability & Monitoring](/guides/observability)** - Track and debug complex workflows -- **[Type Safety & LLM Integration](/guides/type-safety)** - Build reliable agent interactions -- **[Examples Repository](https://github.com/aixgo-dev/aixgo/tree/main/examples)** - See working examples of all patterns diff --git a/web/content/guides/observability.md b/web/content/guides/observability.md deleted file mode 100644 index 0d7284a..0000000 --- a/web/content/guides/observability.md +++ /dev/null @@ -1,691 +0,0 @@ ---- -title: 'Observability & Monitoring' -description: 'Instrument Aixgo agents with OpenTelemetry, distributed tracing, and structured logging for production visibility.' -breadcrumb: 'Deployment' -category: 'Deployment' -weight: 9 ---- - -Production AI systems need comprehensive observability. This guide shows how to instrument Aixgo agents with OpenTelemetry, distributed tracing, metrics, and structured logging. - -## Built-in Observability - -Aixgo includes comprehensive observability features out of the box: - -- **OpenTelemetry** - OTLP export for distributed tracing -- **Langfuse Integration** - LLM-specific analytics via OTLP -- **Prometheus Metrics** - Production monitoring and alerting -- **Health Checks** - Liveness and readiness probes - -### Enable with Configuration - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - max_rounds: 10 - -observability: - tracing: true - service_name: 'aixgo-production' - exporter: 'otlp' - endpoint: 'localhost:4317' - sampling_rate: 1.0 # 100% of traces - -agents: - - name: analyzer - role: react - model: gpt-4-turbo -``` - -That's it. Distributed tracing is now enabled for your entire agent system. - -## Distributed Tracing - -### How Tracing Works - -Every message flow is automatically traced: - -```text -Request → Producer → Analyzer → Logger → Response - | | | | | - span1 span2 span3 span4 span5 -``` - -Each span includes: - -- Agent name -- Processing duration -- Input/output messages -- LLM API calls -- Tool executions -- Errors - -### Viewing Traces - -Aixgo exports traces to any OpenTelemetry-compatible backend: - -**Jaeger (Open Source):** - -```yaml -# docker-compose.yml -services: - jaeger: - image: jaegertracing/all-in-one:latest - ports: - - '4317:4317' # OTLP gRPC - - '16686:16686' # Jaeger UI -``` - -```bash -# Start Jaeger -docker-compose up -d - -# View traces at http://localhost:16686 -``` - -**Grafana Cloud:** - -```yaml -observability: - tracing: true - service_name: 'aixgo-prod' - exporter: 'otlp' - endpoint: 'https://otlp-gateway-prod-us-central-0.grafana.net/otlp' - headers: - authorization: 'Bearer ${GRAFANA_API_KEY}' -``` - -**Datadog:** - -```yaml -observability: - tracing: true - service_name: 'aixgo-prod' - exporter: 'datadog' - endpoint: 'http://datadog-agent:8126' -``` - -### Trace Attributes - -Aixgo automatically adds contextual attributes: - -```json -{ - "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736", - "span_id": "00f067aa0ba902b7", - "service.name": "aixgo-production", - "agent.name": "analyzer", - "agent.role": "react", - "agent.model": "gpt-4-turbo", - "message.id": "msg-123", - "llm.provider": "xai", - "llm.model": "gpt-4-turbo", - "llm.tokens.input": 150, - "llm.tokens.output": 75, - "llm.latency_ms": 850, - "tool.name": "query_database", - "tool.duration_ms": 120 -} -``` - -## Structured Logging - -### Default JSON Logging - -Aixgo uses structured logging with automatic trace correlation: - -```go -import "log/slog" - -func main() { - // Configure JSON logging - logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{ - Level: slog.LevelInfo, - })) - slog.SetDefault(logger) - - slog.Info("Starting Aixgo agent", - "version", "1.0.0", - "env", os.Getenv("ENV"), - ) - - if err := aixgo.Run("config/agents.yaml"); err != nil { - slog.Error("Agent failed", "error", err) - os.Exit(1) - } -} -``` - -**Output:** - -```json -{ - "time": "2025-11-16T10:30:00Z", - "level": "INFO", - "msg": "Starting Aixgo agent", - "version": "1.0.0", - "env": "production", - "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736", - "span_id": "00f067aa0ba902b7" -} -``` - -### Contextual Logging - -Log with trace context automatically: - -```go -func processMessage(ctx context.Context, msg string) { - slog.InfoContext(ctx, "Processing message", - "message_length", len(msg), - "agent", "analyzer", - ) - // trace_id and span_id added automatically -} -``` - -## Metrics - -### Built-in Metrics - -Aixgo exports these metrics automatically: - -**Agent Metrics:** - -- `aixgo_agent_messages_total` - Total messages processed per agent -- `aixgo_agent_errors_total` - Total errors per agent -- `aixgo_agent_duration_seconds` - Processing duration histogram - -**LLM Metrics:** - -- `aixgo_llm_requests_total` - Total LLM API calls -- `aixgo_llm_tokens_input` - Input tokens consumed -- `aixgo_llm_tokens_output` - Output tokens generated -- `aixgo_llm_latency_seconds` - LLM API latency histogram - -**Supervisor Metrics:** - -- `aixgo_supervisor_rounds_total` - Total workflow rounds -- `aixgo_supervisor_active_agents` - Number of active agents -- `aixgo_supervisor_message_queue_size` - Pending messages - -### Prometheus Integration - -```yaml -# config/agents.yaml -observability: - tracing: true - metrics: true - metrics_port: 9090 # Prometheus scrape endpoint -``` - -**Prometheus configuration:** - -```yaml -# prometheus.yml -scrape_configs: - - job_name: 'aixgo' - static_configs: - - targets: ['localhost:9090'] -``` - -### Custom Metrics - -Add application-specific metrics: - -```go -import ( - "github.com/prometheus/client_golang/prometheus" - "github.com/prometheus/client_golang/prometheus/promauto" -) - -var ( - dataProcessed = promauto.NewCounter(prometheus.CounterOpts{ - Name: "myapp_data_processed_total", - Help: "Total data records processed", - }) - - processingDuration = promauto.NewHistogram(prometheus.HistogramOpts{ - Name: "myapp_processing_duration_seconds", - Help: "Time spent processing data", - Buckets: prometheus.DefBuckets, - }) -) - -func processData(data string) { - timer := prometheus.NewTimer(processingDuration) - defer timer.ObserveDuration() - - // Process data - // ... - - dataProcessed.Inc() -} -``` - -## Health Checks - -Aixgo exposes liveness and readiness endpoints for container orchestration: - -```yaml -# config/agents.yaml -observability: - health_port: 8080 # Health check endpoint -``` - -**Endpoints:** - -- `/healthz` - Liveness probe (is the process running?) -- `/readyz` - Readiness probe (is the service ready to accept traffic?) - -**Kubernetes example:** - -```yaml -# deployment.yaml -livenessProbe: - httpGet: - path: /healthz - port: 8080 - initialDelaySeconds: 5 - periodSeconds: 10 - -readinessProbe: - httpGet: - path: /readyz - port: 8080 - initialDelaySeconds: 5 - periodSeconds: 5 -``` - -## Integration with Observability Platforms - -### Grafana Stack (Loki + Tempo + Mimir) - -Complete observability setup: - -```yaml -# docker-compose.yml -services: - # Logs - loki: - image: grafana/loki:latest - ports: - - '3100:3100' - - # Traces - tempo: - image: grafana/tempo:latest - ports: - - '4317:4317' # OTLP gRPC - - '3200:3200' # Tempo UI - - # Metrics - prometheus: - image: prom/prometheus:latest - ports: - - '9090:9090' - volumes: - - ./prometheus.yml:/etc/prometheus/prometheus.yml - - # Dashboards - grafana: - image: grafana/grafana:latest - ports: - - '3000:3000' - environment: - - GF_AUTH_ANONYMOUS_ENABLED=true -``` - -**Aixgo configuration:** - -```yaml -observability: - tracing: true - metrics: true - service_name: 'aixgo-prod' - exporter: 'otlp' - endpoint: 'tempo:4317' - metrics_port: 9090 -``` - -### Datadog - -```yaml -# docker-compose.yml -services: - datadog-agent: - image: gcr.io/datadoghq/agent:latest - environment: - - DD_API_KEY=${DATADOG_API_KEY} - - DD_SITE=datadoghq.com - - DD_LOGS_ENABLED=true - - DD_APM_ENABLED=true - ports: - - '8126:8126' # APM -``` - -**Aixgo configuration:** - -```yaml -observability: - tracing: true - service_name: 'aixgo-prod' - exporter: 'datadog' - endpoint: 'datadog-agent:8126' -``` - -### Langfuse (LLM-Specific) - -Track LLM calls, costs, and performance: - -```yaml -observability: - tracing: true - llm_observability: - enabled: true - provider: 'langfuse' - endpoint: 'https://cloud.langfuse.com' - public_key: ${LANGFUSE_PUBLIC_KEY} - secret_key: ${LANGFUSE_SECRET_KEY} -``` - -Langfuse dashboard shows: - -- LLM call traces -- Token usage and costs -- Model performance comparisons -- Prompt engineering analytics - -## Debugging Multi-Agent Workflows - -### Trace Visualization - -View message flow across agents: - -```text -Request [trace_id: abc123] - ├─ producer (50ms) - │ └─ generate_data - ├─ analyzer (1.2s) - │ ├─ llm_call [gpt-4-turbo] (850ms) - │ └─ tool_call [query_db] (120ms) - └─ logger (30ms) - └─ persist_result - -Total: 1.28s -``` - -### Identifying Bottlenecks - -Sort spans by duration to find slow operations: - -| Agent | Operation | Duration | % of Total | -| -------- | ------------- | -------- | ---------- | -| analyzer | llm_call | 850ms | 66% | -| analyzer | tool_call | 120ms | 9% | -| producer | generate_data | 50ms | 4% | - -**Insight:** LLM call is the bottleneck (66% of time) - -**Actions:** - -- Use faster model (gpt-3.5-turbo instead of gpt-4) -- Optimize prompt length -- Add caching for repeated queries - -## Alerting - -### Prometheus Alerts - -```yaml -# alerts.yml -groups: - - name: aixgo - rules: - # High error rate - - alert: HighErrorRate - expr: rate(aixgo_agent_errors_total[5m]) > 0.1 - for: 5m - annotations: - summary: 'High error rate in {{ $labels.agent }}' - - # Slow LLM calls - - alert: SlowLLMCalls - expr: histogram_quantile(0.95, aixgo_llm_latency_seconds) > 5 - for: 10m - annotations: - summary: '95th percentile LLM latency > 5s' - - # High token usage - - alert: HighTokenUsage - expr: rate(aixgo_llm_tokens_input[1h]) > 100000 - for: 1h - annotations: - summary: 'High token consumption (cost concern)' -``` - -### Grafana Alerts - -Create dashboards with alerts: - -```json -{ - "dashboard": { - "title": "Aixgo Agent Monitoring", - "panels": [ - { - "title": "Agent Throughput", - "targets": [ - { - "expr": "rate(aixgo_agent_messages_total[5m])" - } - ], - "alert": { - "conditions": [ - { - "evaluator": { "type": "lt", "params": [10] }, - "query": { "params": ["A", "5m", "now"] } - } - ] - } - } - ] - } -} -``` - -## Best Practices - -### 1. Always Enable Tracing in Production - -```yaml -# config/agents-prod.yaml -observability: - tracing: true - sampling_rate: 0.1 # 10% sampling to reduce overhead -``` - -### 2. Use Structured Logging - -```go -// ❌ Bad: unstructured -log.Println("Error processing message:", err) - -// ✅ Good: structured -slog.Error("Message processing failed", - "error", err, - "agent", "analyzer", - "message_id", msgID, -) -``` - -### 3. Set SLOs and Monitor Them - -Define service level objectives: - -- **Latency:** P95 < 2 seconds -- **Error rate:** < 1% -- **Throughput:** > 100 msg/s - -Monitor with alerts. - -### 4. Correlate Logs with Traces - -Always log with context: - -```go -slog.InfoContext(ctx, "Processing started") -// trace_id and span_id automatically included -``` - -### 5. Track LLM Costs - -Monitor token usage to control costs: - -```go -// Aixgo tracks this automatically -// aixgo_llm_tokens_input -// aixgo_llm_tokens_output - -// Calculate cost -costPerToken := 0.00001 // $0.01 per 1K tokens -totalCost := totalTokens * costPerToken -``` - -## Troubleshooting Common Issues - -### Missing Traces - -**Symptom:** Traces not appearing in backend - -**Check:** - -1. Endpoint configured correctly -2. Network connectivity to OTLP endpoint -3. Firewall rules allow port 4317 (OTLP) -4. API keys/credentials set correctly - -**Debug:** - -```yaml -observability: - tracing: true - debug: true # Enable verbose OTLP logging -``` - -### High Overhead - -**Symptom:** Tracing impacting performance - -**Solution:** - -```yaml -observability: - tracing: true - sampling_rate: 0.1 # Sample 10% instead of 100% -``` - -### Incomplete Spans - -**Symptom:** Some spans missing from trace - -**Cause:** Long-running operations timing out - -**Solution:** - -```yaml -observability: - tracing: true - span_timeout: 60s # Increase timeout -``` - -## Example: Complete Observability Setup - -```yaml -# config/agents-prod.yaml -supervisor: - name: coordinator - max_rounds: 10 - -observability: - # Tracing - tracing: true - service_name: 'aixgo-production' - exporter: 'otlp' - endpoint: 'tempo:4317' - sampling_rate: 0.2 - - # Metrics - metrics: true - metrics_port: 9090 - - # LLM-specific - llm_observability: - enabled: true - provider: 'langfuse' - endpoint: 'https://cloud.langfuse.com' - public_key: ${LANGFUSE_PUBLIC_KEY} - secret_key: ${LANGFUSE_SECRET_KEY} - -agents: - - name: analyzer - role: react - model: gpt-4-turbo - prompt: 'Analyze the data' -``` - -**Application code:** - -```go -package main - -import ( - "log/slog" - "os" - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - // Structured JSON logging - logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{ - Level: slog.LevelInfo, - })) - slog.SetDefault(logger) - - slog.Info("Starting Aixgo agent", - "version", "1.0.0", - "env", "production", - ) - - // Run with observability enabled - if err := aixgo.Run("config/agents-prod.yaml"); err != nil { - slog.Error("Agent failed", "error", err) - os.Exit(1) - } -} -``` - -**Result:** - -- ✅ Distributed traces in Tempo/Grafana -- ✅ Metrics in Prometheus -- ✅ LLM analytics in Langfuse -- ✅ Structured logs with trace correlation - -## Key Takeaways - -1. **Built-in observability** - OpenTelemetry included, no manual instrumentation -2. **Distributed tracing** - Track messages across multi-agent workflows -3. **Structured logging** - JSON logs with automatic trace correlation -4. **Metrics export** - Agent, LLM, and supervisor metrics -5. **Platform integration** - Works with Grafana, Datadog, Langfuse, etc. - -Production AI systems are complex. Comprehensive observability is not optional—it's essential. - -## Next Steps - -- **[Production Deployment](/guides/production-deployment)** - Deploy with monitoring -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Debug complex workflows -- **[Building Docker Images](/guides/docker-from-scratch)** - Container-based deployments diff --git a/web/content/guides/pattern-composition.md b/web/content/guides/pattern-composition.md deleted file mode 100644 index bbcb552..0000000 --- a/web/content/guides/pattern-composition.md +++ /dev/null @@ -1,835 +0,0 @@ ---- -title: 'Pattern Composition for Multi-Phase Workflows' -description: 'Compose existing orchestration patterns to build complex multi-phase workflows without custom orchestrators' -breadcrumb: 'Pattern Composition' -category: 'Orchestration' -weight: 20 ---- - -Aixgo provides 13 orchestration patterns that can be **composed** to build complex multi-phase workflows without writing custom orchestrators. This guide shows you how to combine existing patterns for sophisticated multi-stage processing. - -## Overview - -Instead of creating custom orchestration logic, compose existing patterns into multi-phase workflows: - -```text -Phase 1 (Parallel) → Validate → Phase 2 (Aggregate) → Validate → Phase 3 (Sequential) -``` - -Each phase uses an existing pattern. Validation gates ensure quality between phases. - -### The Composition Principle - -**Don't create new patterns when you can compose existing ones.** - -Aixgo's 13 patterns are building blocks. Most complex workflows are compositions of these patterns, not new patterns entirely. - -### Benefits of Composition - -- **Faster development**: Reuse proven patterns instead of building from scratch -- **Better reliability**: Each pattern is battle-tested in production -- **Easier maintenance**: Standard patterns are easier to understand and debug -- **Natural validation points**: Phase boundaries provide natural validation gates -- **Flexibility**: Reconfigure by swapping phases, not rewriting logic - -## Composing Patterns Programmatically - -### Example: Policy Analysis Workflow - -This three-phase workflow demonstrates pattern composition: - -1. **Phase 1 (Parallel)**: Extract data from multiple sources concurrently -2. **Validation Gate**: Ensure all required data extracted -3. **Phase 2 (Aggregation)**: Combine extracted data into coherent summary -4. **Validation Gate**: Verify aggregated output meets requirements -5. **Phase 3 (Sequential)**: Perform risk assessment on aggregated data - -```go -package main - -import ( - "context" - "fmt" - "time" - - "github.com/aixgo-dev/aixgo/pkg/agent" - "github.com/aixgo-dev/aixgo/pkg/patterns" - "github.com/aixgo-dev/aixgo/pkg/supervisor" -) - -func runPolicyAnalysisWorkflow(ctx context.Context, policyDocument string) error { - // Initialize executor - executor := supervisor.NewExecutor() - - // Phase 1: Parallel data extraction - phase1 := patterns.NewParallelPattern(executor, patterns.ParallelConfig{ - Timeout: 5 * time.Second, - FailFast: false, // Collect all results even if some fail - }) - - phase1Result, err := phase1.Execute(ctx, - []string{"data-agent", "summary-agent", "rights-agent"}, - agent.NewMessage(policyDocument)) - if err != nil { - return fmt.Errorf("phase 1 failed: %w", err) - } - - // Validation Gate 1: Ensure all data extracted - if err := validatePhase1(phase1Result); err != nil { - return fmt.Errorf("phase 1 validation failed: %w", err) - } - - // Phase 2: Aggregation of extracted data - aggregator := patterns.NewAggregationPattern(executor, patterns.AggregationConfig{ - Method: patterns.AggregationConsensus, - MinimumResponses: 2, - Timeout: 10 * time.Second, - }) - - phase2Result, err := aggregator.Execute(ctx, - []string{"policy-merger"}, - phase1Result.AggregatedOutput) - if err != nil { - return fmt.Errorf("phase 2 failed: %w", err) - } - - // Validation Gate 2: Verify aggregated summary - if err := validatePhase2(phase2Result); err != nil { - return fmt.Errorf("phase 2 validation failed: %w", err) - } - - // Phase 3: Sequential risk assessment - phase3 := patterns.NewSequentialPattern(executor, patterns.SequentialConfig{}) - - finalResult, err := phase3.Execute(ctx, - []string{"risk-assessor", "compliance-checker"}, - phase2Result.AggregatedOutput) - if err != nil { - return fmt.Errorf("phase 3 failed: %w", err) - } - - // Final validation - if err := validateFinalResult(finalResult); err != nil { - return fmt.Errorf("final validation failed: %w", err) - } - - fmt.Printf("Workflow completed successfully: %v\n", finalResult) - return nil -} - -// Validation functions use the Validatable interface -type Phase1Output struct { - DataPractices []string `json:"data_practices"` - Summary string `json:"summary"` - UserRights []string `json:"user_rights"` -} - -func (p Phase1Output) Validate() error { - if len(p.DataPractices) == 0 { - return fmt.Errorf("data practices required") - } - if p.Summary == "" { - return fmt.Errorf("summary required") - } - if len(p.UserRights) == 0 { - return fmt.Errorf("user rights required") - } - return nil -} - -func validatePhase1(result *patterns.PatternResult) error { - var output Phase1Output - if err := result.Unmarshal(&output); err != nil { - return err - } - return output.Validate() -} -``` - -## Common Composition Patterns - -### 1. Extract-Transform-Load (ETL) - -Classic data pipeline pattern using three sequential phases. - -**Structure:** - -```text -Parallel Extraction → Sequential Transform → Sequential Load -``` - -**Use when:** - -- Processing large datasets with multiple sources -- Data transformations are ordered and dependent -- Loading requires specific sequencing - -**YAML Configuration:** - -```yaml -supervisor: - name: etl-pipeline - -agents: - # Extraction Phase (Parallel) - - name: source-1-extractor - role: producer - outputs: - - target: transformer - - - name: source-2-extractor - role: producer - outputs: - - target: transformer - - - name: source-3-extractor - role: producer - outputs: - - target: transformer - - # Transform Phase (Sequential) - - name: transformer - role: react - model: gpt-4-turbo - prompt: 'Transform and normalize data' - inputs: - - source: source-1-extractor - - source: source-2-extractor - - source: source-3-extractor - outputs: - - target: validator - - - name: validator - role: react - model: gpt-4-turbo - prompt: 'Validate transformed data' - inputs: - - source: transformer - outputs: - - target: loader - - # Load Phase (Sequential) - - name: loader - role: logger - inputs: - - source: validator -``` - -### 2. Multi-Expert Analysis - -Combine parallel expert agents with consensus aggregation and sequential reporting. - -**Structure:** - -```text -Parallel Experts → Aggregation (Consensus) → Sequential Report -``` - -**Use when:** - -- Need diverse expert perspectives -- Consensus building improves quality -- Final report requires structured formatting - -**YAML Configuration:** - -```yaml -supervisor: - name: expert-analysis - -agents: - # Parallel Expert Phase - - name: technical-expert - role: react - model: gpt-4-turbo - prompt: 'Analyze from technical perspective' - outputs: - - target: consensus-builder - - - name: business-expert - role: react - model: gpt-4-turbo - prompt: 'Analyze from business perspective' - outputs: - - target: consensus-builder - - - name: legal-expert - role: react - model: gpt-4-turbo - prompt: 'Analyze from legal perspective' - outputs: - - target: consensus-builder - - # Aggregation Phase - - name: consensus-builder - role: aggregator - model: gpt-4-turbo - inputs: - - source: technical-expert - - source: business-expert - - source: legal-expert - outputs: - - target: report-generator - aggregator_config: - aggregation_strategy: consensus - consensus_threshold: 0.7 - - # Sequential Reporting Phase - - name: report-generator - role: react - model: gpt-4-turbo - prompt: 'Format final report with executive summary' - inputs: - - source: consensus-builder - outputs: - - target: report-validator - - - name: report-validator - role: react - model: gpt-4-turbo - prompt: 'Validate report completeness and accuracy' - inputs: - - source: report-generator - outputs: - - target: final-output -``` - -### 3. Iterative Refinement - -Use reflection pattern within each phase for quality improvement. - -**Structure:** - -```text -Sequential Draft → Reflection (Critic + Generator) → Sequential Publish -``` - -**Use when:** - -- Quality is paramount -- Iterative improvement adds significant value -- Time permits multiple refinement cycles - -**YAML Configuration:** - -```yaml -supervisor: - name: iterative-refinement - max_rounds: 10 - -agents: - # Draft Phase - - name: initial-drafter - role: react - model: gpt-4-turbo - prompt: 'Create initial draft' - outputs: - - target: critic - - # Reflection Phase - - name: critic - role: react - model: gpt-4-turbo - prompt: 'Critique and identify improvements' - inputs: - - source: initial-drafter - - source: refiner # Feedback loop - outputs: - - target: refiner - - - name: refiner - role: react - model: gpt-4-turbo - prompt: 'Improve based on critique' - inputs: - - source: critic - outputs: - - target: critic # Continue reflection loop - - target: publisher # Exit when satisfied - - # Publish Phase - - name: publisher - role: logger - inputs: - - source: refiner -``` - -### 4. Hierarchical Processing - -Multi-level delegation with aggregation at each level. - -**Structure:** - -```text -Parallel Teams → Team Aggregation → Hierarchical Summary -``` - -**Use when:** - -- Large number of agents (10+) -- Natural organizational hierarchy -- Multi-level decision making needed - -**YAML Configuration:** - -```yaml -supervisor: - name: hierarchical-processing - -agents: - # Team A (Parallel) - - name: team-a-worker-1 - role: react - model: gpt-4-turbo - outputs: - - target: team-a-lead - - - name: team-a-worker-2 - role: react - model: gpt-4-turbo - outputs: - - target: team-a-lead - - # Team A Aggregation - - name: team-a-lead - role: aggregator - model: gpt-4-turbo - inputs: - - source: team-a-worker-1 - - source: team-a-worker-2 - outputs: - - target: executive - aggregator_config: - aggregation_strategy: voting_majority - - # Team B (Parallel) - - name: team-b-worker-1 - role: react - model: gpt-4-turbo - outputs: - - target: team-b-lead - - - name: team-b-worker-2 - role: react - model: gpt-4-turbo - outputs: - - target: team-b-lead - - # Team B Aggregation - - name: team-b-lead - role: aggregator - model: gpt-4-turbo - inputs: - - source: team-b-worker-1 - - source: team-b-worker-2 - outputs: - - target: executive - aggregator_config: - aggregation_strategy: voting_majority - - # Executive Hierarchical Summary - - name: executive - role: aggregator - model: gpt-4-turbo - inputs: - - source: team-a-lead - - source: team-b-lead - outputs: - - target: final-report - aggregator_config: - aggregation_strategy: hierarchical -``` - -## Validation Between Phases - -Use the `Validatable` interface for type-safe phase validation. - -### Phase Output Structs - -Define clear contracts between phases: - -```go -// Phase 1 Output: Parallel extraction -type ExtractionOutput struct { - Sources []SourceData `json:"sources" validate:"required,dive"` - Metadata ExtractionMetadata `json:"metadata" validate:"required"` -} - -func (e ExtractionOutput) Validate() error { - if len(e.Sources) == 0 { - return fmt.Errorf("at least one source required") - } - if e.Metadata.Timestamp.IsZero() { - return fmt.Errorf("metadata timestamp required") - } - return nil -} - -// Phase 2 Output: Aggregation -type AggregationOutput struct { - Summary string `json:"summary" validate:"required,min=100"` - Confidence float64 `json:"confidence" validate:"required,gte=0,lte=1"` -} - -func (a AggregationOutput) Validate() error { - if a.Confidence < 0.7 { - return fmt.Errorf("confidence too low: %.2f < 0.70", a.Confidence) - } - return nil -} - -// Phase 3 Output: Risk assessment -type RiskAssessmentOutput struct { - RiskLevel string `json:"risk_level" validate:"required,oneof=low medium high"` - Factors []string `json:"factors" validate:"required"` - Recommendations []string `json:"recommendations" validate:"required"` -} - -func (r RiskAssessmentOutput) Validate() error { - if len(r.Factors) == 0 { - return fmt.Errorf("risk factors required") - } - if r.RiskLevel == "high" && len(r.Recommendations) == 0 { - return fmt.Errorf("recommendations required for high risk") - } - return nil -} -``` - -### Validation Gates - -Implement validation between each phase: - -```go -func validateExtractionPhase(result *patterns.PatternResult) error { - var output ExtractionOutput - if err := result.Unmarshal(&output); err != nil { - return fmt.Errorf("unmarshal failed: %w", err) - } - return output.Validate() -} - -func validateAggregationPhase(result *patterns.PatternResult) error { - var output AggregationOutput - if err := result.Unmarshal(&output); err != nil { - return fmt.Errorf("unmarshal failed: %w", err) - } - return output.Validate() -} - -func validateRiskAssessmentPhase(result *patterns.PatternResult) error { - var output RiskAssessmentOutput - if err := result.Unmarshal(&output); err != nil { - return fmt.Errorf("unmarshal failed: %w", err) - } - return output.Validate() -} -``` - -### Error Handling Between Phases - -Handle phase failures gracefully: - -```go -func executeWorkflow(ctx context.Context, input string) (*WorkflowResult, error) { - // Phase 1 - phase1Result, err := executePhase1(ctx, input) - if err != nil { - return nil, &PhaseError{ - Phase: 1, - Error: err, - RecoveryHint: "Check data source availability", - } - } - - if err := validateExtractionPhase(phase1Result); err != nil { - return nil, &ValidationError{ - Phase: 1, - Error: err, - Data: phase1Result, - } - } - - // Phase 2 - phase2Result, err := executePhase2(ctx, phase1Result) - if err != nil { - return nil, &PhaseError{ - Phase: 2, - Error: err, - RecoveryHint: "Consider relaxing consensus threshold", - } - } - - // Continue for remaining phases... -} - -type PhaseError struct { - Phase int - Error error - RecoveryHint string -} - -func (e *PhaseError) Error() string { - return fmt.Sprintf("phase %d failed: %v (hint: %s)", - e.Phase, e.Error, e.RecoveryHint) -} - -type ValidationError struct { - Phase int - Error error - Data any -} - -func (e *ValidationError) Error() string { - return fmt.Sprintf("phase %d validation failed: %v", - e.Phase, e.Error) -} -``` - -## When to Compose vs. Create New Pattern - -### Compose Existing Patterns When - -- Workflow is sequential phases -- Each phase uses existing pattern logic -- Validation/gates are simple checks -- Patterns can be independently configured -- Standard orchestration patterns apply - -**Example:** ETL pipeline, multi-expert analysis, hierarchical review - -### Create New Pattern When - -- Fundamentally new execution model needed -- Custom state management across agents required -- Novel coordination logic not captured by existing patterns -- Pattern will be reused across many projects -- Existing composition would be overly complex - -**Example:** Custom negotiation protocol, domain-specific coordination - -### Decision Matrix - -| Scenario | Compose | New Pattern | -|----------|---------|-------------| -| 3-phase workflow with standard patterns | ✅ | ❌ | -| Multi-level aggregation | ✅ | ❌ | -| Custom agent negotiation protocol | ❌ | ✅ | -| Sequential + Parallel + Aggregation | ✅ | ❌ | -| Novel state machine coordination | ❌ | ✅ | -| ETL with validation gates | ✅ | ❌ | -| Custom distributed consensus algorithm | ❌ | ✅ | - -## Performance Optimization - -### Minimize Phase Transitions - -Each phase transition adds overhead. Combine phases when possible: - -**Before (3 phases):** - -```text -Extract → Validate → Transform → Validate → Load -``` - -**After (2 phases):** - -```text -Extract + Validate → Transform + Validate + Load -``` - -### Parallel Phase Execution - -When phases are independent, run them in parallel: - -```go -// Run independent phases concurrently -var wg sync.WaitGroup -var phase1Result, phase2Result *patterns.PatternResult -var phase1Err, phase2Err error - -wg.Add(2) - -go func() { - defer wg.Done() - phase1Result, phase1Err = executePhase1(ctx, input) -}() - -go func() { - defer wg.Done() - phase2Result, phase2Err = executePhase2(ctx, input) -}() - -wg.Wait() - -if phase1Err != nil || phase2Err != nil { - return handleErrors(phase1Err, phase2Err) -} -``` - -### Cache Phase Results - -For repeated workflows, cache phase results: - -```go -type PhaseCache struct { - cache map[string]*patterns.PatternResult - mu sync.RWMutex -} - -func (c *PhaseCache) Get(key string) (*patterns.PatternResult, bool) { - c.mu.RLock() - defer c.mu.RUnlock() - result, ok := c.cache[key] - return result, ok -} - -func (c *PhaseCache) Set(key string, result *patterns.PatternResult) { - c.mu.Lock() - defer c.mu.Unlock() - c.cache[key] = result -} - -// Usage -cacheKey := hashInput(input) -if cached, ok := phaseCache.Get(cacheKey); ok { - return cached, nil -} - -result, err := executePhase1(ctx, input) -if err == nil { - phaseCache.Set(cacheKey, result) -} -``` - -## Real-World Example: Document Processing Pipeline - -Complete example combining multiple patterns for document processing. - -### Requirements - -1. Extract text from multiple document formats (PDF, Word, HTML) -2. Classify document type and sensitivity -3. Parallel analysis: summarization, entity extraction, sentiment -4. Aggregate analysis results -5. Risk assessment and compliance check -6. Generate final report - -### Implementation - -```go -func processDocument(ctx context.Context, doc Document) (*Report, error) { - executor := supervisor.NewExecutor() - - // Phase 1: Parallel text extraction - extractors := []string{ - "pdf-extractor", - "word-extractor", - "html-extractor", - } - - phase1 := patterns.NewParallelPattern(executor, patterns.ParallelConfig{ - Timeout: 10 * time.Second, - FailFast: false, - }) - - extractionResult, err := phase1.Execute(ctx, extractors, - agent.NewMessage(doc)) - if err != nil { - return nil, fmt.Errorf("extraction failed: %w", err) - } - - // Phase 2: Classification (Router pattern) - classifier := patterns.NewRouterPattern(executor, patterns.RouterConfig{ - RoutingStrategy: patterns.ClassificationBased, - }) - - classificationResult, err := classifier.Execute(ctx, - []string{"document-classifier"}, - extractionResult.AggregatedOutput) - if err != nil { - return nil, fmt.Errorf("classification failed: %w", err) - } - - // Phase 3: Parallel analysis - analyzers := []string{ - "summarizer", - "entity-extractor", - "sentiment-analyzer", - } - - phase3 := patterns.NewParallelPattern(executor, patterns.ParallelConfig{ - Timeout: 15 * time.Second, - }) - - analysisResult, err := phase3.Execute(ctx, analyzers, - classificationResult.AggregatedOutput) - if err != nil { - return nil, fmt.Errorf("analysis failed: %w", err) - } - - // Phase 4: Aggregation - aggregator := patterns.NewAggregationPattern(executor, patterns.AggregationConfig{ - Method: patterns.AggregationConsensus, - MinimumResponses: 2, - }) - - aggregatedResult, err := aggregator.Execute(ctx, - []string{"analysis-aggregator"}, - analysisResult.AggregatedOutput) - if err != nil { - return nil, fmt.Errorf("aggregation failed: %w", err) - } - - // Phase 5: Sequential risk and compliance - riskCheckers := []string{ - "risk-assessor", - "compliance-checker", - } - - phase5 := patterns.NewSequentialPattern(executor, patterns.SequentialConfig{}) - - riskResult, err := phase5.Execute(ctx, riskCheckers, - aggregatedResult.AggregatedOutput) - if err != nil { - return nil, fmt.Errorf("risk assessment failed: %w", err) - } - - // Phase 6: Report generation - reportGenerator := patterns.NewSequentialPattern(executor, patterns.SequentialConfig{}) - - finalResult, err := reportGenerator.Execute(ctx, - []string{"report-generator"}, - riskResult.AggregatedOutput) - if err != nil { - return nil, fmt.Errorf("report generation failed: %w", err) - } - - var report Report - if err := finalResult.Unmarshal(&report); err != nil { - return nil, fmt.Errorf("unmarshal report failed: %w", err) - } - - return &report, nil -} -``` - -## Best Practices - -1. **Define clear phase boundaries**: Each phase should have well-defined inputs and outputs -2. **Validate between phases**: Use `Validatable` interface for type-safe validation -3. **Handle failures gracefully**: Provide recovery hints and context in errors -4. **Cache when appropriate**: Reuse expensive phase results for identical inputs -5. **Monitor phase performance**: Track latency and success rates per phase -6. **Document phase contracts**: Clear documentation of expected inputs/outputs -7. **Use appropriate patterns**: Match pattern to phase requirements (parallel vs sequential) -8. **Consider cost**: LLM-powered patterns vs deterministic patterns based on needs - -## See Also - -- [Multi-Agent Orchestration Guide](./multi-agent-orchestration/) - All 13 orchestration patterns -- [Sequential Pattern](./multi-agent-orchestration/#sequential-pattern) - Ordered execution -- [Parallel Pattern](./multi-agent-orchestration/#parallel-pattern) - Concurrent execution -- [Aggregation Pattern](./multi-agent-orchestration/#aggregation-pattern) - Multi-source synthesis -- [Validation with Retry](./validation-with-retry/) - Automatic validation and retry -- [Multi-Phase Workflow Example](../../examples/multi-phase-workflow/) - Complete working example diff --git a/web/content/guides/production-deployment.md b/web/content/guides/production-deployment.md deleted file mode 100644 index 3693f19..0000000 --- a/web/content/guides/production-deployment.md +++ /dev/null @@ -1,557 +0,0 @@ ---- -title: 'Production Deployment' -description: 'Deploy Aixgo agents to production with best practices for scaling and monitoring.' -breadcrumb: 'Deployment' -category: 'Deployment' -weight: 8 ---- - -Aixgo is built for production deployment. This guide covers deployment patterns, best practices, and strategies for running AI agents at scale with enterprise security and full -observability. - -## Deployment Patterns - -### Pattern 1: Single Binary on Cloud Run / Lambda - -The simplest production deployment: compile to a single binary and deploy to serverless platforms. - -**Pros:** - -- Zero infrastructure management -- Auto-scaling included -- Pay-per-request pricing -- <100ms cold start - -**Best for:** - -- API endpoints -- Event-driven workflows -- Low to medium throughput (<1000 req/s) - -#### Example: Cloud Run - -```dockerfile -# Build stage -FROM golang:1.21 AS builder -WORKDIR /app -COPY go.mod go.sum ./ -RUN go mod download -COPY . . -RUN CGO_ENABLED=0 GOOS=linux go build -o agent main.go - -# Runtime stage -FROM scratch -COPY --from=builder /app/agent /agent -COPY --from=builder /app/config/ /config/ -CMD ["/agent"] -``` - -```bash -# Build and deploy -docker build -t gcr.io/my-project/aixgo-agent . -docker push gcr.io/my-project/aixgo-agent -gcloud run deploy aixgo-agent \ - --image gcr.io/my-project/aixgo-agent \ - --platform managed \ - --region us-central1 \ - --allow-unauthenticated -``` - -**Configuration:** - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - mode: local # Single process - max_rounds: 10 -``` - -### Pattern 2: Container on Kubernetes - -For higher control and customization, deploy as containers on Kubernetes. - -**Pros:** - -- Full control over scaling -- Persistent connections -- Custom networking -- Multi-region support - -**Best for:** - -- High throughput (>1000 req/s) -- Stateful workflows -- Complex networking requirements - -#### Example: Kubernetes Deployment - -```yaml -# deployment.yaml -apiVersion: apps/v1 -kind: Deployment -metadata: - name: aixgo-agent -spec: - replicas: 3 - selector: - matchLabels: - app: aixgo-agent - template: - metadata: - labels: - app: aixgo-agent - spec: - containers: - - name: agent - image: my-registry/aixgo-agent:latest - ports: - - containerPort: 8080 - env: - - name: CONFIG_PATH - value: '/config/agents.yaml' - - name: LOG_LEVEL - value: 'info' - resources: - requests: - memory: '128Mi' - cpu: '100m' - limits: - memory: '512Mi' - cpu: '500m' - livenessProbe: - httpGet: - path: /health - port: 8080 - initialDelaySeconds: 10 - periodSeconds: 30 - readinessProbe: - httpGet: - path: /ready - port: 8080 - initialDelaySeconds: 5 - periodSeconds: 10 ---- -apiVersion: v1 -kind: Service -metadata: - name: aixgo-agent -spec: - selector: - app: aixgo-agent - ports: - - port: 80 - targetPort: 8080 - type: LoadBalancer -``` - -### Pattern 3: Edge Deployment - -Deploy to edge devices, IoT gateways, or resource-constrained environments. - -**Pros:** - -- Minimal footprint (<20MB binary) -- No external dependencies -- Works offline -- Low latency (local processing) - -**Best for:** - -- IoT devices -- Edge computing -- Offline-first applications -- Latency-sensitive workloads - -#### Example: Raspberry Pi Deployment - -```bash -# Cross-compile for ARM -GOOS=linux GOARCH=arm64 go build -o agent-arm64 main.go - -# Deploy to device -scp agent-arm64 pi@192.168.1.100:/home/pi/aixgo/ -scp config/agents.yaml pi@192.168.1.100:/home/pi/aixgo/config/ - -# Run as systemd service -ssh pi@192.168.1.100 'sudo systemctl start aixgo-agent' -``` - -**Systemd service file:** - -```ini -# /etc/systemd/system/aixgo-agent.service -[Unit] -Description=Aixgo Agent -After=network.target - -[Service] -Type=simple -User=pi -WorkingDirectory=/home/pi/aixgo -ExecStart=/home/pi/aixgo/agent-arm64 -Restart=always -RestartSec=10 - -[Install] -WantedBy=multi-user.target -``` - -## Configuration Management - -### Environment-Based Configs - -Use different configs for different environments: - -```go -// main.go -package main - -import ( - "fmt" - "os" - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - env := os.Getenv("ENV") - if env == "" { - env = "local" - } - - configPath := fmt.Sprintf("config/agents-%s.yaml", env) - if err := aixgo.Run(configPath); err != nil { - panic(err) - } -} -``` - -**Config files:** - -- `config/agents-local.yaml` - Development -- `config/agents-staging.yaml` - Staging -- `config/agents-prod.yaml` - Production - -### Secrets Management - -Never hardcode API keys or secrets. Use environment variables or secret managers. - -```yaml -# config/agents-prod.yaml -supervisor: - name: coordinator - -agents: - - name: analyzer - role: react - model: gpt-4-turbo - api_key: ${GROK_API_KEY} # From environment - prompt: 'Analyze the data' -``` - -```bash -# Cloud Run -gcloud run deploy aixgo-agent \ - --set-env-vars GROK_API_KEY=your-secret-key - -# Kubernetes Secret -kubectl create secret generic aixgo-secrets \ - --from-literal=GROK_API_KEY=your-secret-key - -# Reference in deployment -env: -- name: GROK_API_KEY - valueFrom: - secretKeyRef: - name: aixgo-secrets - key: GROK_API_KEY -``` - -## Health Checks & Monitoring - -### Health Endpoints - -Implement health check endpoints for orchestration platforms: - -```go -// health.go -package main - -import ( - "net/http" - "time" -) - -func healthHandler(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusOK) - w.Write([]byte("OK")) -} - -func readyHandler(w http.ResponseWriter, r *http.Request) { - // Check if agents are initialized - if !agentsReady() { - w.WriteHeader(http.StatusServiceUnavailable) - return - } - w.WriteHeader(http.StatusOK) - w.Write([]byte("READY")) -} - -func main() { - http.HandleFunc("/health", healthHandler) - http.HandleFunc("/ready", readyHandler) - - go func() { - http.ListenAndServe(":8080", nil) - }() - - // Start agents - aixgo.Run("config/agents.yaml") -} -``` - -### Metrics & Logging - -Use structured logging and metrics for observability: - -```go -import ( - "log/slog" - "os" -) - -func main() { - logger := slog.New(slog.NewJSONHandler(os.Stdout, nil)) - slog.SetDefault(logger) - - slog.Info("Starting Aixgo agent", "env", os.Getenv("ENV")) - - if err := aixgo.Run("config/agents.yaml"); err != nil { - slog.Error("Agent failed", "error", err) - os.Exit(1) - } -} -``` - -## Scaling Strategies - -### Vertical Scaling - -Increase resources for a single instance: - -```yaml -# Kubernetes resources -resources: - requests: - memory: '256Mi' # Increase from 128Mi - cpu: '200m' # Increase from 100m - limits: - memory: '1Gi' - cpu: '1000m' -``` - -**When to use:** - -- Single-instance bottleneck -- Memory-intensive LLM operations -- Before horizontal scaling - -### Horizontal Scaling - -Add more instances: - -```yaml -# Kubernetes HPA -apiVersion: autoscaling/v2 -kind: HorizontalPodAutoscaler -metadata: - name: aixgo-agent-hpa -spec: - scaleTargetRef: - apiVersion: apps/v1 - kind: Deployment - name: aixgo-agent - minReplicas: 2 - maxReplicas: 10 - metrics: - - type: Resource - resource: - name: cpu - target: - type: Utilization - averageUtilization: 70 -``` - -**When to use:** - -- High request volume -- Redundancy requirements -- Multi-region deployment - -## Best Practices - -### 1. Use Minimal Base Images - -Prefer `FROM scratch` or `alpine` for smallest attack surface: - -```dockerfile -FROM scratch # 0MB base -# or -FROM alpine:latest # ~5MB base -``` - -### 2. Enable Observability from Day One - -Configure OpenTelemetry before deploying: - -```yaml -# config/agents-prod.yaml -observability: - tracing: true - service_name: 'aixgo-prod' - exporter: 'otlp' - endpoint: 'otel-collector:4317' -``` - -### 3. Set Resource Limits - -Always define resource requests and limits: - -```yaml -resources: - requests: # Minimum guaranteed - memory: '128Mi' - cpu: '100m' - limits: # Maximum allowed - memory: '512Mi' - cpu: '500m' -``` - -### 4. Use Rolling Updates - -Deploy with zero downtime: - -```yaml -# Kubernetes deployment strategy -strategy: - type: RollingUpdate - rollingUpdate: - maxSurge: 1 - maxUnavailable: 0 -``` - -### 5. Implement Graceful Shutdown - -Handle termination signals properly: - -```go -import ( - "context" - "os" - "os/signal" - "syscall" - "time" -) - -func main() { - ctx, cancel := context.WithCancel(context.Background()) - - // Handle shutdown signals - sigChan := make(chan os.Signal, 1) - signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM) - - go func() { - <-sigChan - slog.Info("Shutdown signal received") - cancel() - }() - - // Run agents with context - if err := aixgo.RunWithContext(ctx, "config/agents.yaml"); err != nil { - slog.Error("Agent failed", "error", err) - os.Exit(1) - } -} -``` - -### 6. Monitor Cold Start Times - -Track startup performance: - -```go -import "time" - -func main() { - startTime := time.Now() - - if err := aixgo.Run("config/agents.yaml"); err != nil { - panic(err) - } - - slog.Info("Agent started", "duration_ms", time.Since(startTime).Milliseconds()) -} -``` - -Target: <100ms for serverless, <1s for containers - -### 7. Use Configuration Validation - -Validate configs before deployment: - -```bash -# Pre-deployment validation -go run main.go --validate-config config/agents-prod.yaml -``` - -## Performance Benchmarks - -Real-world production metrics: - -| Metric | Python (LangChain) | Aixgo | Improvement | -| --------------- | ------------------ | ------------ | ------------------ | -| Container Size | 1.2GB | <20MB | 60x smaller | -| Cold Start | 45 seconds | <100ms | 450x faster | -| Memory Baseline | 512MB | 50MB | 10x more efficient | -| Throughput | 500-1,000 req/s | 10,000 req/s | 10-20x higher | - -## Troubleshooting - -### High Memory Usage - -**Symptom:** OOM kills, high memory consumption - -**Solutions:** - -- Reduce max_rounds to limit workflow iterations -- Implement message batching -- Increase memory limits -- Use pagination for large datasets - -### Slow Response Times - -**Symptom:** High P99 latency - -**Solutions:** - -- Enable distributed tracing to identify bottlenecks -- Optimize LLM prompts (shorter = faster) -- Use faster models (gpt-3.5-turbo vs gpt-4) -- Add caching layer for repeated queries - -### Failed Health Checks - -**Symptom:** Pods restarting, traffic not routing - -**Solutions:** - -- Increase initialDelaySeconds for slow startup -- Check agent initialization logic -- Verify configuration is valid -- Review logs for startup errors - -## Next Steps - -- **[Observability & Monitoring](/guides/observability)** - Set up comprehensive monitoring -- **[Building Docker Images](/guides/docker-from-scratch)** - Optimize container builds -- **[Single Binary vs Distributed](/guides/single-vs-distributed)** - Understand scaling patterns diff --git a/web/content/guides/provider-comparison.md b/web/content/guides/provider-comparison.md deleted file mode 100644 index 8aa6660..0000000 --- a/web/content/guides/provider-comparison.md +++ /dev/null @@ -1,784 +0,0 @@ ---- -title: 'Provider Comparison' -description: 'Compare vectorstore and embedding providers to choose the right stack' -category: 'Reference' -weight: 9 ---- - -Choosing the right combination of vectorstore and embedding provider is critical for performance, cost, and developer experience. This guide helps you make informed decisions based on your requirements. - -## Why Provider Choice Matters - -Your choice of vectorstore and embedding provider affects: - -- **Performance**: Query latency, throughput, and scalability -- **Cost**: Development costs, API fees, and infrastructure expenses -- **Developer Experience**: Setup complexity, debugging tools, and documentation -- **Production Readiness**: Reliability, monitoring, and operational overhead - -The good news: Aixgo's abstraction layer lets you start simple and upgrade later without code changes. - -## Vector Store Providers - -### Detailed Comparison - -| Feature | Memory | Firestore | Qdrant | pgvector | -|---------|--------|-----------|--------|----------| -| **Persistence** | No | Yes | Yes | Yes | -| **Scalability** | 10K docs | Unlimited | Very High | High | -| **Setup Complexity** | None | Medium | Medium | Medium | -| **Cost** | Free | Pay-per-use | Self-host/Cloud | Self-host | -| **Collections** | Yes | Yes | Yes | Yes | -| **TTL Support** | Yes | Yes | Yes | No | -| **Multi-tenancy** | Yes | Yes | Yes | Yes | -| **Query Speed** | Very Fast | Fast | Very Fast | Fast | -| **Distributed** | No | Yes | Yes | Yes | -| **Backup/Recovery** | No | Automatic | Manual | Manual | -| **Best For** | Dev/Test | Production | High Performance | PostgreSQL Apps | - -### Memory - -**Ideal for**: Development, testing, prototyping - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" - -store, err := memory.New() -if err != nil { - log.Fatal(err) -} - -collection := store.Collection("my-data") -``` - -**Pros:** - -- Zero setup required -- Blazing fast (in-memory) -- Perfect for development and testing -- No external dependencies - -**Cons:** - -- Data lost on restart -- Limited to single process -- Not suitable for production -- Memory constraints (typically 10K docs max) - -**When to use:** - -- Local development -- Integration tests -- Proof of concepts -- Learning and experimentation - -### Firestore - -**Ideal for**: Managed production deployments, serverless applications - -```go -import ( - "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore" - "cloud.google.com/go/firestore" -) - -client, _ := firestore.NewClient(ctx, "project-id") -store, err := firestore.New(ctx, client) -if err != nil { - log.Fatal(err) -} - -collection := store.Collection("my-data") -``` - -**Pros:** - -- Fully managed (no ops overhead) -- Automatic scaling -- Built-in backup and recovery -- Global distribution -- Strong consistency guarantees -- Generous free tier - -**Cons:** - -- Vendor lock-in (GCP only) -- Pay-per-operation pricing -- Query complexity limitations -- Network latency for queries - -**Pricing** (approximate): - -- Free tier: 1 GB storage, 50K reads/day, 20K writes/day -- Beyond free tier: $0.18/GB/month storage, $0.06 per 100K reads - -**When to use:** - -- Production applications on GCP -- Serverless deployments (Cloud Run, Cloud Functions) -- Multi-region applications -- When you want zero operational overhead - -### Qdrant - -**Ideal for**: High-performance production, large-scale deployments - -**Status**: Coming soon - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/qdrant" - -store, err := qdrant.New(ctx, qdrant.Config{ - URL: "http://localhost:6333", -}) -if err != nil { - log.Fatal(err) -} - -collection := store.Collection("my-data") -``` - -**Pros:** - -- Extremely fast queries -- Rich filtering capabilities -- Excellent documentation -- Active development -- Cloud or self-hosted options -- REST and gRPC APIs - -**Cons:** - -- Requires infrastructure setup -- Self-hosting operational overhead -- Cloud pricing can be expensive at scale - -**Pricing** (approximate): - -- Self-hosted: Free (infrastructure costs only) -- Qdrant Cloud: Starting at $25/month for 1GB - -**When to use:** - -- High-throughput applications -- Complex filtering requirements -- Large datasets (millions of vectors) -- When you need maximum performance -- Self-hosted infrastructure preferred - -### pgvector - -**Ideal for**: PostgreSQL-based applications, existing database infrastructure - -**Status**: Coming soon - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/pgvector" - -store, err := pgvector.New(ctx, "postgresql://user:pass@localhost/db") -if err != nil { - log.Fatal(err) -} - -collection := store.Collection("my-data") -``` - -**Pros:** - -- Leverage existing PostgreSQL infrastructure -- ACID transactions -- Mature ecosystem -- Familiar SQL interface -- Great for hybrid relational + vector data - -**Cons:** - -- Manual scaling required -- Performance depends on PostgreSQL tuning -- No built-in TTL support -- Less optimized for pure vector search - -**When to use:** - -- Already using PostgreSQL -- Need transactional consistency -- Hybrid relational/vector queries -- Want to avoid additional infrastructure - -## Embedding Providers - -### Detailed Comparison - -| Provider | Quality | Speed | Cost | Dimensions | Best For | -|----------|---------|-------|------|------------|----------| -| **HuggingFace API** | Good-Excellent | Medium | Free | 384-1024 | Development | -| **HuggingFace TEI** | Good-Excellent | Very Fast | Self-host | 384-1024 | Production | -| **OpenAI** | Excellent | Fast | $0.02-0.13/1M tokens | 1536-3072 | Production | - -### HuggingFace API - -**Ideal for**: Development, testing, learning - -```go -import "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface" - -embedder, err := huggingface.New(ctx, huggingface.Config{ - APIKey: os.Getenv("HF_API_KEY"), - Model: "sentence-transformers/all-MiniLM-L6-v2", -}) -if err != nil { - log.Fatal(err) -} - -embedding, err := embedder.EmbedText(ctx, "Hello world") -``` - -**Popular Models:** - -- `sentence-transformers/all-MiniLM-L6-v2` (384 dims, fast, good quality) -- `BAAI/bge-small-en-v1.5` (384 dims, excellent for English) -- `intfloat/multilingual-e5-base` (768 dims, multilingual) - -**Pros:** - -- Free tier available -- Many model choices -- Good quality embeddings -- Easy to get started - -**Cons:** - -- API rate limits -- Network latency -- Potential availability issues -- Slower than self-hosted - -**Pricing:** - -- Free tier with rate limits -- Paid tiers starting at $9/month - -**When to use:** - -- Development and prototyping -- Low-volume applications -- Budget-conscious projects -- Testing different models - -### HuggingFace TEI (Text Embeddings Inference) - -**Ideal for**: Production, high-throughput, cost-sensitive deployments - -```go -import "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface" - -embedder, err := huggingface.New(ctx, huggingface.Config{ - BaseURL: "http://localhost:8080", - Model: "BAAI/bge-small-en-v1.5", -}) -if err != nil { - log.Fatal(err) -} -``` - -**Deployment** (Docker): - -```bash -docker run -p 8080:80 \ - -v $PWD/data:/data \ - ghcr.io/huggingface/text-embeddings-inference:latest \ - --model-id BAAI/bge-small-en-v1.5 -``` - -**Pros:** - -- Very fast (optimized inference) -- No API costs (after infrastructure) -- Batch processing support -- GPU acceleration available -- Full control over deployment - -**Cons:** - -- Requires infrastructure setup -- Operational overhead -- GPU costs for maximum performance -- Manual scaling required - -**Cost Estimate:** - -- CPU instance: ~$50-100/month (Cloud Run, ECS) -- GPU instance: ~$200-500/month (for high throughput) -- Unlimited embeddings after infrastructure cost - -**When to use:** - -- Production applications -- High-volume embedding generation -- Cost optimization at scale -- When you need predictable latency - -### OpenAI - -**Ideal for**: Production applications prioritizing quality over cost - -```go -import "github.com/aixgo-dev/aixgo/pkg/embeddings/openai" - -embedder, err := openai.New(ctx, openai.Config{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", -}) -if err != nil { - log.Fatal(err) -} -``` - -**Available Models:** - -- `text-embedding-3-small` (1536 dims, $0.02/1M tokens) -- `text-embedding-3-large` (3072 dims, $0.13/1M tokens) -- `text-embedding-ada-002` (1536 dims, $0.10/1M tokens, legacy) - -**Pros:** - -- Excellent quality -- Reliable infrastructure -- Fast response times -- Simple API -- No operational overhead - -**Cons:** - -- Costs scale with usage -- Vendor lock-in -- API rate limits (tier-based) -- Less control over model - -**Pricing Examples** (text-embedding-3-small at $0.02/1M tokens): - -- 1K documents: ~$0.002 (essentially free) -- 100K documents: ~$0.20 -- 1M documents: ~$2.00 -- 10M documents: ~$20.00 - -**When to use:** - -- Production applications -- When quality is critical -- Moderate to high volume -- When you want reliable, managed service - -## Recommended Stacks - -### Development Stack - -**Best for**: Local development, prototyping, learning - -```yaml -Embedding: HuggingFace API (free tier) -Vectorstore: Memory -``` - -**Setup time**: 5 minutes - -```go -// Complete working example -import ( - "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" -) - -// Embedding provider -embedder, _ := huggingface.New(ctx, huggingface.Config{ - APIKey: os.Getenv("HF_API_KEY"), - Model: "sentence-transformers/all-MiniLM-L6-v2", -}) - -// Vector store -store, _ := memory.New() -collection := store.Collection("dev-data") -``` - -**Cost**: Free -**Performance**: Fast (local storage, API network latency) -**Best for**: Getting started quickly - -### Production Stack (Managed) - -**Best for**: Production applications on GCP, serverless deployments - -```yaml -Embedding: OpenAI text-embedding-3-small -Vectorstore: Firestore -``` - -**Setup time**: 30 minutes - -```go -import ( - "github.com/aixgo-dev/aixgo/pkg/embeddings/openai" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore" - "cloud.google.com/go/firestore" -) - -// Embedding provider -embedder, _ := openai.New(ctx, openai.Config{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", -}) - -// Vector store -client, _ := firestore.NewClient(ctx, "project-id") -store, _ := firestore.New(ctx, client) -collection := store.Collection("prod-data") -``` - -**Cost**: ~$50-200/month (depending on scale) -**Performance**: Fast, globally distributed -**Best for**: Most production applications - -### Production Stack (Self-Hosted) - -**Best for**: High-volume, cost-sensitive, maximum performance - -```yaml -Embedding: HuggingFace TEI (self-hosted) -Vectorstore: Qdrant (self-hosted or cloud) -``` - -**Setup time**: 2-4 hours - -**Infrastructure** (Docker Compose): - -```yaml -version: '3.8' -services: - embeddings: - image: ghcr.io/huggingface/text-embeddings-inference:latest - command: --model-id BAAI/bge-small-en-v1.5 - ports: - - "8080:80" - volumes: - - ./data:/data - - qdrant: - image: qdrant/qdrant:latest - ports: - - "6333:6333" - volumes: - - ./qdrant_data:/qdrant/storage -``` - -**Application Code**: - -```go -import ( - "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/qdrant" -) - -// Embedding provider (self-hosted TEI) -embedder, _ := huggingface.New(ctx, huggingface.Config{ - BaseURL: "http://localhost:8080", - Model: "BAAI/bge-small-en-v1.5", -}) - -// Vector store (self-hosted Qdrant) -store, _ := qdrant.New(ctx, qdrant.Config{ - URL: "http://localhost:6333", -}) -collection := store.Collection("prod-data") -``` - -**Cost**: ~$100-300/month (infrastructure only) -**Performance**: Extremely fast -**Best for**: High-volume applications, cost optimization - -### Budget Stack - -**Best for**: Bootstrapped startups, side projects, MVPs - -```yaml -Embedding: HuggingFace API (free tier) -Vectorstore: Firestore (free tier) -``` - -**Setup time**: 15 minutes - -**Cost**: Free up to limits, then pay-as-you-go -**Performance**: Good for low-moderate volume -**Best for**: Getting to market fast with minimal costs - -### High-Volume Stack - -**Best for**: Large-scale applications, millions of queries/day - -```yaml -Embedding: HuggingFace TEI (GPU-accelerated) -Vectorstore: Qdrant cluster -``` - -**Cost**: ~$500-2000/month -**Performance**: Maximum throughput and minimal latency -**Best for**: Applications with millions of users - -## Cost Calculator - -### Scenario 1: Small Application - -**Volume**: 10K documents, 100K queries/month - -| Stack | Embedding Cost | Storage Cost | Total/Month | -|-------|----------------|--------------|-------------| -| Dev (HF API + Memory) | Free | Free | $0 | -| Budget (HF API + Firestore) | Free | Free | $0 | -| Managed (OpenAI + Firestore) | $0.20 | $0.18 | $0.38 | -| Self-Hosted (TEI + Qdrant) | $50 | $50 | $100 | - -**Recommendation**: Budget stack (HF API + Firestore) - -### Scenario 2: Medium Application - -**Volume**: 100K documents, 1M queries/month - -| Stack | Embedding Cost | Storage Cost | Total/Month | -|-------|----------------|--------------|-------------| -| Budget (HF API + Firestore) | $9 | $18 | $27 | -| Managed (OpenAI + Firestore) | $2.00 | $18 | $20 | -| Self-Hosted (TEI + Qdrant) | $100 | $100 | $200 | - -**Recommendation**: Managed stack (OpenAI + Firestore) - -### Scenario 3: Large Application - -**Volume**: 1M documents, 10M queries/month - -| Stack | Embedding Cost | Storage Cost | Total/Month | -|-------|----------------|--------------|-------------| -| Managed (OpenAI + Firestore) | $20 | $180 | $200 | -| Self-Hosted (TEI + Qdrant) | $200 | $200 | $400 | - -**Recommendation**: Self-hosted stack becomes cost-effective at this scale - -### Scenario 4: Enterprise Application - -**Volume**: 10M documents, 100M queries/month - -| Stack | Embedding Cost | Storage Cost | Total/Month | -|-------|----------------|--------------|-------------| -| Managed (OpenAI + Firestore) | $200 | $1,800 | $2,000 | -| Self-Hosted (TEI + Qdrant) | $500 | $1,000 | $1,500 | - -**Recommendation**: Self-hosted with dedicated infrastructure - -## Performance Benchmarks - -### Query Latency (p95) - -| Provider | Single Vector | Batch (10) | Batch (100) | -|----------|---------------|------------|-------------| -| Memory | 0.1ms | 0.5ms | 3ms | -| Firestore | 50ms | 75ms | 200ms | -| Qdrant (local) | 1ms | 5ms | 25ms | -| Qdrant (cloud) | 25ms | 40ms | 100ms | - -**Notes:** - -- Memory is fastest but not persistent -- Firestore latency includes network round-trip -- Qdrant local is nearly as fast as memory -- All tested with 100K documents, 384-dimensional vectors - -### Embedding Generation - -| Provider | Single Text | Batch (10) | Batch (100) | -|----------|-------------|------------|-------------| -| HF API | 200ms | 500ms | 2000ms | -| HF TEI (CPU) | 50ms | 200ms | 800ms | -| HF TEI (GPU) | 10ms | 30ms | 100ms | -| OpenAI | 100ms | 300ms | 1500ms | - -**Notes:** - -- HF TEI with GPU is fastest -- Batch processing significantly improves throughput -- OpenAI offers good balance of speed and quality - -### Throughput (queries per second) - -| Provider | Single Thread | 10 Threads | 100 Threads | -|----------|---------------|------------|-------------| -| Memory | 10,000 | 50,000 | 100,000 | -| Firestore | 100 | 500 | 2,000 | -| Qdrant (local) | 5,000 | 25,000 | 50,000 | -| Qdrant (cloud) | 500 | 2,500 | 10,000 | - -**Notes:** - -- Memory throughput limited only by CPU -- Firestore throughput limited by API quotas -- Qdrant scales well with parallelism - -## Migration Paths - -### Memory to Firestore - -**When**: Moving from development to production - -**Process**: - -1. Update configuration (no code changes needed) -2. Re-index your documents -3. Update monitoring/alerting - -```go -// Before (Memory) -store, _ := memory.New() - -// After (Firestore) - same interface! -client, _ := firestore.NewClient(ctx, "project-id") -store, _ := firestore.New(ctx, client) -``` - -**Downtime**: None (parallel indexing possible) -**Difficulty**: Easy -**Time**: 1-2 hours - -### HuggingFace API to TEI - -**When**: Scaling beyond free tier, need better performance - -**Process**: - -1. Deploy TEI container -2. Update base URL in configuration -3. Test embedding compatibility -4. Cutover - -```go -// Before (API) -embedder, _ := huggingface.New(ctx, huggingface.Config{ - APIKey: os.Getenv("HF_API_KEY"), - Model: "BAAI/bge-small-en-v1.5", -}) - -// After (TEI) - same interface! -embedder, _ := huggingface.New(ctx, huggingface.Config{ - BaseURL: "http://localhost:8080", - Model: "BAAI/bge-small-en-v1.5", -}) -``` - -**Downtime**: None -**Difficulty**: Medium -**Time**: 2-4 hours - -### Firestore to Qdrant - -**When**: Need better performance, higher volume, cost optimization - -**Process**: - -1. Deploy Qdrant -2. Export documents from Firestore -3. Re-index in Qdrant -4. Parallel run for validation -5. Cutover - -```go -// Before (Firestore) -client, _ := firestore.NewClient(ctx, "project-id") -store, _ := firestore.New(ctx, client) - -// After (Qdrant) - same interface! -store, _ := qdrant.New(ctx, qdrant.Config{ - URL: "http://localhost:6333", -}) -``` - -**Downtime**: None (with parallel indexing) -**Difficulty**: Medium-Hard -**Time**: 1-2 days - -### OpenAI to HuggingFace - -**When**: Cost optimization, need offline capability - -**Challenges**: - -- Different embedding dimensions (requires re-indexing) -- Potential quality differences -- Need to test thoroughly - -**Process**: - -1. Choose comparable HF model -2. Generate test embeddings -3. Evaluate quality on your use case -4. Re-index all documents -5. Cutover - -**Downtime**: Depends on index size -**Difficulty**: Hard -**Time**: 1-2 weeks (including testing) - -**Important**: Different embedding models are NOT compatible. You must re-index all data. - -## Decision Framework - -### Start Here - -Ask yourself these questions: - -1. **What's your deployment environment?** - - GCP → Consider Firestore - - AWS/Azure → Consider self-hosted options - - Multi-cloud → Memory (dev) or self-hosted (prod) - -2. **What's your budget?** - - $0/month → HuggingFace API + Memory/Firestore - - $50-200/month → OpenAI + Firestore - - $200+/month → Self-hosted TEI + Qdrant - -3. **What's your scale?** - - <100K docs → Managed solutions - - 100K-1M docs → Either managed or self-hosted - - >1M docs → Self-hosted recommended - -4. **What's your team's expertise?** - - Small team, no ops → Managed solutions - - DevOps capability → Self-hosted for better economics - -5. **What's your performance requirement?** - - <100ms p95 → Self-hosted required - - <500ms p95 → Managed solutions work - - >500ms p95 → Any solution works - -### Quick Decision Tree - -```text -Are you in production? -├─ No → HuggingFace API + Memory -└─ Yes - └─ On GCP? - ├─ Yes → OpenAI + Firestore - └─ No - └─ High volume (>1M queries/day)? - ├─ Yes → HuggingFace TEI + Qdrant - └─ No → OpenAI + Firestore -``` - -## Next Steps - -1. **Start with the Development Stack**: Get familiar with the APIs -2. **Benchmark your use case**: Real performance depends on your data -3. **Plan for migration**: Design with future scaling in mind -4. **Monitor costs**: Set up billing alerts early - -## Additional Resources - -- **[Vector Databases Guide](/guides/vector-databases)**: Deep dive into vector search -- **[Production Deployment](/guides/production-deployment)**: Best practices for production -- **[Observability](/guides/observability)**: Monitoring your vector infrastructure -- **[Provider Integration](/guides/provider-integration)**: Integrating custom providers diff --git a/web/content/guides/provider-integration.md b/web/content/guides/provider-integration.md deleted file mode 100644 index 179f95f..0000000 --- a/web/content/guides/provider-integration.md +++ /dev/null @@ -1,1389 +0,0 @@ ---- -title: 'Provider Integration Guide' -description: 'Integrate Aixgo with OpenAI, Anthropic, Google Vertex AI, Amazon Bedrock, HuggingFace, Ollama (local), and vector databases.' -breadcrumb: 'Reference' -category: 'Reference' -weight: 11 ---- - -## Provider Status - -### LLM Providers - -| Provider | Status | Notes | -| ------------------ | --------------------------------------- | --------------------------------------------------------- | -| OpenAI | {{< status-badge status="available" >}} | Chat, streaming SSE, function calling, JSON mode | -| Anthropic (Claude) | {{< status-badge status="available" >}} | Messages API, streaming SSE, tool use | -| Google Gemini | {{< status-badge status="available" >}} | GenerateContent API, streaming SSE, function calling | -| xAI (Grok) | {{< status-badge status="available" >}} | Chat, streaming SSE, function calling (OpenAI-compatible) | -| Vertex AI | {{< status-badge status="available" >}} | Google Cloud AI Platform, streaming SSE, function calling | -| Amazon Bedrock | {{< status-badge status="available" >}} | Claude, Llama, Nova, Titan via AWS, Converse API | -| HuggingFace | {{< status-badge status="available" >}} | Free Inference API, cloud backends | -| Ollama | {{< status-badge status="available" >}} | Local models, zero costs, enterprise SSRF protection, hybrid fallback | - -### Vector Databases - -| Provider | Status | Notes | -| --------- | --------------------------------------- | ------------------------------------- | -| Firestore | {{< status-badge status="available" >}} | Google Cloud serverless vector search | -| In-Memory | {{< status-badge status="available" >}} | Development and testing | -| Qdrant | {{< status-badge status="planned" >}} | Planned for v0.2 | -| pgvector | {{< status-badge status="planned" >}} | Planned for v0.2 | - -### Embedding Providers - -| Provider | Status | Notes | -| --------------- | --------------------------------------- | ---------------------------------------------- | -| OpenAI | {{< status-badge status="available" >}} | text-embedding-3-small, text-embedding-3-large | -| HuggingFace API | {{< status-badge status="available" >}} | Free inference API, 100+ models | -| HuggingFace TEI | {{< status-badge status="available" >}} | Self-hosted high-performance server | - ---- - -## LLM Providers - -### OpenAI (GPT-4, GPT-3.5) - -**Supported models:** - -- `gpt-4` - Most capable, higher cost -- `gpt-4-turbo` - Faster, lower cost than GPT-4 -- `gpt-3.5-turbo` - Fast, cost-effective - -**Configuration:** - -```yaml -# config/agents.yaml -agents: - - name: analyzer - role: react - model: gpt-4-turbo - provider: openai - api_key: ${OPENAI_API_KEY} - temperature: 0.7 - max_tokens: 1000 -``` - -**Environment variables:** - -```bash -export OPENAI_API_KEY=sk-... -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/providers/openai" - -agent := aixgo.NewAgent( - aixgo.WithName("analyzer"), - aixgo.WithModel("gpt-4-turbo"), - aixgo.WithProvider(openai.Provider{ - APIKey: os.Getenv("OPENAI_API_KEY"), - }), -) -``` - -**Features:** - -- ✅ Chat completions -- ✅ Function calling (tools) -- ✅ Streaming SSE responses -- ✅ JSON mode -- ✅ Token usage tracking - -**Pricing (as of 2025):** - -- GPT-4 Turbo: $0.01 per 1K input tokens, $0.03 per 1K output tokens -- GPT-3.5 Turbo: $0.0005 per 1K input tokens, $0.0015 per 1K output tokens - -### Anthropic (Claude) - -**Supported models:** - -- `claude-3-opus` - Most capable -- `claude-3-sonnet` - Balanced performance/cost -- `claude-3-haiku` - Fastest, lowest cost - -**Configuration:** - -```yaml -agents: - - name: analyst - role: react - model: claude-3-sonnet - provider: anthropic - api_key: ${ANTHROPIC_API_KEY} - temperature: 0.5 - max_tokens: 2000 -``` - -**Environment variables:** - -```bash -export ANTHROPIC_API_KEY=sk-ant-... -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/providers/anthropic" - -agent := aixgo.NewAgent( - aixgo.WithName("analyst"), - aixgo.WithModel("claude-3-sonnet"), - aixgo.WithProvider(anthropic.Provider{ - APIKey: os.Getenv("ANTHROPIC_API_KEY"), - }), -) -``` - -**Features:** - -- ✅ Long context window (200K tokens supported by API) -- ✅ Tool use -- ✅ Streaming SSE responses -- 🚧 Vision support (Planned) - -**Pricing:** - -- Claude 3 Opus: $0.015 per 1K input tokens, $0.075 per 1K output tokens -- Claude 3 Sonnet: $0.003 per 1K input tokens, $0.015 per 1K output tokens -- Claude 3 Haiku: $0.00025 per 1K input tokens, $0.00125 per 1K output tokens - -### Google Vertex AI (Gemini) - -**Supported models:** - -- `gemini-1.5-pro` - Most capable -- `gemini-1.5-flash` - Fast, cost-effective - -**Configuration:** - -```yaml -agents: - - name: processor - role: react - model: gemini-1.5-flash - provider: vertexai - project_id: ${GCP_PROJECT_ID} - location: us-central1 - temperature: 0.8 -``` - -**Authentication:** - -```bash -# Service account key -export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json - -# Or use gcloud default credentials -gcloud auth application-default login -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/providers/vertexai" - -agent := aixgo.NewAgent( - aixgo.WithName("processor"), - aixgo.WithModel("gemini-1.5-flash"), - aixgo.WithProvider(vertexai.Provider{ - ProjectID: os.Getenv("GCP_PROJECT_ID"), - Location: "us-central1", - }), -) -``` - -**Features:** - -- ✅ Long context (2M tokens for Gemini 1.5) -- ✅ Multimodal (text, images, video) -- ✅ Function calling -- ✅ Grounding with Google Search - -**Pricing:** - -- Gemini 1.5 Pro: $0.00125 per 1K input chars, $0.005 per 1K output chars -- Gemini 1.5 Flash: $0.000125 per 1K input chars, $0.000375 per 1K output chars - -### HuggingFace Inference API - -**Supported backends:** - -- HuggingFace Inference API (cloud) -- Ollama (local) -- vLLM (self-hosted) - -**Supported models:** - -- Any model on HuggingFace with Inference API enabled -- Popular: `meta-llama/Llama-2-70b-chat-hf`, `mistralai/Mixtral-8x7B-Instruct-v0.1` - -**Configuration:** - -```yaml -agents: - - name: classifier - role: react - model: meta-llama/Llama-2-70b-chat-hf - provider: huggingface - api_key: ${HUGGINGFACE_API_KEY} - endpoint: https://api-inference.huggingface.co -``` - -**Environment variables:** - -```bash -export HUGGINGFACE_API_KEY=hf_... -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/providers/huggingface" - -agent := aixgo.NewAgent( - aixgo.WithName("classifier"), - aixgo.WithModel("meta-llama/Llama-2-70b-chat-hf"), - aixgo.WithProvider(huggingface.Provider{ - APIKey: os.Getenv("HUGGINGFACE_API_KEY"), - Endpoint: "https://api-inference.huggingface.co", - }), -) -``` - -**Features:** - -- ✅ Open-source models -- ✅ Self-hosted option (Ollama, vLLM) -- ✅ Cloud backends -- ✅ Streaming support -- ✅ Custom fine-tuned models -- ⚠️ Tool calling support (model-dependent) - -**Pricing:** - -- Pay-per-request or dedicated endpoints -- Varies by model size and usage - ---- - -### Ollama (Local Models) - -**Run production AI models on your own infrastructure with zero API costs and complete data privacy.** - -Ollama enables you to run state-of-the-art open-source models locally or on-premises. Aixgo provides enterprise-grade Ollama integration with hardened security, automatic fallback to cloud APIs, and production-ready deployment templates. - -**Supported models:** - -Any model from the [Ollama library](https://ollama.com/library): - -- `phi3.5:3.8b-mini-instruct-q4_K_M` - Fast, efficient (3.8B parameters) -- `gemma2:2b-instruct-q4_0` - Google's lightweight model (2B) -- `llama3.1:8b` - Meta's Llama 3.1 (8B) -- `mistral:7b` - Mistral 7B -- `codellama:7b` - Code-focused model -- Custom quantized models (int4, int8, fp16) - -**Quick Start:** - -1. Install Ollama: - -```bash -# macOS -brew install ollama - -# Linux -curl -fsSL https://ollama.com/install.sh | sh - -# Windows - download from https://ollama.com/download -``` - -1. Pull a model: - -```bash -ollama pull phi3.5:3.8b-mini-instruct-q4_K_M -``` - -1. Configure Aixgo: - -```yaml -# config/agents.yaml -model_services: - - name: phi-local - provider: huggingface - model: phi3.5:3.8b-mini-instruct-q4_K_M - runtime: ollama - transport: local - config: - address: http://localhost:11434 # Default, can be omitted - quantization: int4 - -agents: - - name: local-assistant - role: react - model: phi-local - prompt: | - You are a helpful AI assistant running locally. - temperature: 0.7 - max_tokens: 1000 -``` - -**Go SDK:** - -```go -import ( - "context" - "github.com/aixgo-dev/aixgo/internal/llm/inference" -) - -// Create Ollama service -ollama := inference.NewOllamaService("http://localhost:11434") - -// Check availability -if !ollama.Available() { - log.Fatal("Ollama not running") -} - -// List available models -models, err := ollama.ListModels(context.Background()) -if err != nil { - log.Fatal(err) -} -for _, model := range models { - fmt.Printf("Model: %s (Size: %d bytes)\n", model.Name, model.Size) -} - -// Generate text -req := inference.GenerateRequest{ - Model: "phi3.5:3.8b-mini-instruct-q4_K_M", - Prompt: "Explain quantum computing in simple terms.", - MaxTokens: 500, - Temperature: 0.7, - Stop: []string{"\n\n"}, -} -resp, err := ollama.Generate(context.Background(), req) -if err != nil { - log.Fatal(err) -} -fmt.Printf("Response: %s\n", resp.Text) -fmt.Printf("Tokens used: %d prompt + %d completion = %d total\n", - resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens) - -// Chat completions -chatResp, err := ollama.Chat(context.Background(), "phi3.5:3.8b-mini-instruct-q4_K_M", []inference.ChatMessage{ - {Role: "user", Content: "What is Aixgo?"}, -}) -``` - -**Configuration Options:** - -```yaml -model_services: - - name: my-ollama-service - provider: huggingface - model: llama3.1:8b - runtime: ollama - transport: local - config: - # Ollama server address (optional, defaults to http://localhost:11434) - address: http://localhost:11434 - - # Quantization level (optional) - quantization: int4 # Options: int4, int8, fp16 - - # Request timeout (optional, defaults to 5 minutes) - timeout: 300s -``` - -**Hybrid Inference with Automatic Fallback:** - -Aixgo can automatically fall back to cloud APIs if Ollama is unavailable: - -```yaml -agents: - - name: resilient-agent - role: react - providers: - # Try local Ollama first - - model: phi-local - provider: huggingface - runtime: ollama - - # Fallback to cloud if local unavailable - - model: gpt-4-turbo - provider: openai - api_key: ${OPENAI_API_KEY} - - - model: claude-3-haiku - provider: anthropic - api_key: ${ANTHROPIC_API_KEY} - - fallback_strategy: cascade # Try each in order - prompt: | - You are a resilient assistant with automatic failover. -``` - -**Security Features:** - -Aixgo's Ollama integration includes enterprise-grade security: - -- **SSRF Protection**: Strict host allowlist (localhost, 127.0.0.1, ::1, ollama) -- **No Redirects**: Prevents redirect-based SSRF attacks -- **IP Validation**: Blocks private ranges, link-local, multicast, cloud metadata endpoints -- **DNS Rebinding Protection**: Per-connection hostname validation -- **40+ Security Test Cases**: Comprehensive security validation - -**Production Deployment:** - -**Docker Compose:** - -```yaml -# docker-compose.yaml -version: '3.8' -services: - ollama: - image: ollama/ollama:0.5.4 - ports: - - "11434:11434" - volumes: - - ollama-data:/root/.ollama - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"] - interval: 10s - timeout: 5s - retries: 3 - - aixgo: - build: . - depends_on: - ollama: - condition: service_healthy - environment: - - OLLAMA_HOST=http://ollama:11434 - ports: - - "8080:8080" - -volumes: - ollama-data: -``` - -**Kubernetes:** - -Aixgo provides production-ready Kubernetes manifests at `deploy/k8s/base/ollama-deployment.yaml`: - -```yaml -apiVersion: apps/v1 -kind: Deployment -metadata: - name: ollama - namespace: aixgo -spec: - replicas: 1 - template: - spec: - # Security: Non-root user, seccomp, capabilities dropped - securityContext: - runAsNonRoot: true - runAsUser: 1000 - seccompProfile: - type: RuntimeDefault - - containers: - - name: ollama - image: ollama/ollama:0.5.4 - ports: - - containerPort: 11434 - - resources: - requests: - cpu: 2 - memory: 4Gi - limits: - cpu: 4 - memory: 8Gi - - # Health checks - livenessProbe: - httpGet: - path: /api/tags - port: 11434 - initialDelaySeconds: 30 - periodSeconds: 10 - - readinessProbe: - httpGet: - path: /api/tags - port: 11434 - initialDelaySeconds: 10 - periodSeconds: 5 - - volumeMounts: - - name: ollama-data - mountPath: /.ollama - - volumes: - - name: ollama-data - persistentVolumeClaim: - claimName: ollama-pvc -``` - -Deploy with: - -```bash -kubectl apply -f deploy/k8s/base/ollama-deployment.yaml -``` - -**Custom Docker Image:** - -Build a security-hardened Ollama image with pre-loaded models: - -```dockerfile -# docker/ollama.Dockerfile -FROM ollama/ollama:0.5.4 - -# Pre-pull models at build time (optional) -ARG MODELS="phi3.5:3.8b-mini-instruct-q4_K_M gemma2:2b-instruct-q4_0" -RUN ollama serve & sleep 5 && \ - for model in $MODELS; do ollama pull $model; done && \ - pkill ollama - -# Run as non-root user -USER 1000 - -EXPOSE 11434 -CMD ["serve"] -``` - -Build and run: - -```bash -docker build -f docker/ollama.Dockerfile -t my-ollama:latest . -docker run -d -p 11434:11434 -v ollama-data:/root/.ollama my-ollama:latest -``` - -**API Endpoints:** - -Aixgo supports these Ollama API endpoints: - -| Endpoint | Method | Purpose | Supported | -|----------|--------|---------|-----------| -| `/api/generate` | POST | Text generation | ✅ Yes | -| `/api/chat` | POST | Chat completions | ✅ Yes | -| `/api/tags` | GET | List models / health check | ✅ Yes | -| `/` | GET | Health check | ✅ Yes | - -**Environment Variables:** - -```bash -# Ollama server address (optional, defaults to http://localhost:11434) -export OLLAMA_HOST=http://localhost:11434 - -# For custom deployments -export OLLAMA_HOST=http://ollama-service.aixgo.svc.cluster.local:11434 -``` - -**Features:** - -- ✅ Zero API costs -- ✅ Complete data privacy -- ✅ Any Ollama-compatible model -- ✅ Text generation and chat completions -- ✅ Token usage tracking -- ✅ Model listing and health checks -- ✅ Hybrid inference with cloud fallback -- ✅ Enterprise security (SSRF protection) -- ✅ Production Kubernetes manifests -- ✅ Docker and Docker Compose support -- ✅ Non-root container execution -- ❌ Streaming (not yet supported) -- ❌ Function calling (model-dependent) - -**Performance:** - -| Model | Size | Speed (tokens/sec) | Memory | Best For | -|-------|------|-------------------|--------|----------| -| **phi3.5:3.8b-q4** | 2.2GB | ~50-100 | 4GB | General purpose, fast responses | -| **gemma2:2b-q4** | 1.6GB | ~80-150 | 3GB | Lightweight, edge deployment | -| **llama3.1:8b** | 4.7GB | ~30-60 | 8GB | Higher quality, reasoning | -| **mistral:7b** | 4.1GB | ~40-80 | 6GB | Balanced performance/quality | - -**Model Selection Guide:** - -- **Development/Testing**: `gemma2:2b-q4_0` - Fastest, smallest -- **Production (CPU)**: `phi3.5:3.8b-mini-instruct-q4_K_M` - Best quality/speed balance -- **Production (GPU)**: `llama3.1:8b` or `mistral:7b` - Higher quality -- **Code Generation**: `codellama:7b` - Specialized for code -- **Edge Devices**: `gemma2:2b-q4_0` - Smallest memory footprint - -**Troubleshooting:** - -**Ollama not available:** - -```bash -# Check if Ollama is running -curl http://localhost:11434/api/tags - -# Start Ollama -ollama serve - -# Check logs -ollama logs -``` - -**Model not found:** - -```bash -# List available models -ollama list - -# Pull missing model -ollama pull phi3.5:3.8b-mini-instruct-q4_K_M -``` - -**Connection refused in Kubernetes:** - -```bash -# Check service -kubectl get svc ollama-service -n aixgo - -# Check pod -kubectl get pods -n aixgo -l app=ollama - -# View logs -kubectl logs -n aixgo -l app=ollama - -# Port forward for testing -kubectl port-forward -n aixgo svc/ollama-service 11434:11434 -``` - -**High memory usage:** - -- Use quantized models (q4_K_M, q4_0) -- Reduce `num_ctx` parameter -- Limit concurrent requests -- Use smaller models (2B-7B vs 13B+) - -**Learn More:** - -- [Ollama Documentation](https://github.com/ollama/ollama) -- [Available Models](https://ollama.com/library) -- [Aixgo Security Documentation](https://github.com/aixgo-dev/aixgo/blob/main/docs/SECURITY_STATUS.md) -- [Kubernetes Deployment Guide](https://github.com/aixgo-dev/aixgo/tree/main/deploy/k8s) - -### xAI (Grok) - -**Supported models:** - -- `gpt-4-turbo` - Latest Grok model - -**Configuration:** - -```yaml -agents: - - name: researcher - role: react - model: gpt-4-turbo - provider: xai - api_key: ${XAI_API_KEY} -``` - -**Environment variables:** - -```bash -export XAI_API_KEY=xai-... -``` - -**Features:** - -- ✅ Real-time web access -- ✅ Tool calling -- ✅ Long context window - -### Amazon Bedrock - -Amazon Bedrock provides access to foundation models from multiple providers through a single AWS API. - -**Supported models:** - -| Provider | Models | -| --------- | --------------------------------------------------------------- | -| Anthropic | `anthropic.claude-3-5-sonnet-20240620-v1:0`, `anthropic.claude-3-haiku-20240307-v1:0`, `anthropic.claude-3-opus-20240229-v1:0` | -| Amazon | `amazon.nova-pro-v1:0`, `amazon.nova-lite-v1:0`, `amazon.nova-micro-v1:0` | -| Meta | `meta.llama3-70b-instruct-v1:0`, `meta.llama3-8b-instruct-v1:0` | -| Mistral | `mistral.mistral-large-2407-v1:0` | -| Amazon | `amazon.titan-text-express-v1`, `amazon.titan-text-lite-v1` | - -**Configuration:** - -```yaml -agents: - - name: analyst - role: react - model: anthropic.claude-3-5-sonnet-20240620-v1:0 - provider: bedrock -``` - -**Environment variables:** - -```bash -# AWS Region (required) -export AWS_REGION=us-east-1 - -# Option 1: Access keys -export AWS_ACCESS_KEY_ID=AKIA... -export AWS_SECRET_ACCESS_KEY=... - -# Option 2: Named profile -export AWS_PROFILE=my-profile - -# Option 3: IAM role (for EC2/ECS/EKS) - no env vars needed -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/llm/provider" - -p, err := provider.CreateProvider("bedrock", map[string]any{ - "region": "us-east-1", -}) -if err != nil { - log.Fatal(err) -} - -resp, err := p.CreateCompletion(ctx, provider.CompletionRequest{ - Model: "anthropic.claude-3-5-sonnet-20240620-v1:0", - Messages: []provider.Message{ - {Role: "user", Content: "Explain Go interfaces"}, - }, -}) -``` - -**Features:** - -- ✅ Multi-model access via single API -- ✅ Unified Converse API for all models -- ✅ Streaming responses (ConverseStream) -- ✅ Tool calling -- ✅ Structured output -- ✅ AWS IAM authentication -- ✅ VPC endpoints for private connectivity -- ✅ Guardrails integration - -**Pricing (per 1M tokens):** - -| Model | Input | Output | -| ------------------------ | -------- | -------- | -| Claude 3.5 Sonnet | $3.00 | $15.00 | -| Claude 3 Haiku | $0.25 | $1.25 | -| Amazon Nova Pro | $0.80 | $3.20 | -| Amazon Nova Lite | $0.06 | $0.24 | -| Llama 3 70B | $2.65 | $3.50 | - -**See also:** [AWS Bedrock Guide](/guides/aws-bedrock/) for comprehensive setup and production deployment. - -## Provider Comparison - -| Provider | Best For | Context Length | Tool Support | Cost | -| ----------------- | --------------------------------- | -------------- | ------------ | --------- | -| **OpenAI** | General purpose, function calling | 128K tokens | ✅ Excellent | $$$ | -| **Anthropic** | Long documents, safety | 200K tokens | ✅ Excellent | $$$$ | -| **Google Vertex** | Multimodal, grounding | 2M tokens | ✅ Good | $$ | -| **Amazon Bedrock**| Multi-model access, enterprise | 200K tokens | ✅ Excellent | $$ - $$$$ | -| **HuggingFace** | Open source, custom models | Varies | ⚠️ Limited | $ | -| **xAI** | Real-time info, research | 128K tokens | ✅ Good | $$$ | -| **Ollama** | Local inference, data privacy | Varies (4K-32K)| ⚠️ Limited | Free | - -## Multi-Provider Strategy - -### Fallback Configuration - -Use multiple providers with automatic fallback: - -```yaml -agents: - - name: resilient-analyzer - role: react - providers: - - model: gpt-4-turbo - provider: openai - api_key: ${OPENAI_API_KEY} - - model: claude-3-sonnet - provider: anthropic - api_key: ${ANTHROPIC_API_KEY} - - model: gemini-1.5-flash - provider: vertexai - project_id: ${GCP_PROJECT_ID} - fallback_strategy: cascade # Try each in order -``` - -If OpenAI fails, automatically try Anthropic, then Google. - -### Cost Optimization - -Route based on complexity: - -```yaml -# Simple tasks: cheap model -- name: simple-classifier - role: react - model: gpt-3.5-turbo - provider: openai - -# Complex reasoning: capable model -- name: complex-analyzer - role: react - model: gpt-4-turbo - provider: openai -``` - -### Region-Specific Routing - -```yaml -# US region: Vertex AI (low latency) -- name: us-agent - role: react - model: gemini-1.5-flash - provider: vertexai - location: us-central1 - -# EU region: OpenAI EU endpoint -- name: eu-agent - role: react - model: gpt-4-turbo - provider: openai - endpoint: https://api.openai.com/v1 # or EU-specific endpoint -``` - -## Vector Databases & Embeddings - -### Overview - -Aixgo provides integrated support for vector databases and embeddings, enabling Retrieval-Augmented Generation (RAG) systems. The architecture separates embedding generation from -vector storage for maximum flexibility. - -**Architecture:** - -```text -Documents → Embeddings Service → Vector Database → Semantic Search -``` - -### Embedding Providers - -#### OpenAI Embeddings - -**Best for:** Production deployments, highest quality - -**Configuration:** - -```yaml -embeddings: - provider: openai - openai: - api_key: ${OPENAI_API_KEY} - model: text-embedding-3-small # or text-embedding-3-large -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/embeddings" - -config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", - }, -} - -embSvc, err := embeddings.New(config) -if err != nil { - log.Fatal(err) -} -defer embSvc.Close() - -// Generate embedding -embedding, err := embSvc.Embed(ctx, "Your text here") -``` - -**Models:** - -- `text-embedding-3-small`: 1536 dimensions, $0.02 per 1M tokens -- `text-embedding-3-large`: 3072 dimensions, $0.13 per 1M tokens -- `text-embedding-ada-002`: 1536 dimensions (legacy) - -#### HuggingFace Inference API - -**Best for:** Development, cost-sensitive deployments - -**Configuration:** - -```yaml -embeddings: - provider: huggingface - huggingface: - model: sentence-transformers/all-MiniLM-L6-v2 - api_key: ${HUGGINGFACE_API_KEY} # Optional - wait_for_model: true - use_cache: true -``` - -**Popular models:** - -- `sentence-transformers/all-MiniLM-L6-v2`: 384 dims, fast -- `BAAI/bge-large-en-v1.5`: 1024 dims, excellent quality -- `thenlper/gte-large`: 1024 dims, multilingual - -**Pricing:** FREE (Inference API) with rate limits - -#### HuggingFace TEI (Self-Hosted) - -**Best for:** High-throughput production workloads - -**Docker setup:** - -```bash -docker run -d \ - --name tei \ - -p 8080:8080 \ - --gpus all \ - ghcr.io/huggingface/text-embeddings-inference:latest \ - --model-id BAAI/bge-large-en-v1.5 -``` - -**Configuration:** - -```yaml -embeddings: - provider: huggingface_tei - huggingface_tei: - endpoint: http://localhost:8080 - model: BAAI/bge-large-en-v1.5 - normalize: true -``` - -### Vector Store Providers - -#### Firestore Vector Search - -**Best for:** Serverless production deployments on GCP - -**Setup:** - -```bash -# Enable Firestore -gcloud services enable firestore.googleapis.com - -# Create vector index -gcloud firestore indexes composite create \ - --collection-group=embeddings \ - --query-scope=COLLECTION \ - --field-config=field-path=embedding,vector-config='{"dimension":"384","flat":{}}' -``` - -**Configuration:** - -```yaml -vectorstore: - provider: firestore - embedding_dimensions: 384 - firestore: - project_id: ${GCP_PROJECT_ID} - collection: embeddings - credentials_file: /path/to/key.json # Optional -``` - -**Go code:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore" - -config := vectorstore.Config{ - Provider: "firestore", - EmbeddingDimensions: 384, - Firestore: &vectorstore.FirestoreConfig{ - ProjectID: os.Getenv("GCP_PROJECT_ID"), - Collection: "embeddings", - }, -} - -store, err := vectorstore.New(config) -if err != nil { - log.Fatal(err) -} -defer store.Close() - -// Upsert documents -doc := vectorstore.Document{ - ID: "doc-1", - Content: "Your document content", - Embedding: embedding, - Metadata: map[string]interface{}{ - "category": "documentation", - }, -} -store.Upsert(ctx, []vectorstore.Document{doc}) - -// Search -results, err := store.Search(ctx, vectorstore.SearchQuery{ - Embedding: queryEmbedding, - TopK: 5, - MinScore: 0.7, -}) -``` - -**Features:** - -- ✅ Serverless, auto-scaling -- ✅ Persistent storage -- ✅ Real-time updates -- ✅ ACID transactions - -**Pricing:** ~$0.06 per 100K reads + storage - -#### In-Memory Vector Store - -**Best for:** Development, testing, prototyping - -**Configuration:** - -```yaml -vectorstore: - provider: memory - embedding_dimensions: 384 - memory: - max_documents: 10000 -``` - -**Features:** - -- ✅ Zero setup -- ✅ Fast for small datasets -- ❌ Data lost on restart -- ❌ Limited capacity - -#### Qdrant (Planned - v0.2) - -High-performance dedicated vector database: - -```yaml -# Coming soon -vectorstore: - provider: qdrant - embedding_dimensions: 384 - qdrant: - host: localhost - port: 6333 - collection: knowledge_base -``` - -#### pgvector (Planned - v0.2) - -PostgreSQL extension for vector search: - -```yaml -# Coming soon -vectorstore: - provider: pgvector - embedding_dimensions: 384 - pgvector: - connection_string: postgresql://user:pass@localhost/db - table: embeddings -``` - -### Complete RAG Example - -```go -package main - -import ( - "context" - "log" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" -) - -func main() { - ctx := context.Background() - - // Setup embeddings - embConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - }, - } - embSvc, _ := embeddings.New(embConfig) - defer embSvc.Close() - - // Setup vector store - storeConfig := vectorstore.Config{ - Provider: "firestore", - EmbeddingDimensions: embSvc.Dimensions(), - Firestore: &vectorstore.FirestoreConfig{ - ProjectID: "my-project", - Collection: "knowledge_base", - }, - } - store, _ := vectorstore.New(storeConfig) - defer store.Close() - - // Index documents - docs := []string{ - "Aixgo is a production-grade AI framework", - "RAG combines retrieval with generation", - } - - for i, content := range docs { - emb, _ := embSvc.Embed(ctx, content) - doc := vectorstore.Document{ - ID: fmt.Sprintf("doc-%d", i), - Content: content, - Embedding: emb, - } - store.Upsert(ctx, []vectorstore.Document{doc}) - } - - // Search - query := "What is Aixgo?" - queryEmb, _ := embSvc.Embed(ctx, query) - results, _ := store.Search(ctx, vectorstore.SearchQuery{ - Embedding: queryEmb, - TopK: 3, - }) - - for _, result := range results { - fmt.Printf("Score: %.2f - %s\n", result.Score, result.Document.Content) - } -} -``` - -### Provider Comparison: Embeddings - -| Provider | Cost | Quality | Speed | Best For | -| ------------------- | -------------------- | -------------- | --------- | ----------- | -| **OpenAI** | $0.02-0.13/1M tokens | Excellent | Fast | Production | -| **HuggingFace API** | Free | Good-Excellent | Medium | Development | -| **HuggingFace TEI** | Free (self-host) | Good-Excellent | Very Fast | High-volume | - -### Provider Comparison: Vector Stores - -| Provider | Persistence | Scalability | Setup | Cost | -| ---------------------- | ----------- | ----------- | ------ | --------- | -| **Memory** | No | Low | None | Free | -| **Firestore** | Yes | Unlimited | Medium | $$ | -| **Qdrant** (planned) | Yes | Very High | Medium | Self-host | -| **pgvector** (planned) | Yes | High | Medium | Self-host | - -### Learn More - -- **[Vector Databases Guide](/guides/vector-databases)** - Complete RAG implementation guide -- **[Extending Aixgo](/guides/extending-aixgo)** - Add custom vector store providers -- **[RAG Agent Example](https://github.com/aixgo-dev/aixgo/tree/main/examples/rag-agent)** - Full working example - -## API Key Management - -### Environment Variables - -```bash -# .env file -OPENAI_API_KEY=sk-... -ANTHROPIC_API_KEY=sk-ant-... -GCP_PROJECT_ID=my-project -HUGGINGFACE_API_KEY=hf_... -``` - -Load with: - -```bash -export $(cat .env | xargs) -``` - -### Kubernetes Secrets - -```bash -kubectl create secret generic llm-keys \ - --from-literal=OPENAI_API_KEY=sk-... \ - --from-literal=ANTHROPIC_API_KEY=sk-ant-... -``` - -**Reference in deployment:** - -```yaml -env: - - name: OPENAI_API_KEY - valueFrom: - secretKeyRef: - name: llm-keys - key: OPENAI_API_KEY -``` - -### Cloud Secret Managers - -**Google Secret Manager:** - -```go -import "cloud.google.com/go/secretmanager/apiv1" - -func getAPIKey(ctx context.Context, secretName string) (string, error) { - client, _ := secretmanager.NewClient(ctx) - result, _ := client.AccessSecretVersion(ctx, &secretmanagerpb.AccessSecretVersionRequest{ - Name: secretName, - }) - return string(result.Payload.Data), nil -} -``` - -## Rate Limiting & Retries - -### Provider Rate Limits - -| Provider | Tier | Requests/Min | Tokens/Min | -| --------- | ----------- | ------------ | ---------- | -| OpenAI | Free | 3 | 40,000 | -| OpenAI | Paid Tier 1 | 500 | 90,000 | -| Anthropic | Free | 5 | 25,000 | -| Anthropic | Paid | 50 | 100,000 | -| Vertex AI | Default | 60 | 60,000 | - -### Retry Configuration - -```yaml -agents: - - name: resilient-agent - role: react - model: gpt-4-turbo - provider: openai - retry: - max_attempts: 3 - initial_backoff: 1s - max_backoff: 10s - multiplier: 2 - retry_on: - - rate_limit - - timeout - - server_error -``` - -## Monitoring Provider Performance - -### Track Latency by Provider - -```go -import "github.com/prometheus/client_golang/prometheus" - -var providerLatency = prometheus.NewHistogramVec( - prometheus.HistogramOpts{ - Name: "llm_provider_latency_seconds", - Help: "LLM API call latency by provider", - }, - []string{"provider", "model"}, -) - -// Aixgo tracks this automatically -``` - -### Cost Tracking - -```yaml -observability: - cost_tracking: true - cost_alert_threshold: 100 # Alert if daily cost > $100 -``` - -## Best Practices - -### 1. Use Environment-Specific Keys - -```yaml -# Development -OPENAI_API_KEY=sk-dev-... - -# Production -OPENAI_API_KEY=sk-prod-... -``` - -### 2. Implement Fallback Providers - -Always have a backup provider to avoid single point of failure. - -### 3. Monitor Token Usage - -Track and alert on unexpected token consumption: - -```yaml -observability: - llm_observability: - enabled: true - track_tokens: true - daily_token_limit: 1000000 -``` - -### 4. Choose Models Strategically - -- **Simple tasks:** gpt-3.5-turbo, gemini-flash, claude-haiku -- **Complex reasoning:** gpt-4-turbo, claude-3-opus -- **Long documents:** claude-3-opus (200K), gemini-pro (2M) -- **Cost-sensitive:** gemini-flash, gpt-3.5-turbo - -### 5. Use Caching - -Cache LLM responses for repeated queries: - -```go -import "github.com/aixgo-dev/aixgo/cache" - -agent := aixgo.NewAgent( - aixgo.WithName("cached-analyzer"), - aixgo.WithCache(cache.NewRedisCache("localhost:6379")), - aixgo.WithCacheTTL(1 * time.Hour), -) -``` - -## Troubleshooting - -### Authentication Errors - -**Error:** `401 Unauthorized` - -**Solution:** - -- Verify API key is correct -- Check key has not expired -- Ensure environment variable is loaded - -### Rate Limit Exceeded - -**Error:** `429 Too Many Requests` - -**Solution:** - -- Implement exponential backoff -- Reduce request rate -- Upgrade to higher tier -- Add multiple API keys for rotation - -### Timeout Errors - -**Error:** `Request timeout` - -**Solution:** - -```yaml -agents: - - name: patient-agent - role: react - model: gpt-4-turbo - timeout: 60s # Increase timeout -``` - -## Next Steps - -- **[Type Safety & LLM Integration](/guides/type-safety)** - Type-safe provider usage -- **[Observability & Monitoring](/guides/observability)** - Monitor provider performance -- **[Production Deployment](/guides/production-deployment)** - Deploy with secrets management diff --git a/web/content/guides/quick-start.md b/web/content/guides/quick-start.md deleted file mode 100644 index 010654c..0000000 --- a/web/content/guides/quick-start.md +++ /dev/null @@ -1,138 +0,0 @@ ---- -title: 'Quick Start Guide' -description: 'Get started with Aixgo in under 5 minutes. Build your first multi-agent system.' -breadcrumb: 'Getting Started' -category: 'Getting Started' -weight: 1 ---- - -Get running in under 5 minutes. Create a simple data analysis pipeline with three agents: a producer that generates data, an analyzer that processes it with an LLM, and a logger -that persists the results. - -## 1. Install Aixgo - -```bash -go get github.com/aixgo-dev/aixgo -``` - -## 2. Set Up Your API Key - -Before running agents, configure your LLM provider API key: - -```bash -# For OpenAI (used in this example) -export OPENAI_API_KEY=your-openai-key-here - -# OR for xAI/Grok -export XAI_API_KEY=your-xai-key-here - -# OR for Anthropic -export ANTHROPIC_API_KEY=your-anthropic-key-here -``` - -Get your key from: - -- **OpenAI Platform**: https://platform.openai.com/ -- **xAI Console**: https://console.x.ai/ -- **Anthropic Console**: https://console.anthropic.com/ - -## 3. Create `config/agents.yaml` - -This configuration file sets up a simple automated pipeline with three connected agents. The first agent generates sample data every second, the second agent uses AI to analyze that data, and the third agent logs the results. Think of it like an assembly line where each station performs a specific task. - -```yaml -supervisor: - name: coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - - name: data-producer - role: producer - interval: 1s - outputs: - - target: analyzer - - - name: analyzer - role: react - model: gpt-4-turbo - prompt: | - You are a data analyst. Analyze incoming data and provide insights. - inputs: - - source: data-producer - outputs: - - target: logger - - - name: logger - role: logger - inputs: - - source: analyzer -``` - -## 4. Create `main.go` - -This is the entry point for your application. It loads your agent configuration and starts the system. The `agents` import registers all built-in agent types so they're available for use. - -```go -package main - -import ( - "github.com/aixgo-dev/aixgo" - _ "github.com/aixgo-dev/aixgo/agents" -) - -func main() { - if err := aixgo.Run("config/agents.yaml"); err != nil { - panic(err) - } -} -``` - -## 5. Run it - -```bash -go run main.go -``` - -That's it! You now have a running multi-agent system with producer, analyzer, and logger agents orchestrated by a supervisor. The entire deployment is a single binary. - -## What Just Happened? - -This example demonstrates Aixgo's core concepts: - -- **Producer Agent** (`data-producer`) - Generates periodic messages every second -- **ReAct Agent** (`analyzer`) - Uses an LLM (GPT-4 Turbo) to analyze incoming data -- **Logger Agent** (`logger`) - Persists the analysis results -- **Supervisor** (`coordinator`) - Orchestrates the agents and manages message routing - -The supervisor automatically: - -- Starts agents in dependency order -- Routes messages from data-producer → analyzer → logger -- Enforces the max_rounds limit (10 iterations) -- Handles graceful shutdown - -## Next Steps - -Now that you have your first agent running, explore Aixgo's powerful features: - -### Build Production Systems - -- **[Vector Databases & RAG](/guides/vector-databases)** - Add semantic search and retrieval-augmented generation to eliminate hallucinations -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Build complex workflows with multiple specialized agents -- **[Production Deployment](/guides/production-deployment)** - Deploy your agents to production with monitoring and scaling - -### Advanced Features - -- **[Provider Integration](/guides/provider-integration)** - Connect to OpenAI, Anthropic, Google, and more -- **[Observability](/guides/observability)** - Monitor your agents with OpenTelemetry and distributed tracing -- **[Type Safety](/guides/type-safety)** - Leverage Go's type system for compile-time error detection - -### Core Concepts - -- **[Core Concepts](/guides/core-concepts)** - Learn about agent types and supervisor patterns -- **[Extending Aixgo](/guides/extending-aixgo)** - Add custom LLM providers, vector databases, and embeddings - -### Examples - -Browse our example configurations for common use cases like chatbots, data processing, and RAG systems diff --git a/web/content/guides/sessions.md b/web/content/guides/sessions.md deleted file mode 100644 index 11a6ce7..0000000 --- a/web/content/guides/sessions.md +++ /dev/null @@ -1,189 +0,0 @@ ---- -title: "Session Persistence" -description: "Build AI agents that remember with built-in session persistence, checkpoints, and multiple storage backends" -category: "Tools" -weight: 8 ---- - -AI agents that remember. Aixgo sessions provide durable conversation history with automatic persistence, checkpoints, and seamless resumption across restarts. - -## Quick Start - -```go -package main - -import ( - "context" - "github.com/aixgo-dev/aixgo/pkg/session" -) - -func main() { - ctx := context.Background() - - // Create file-based storage - backend, _ := session.NewFileBackend("~/.aixgo/sessions") - mgr := session.NewManager(backend) - defer mgr.Close() - - // Create or resume a session - sess, _ := mgr.GetOrCreate(ctx, "assistant", "user-123") - - // Append a message (role: "user", "assistant", or "system") - sess.AppendEntry(ctx, session.EntryTypeMessage, map[string]any{ - "role": "user", - "content": "Hello!", - }) - - // Get full history - entries, _ := sess.GetEntries(ctx) -} -``` - -## Why Sessions? - -| Problem | Solution | -|---------|----------| -| Context lost on restart | Persistent JSONL/Redis storage | -| No rollback capability | Checkpoint and restore | -| Complex state management | Automatic session lifecycle | -| Multi-node deployments | Redis backend for shared state | - -## Storage Backends - -### File Backend (JSONL) - -Best for single-node deployments and development: - -```go -backend, err := session.NewFileBackend("~/.aixgo/sessions") -``` - -**Features:** -- Append-only JSONL format (human-readable) -- Automatic directory creation -- Restrictive permissions (0700/0600) -- File locking for concurrent access - -### Redis Backend - -Best for distributed deployments: - -```go -backend, err := session.NewRedisBackend( - "localhost:6379", - session.RedisOptions{ - Password: os.Getenv("REDIS_PASSWORD"), - DB: 0, - KeyPrefix: "aixgo:sessions:", - }, -) -``` - -**Features:** -- Shared state across nodes -- Connection pooling -- Configurable key prefix -- TTL support for session expiration - -## Checkpoints - -Save and restore conversation state: - -```go -// Create checkpoint before risky operation -checkpoint, err := sess.Checkpoint(ctx) -if err != nil { - log.Fatal(err) -} - -// ... perform operation ... - -// Restore if needed -if needRollback { - err = sess.Restore(ctx, checkpoint.ID) -} -``` - -Checkpoints include: -- Full message history reference -- Timestamp and checksum -- Custom metadata - -## Runtime Integration - -Use `CallWithSession()` for automatic session management: - -```go -rt := aixgo.NewSimpleRuntime() -rt.SetSessionManager(sessionMgr) - -// Messages automatically persisted -result, err := rt.CallWithSession(ctx, "assistant", msg, sessionID) -``` - -## Context Helpers - -Pass sessions through context: - -```go -// Add session to context -ctx = session.ContextWithSession(ctx, sess) - -// Retrieve in downstream code -sess, ok := session.SessionFromContext(ctx) -if ok { - messages, _ := sess.GetMessages(ctx) -} -``` - -## Configuration - -YAML configuration for session-enabled agents: - -```yaml -session: - enabled: true - store: file # or redis - base_dir: ~/.aixgo/sessions - -agents: - - name: assistant - role: react - model: gpt-4-turbo - prompt: "You are a helpful assistant with memory." -``` - -## Performance - -| Operation | Latency | -|-----------|---------| -| Create session | <1ms | -| Append message | <1ms | -| Get 100 messages | <5ms | -| Checkpoint | <1ms | - -## Examples - -Two complete examples are included: - -1. **session-basic** - CRUD operations, checkpoints, context helpers -2. **session-react** - ReAct agent with full conversation history - -```bash -cd examples/session-basic && go run main.go -cd examples/session-react && go run main.go -``` - -## Best Practices - -1. **Use GetOrCreate()** - Handles both new and existing sessions -2. **Checkpoint before risky operations** - Enable rollback capability -3. **Use Redis for multi-node** - File backend is single-node only -4. **Clean up old sessions** - Implement retention policies -5. **Pass sessions via context** - Cleaner than parameter threading - -## Next Steps - -- [Multi-Agent Orchestration](/guides/multi-agent-orchestration/) - Coordinate multiple agents -- [Production Deployment](/guides/production-deployment/) - Deploy with sessions -- [Observability](/guides/observability/) - Monitor session operations diff --git a/web/content/guides/single-vs-distributed.md b/web/content/guides/single-vs-distributed.md deleted file mode 100644 index 8f50372..0000000 --- a/web/content/guides/single-vs-distributed.md +++ /dev/null @@ -1,288 +0,0 @@ ---- -title: 'Single Binary vs Distributed Mode' -description: 'Understand how Aixgo scales from local development to distributed production with zero code changes.' -breadcrumb: 'Core Concepts' -category: 'Core Concepts' -weight: 5 ---- - -One of Aixgo's planned features is seamless scaling: write your code once, and run it anywhere. Currently in alpha, local mode is available. Distributed mode coming in v0.2. - -## The Problem with Traditional Scaling - -Most frameworks require you to choose your architecture upfront: - -- **Local development** - Simple, fast, but doesn't match production -- **Distributed production** - Complex, requires queues, service orchestration, infrastructure -- **The gap** - Refactoring, rewriting, architectural changes to move from local to distributed - -This creates a painful transition: prototype locally with one architecture, then rewrite for production with another. - -## Aixgo's Solution: Transport Abstraction - -Aixgo abstracts message transport into a runtime layer. Your agent code stays the same; the runtime selects the appropriate transport based on configuration. - -```go -// This code works locally AND distributed -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) -supervisor.AddAgent(analyzer) -supervisor.Run() // Runtime picks: channels or gRPC -``` - -## Local Mode: Go Channels - -### How It Works - -In local mode, Aixgo uses Go channels for inter-agent communication. All agents run in the same process, communicating through in-memory channels. - -```go -// Internal implementation (simplified) -type LocalRuntime struct { - channels map[string]chan Message -} - -func (r *LocalRuntime) Send(target string, msg Message) { - r.channels[target] <- msg -} -``` - -### Benefits - -- **Fast iteration** - No infrastructure setup required -- **Easy debugging** - Single process, standard Go debugging tools -- **Low latency** - In-memory communication, microsecond message passing -- **Perfect for development** - Rapid prototyping and testing - -### When to Use Local Mode - -✅ **Local development** - Prototyping and testing on your machine - -✅ **Single-instance deployments** - Cloud Run, Lambda, simple containers - -✅ **Small workloads** - Low throughput, single-region requirements - -✅ **Edge devices** - Resource-constrained environments where one process is sufficient - -## Distributed Mode: gRPC - -### How It Works - -In distributed mode, Aixgo uses gRPC with Protocol Buffers for inter-agent communication. Agents can run on different nodes, regions, or even cloud providers. - -```go -// Same agent code! Runtime handles gRPC automatically -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) // May run on node A -supervisor.AddAgent(analyzer) // May run on node B -supervisor.Run() // Runtime uses gRPC -``` - -### Benefits - -- **Horizontal scaling** - Run agents on multiple nodes -- **Fault isolation** - Agent failures don't crash the entire system -- **Multi-region support** - Deploy across geographic regions -- **Resource optimization** - Run compute-heavy agents on dedicated hardware - -### When to Use Distributed Mode - -✅ **High throughput** - Processing thousands of messages per second - -✅ **Multi-region** - Geographic distribution for latency or compliance - -✅ **Resource isolation** - Separate agents with different compute requirements - -✅ **Fault tolerance** - Critical systems requiring redundancy - -## The Same Code, Different Configuration - -Here's the key insight: **your agent code never changes**. Only configuration differs. - -### Local Configuration - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - mode: local # Uses Go channels - -agents: - - name: producer - role: producer - outputs: - - target: analyzer - - - name: analyzer - role: react - inputs: - - source: producer -``` - -### Distributed Configuration - -```yaml -# config/agents.yaml -supervisor: - name: coordinator - mode: distributed # Uses gRPC - -agents: - - name: producer - role: producer - endpoint: 'producer-service:50051' # gRPC endpoint - outputs: - - target: analyzer - - - name: analyzer - role: react - endpoint: 'analyzer-service:50051' # gRPC endpoint - inputs: - - source: producer -``` - -## Scaling Path: Local → Single Instance → Distributed - -### Step 1: Develop Locally - -```bash -# Development on your laptop -go run main.go -``` - -Configuration: `mode: local` - -### Step 2: Deploy Single Instance - -```dockerfile -# Dockerfile -FROM golang:1.21 AS builder -WORKDIR /app -COPY . . -RUN go build -o agent main.go - -FROM scratch -COPY --from=builder /app/agent /agent -COPY config/ /config/ -CMD ["/agent"] -``` - -Deploy to Cloud Run, Lambda, or any container platform. Still using `mode: local` - all agents in one process. - -### Step 3: Scale to Distributed - -When you need more capacity: - -1. **Split agents into services** - Deploy each agent as a separate service -2. **Update configuration** - Change `mode: distributed`, add endpoints -3. **Deploy** - Same binary, different config - -```yaml -# Kubernetes deployment example -apiVersion: apps/v1 -kind: Deployment -metadata: - name: producer -spec: - template: - spec: - containers: - - name: producer - image: my-agent:latest - args: ['--agent=producer'] ---- -apiVersion: apps/v1 -kind: Deployment -metadata: - name: analyzer -spec: - template: - spec: - containers: - - name: analyzer - image: my-agent:latest # Same image! - args: ['--agent=analyzer'] -``` - -## Performance Comparison - -Choose the right deployment mode based on your performance and scale requirements. - -| | Local Mode | Distributed Mode | -| -------------- | ----------------------------- | --------------------------- | -| **Latency** | Microseconds (in-memory) | Milliseconds (network) | -| **Throughput** | 10,000+ msg/s (single core) | 100,000+ msg/s (multi-node) | -| **Deployment** | Single binary | Multiple services | -| **Scaling** | Vertical (bigger instance) | Horizontal (more nodes) | -| **Cost** | $10-50/month (small instance) | $100-500/month (cluster) | -| **Complexity** | Simple | Requires orchestration | - -## Best Practices - -### Start Local - -Always begin with local mode: - -- Fast development iteration -- Easy debugging -- Validate logic before adding infrastructure - -### Measure Before Distributing - -Only move to distributed mode when you have evidence you need it: - -- Throughput exceeding single-instance capacity -- Geographic distribution requirements -- Specific resource isolation needs - -Don't prematurely optimize—many production workloads run fine in local mode on a single Cloud Run instance. - -### Use Environment-Based Configuration - -```go -// main.go -func main() { - env := os.Getenv("ENV") - configPath := fmt.Sprintf("config/agents-%s.yaml", env) - - if err := aixgo.Run(configPath); err != nil { - panic(err) - } -} -``` - -```bash -# Development -ENV=local go run main.go - -# Production (single instance) -ENV=prod go run main.go - -# Production (distributed) -ENV=prod-distributed go run main.go -``` - -### Monitor Transport Performance - -Use OpenTelemetry to track: - -- Message latency (local vs network) -- Throughput per agent -- Resource utilization - -This data informs when to scale. - -## Current Status & Roadmap - -**Local Mode: ✅ Available (v0.1)** Ready for production use with Go channels. - -**Distributed Mode: 🚧 Coming Soon (v0.2)** gRPC transport and distributed orchestration scheduled for Q1 2026. - -While distributed mode is in development, you can build production systems today using local mode. Most use cases—especially single-instance Cloud Run/Lambda deployments—work -perfectly with in-process channels. - -## Next Steps - -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Build complex workflows -- **[Production Deployment](/guides/production-deployment)** - Deploy to production environments -- **[Observability & Monitoring](/guides/observability)** - Track performance and debug issues diff --git a/web/content/guides/type-safety.md b/web/content/guides/type-safety.md deleted file mode 100644 index 803fea0..0000000 --- a/web/content/guides/type-safety.md +++ /dev/null @@ -1,477 +0,0 @@ ---- -title: 'Type Safety & LLM Integration' -description: "Leverage Go's type system for compile-time guarantees in LLM interactions and tool definitions." -breadcrumb: 'Core Concepts' -category: 'Core Concepts' -weight: 6 ---- - -One of Aixgo's most powerful advantages over Python frameworks is compile-time type safety. This guide shows how Go's type system catches errors before deployment and ensures -reliable LLM interactions. - -## The Runtime Error Problem - -Python frameworks discover type errors in production: - -```python -# Python - Runtime error (discovered in production) -agent = Agent( - name="analyzer", - model=123, # Should be string, but Python allows it - temperature="high" # Should be float, but Python allows it -) - -# Error only appears when code runs in production -# TypeError: expected str, got int -``` - -This leads to: - -- Production incidents -- Difficult debugging -- Customer-facing errors -- Wasted time and resources - -## Aixgo's Solution: Compile-Time Safety - -Go catches these errors before you deploy: - -```go -// Go - Compile error (caught before deployment) -agent := aixgo.NewAgent( - aixgo.WithName("analyzer"), - aixgo.WithModel(123), // ❌ Compile error: expected string, got int - aixgo.WithTemperature("high"), // ❌ Compile error: expected float64, got string -) - -// Code won't compile until fixed -``` - -Your IDE tells you what's broken before your customers do. - -## Type-Safe Agent Configuration - -### Strong Typing for Agent Options - -```go -package main - -import "github.com/aixgo-dev/aixgo" - -func main() { - // All types are checked at compile time - agent := aixgo.NewAgent( - aixgo.WithName("data-analyzer"), // string - aixgo.WithRole(aixgo.RoleReAct), // enum - aixgo.WithModel("gpt-4-turbo"), // string - aixgo.WithTemperature(0.7), // float64 - aixgo.WithMaxTokens(1000), // int - aixgo.WithTimeout(30 * time.Second), // time.Duration - ) - - // Won't compile if types are wrong - // aixgo.WithTemperature("0.7") // ❌ Compile error - // aixgo.WithMaxTokens(1000.5) // ❌ Compile error -} -``` - -### Enum-Based Role Safety - -```go -// Roles are compile-time constants -const ( - RoleProducer aixgo.AgentRole = "producer" - RoleReAct aixgo.AgentRole = "react" - RoleLogger aixgo.AgentRole = "logger" - RoleClassifier aixgo.AgentRole = "classifier" - RoleAggregator aixgo.AgentRole = "aggregator" - RolePlanner aixgo.AgentRole = "planner" -) - -// Type-safe role assignment -agent := aixgo.NewAgent( - aixgo.WithRole(aixgo.RoleReAct), // ✅ Valid - aixgo.WithRole(aixgo.RoleClassifier), // ✅ Valid - // aixgo.WithRole("analyser"), // ❌ Compile error (typo) -) -``` - -## Type-Safe Tool Definitions - -ReAct agents use tools to interact with external systems. Aixgo enforces type safety for tool schemas. - -### Defining Tools with Struct Types - -```go -package main - -import "github.com/aixgo-dev/aixgo/tools" - -// Define tool input as a struct -type DatabaseQueryInput struct { - Query string `json:"query" required:"true"` - Limit int `json:"limit" required:"false"` - Filters []string `json:"filters" required:"false"` -} - -// Define tool output as a struct -type DatabaseQueryOutput struct { - Results []map[string]interface{} `json:"results"` - Count int `json:"count"` -} - -// Implement the tool with type-safe inputs/outputs -func queryDatabase(input DatabaseQueryInput) (DatabaseQueryOutput, error) { - // Implementation here - // Type system guarantees input structure - query := input.Query // Guaranteed to be string - limit := input.Limit // Guaranteed to be int - - results, err := db.Query(query, limit) - if err != nil { - return DatabaseQueryOutput{}, err - } - - return DatabaseQueryOutput{ - Results: results, - Count: len(results), - }, nil -} - -func main() { - // Register tool with type-safe schema - tool := tools.NewTool( - "query_database", - "Query the database with filters", - queryDatabase, // Type-checked at compile time - ) - - agent := aixgo.NewAgent( - aixgo.WithName("analyst"), - aixgo.WithRole(aixgo.RoleReAct), - aixgo.WithTools(tool), - ) -} -``` - -### Automatic Schema Generation - -Aixgo generates JSON schemas from Go structs automatically: - -```go -type SearchInput struct { - Query string `json:"query" required:"true" description:"Search query"` - MaxResults int `json:"max_results" required:"false" description:"Maximum results to return"` - Categories []string `json:"categories" required:"false" description:"Filter by categories"` -} - -// Aixgo generates this JSON schema automatically: -// { -// "type": "object", -// "properties": { -// "query": { "type": "string", "description": "Search query" }, -// "max_results": { "type": "integer", "description": "Maximum results to return" }, -// "categories": { -// "type": "array", -// "items": { "type": "string" }, -// "description": "Filter by categories" -// } -// }, -// "required": ["query"] -// } -``` - -No manual schema writing. No drift between code and schema. - -## Compile-Time Error Detection - -### Configuration Errors - -```go -// ❌ Won't compile - wrong type -agent := aixgo.NewAgent( - aixgo.WithMaxTokens("1000"), // Expected int, got string -) - -// ✅ Compiles - correct type -agent := aixgo.NewAgent( - aixgo.WithMaxTokens(1000), -) -``` - -### Missing Required Fields - -```go -type ToolInput struct { - Query string `json:"query" required:"true"` -} - -func searchTool(input ToolInput) (string, error) { - // Input.Query is guaranteed to exist - // No need for nil checks - return search(input.Query), nil -} -``` - -### Type Mismatches in Message Passing - -```go -// Define message types -type DataMessage struct { - Content string - Score float64 -} - -// Agent expects specific message type -func processMessage(msg DataMessage) { - // msg.Content guaranteed to be string - // msg.Score guaranteed to be float64 -} - -// Type system prevents wrong message types -// processMessage("wrong type") // ❌ Won't compile -``` - -## Refactoring with Confidence - -### Safe Across Large Codebases - -```go -// Change a tool input structure -type QueryInput struct { - Query string - Limit int - Offset int // New field added -} - -// Compiler finds ALL places that need updating: -// - Tool implementations using QueryInput -// - Tests that create QueryInput -// - Documentation examples (if in Go) - -// No hidden runtime errors in production -``` - -### Interface Changes are Tracked - -```go -// Change an interface -type Analyzer interface { - Analyze(data string) (Result, error) - // GetConfidence() float64 // New method added -} - -// Compiler identifies all implementations that need the new method -// Can't deploy until all implementations are updated -``` - -## LLM Output Validation - -### Structured Output Parsing - -```go -// Define expected LLM output structure -type AnalysisResult struct { - Sentiment string `json:"sentiment" validate:"oneof=positive negative neutral"` - Confidence float64 `json:"confidence" validate:"gte=0,lte=1"` - Keywords []string `json:"keywords" validate:"min=1"` -} - -// Parse and validate LLM response -func parseAnalysis(llmOutput string) (*AnalysisResult, error) { - var result AnalysisResult - if err := json.Unmarshal([]byte(llmOutput), &result); err != nil { - return nil, fmt.Errorf("invalid JSON: %w", err) - } - - // Validate struct - if err := validator.Validate(result); err != nil { - return nil, fmt.Errorf("validation failed: %w", err) - } - - return &result, nil -} -``` - -### Retry on Type Errors - -```go -// Automatic retry if LLM returns invalid type -agent := aixgo.NewAgent( - aixgo.WithName("classifier"), - aixgo.WithRetryOnValidationError(true), - aixgo.WithMaxRetries(3), -) - -// If LLM returns wrong type: -// 1. Aixgo detects type mismatch -// 2. Automatically retries with corrected prompt -// 3. Returns error only after max retries -``` - -## Python vs Go: Type Safety Comparison - -| Scenario | Python | Aixgo (Go) | -| ---------------------- | -------------------------------------- | --------------------------------------- | -| **Configuration typo** | Runtime error in production | Compile error, caught before deployment | -| **Tool input type** | Validated at runtime (if you remember) | Validated at compile time (automatic) | -| **Refactoring** | Hope you find all usages | Compiler finds all usages | -| **LLM output parsing** | Manual validation, easy to miss | Struct-based, validated automatically | -| **Schema drift** | Code and schema can diverge | Schema generated from code | - -## Best Practices - -### 1. Use Structs for Complex Inputs - -```go -// ❌ Avoid: untyped maps -func processTool(input map[string]interface{}) { - query := input["query"].(string) // Runtime panic if wrong type -} - -// ✅ Prefer: typed structs -type ToolInput struct { - Query string `json:"query"` -} - -func processTool(input ToolInput) { - query := input.Query // Compile-time guaranteed -} -``` - -### 2. Define Custom Types for Enums - -```go -type Sentiment string - -const ( - SentimentPositive Sentiment = "positive" - SentimentNegative Sentiment = "negative" - SentimentNeutral Sentiment = "neutral" -) - -type AnalysisOutput struct { - Sentiment Sentiment `json:"sentiment"` -} - -// Type-safe sentiment assignment -output := AnalysisOutput{ - Sentiment: SentimentPositive, // ✅ Type-checked - // Sentiment: "postive", // ❌ Won't compile (typo) -} -``` - -### 3. Validate at Boundaries - -```go -import "github.com/go-playground/validator/v10" - -type UserInput struct { - Email string `json:"email" validate:"required,email"` - Age int `json:"age" validate:"gte=0,lte=150"` -} - -func handleInput(input UserInput) error { - validate := validator.New() - if err := validate.Struct(input); err != nil { - return fmt.Errorf("invalid input: %w", err) - } - // Proceed with validated input -} -``` - -### 4. Use Pointer Types for Optional Fields - -```go -type AgentConfig struct { - Name string `json:"name" required:"true"` - Model string `json:"model" required:"true"` - Temperature *float64 `json:"temperature" required:"false"` // Optional -} - -// Can distinguish between "not provided" and "zero value" -if config.Temperature != nil { - // Temperature was explicitly set -} -``` - -## Real-World Example: Type-Safe Research Agent - -```go -package main - -import ( - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/tools" -) - -// Define research query input -type ResearchInput struct { - Topic string `json:"topic" required:"true" description:"Research topic"` - MaxSources int `json:"max_sources" required:"false" description:"Maximum sources to search"` - Languages []string `json:"languages" required:"false" description:"Preferred languages"` -} - -// Define research output -type ResearchOutput struct { - Summary string `json:"summary"` - Sources []string `json:"sources"` - Tags []string `json:"tags"` -} - -// Type-safe research tool -func conductResearch(input ResearchInput) (ResearchOutput, error) { - // Input types guaranteed at compile time - topic := input.Topic - maxSources := input.MaxSources - if maxSources == 0 { - maxSources = 10 // Default - } - - // Implement research logic - results := performSearch(topic, maxSources) - - return ResearchOutput{ - Summary: summarize(results), - Sources: extractSources(results), - Tags: extractTags(results), - }, nil -} - -func main() { - // Register tool with type safety - researchTool := tools.NewTool( - "conduct_research", - "Research a topic and return summary with sources", - conductResearch, - ) - - // Create agent with type-safe configuration - agent := aixgo.NewAgent( - aixgo.WithName("research-assistant"), - aixgo.WithRole(aixgo.RoleReAct), - aixgo.WithModel("gpt-4-turbo"), - aixgo.WithTemperature(0.3), - aixgo.WithTools(researchTool), - ) - - // All types checked at compile time - // No runtime surprises in production -} -``` - -## Key Takeaways - -1. **Compile-time safety** - Errors caught before deployment, not in production -2. **Automatic schema generation** - No manual JSON schema writing -3. **Refactoring confidence** - Compiler tracks all changes across codebase -4. **Type-safe tools** - LLM tool inputs/outputs validated automatically -5. **IDE support** - Autocomplete, type hints, and error detection - -Type safety is not just a nice-to-have—it's a production necessity for reliable AI systems. - -## Next Steps - -- **[Multi-Agent Orchestration](/guides/multi-agent-orchestration)** - Build complex type-safe workflows -- **[Production Deployment](/guides/production-deployment)** - Deploy with confidence -- **[Provider Integration](/guides/provider-integration)** - Integrate LLM providers with type safety diff --git a/web/content/guides/using-public-interfaces.md b/web/content/guides/using-public-interfaces.md deleted file mode 100644 index 215f8cd..0000000 --- a/web/content/guides/using-public-interfaces.md +++ /dev/null @@ -1,506 +0,0 @@ ---- -title: 'Using Public Interfaces' -description: 'Build custom agents and integrate Aixgo into existing Go applications with the public agent package' -category: 'Advanced' -weight: 8 ---- - -# Using Public Interfaces - -Aixgo v0.2.2 introduces the `agent` package—a standalone, minimal-dependency package that exports core interfaces for building custom agents. This enables library-style integration into existing Go applications without requiring the full Aixgo framework. - -## Overview - -The public agent package provides: - -- **Agent Interface**: Define custom agent behavior -- **Message Struct**: Standard communication format between agents -- **Runtime Interface**: Coordinate multiple agents with synchronous and asynchronous communication -- **LocalRuntime**: Production-ready single-process runtime implementation - -### When to Use Public Interfaces - -| Use Case | Approach | -|----------|----------| -| Building standalone AI applications | Full Aixgo framework | -| Integrating agents into existing services | Public agent package | -| Custom runtime implementations | Public agent package | -| Lightweight agent prototypes | Public agent package | -| Multi-provider orchestration with built-in patterns | Full Aixgo framework | - -## Installation - -Add the agent package to your project: - -```bash -go get github.com/aixgo-dev/aixgo/agent -``` - -The package has minimal dependencies (only `github.com/google/uuid`), making it suitable for projects where dependency management is a concern. - -## Core Interfaces - -### Agent Interface - -All agents must implement the `Agent` interface: - -```go -type Agent interface { - // Name returns the unique identifier for this agent instance - Name() string - - // Role returns the agent's role type (e.g., "analyzer", "processor") - Role() string - - // Start initializes the agent and prepares it to receive messages - // Blocks until context is cancelled or a fatal error occurs - Start(ctx context.Context) error - - // Execute processes an input message and returns a response synchronously - Execute(ctx context.Context, input *Message) (*Message, error) - - // Stop gracefully shuts down the agent - Stop(ctx context.Context) error - - // Ready returns true if the agent is ready to process messages - Ready() bool -} -``` - -### Message Struct - -Messages are the standard unit of communication: - -```go -type Message struct { - ID string // Unique identifier (auto-generated) - Type string // Message type for routing - Payload string // JSON-serialized data - Timestamp string // ISO 8601 creation time - Metadata map[string]interface{} // Optional key-value pairs -} -``` - -### Runtime Interface - -The Runtime coordinates agent communication: - -```go -type Runtime interface { - // Registration - Register(agent Agent) error - Unregister(name string) error - Get(name string) (Agent, error) - List() []string - - // Synchronous communication - Call(ctx context.Context, target string, input *Message) (*Message, error) - CallParallel(ctx context.Context, targets []string, input *Message) (map[string]*Message, map[string]error) - - // Asynchronous communication - Send(target string, msg *Message) error - Recv(source string) (<-chan *Message, error) - Broadcast(msg *Message) error - - // Lifecycle - Start(ctx context.Context) error - Stop(ctx context.Context) error -} -``` - -## Creating a Custom Agent - -Here's a complete example of a custom agent: - -```go -package main - -import ( - "context" - "github.com/aixgo-dev/aixgo/agent" -) - -type AnalyzerAgent struct { - name string - ready bool -} - -func NewAnalyzerAgent(name string) *AnalyzerAgent { - return &AnalyzerAgent{name: name} -} - -func (a *AnalyzerAgent) Name() string { return a.name } -func (a *AnalyzerAgent) Role() string { return "analyzer" } -func (a *AnalyzerAgent) Ready() bool { return a.ready } - -func (a *AnalyzerAgent) Start(ctx context.Context) error { - a.ready = true - <-ctx.Done() // Block until context cancelled - return nil -} - -func (a *AnalyzerAgent) Execute(ctx context.Context, input *agent.Message) (*agent.Message, error) { - // Unmarshal the input - var request struct { - Text string `json:"text"` - } - if err := input.UnmarshalPayload(&request); err != nil { - return nil, err - } - - // Process the request - result := struct { - WordCount int `json:"word_count"` - Status string `json:"status"` - }{ - WordCount: len(request.Text), - Status: "analyzed", - } - - return agent.NewMessage("analysis_result", result), nil -} - -func (a *AnalyzerAgent) Stop(ctx context.Context) error { - a.ready = false - return nil -} -``` - -## Using the LocalRuntime - -The `LocalRuntime` provides single-process agent coordination: - -```go -package main - -import ( - "context" - "fmt" - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/agent" -) - -func main() { - ctx := context.Background() - - // Create runtime - rt := aixgo.NewRuntime() - - // Register agents - rt.Register(NewAnalyzerAgent("analyzer-1")) - rt.Register(NewAnalyzerAgent("analyzer-2")) - - // Start runtime (launches all agents) - go rt.Start(ctx) - - // Create a request message - input := agent.NewMessage("analyze_request", map[string]string{ - "text": "Hello, Aixgo!", - }) - - // Synchronous call to single agent - response, err := rt.Call(ctx, "analyzer-1", input) - if err != nil { - panic(err) - } - - var result map[string]interface{} - response.UnmarshalPayload(&result) - fmt.Printf("Result: %v\n", result) - - // Parallel call to multiple agents - results, errors := rt.CallParallel(ctx, []string{"analyzer-1", "analyzer-2"}, input) - for name, resp := range results { - fmt.Printf("Agent %s responded\n", name) - } - for name, err := range errors { - fmt.Printf("Agent %s failed: %v\n", name, err) - } - - // Clean up - rt.Stop(ctx) -} -``` - -## Message Patterns - -### Creating Messages with Metadata - -```go -// Create a message with structured payload -msg := agent.NewMessage("request", map[string]interface{}{ - "action": "analyze", - "data": "sample text", -}).WithMetadata("priority", "high"). - WithMetadata("user_id", "user-123"). - WithMetadata("correlation_id", "req-456") - -// Access metadata -priority := msg.GetMetadataString("priority", "normal") -``` - -### Asynchronous Communication - -```go -// Send a message without waiting for response -rt.Send("analyzer-1", msg) - -// Receive messages from an agent -recvCh, _ := rt.Recv("analyzer-1") -go func() { - for msg := range recvCh { - fmt.Printf("Received: %s\n", msg.Type) - } -}() - -// Broadcast to all registered agents -rt.Broadcast(agent.NewMessage("shutdown", nil)) -``` - -## Integration Patterns - -### Embedding in an HTTP Service - -```go -package main - -import ( - "context" - "encoding/json" - "net/http" - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/agent" -) - -type AgentService struct { - runtime agent.Runtime -} - -func NewAgentService() *AgentService { - rt := aixgo.NewRuntime() - rt.Register(NewAnalyzerAgent("analyzer")) - - ctx := context.Background() - go rt.Start(ctx) - - return &AgentService{runtime: rt} -} - -func (s *AgentService) HandleAnalyze(w http.ResponseWriter, r *http.Request) { - var req struct { - Text string `json:"text"` - } - json.NewDecoder(r.Body).Decode(&req) - - input := agent.NewMessage("analyze", req). - WithMetadata("request_id", r.Header.Get("X-Request-ID")) - - response, err := s.runtime.Call(r.Context(), "analyzer", input) - if err != nil { - http.Error(w, err.Error(), http.StatusInternalServerError) - return - } - - w.Header().Set("Content-Type", "application/json") - w.Write(response.MarshalPayload()) -} -``` - -### Wrapping Existing Code - -Migrate existing logic incrementally by wrapping it in an agent: - -```go -// Your existing service -type LegacyProcessor struct { - // existing fields -} - -func (p *LegacyProcessor) Process(data string) (string, error) { - // existing logic - return "processed: " + data, nil -} - -// Wrap it in an agent -type LegacyWrapper struct { - processor *LegacyProcessor - name string - ready bool -} - -func (w *LegacyWrapper) Name() string { return w.name } -func (w *LegacyWrapper) Role() string { return "legacy-processor" } -func (w *LegacyWrapper) Ready() bool { return w.ready } - -func (w *LegacyWrapper) Start(ctx context.Context) error { - w.ready = true - <-ctx.Done() - return nil -} - -func (w *LegacyWrapper) Execute(ctx context.Context, input *agent.Message) (*agent.Message, error) { - var req struct { - Data string `json:"data"` - } - input.UnmarshalPayload(&req) - - result, err := w.processor.Process(req.Data) - if err != nil { - return nil, err - } - - return agent.NewMessage("result", map[string]string{"output": result}), nil -} - -func (w *LegacyWrapper) Stop(ctx context.Context) error { - w.ready = false - return nil -} -``` - -## Building an Orchestrator - -Create complex workflows by composing agents: - -```go -type WorkflowOrchestrator struct { - runtime agent.Runtime - name string - ready bool -} - -func (o *WorkflowOrchestrator) Execute(ctx context.Context, input *agent.Message) (*agent.Message, error) { - // Step 1: Validate input - validated, err := o.runtime.Call(ctx, "validator", input) - if err != nil { - return nil, fmt.Errorf("validation failed: %w", err) - } - - // Step 2: Process in parallel - targets := []string{"processor-1", "processor-2", "processor-3"} - results, errors := o.runtime.CallParallel(ctx, targets, validated) - - // Check for errors - for name, err := range errors { - if err != nil { - return nil, fmt.Errorf("processor %s failed: %w", name, err) - } - } - - // Step 3: Aggregate results - aggregateInput := agent.NewMessage("aggregate", results) - return o.runtime.Call(ctx, "aggregator", aggregateInput) -} -``` - -## Testing Agents - -```go -package myagent_test - -import ( - "context" - "testing" - "github.com/aixgo-dev/aixgo" - "github.com/aixgo-dev/aixgo/agent" -) - -func TestAnalyzerAgent(t *testing.T) { - // Create agent - analyzer := NewAnalyzerAgent("test-analyzer") - analyzer.ready = true - - // Create input - input := agent.NewMessage("analyze", map[string]string{ - "text": "Hello, World!", - }) - - // Execute - ctx := context.Background() - response, err := analyzer.Execute(ctx, input) - if err != nil { - t.Fatalf("Execute failed: %v", err) - } - - // Verify response - var result struct { - WordCount int `json:"word_count"` - } - if err := response.UnmarshalPayload(&result); err != nil { - t.Fatalf("Unmarshal failed: %v", err) - } - - if result.WordCount == 0 { - t.Error("Expected non-zero word count") - } -} - -func TestWithRuntime(t *testing.T) { - ctx := context.Background() - rt := aixgo.NewRuntime() - - // Register test agents - rt.Register(NewAnalyzerAgent("analyzer")) - - // Start runtime - go rt.Start(ctx) - defer rt.Stop(ctx) - - // Test call - input := agent.NewMessage("test", map[string]string{"text": "test"}) - _, err := rt.Call(ctx, "analyzer", input) - if err != nil { - t.Fatalf("Call failed: %v", err) - } -} -``` - -## Comparison: Framework vs. Library - -| Feature | Full Framework | Public Package | -|---------|----------------|----------------| -| Built-in agent types (ReAct, Classifier, etc.) | Yes | No | -| LLM provider abstraction | Yes | No | -| MCP integration | Yes | No | -| Orchestration patterns | Yes | Build your own | -| Observability (OpenTelemetry, Langfuse) | Yes | Add manually | -| Dependencies | Full framework | Only uuid | -| Use case | Standalone AI apps | Integration into existing services | - -## Migrating to Full Framework - -When you're ready for advanced features, migration is straightforward: - -```go -// Using public interfaces -import "github.com/aixgo-dev/aixgo/agent" - -// Add full framework capabilities -import ( - "github.com/aixgo-dev/aixgo/agent" - "github.com/aixgo-dev/aixgo/pkg/llm" - "github.com/aixgo-dev/aixgo/pkg/agents" -) - -// Your custom agents continue to work -rt := aixgo.NewRuntime() -rt.Register(yourCustomAgent) - -// Add framework agents alongside -reactAgent := agents.NewReActAgent(config) -rt.Register(reactAgent) -``` - -## Best Practices - -1. **Keep agents focused**: Each agent should have a single responsibility -1. **Use metadata for tracing**: Add correlation IDs and request context to messages -1. **Handle context cancellation**: Always respect `ctx.Done()` in long-running operations -1. **Test in isolation**: Test agents independently before integrating with the runtime -1. **Graceful shutdown**: Always call `runtime.Stop()` to clean up resources - -## Next Steps - -- [Core Concepts](/guides/core-concepts) - Understand agent fundamentals -- [Multi-Agent Orchestration](/guides/multi-agent-orchestration) - Advanced coordination patterns -- [Extending Aixgo](/guides/extending-aixgo) - Add custom LLM providers and vector stores diff --git a/web/content/guides/validation-with-retry.md b/web/content/guides/validation-with-retry.md deleted file mode 100644 index 9efc714..0000000 --- a/web/content/guides/validation-with-retry.md +++ /dev/null @@ -1,695 +0,0 @@ ---- -title: 'Validation with Automatic Retry' -description: 'Pydantic AI-style automatic validation retry for 40-70% improved structured output reliability' -breadcrumb: 'Validation Retry' -category: 'LLM Integration' -weight: 15 ---- - -Aixgo provides **Pydantic AI-style automatic validation retry**, a powerful feature that dramatically improves the reliability of structured data extraction from LLMs. - -**Working Example**: See [pydantic-style-validation](https://github.com/aixgo-dev/aixgo/tree/main/examples/pydantic-style-validation) for a complete implementation with mock and real LLM provider examples. - -## Overview - -### The Problem - -LLMs are powerful but imperfect. When extracting structured data, they often: -- Omit required fields -- Return incorrect data types -- Violate validation constraints -- Produce malformed output - -Traditional approaches fail immediately on validation errors, requiring manual retry logic and increasing development complexity. - -### The Solution - -Aixgo's validation retry feature automatically: -1. **Detects** validation failures -2. **Constructs** retry prompts with validation errors -3. **Requests** corrections from the LLM -4. **Validates** the corrected output -5. **Returns** valid data or a clear error after max retries - -This is **enabled by default** with `MaxRetries=3`, providing Pydantic AI-style reliability out-of-the-box. - -### Benefits - -- **40-70% improvement** in structured output reliability -- **Zero configuration** required (works automatically) -- **Type-safe** using Go generics -- **Opt-out support** for performance-critical scenarios -- **Works with all agents** and providers - -## Quick Start - -### Basic Usage - -```go -package main - -import ( - "context" - "fmt" - "log" - - "github.com/aixgo-dev/aixgo/internal/llm" - "github.com/aixgo-dev/aixgo/internal/llm/provider" -) - -type User struct { - Name string `json:"name" validate:"required"` - Email string `json:"email" validate:"required,email"` - Age int `json:"age" validate:"gte=0,lte=150"` -} - -func main() { - ctx := context.Background() - - // Get provider - prov, err := provider.Get("openai") - if err != nil { - log.Fatalf("Failed to get provider: %v", err) - } - - // Create client - validation retry is AUTOMATIC! - client := llm.NewClient(prov, llm.ClientConfig{ - DefaultModel: "gpt-4", - // MaxRetries defaults to 3 - no configuration needed - }) - - // Extract data - automatic retry on validation failure - user, err := llm.CreateStructured[User]( - ctx, - client, - "Extract user: John Smith is 30", - nil, - ) - - if err != nil { - log.Fatalf("Failed after retries: %v", err) - } - - fmt.Printf("Success: %+v\n", user) -} -``` - -### What Happens Behind the Scenes - -When you call `CreateStructured`, Aixgo automatically handles validation failures: - -**Attempt 1**: LLM returns incomplete data -```json -{"name": "John Smith", "age": 30} -``` -Validation fails: missing required field `email` - -**Automatic Retry**: Aixgo sends validation feedback to the LLM -```text -Your previous response did not pass validation: - -Field validation for 'Email' failed on the 'required' tag - -Please correct the issues and provide a valid response that matches all requirements. -``` - -**Attempt 2**: LLM corrects the issue -```json -{"name": "John Smith", "email": "john.smith@example.com", "age": 30} -``` -Validation succeeds - result returned to your application - -## How It Works - -### Architecture - -```text -┌─────────────────────────────────────────────────────────────┐ -│ Your Code: llm.CreateStructured[T](...) │ -└────────────┬────────────────────────────────────────────────┘ - │ - v -┌─────────────────────────────────────────────────────────────┐ -│ LLM Client Layer (internal/llm/client.go) │ -│ - Manages retry loop (up to MaxRetries attempts) │ -│ - Constructs feedback messages │ -└────────────┬────────────────────────────────────────────────┘ - │ - v -┌─────────────────────────────────────────────────────────────┐ -│ Provider Layer (internal/llm/provider/) │ -│ - Calls LLM API │ -│ - Returns structured response │ -└────────────┬────────────────────────────────────────────────┘ - │ - v -┌─────────────────────────────────────────────────────────────┐ -│ Validator Layer (internal/llm/validator/) │ -│ - Validates struct tags │ -│ - Returns validation errors if any │ -└─────────────────────────────────────────────────────────────┘ -``` - -### Retry Loop Logic - -```go -for attempt := 0; attempt < maxRetries; attempt++ { - // 1. Call LLM - response := provider.CreateStructured(ctx, messages) - - // 2. Validate response - result, validationErr := validator.Validate[T](response.Data) - - // 3. Success! - if validationErr == nil { - return result, nil - } - - // 4. Last attempt failed - return error - if attempt == maxRetries-1 { - return nil, fmt.Errorf("validation failed after %d attempts: %w", - maxRetries, validationErr) - } - - // 5. Construct retry prompt with validation errors - feedback := formatValidationFeedback(validationErr) - messages = append(messages, - Message{Role: "assistant", Content: response.Content}, - Message{Role: "user", Content: feedback}, - ) -} -``` - -## Configuration - -### ClientConfig Options - -```go -type ClientConfig struct { - DefaultModel string - - // MaxRetries for validation failures (default: 3) - // Set to 1 to disable retry - MaxRetries int - - // DisableValidationRetry disables automatic retry - // When true, validation errors fail immediately - DisableValidationRetry bool - - // StrictValidation enables strict type checking - // No type coercion (e.g., "42" won't become int 42) - StrictValidation bool -} -``` - -### Default Behavior - -```go -// Default: MaxRetries=3, retry enabled -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", -}) -// ✅ Automatic retry with up to 3 attempts -``` - -### Disable Retry (Opt-Out) - -#### Option 1: Use DisableValidationRetry Flag - -```go -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - DisableValidationRetry: true, // Fail immediately on validation error -}) -``` - -#### Option 2: Set MaxRetries to 1 - -```go -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - MaxRetries: 1, // Single attempt, no retry -}) -``` - -### Custom Retry Count - -```go -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - MaxRetries: 5, // Allow up to 5 attempts for complex schemas -}) -``` - -## Use Cases - -### Use Case 1: User Data Extraction - -```go -type User struct { - Name string `json:"name" validate:"required,min=1,max=100"` - Email string `json:"email" validate:"required,email"` - Phone string `json:"phone" validate:"omitempty,e164"` // Optional, but must be valid E.164 if present - Age int `json:"age" validate:"required,gte=0,lte=150"` - Country string `json:"country" validate:"required,iso3166_1_alpha2"` // ISO country code -} - -// LLM might initially miss fields or use invalid formats -// Auto-retry ensures all required fields are present and valid -user, err := llm.CreateStructured[User](ctx, client, prompt, nil) -``` - -### Use Case 2: API Response Parsing - -```go -type APIResponse struct { - Status string `json:"status" validate:"required,oneof=success error pending"` - Message string `json:"message" validate:"required,min=1"` - Code int `json:"code" validate:"required,gte=100,lte=599"` // HTTP status codes - Data any `json:"data"` - Metadata Metadata `json:"metadata" validate:"required"` -} - -type Metadata struct { - RequestID string `json:"request_id" validate:"required,uuid"` - Timestamp int64 `json:"timestamp" validate:"required,gt=0"` -} - -// Complex nested validation with auto-retry -// If the LLM omits metadata or uses invalid values, it will be retried -response, err := llm.CreateStructured[APIResponse](ctx, client, prompt, nil) -``` - -## Validating Array Length - -Go's validator tags don't support `minItems` for slices. Use the `Validatable` interface for custom array validation. - -### The Problem - -LLMs frequently return empty arrays when they shouldn't: - -- "Extract data collection methods" → `{"data_collection": []}` -- "List product features" → `{"features": []}` -- "Find security risks" → `{"risks": []}` - -This is a common failure mode that degrades data quality and requires explicit handling. - -### The Solution - -Implement the `Validatable` interface with custom validation: - -```go -type DataCollection struct { - Items []string `json:"items" validate:"required"` -} - -func (d DataCollection) Validate() error { - if len(d.Items) == 0 { - return fmt.Errorf("items array cannot be empty - at least one item required") - } - return nil -} - -// Use with automatic retry -result, err := llm.CreateStructured[DataCollection](ctx, client, prompt, nil) -// Framework automatically retries if validation fails -``` - -### Automatic Retry Feedback - -When validation fails, the LLM receives detailed feedback: - -```text -Your previous response did not pass validation: - -items array cannot be empty - at least one item required - -Please re-read the document and extract all relevant items. -If truly not found, use: ["Not specified"] -``` - -The retry mechanism feeds this error message back to the LLM, prompting it to correct the issue. This typically resolves 60-80% of empty array problems automatically. - -### Reusable Pattern - -Create a generic helper for application code (not provided by framework): - -```go -type NonEmptySlice[T any] []T - -func (s NonEmptySlice[T]) Validate() error { - if len(s) == 0 { - return fmt.Errorf("slice cannot be empty") - } - return nil -} - -// Usage -type Response struct { - Items NonEmptySlice[Item] `json:"items"` -} -``` - -## Comprehensive Validation Tags Reference - -Aixgo uses the [go-playground/validator](https://github.com/go-playground/validator) library, which supports extensive validation tags. - -### Required and Optional Fields - -```go -type User struct { - Name string `json:"name" validate:"required"` // Must be present - Email string `json:"email" validate:"omitempty"` // Optional field -} -``` - -### Numeric Constraints - -```go -type Product struct { - Price float64 `json:"price" validate:"gte=0"` // Greater than or equal - Quantity int `json:"quantity" validate:"gt=0,lte=100"` // Greater than 0, less than or equal to 100 - Rating float64 `json:"rating" validate:"min=1,max=5"` // Between 1 and 5 - Age int `json:"age" validate:"gte=0,lte=150"` // 0 to 150 -} -``` - -### String Constraints - -```go -type User struct { - Username string `json:"username" validate:"required,min=3,max=20"` // Length between 3-20 - Bio string `json:"bio" validate:"max=500"` // Max 500 characters - Code string `json:"code" validate:"len=6"` // Exactly 6 characters -} -``` - -### Enumeration (oneof) - -```go -type Order struct { - Status string `json:"status" validate:"required,oneof=pending approved rejected"` - Type string `json:"type" validate:"oneof=standard express overnight"` -} -``` - -### Format Validation - -```go -type Contact struct { - Email string `json:"email" validate:"required,email"` // RFC 5322 email - URL string `json:"url" validate:"omitempty,url"` // Valid URL - UUID string `json:"uuid" validate:"required,uuid"` // Valid UUID - Phone string `json:"phone" validate:"omitempty,e164"` // E.164 phone format - Country string `json:"country" validate:"iso3166_1_alpha2"` // ISO country code -} -``` - -### Nested Validation (dive) - -```go -type Company struct { - Employees []Employee `json:"employees" validate:"required,dive"` -} - -type Employee struct { - Name string `json:"name" validate:"required,min=1"` - Email string `json:"email" validate:"required,email"` -} - -// The "dive" tag validates each element in the slice -``` - -### Combining Tags - -```go -type User struct { - // Multiple constraints combined - Email string `json:"email" validate:"required,email,min=5,max=100"` - - // Optional but must be valid if present - Website string `json:"website" validate:"omitempty,url"` - - // Complex numeric constraints - Age int `json:"age" validate:"required,gte=18,lte=120"` -} -``` - -### When to Use Struct Tags vs Validatable Interface - -**Use struct tags when:** - -- Validation is simple and supported by standard tags -- Field-level constraints are sufficient -- No cross-field validation needed -- No complex custom logic required - -**Use Validatable interface when:** - -- Array length validation needed (`minItems`, `maxItems`) -- Cross-field validation required (e.g., `end_date > start_date`) -- Complex business logic -- Conditional validation based on other fields -- Custom error messages with context - -```go -// Example: When you need both -type Order struct { - Items []Item `json:"items" validate:"required,dive"` // Struct tag for nested validation - StartDate time.Time `json:"start_date" validate:"required"` - EndDate time.Time `json:"end_date" validate:"required"` -} - -// Validatable for cross-field logic -func (o Order) Validate() error { - if len(o.Items) == 0 { - return fmt.Errorf("items array cannot be empty") - } - if o.EndDate.Before(o.StartDate) { - return fmt.Errorf("end_date must be after start_date") - } - return nil -} -``` - -## Cross-Field Validation - -When validation depends on multiple fields, implement the `Validatable` interface. - -### Date Range Validation - -```go -type Event struct { - StartDate time.Time `json:"start_date" validate:"required"` - EndDate time.Time `json:"end_date" validate:"required"` -} - -func (e Event) Validate() error { - if e.EndDate.Before(e.StartDate) { - return fmt.Errorf("end_date must be after start_date") - } - return nil -} -``` - -### Conditional Required Fields - -```go -type Payment struct { - Method string `json:"method" validate:"required,oneof=credit_card bank_transfer"` - CardNumber string `json:"card_number" validate:"omitempty"` - BankAccount string `json:"bank_account" validate:"omitempty"` -} - -func (p Payment) Validate() error { - if p.Method == "credit_card" && p.CardNumber == "" { - return fmt.Errorf("card_number required when method is credit_card") - } - if p.Method == "bank_transfer" && p.BankAccount == "" { - return fmt.Errorf("bank_account required when method is bank_transfer") - } - return nil -} -``` - -### Mutually Exclusive Fields - -```go -type Search struct { - Keyword string `json:"keyword" validate:"omitempty"` - TagID string `json:"tag_id" validate:"omitempty"` -} - -func (s Search) Validate() error { - hasKeyword := s.Keyword != "" - hasTagID := s.TagID != "" - - if !hasKeyword && !hasTagID { - return fmt.Errorf("either keyword or tag_id must be provided") - } - if hasKeyword && hasTagID { - return fmt.Errorf("keyword and tag_id are mutually exclusive") - } - return nil -} -``` - -### Sum Validation - -```go -type Budget struct { - Total float64 `json:"total" validate:"required,gt=0"` - Categories []float64 `json:"categories" validate:"required,dive,gte=0"` -} - -func (b Budget) Validate() error { - sum := 0.0 - for _, amount := range b.Categories { - sum += amount - } - - if math.Abs(sum-b.Total) > 0.01 { - return fmt.Errorf("category sum (%.2f) must equal total (%.2f)", sum, b.Total) - } - return nil -} -``` - -## Best Practices - -### 1. Use Descriptive Validation Tags - -**Good:** -```go -type User struct { - Email string `json:"email" validate:"required,email"` - Age int `json:"age" validate:"required,gte=0,lte=150"` -} -``` - -**Better:** -```go -// Also provide clear field documentation -type User struct { - // Email must be a valid email address (required) - Email string `json:"email" validate:"required,email"` - - // Age must be between 0 and 150 (required) - Age int `json:"age" validate:"required,gte=0,lte=150"` -} -``` - -### 2. Provide Explicit System Prompts - -```go -result, err := llm.CreateStructured[User](ctx, client, userPrompt, &llm.CreateOptions{ - SystemPrompt: `You are a data extraction assistant. - -Extract user information and return it as JSON with these REQUIRED fields: -- name: full name (string, 1-100 characters) -- email: valid email address (string, RFC 5322 format) -- age: age in years (integer, 0-150) -- city: city of residence (string, 1-100 characters) - -All fields are REQUIRED. If information is missing, make reasonable assumptions or ask for clarification.`, -}) -``` - -### 3. Set Reasonable MaxRetries - -```go -// Simple schema: 3 retries (default) -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - MaxRetries: 3, // Good for most cases -}) - -// Complex nested schema: more retries -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - MaxRetries: 5, // More attempts for complex validation -}) - -// Performance-critical: disable retry -client := llm.NewClient(provider, llm.ClientConfig{ - DefaultModel: "gpt-4", - DisableValidationRetry: true, // Speed over reliability -}) -``` - -## Troubleshooting - -### Validation Still Fails After Retries - -**Problem**: Error message shows "validation failed after 3 attempts" - -**Solutions**: - -1. **Check validation tags are achievable** - ```go - // Bad: Too restrictive - Email string `validate:"required,email,endswith=@company.com"` - - // Good: Reasonable - Email string `validate:"required,email"` - ``` - -2. **Improve system prompt clarity** - ```go - // Bad: Vague - SystemPrompt: "Extract user data" - - // Good: Explicit - SystemPrompt: `Extract user data as JSON with: - - name: string (required) - - email: valid email (required) - - age: number 0-150 (required)` - ``` - -3. **Increase MaxRetries** - ```go - MaxRetries: 7, // More attempts for complex schemas - ``` - -4. **Use better models** - ```go - DefaultModel: "gpt-4", // Better than gpt-3.5-turbo - ``` - -### Performance Issues - -**Problem**: Requests are too slow - -**Solutions**: - -1. **Reduce MaxRetries** - ```go - MaxRetries: 2, // Faster but less reliable - ``` - -2. **Disable retry for non-critical data** - ```go - DisableValidationRetry: true, // Speed over reliability - ``` - -3. **Use faster models** - ```go - DefaultModel: "gpt-3.5-turbo", // Faster but less accurate - ``` - -4. **Optimize prompts to reduce failures** - - Provide examples in system prompt - - Use few-shot prompting - - Simplify schema complexity - -## Related Documentation - -- [Validation Tags Reference](https://pkg.go.dev/github.com/go-playground/validator/v10) -- [Pydantic AI Inspiration](https://ai.pydantic.dev/) -- [Example: Pydantic-Style Validation](https://github.com/aixgo-dev/aixgo/tree/main/examples/pydantic-style-validation/) - -## See Also - -- [LLM Provider Integration](/guides/provider-integration) -- [Type Safety](/guides/type-safety) -- [Core Concepts](/guides/core-concepts) diff --git a/web/content/guides/vector-databases.md b/web/content/guides/vector-databases.md deleted file mode 100644 index e3aa825..0000000 --- a/web/content/guides/vector-databases.md +++ /dev/null @@ -1,1439 +0,0 @@ ---- -title: 'Vector Databases in Aixgo' -description: 'Complete guide to building RAG systems with vector databases and embeddings' -category: 'RAG & Embeddings' -weight: 4 ---- - -# Vector Databases in Aixgo - -This guide covers everything you need to build production-ready Retrieval-Augmented Generation (RAG) systems using Aixgo's vector database and embeddings integration. - -## Overview - -Vector databases enable semantic search by storing high-dimensional embeddings alongside your data. Combined with LLMs, they power RAG systems that reduce hallucinations and -provide domain-specific knowledge. - -### What You'll Learn - -- Vector database fundamentals and the Collection-based architecture -- Choosing the right embedding provider -- Implementing semantic search and RAG systems -- 10 powerful use cases: caching, agent memory, conversations, and more -- Production deployment strategies with Firestore -- Performance optimization and best practices -- Migrating from the old API -- Troubleshooting common issues - -## Understanding Vector Databases - -### What are Embeddings? - -Embeddings are numerical representations of text that capture semantic meaning. Similar texts produce similar vectors, enabling: - -- **Semantic Search**: Find by meaning, not keywords -- **Similarity Matching**: Compare documents for relevance -- **Clustering**: Group related content -- **Recommendations**: Suggest similar items - -**Example:** - -```text -"dog" → [0.2, 0.8, 0.1, ...] -"puppy" → [0.21, 0.79, 0.11, ...] (similar vector) -"car" → [0.9, 0.1, 0.05, ...] (different vector) -``` - -### How RAG Works - -```text -┌──────────────────────────────────────────────────┐ -│ 1. INDEXING (One-time) │ -│ │ -│ Documents → Embeddings → Vector Database │ -└──────────────────────────────────────────────────┘ - -┌──────────────────────────────────────────────────┐ -│ 2. RETRIEVAL (Per query) │ -│ │ -│ Query → Embedding → Similarity Search │ -│ → Top K Documents │ -└──────────────────────────────────────────────────┘ - -┌──────────────────────────────────────────────────┐ -│ 3. GENERATION (Per query) │ -│ │ -│ Retrieved Context + Query → LLM → Response │ -└──────────────────────────────────────────────────┘ -``` - -## Quick Start - -### Minimal Example - -```go -package main - -import ( - "context" - "fmt" - "log" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" -) - -func main() { - ctx := context.Background() - - // 1. Setup embeddings (free HuggingFace) - embConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - }, - } - embSvc, _ := embeddings.New(embConfig) - defer embSvc.Close() - - // 2. Setup vector store (in-memory for testing) - store, _ := memory.New() - defer store.Close() - - // 3. Create a collection for documents - docs := store.Collection("documents") - - // 4. Index documents - contents := []string{ - "Aixgo is a production-grade AI framework for Go", - "Go is a programming language created at Google", - "Vector databases enable semantic search", - } - - for i, content := range contents { - emb, _ := embSvc.Embed(ctx, content) - doc := &vectorstore.Document{ - ID: fmt.Sprintf("doc-%d", i), - Content: vectorstore.NewTextContent(content), - Embedding: vectorstore.NewEmbedding(emb, "all-MiniLM-L6-v2"), - } - docs.Upsert(ctx, doc) - } - - // 5. Search - query := "What is Aixgo?" - queryEmb, _ := embSvc.Embed(ctx, query) - - result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "all-MiniLM-L6-v2"), - Limit: 3, - }) - - // 6. Display results - for _, match := range result.Matches { - fmt.Printf("Score: %.2f - %s\n", match.Score, match.Document.Content.String()) - } -} -``` - -## Collection-Based Architecture - -Aixgo's vector store uses a **Collection-based architecture** that provides logical isolation for different use cases. Each collection can have its own configuration for TTL, -deduplication, scoping, and capacity limits. - -### Core Concepts - -**Collections**: Logical containers for related documents with shared configuration. - -```go -// Create collections with different purposes -cache := store.Collection("cache", - vectorstore.WithTTL(5*time.Minute), - vectorstore.WithDeduplication(true), -) - -memory := store.Collection("agent-memory", - vectorstore.WithScope("user", "session"), - vectorstore.WithMaxDocuments(1000), -) - -docs := store.Collection("documents", - vectorstore.WithMaxDocuments(100000), -) -``` - -**Documents**: Enhanced with multi-modal content, scoping, temporal data, and tags. - -```go -doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent("text content"), - Embedding: vectorstore.NewEmbedding(emb, "model-name"), - Scope: vectorstore.NewScope("tenant1", "user123", "session456"), - Tags: []string{"important", "verified"}, - Temporal: vectorstore.NewTemporalWithTTL(24*time.Hour), - Metadata: map[string]any{"source": "api"}, -} -``` - -**Queries**: Advanced filtering with scope, tags, temporal conditions, and sorting. - -```go -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(emb, "model"), - Limit: 10, - MinScore: 0.7, - Filters: vectorstore.And( - vectorstore.UserFilter("user123"), - vectorstore.TagFilter("important"), - vectorstore.CreatedAfter(time.Now().Add(-7*24*time.Hour)), - ), - SortBy: []vectorstore.SortBy{ - vectorstore.SortByScore(), - vectorstore.SortByCreatedAt(true), - }, -}) -``` - -## 10 Powerful Use Cases - -### 1. Semantic Caching - -Cache LLM responses to reduce costs and latency. - -```go -// Create cache collection with TTL -cache := store.Collection("llm-cache", - vectorstore.WithTTL(5*time.Minute), - vectorstore.WithDeduplication(true), - vectorstore.WithMaxDocuments(10000), -) - -// Cache a query result -cacheDoc := &vectorstore.Document{ - ID: "query-hash-123", - Content: vectorstore.NewTextContent("What is the capital of France?"), - Embedding: vectorstore.NewEmbedding(queryEmb, "text-embedding-3-small"), - Temporal: vectorstore.NewTemporalWithTTL(5 * time.Minute), - Tags: []string{"qa", "geography"}, - Metadata: map[string]any{ - "answer": "Paris", - "model": "gpt-4", - }, -} -cache.Upsert(ctx, cacheDoc) - -// Lookup cached result by similarity -query := &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "text-embedding-3-small"), - Limit: 1, - MinScore: 0.98, // High threshold for cache hits -} - -result, _ := cache.Query(ctx, query) -if result.HasMatches() { - answer := result.TopMatch().Document.Metadata["answer"] - fmt.Printf("Cache hit! Answer: %s\n", answer) -} -``` - -### 2. Agent Memory with Scope Isolation - -Store agent memories with multi-tenant isolation. - -```go -// Create memory collection with scope requirements -memory := store.Collection("agent-memory", - vectorstore.WithScope("tenant", "user", "session"), - vectorstore.WithMaxDocuments(1000), -) - -// Store a memory scoped to tenant, user, and session -memoryDoc := &vectorstore.Document{ - ID: "memory-1", - Content: vectorstore.NewTextContent("User prefers dark mode"), - Embedding: vectorstore.NewEmbedding(emb, "text-embedding-3-small"), - Scope: vectorstore.NewScope("tenant1", "user123", "session456"), - Tags: []string{"preference", "ui"}, -} -memory.Upsert(ctx, memoryDoc) - -// Retrieve memories for specific user -result, _ := memory.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "text-embedding-3-small"), - Filters: vectorstore.And( - vectorstore.TenantFilter("tenant1"), - vectorstore.UserFilter("user123"), - vectorstore.TagFilter("preference"), - ), - Limit: 5, -}) -``` - -### 3. Conversation History - -Track conversation context across sessions. - -```go -conversations := store.Collection("conversations", - vectorstore.WithScope("user", "thread"), - vectorstore.WithMaxDocuments(100000), -) - -// Store a conversation turn -turn := &vectorstore.Document{ - ID: "turn-1", - Content: vectorstore.NewTextContent("User: What's the weather?\nAssistant: It's sunny today."), - Embedding: vectorstore.NewEmbedding(emb, "text-embedding-3-small"), - Scope: &vectorstore.Scope{ - User: "user123", - Thread: "thread-abc", - }, - Temporal: &vectorstore.Temporal{ - CreatedAt: time.Now(), - EventTime: &[]time.Time{time.Now()}[0], - }, - Tags: []string{"weather", "conversation"}, - Metadata: map[string]any{"turn_number": 1}, -} -conversations.Upsert(ctx, turn) - -// Retrieve conversation history chronologically -result, _ := conversations.Query(ctx, &vectorstore.Query{ - Filters: vectorstore.And( - vectorstore.UserFilter("user123"), - vectorstore.Eq("thread", "thread-abc"), - ), - SortBy: []vectorstore.SortBy{ - vectorstore.SortByCreatedAt(false), // Ascending - }, - Limit: 20, -}) -``` - -### 4. Content Deduplication - -Prevent duplicate content in your knowledge base. - -```go -docs := store.Collection("documents", - vectorstore.WithDeduplicationThreshold(0.95), -) - -// Insert multiple documents - duplicates are automatically detected -documents := []*vectorstore.Document{ - { - ID: "doc1", - Content: vectorstore.NewTextContent("The quick brown fox"), - Embedding: vectorstore.NewEmbedding(emb1, "model"), - }, - { - ID: "doc2", - Content: vectorstore.NewTextContent("The quick brown fox"), // Duplicate - Embedding: vectorstore.NewEmbedding(emb1, "model"), - }, -} - -result, _ := docs.UpsertBatch(ctx, documents) -fmt.Printf("Inserted: %d, Deduplicated: %d\n", result.Inserted, result.Deduplicated) -``` - -### 5. Multi-Modal Support - -Store and search images, URLs, and text. - -```go -media := store.Collection("media") - -// Store an image with CLIP embeddings -imageDoc := &vectorstore.Document{ - ID: "img1", - Content: vectorstore.NewImageURL("https://example.com/photo.jpg"), - Embedding: vectorstore.NewEmbedding(clipEmb, "clip-vit-base-patch32"), - Tags: []string{"photo", "landscape"}, -} -media.Upsert(ctx, imageDoc) - -// Query with image or text embedding -result, _ := media.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "clip-vit-base-patch32"), - Filters: vectorstore.TagFilter("photo"), - Limit: 10, -}) -``` - -### 6. Temporal Data with Expiration - -Manage time-based data with automatic cleanup. - -```go -events := store.Collection("events", - vectorstore.WithTTL(7*24*time.Hour), // 7 days -) - -// Store event with TTL -event := &vectorstore.Document{ - ID: "event-1", - Content: vectorstore.NewTextContent("System maintenance scheduled"), - Embedding: vectorstore.NewEmbedding(emb, "model"), - Temporal: vectorstore.NewTemporalWithTTL(24 * time.Hour), - Tags: []string{"maintenance", "scheduled"}, -} -events.Upsert(ctx, event) - -// Query for recent, non-expired events -result, _ := events.Query(ctx, &vectorstore.Query{ - Filters: vectorstore.And( - vectorstore.CreatedAfter(time.Now().Add(-7*24*time.Hour)), - vectorstore.NotExpired(), - vectorstore.TagFilter("scheduled"), - ), - SortBy: []vectorstore.SortBy{ - vectorstore.SortByCreatedAt(true), // Most recent first - }, -}) -``` - -### 7. Multi-Tenancy via Scope - -Isolate data across tenants, users, and sessions. - -```go -// Index with tenant isolation -doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent("Sensitive company data"), - Embedding: vectorstore.NewEmbedding(emb, "model"), - Scope: vectorstore.NewScope("acme-corp", "user456", ""), -} -docs.Upsert(ctx, doc) - -// Search within tenant boundary -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "model"), - Filters: vectorstore.And( - vectorstore.TenantFilter("acme-corp"), - vectorstore.UserFilter("user456"), - ), -}) -``` - -### 8. Batch Operations - -Efficiently process large datasets with progress tracking. - -```go -// Create many documents -documents := make([]*vectorstore.Document, 1000) -for i := range documents { - documents[i] = &vectorstore.Document{ - ID: fmt.Sprintf("doc-%d", i), - Content: vectorstore.NewTextContent(fmt.Sprintf("Document %d", i)), - Embedding: vectorstore.NewEmbedding(embs[i], "model"), - } -} - -// Batch insert with progress tracking -result, _ := docs.UpsertBatch(ctx, documents, - vectorstore.WithBatchSize(100), - vectorstore.WithParallelism(4), - vectorstore.WithProgressCallback(func(processed, total int) { - pct := float64(processed) / float64(total) * 100 - fmt.Printf("Progress: %.1f%%\n", pct) - }), -) - -fmt.Printf("Inserted: %d, Failed: %d\n", result.Inserted, result.Failed) -``` - -### 9. Streaming Queries - -Process large result sets efficiently. - -```go -query := &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(emb, "model"), - Limit: 1000, // Large result set -} - -// Stream results to avoid loading all at once -iter, _ := docs.QueryStream(ctx, query) -defer iter.Close() - -count := 0 -for iter.Next() { - match := iter.Match() - if match.Score >= 0.8 { - count++ - // Process high-score match - } -} - -if err := iter.Err(); err != nil { - log.Printf("Stream error: %v", err) -} -``` - -### 10. Advanced Filtering - -Complex queries with boolean logic. - -```go -// Find recent, high-rated electronics in stock -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(emb, "model"), - Filters: vectorstore.And( - vectorstore.TagFilter("electronics"), - vectorstore.Gte("rating", 4.5), - vectorstore.Eq("in_stock", true), - vectorstore.CreatedAfter(time.Now().Add(-30*24*time.Hour)), - vectorstore.Or( - vectorstore.Contains("category", "phone"), - vectorstore.Contains("category", "laptop"), - ), - ), - SortBy: []vectorstore.SortBy{ - vectorstore.SortByScore(), - vectorstore.SortByField("rating", true), - }, - Limit: 20, -}) -``` - -## Choosing Components - -### Embedding Providers - -#### Embedding Provider Decision Matrix - -| Provider | Cost | Setup | Quality | Speed | Best For | -| ------------------- | ---------------- | ------- | -------------- | --------- | ----------- | -| **HuggingFace API** | Free | None | Good-Excellent | Medium | Development | -| **HuggingFace TEI** | Free (self-host) | Docker | Good-Excellent | Very Fast | Production | -| **OpenAI** | $0.02-0.13/1M | API Key | Excellent | Fast | Production | - -#### HuggingFace Models - -**Popular choices:** - -```go -// Fast, good quality (384 dimensions) -Model: "sentence-transformers/all-MiniLM-L6-v2" - -// Best quality (1024 dimensions) -Model: "BAAI/bge-large-en-v1.5" - -// Multilingual (1024 dimensions) -Model: "thenlper/gte-large" -``` - -**Configuration:** - -```go -config := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "sentence-transformers/all-MiniLM-L6-v2", - APIKey: os.Getenv("HUGGINGFACE_API_KEY"), // Optional - WaitForModel: true, - UseCache: true, - }, -} -``` - -#### OpenAI Models - -**Available models:** - -```go -// Recommended: Balance of cost and quality -Model: "text-embedding-3-small" // 1536 dims, $0.02/1M tokens - -// Best quality -Model: "text-embedding-3-large" // 3072 dims, $0.13/1M tokens - -// Legacy (still supported) -Model: "text-embedding-ada-002" // 1536 dims, $0.10/1M tokens -``` - -**Configuration:** - -```go -config := embeddings.Config{ - Provider: "openai", - OpenAI: &embeddings.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "text-embedding-3-small", - }, -} -``` - -### Vector Store Providers - -#### Decision Matrix - -| Provider | Persistence | Scalability | Setup | Best For | -| --------------------- | ----------- | -------------- | ------ | ------------------------ | -| **Memory** | No | Low (10K docs) | None | Development, testing | -| **Firestore** | Yes | Unlimited | Medium | Production, serverless | -| **Qdrant** (future) | Yes | Very High | Medium | High-performance search | -| **pgvector** (future) | Yes | High | Medium | Existing PostgreSQL apps | - -#### Memory Store - -**Configuration:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory" - -store, _ := memory.New() -defer store.Close() - -// Create collections with different configurations -cache := store.Collection("cache", vectorstore.WithTTL(5*time.Minute)) -docs := store.Collection("docs", vectorstore.WithMaxDocuments(10000)) -``` - -**Use Cases:** - -- Unit tests -- Local development -- Prototyping -- Small datasets - -#### Firestore - -**Configuration:** - -```go -import "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore" - -store, _ := firestore.New(ctx, - firestore.WithProjectID("my-gcp-project"), - firestore.WithCredentialsFile("/path/to/service-account.json"), -) -defer store.Close() - -// Create collections -docs := store.Collection("documents") -``` - -**Use Cases:** - -- Production deployments -- Serverless architectures -- Firebase-based apps -- Auto-scaling workloads - -## Production Setup - -### Firestore Configuration - -#### 1. Create GCP Project - -```bash -# Create project -gcloud projects create my-rag-project -gcloud config set project my-rag-project - -# Enable billing (required) -gcloud beta billing projects link my-rag-project \ - --billing-account=BILLING_ACCOUNT_ID -``` - -#### 2. Enable Firestore - -```bash -# Enable Firestore API -gcloud services enable firestore.googleapis.com - -# Create database -gcloud firestore databases create \ - --location=us-central1 \ - --type=firestore-native -``` - -#### 3. Create Vector Index - -**Critical:** Firestore requires vector indexes for similarity search. - -```bash -# For 384-dimensional embeddings (all-MiniLM-L6-v2) -gcloud firestore indexes composite create \ - --collection-group=documents \ - --query-scope=COLLECTION \ - --field-config=field-path=embedding.vector,vector-config='{"dimension":"384","flat":{}}' \ - --project=my-rag-project - -# For 1536-dimensional embeddings (OpenAI) -gcloud firestore indexes composite create \ - --collection-group=documents \ - --query-scope=COLLECTION \ - --field-config=field-path=embedding.vector,vector-config='{"dimension":"1536","flat":{}}' \ - --project=my-rag-project -``` - -**Check index status:** - -```bash -gcloud firestore indexes composite list --format=table -``` - -#### 4. Setup Authentication - -```bash -# Create service account -gcloud iam service-accounts create aixgo-rag \ - --display-name="Aixgo RAG Service" - -# Grant Firestore permissions -gcloud projects add-iam-policy-binding my-rag-project \ - --member="serviceAccount:aixgo-rag@my-rag-project.iam.gserviceaccount.com" \ - --role="roles/datastore.user" - -# Create and download key -gcloud iam service-accounts keys create key.json \ - --iam-account=aixgo-rag@my-rag-project.iam.gserviceaccount.com - -# Set environment variable -export GOOGLE_APPLICATION_CREDENTIALS=$(pwd)/key.json -``` - -### HuggingFace TEI (Self-Hosted) - -For high-throughput production deployments: - -#### Docker Deployment - -```bash -# Run TEI server -docker run -d \ - --name tei \ - -p 8080:8080 \ - -v $PWD/data:/data \ - --gpus all \ - ghcr.io/huggingface/text-embeddings-inference:latest \ - --model-id BAAI/bge-large-en-v1.5 \ - --revision main \ - --max-batch-size 128 -``` - -#### Configuration - -```go -config := embeddings.Config{ - Provider: "huggingface_tei", - HuggingFaceTEI: &embeddings.HuggingFaceTEIConfig{ - Endpoint: "http://localhost:8080", - Model: "BAAI/bge-large-en-v1.5", - Normalize: true, - }, -} -``` - -## Building RAG Systems - -### Complete RAG Implementation - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "strings" - - "github.com/aixgo-dev/aixgo/pkg/embeddings" - "github.com/aixgo-dev/aixgo/pkg/llm" - "github.com/aixgo-dev/aixgo/pkg/vectorstore" - "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore" -) - -type RAGSystem struct { - embeddings embeddings.EmbeddingService - vectorDB vectorstore.VectorStore - collection vectorstore.Collection - llm llm.LLM -} - -func NewRAGSystem() (*RAGSystem, error) { - ctx := context.Background() - - // Setup embeddings - embConfig := embeddings.Config{ - Provider: "huggingface", - HuggingFace: &embeddings.HuggingFaceConfig{ - Model: "BAAI/bge-large-en-v1.5", - }, - } - embSvc, err := embeddings.New(embConfig) - if err != nil { - return nil, err - } - - // Setup vector store - store, err := firestore.New(ctx, - firestore.WithProjectID(os.Getenv("GCP_PROJECT_ID")), - ) - if err != nil { - return nil, err - } - - // Create knowledge base collection - knowledgeBase := store.Collection("knowledge_base", - vectorstore.WithDeduplication(true), - vectorstore.WithMaxDocuments(100000), - ) - - // Setup LLM - llmInstance, err := llm.New(llm.Config{ - Provider: "openai", - OpenAI: &llm.OpenAIConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: "gpt-4-turbo-preview", - }, - }) - if err != nil { - return nil, err - } - - return &RAGSystem{ - embeddings: embSvc, - vectorDB: store, - collection: knowledgeBase, - llm: llmInstance, - }, nil -} - -// IndexDocument adds a document to the knowledge base -func (r *RAGSystem) IndexDocument(ctx context.Context, id, content string, metadata map[string]any) error { - // Generate embedding - embedding, err := r.embeddings.Embed(ctx, content) - if err != nil { - return fmt.Errorf("failed to generate embedding: %w", err) - } - - // Create document - doc := &vectorstore.Document{ - ID: id, - Content: vectorstore.NewTextContent(content), - Embedding: vectorstore.NewEmbedding(embedding, "BAAI/bge-large-en-v1.5"), - Metadata: metadata, - Tags: []string{"knowledge-base"}, - } - - // Store in vector DB - _, err = r.collection.Upsert(ctx, doc) - return err -} - -// Query performs RAG: retrieve + generate -func (r *RAGSystem) Query(ctx context.Context, question string, topK int) (string, error) { - // 1. Generate query embedding - queryEmb, err := r.embeddings.Embed(ctx, question) - if err != nil { - return "", fmt.Errorf("failed to embed query: %w", err) - } - - // 2. Retrieve relevant documents - result, err := r.collection.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "BAAI/bge-large-en-v1.5"), - Limit: topK, - MinScore: 0.7, - }) - if err != nil { - return "", fmt.Errorf("search failed: %w", err) - } - - if !result.HasMatches() { - return "I don't have enough information to answer that question.", nil - } - - // 3. Build context from retrieved documents - var contextParts []string - for i, match := range result.Matches { - contextParts = append(contextParts, - fmt.Sprintf("[Document %d] (score: %.2f)\n%s", - i+1, match.Score, match.Document.Content.String())) - } - context := strings.Join(contextParts, "\n\n") - - // 4. Generate response with LLM - prompt := fmt.Sprintf(`Based on the following context, please answer the question. - -Context: -%s - -Question: %s - -Answer:`, context, question) - - response, err := r.llm.Complete(ctx, prompt) - if err != nil { - return "", fmt.Errorf("LLM generation failed: %w", err) - } - - return response, nil -} - -func (r *RAGSystem) Close() error { - r.embeddings.Close() - r.vectorDB.Close() - return nil -} -``` - -### Usage Example - -```go -func main() { - ctx := context.Background() - - // Initialize RAG system - rag, err := NewRAGSystem() - if err != nil { - log.Fatal(err) - } - defer rag.Close() - - // Index documents - documents := map[string]string{ - "doc-1": "Aixgo is a production-grade AI agent framework written in Go.", - "doc-2": "Aixgo supports multiple LLM providers including OpenAI, Anthropic, and Gemini.", - "doc-3": "Vector databases in Aixgo enable semantic search and RAG systems.", - } - - for id, content := range documents { - err := rag.IndexDocument(ctx, id, content, map[string]any{ - "source": "documentation", - }) - if err != nil { - log.Printf("Failed to index %s: %v", id, err) - } - } - - // Query the system - answer, err := rag.Query(ctx, "What LLM providers does Aixgo support?", 3) - if err != nil { - log.Fatal(err) - } - - fmt.Println("Answer:", answer) -} -``` - -## Best Practices - -### 1. Document Chunking - -Break large documents into optimal chunks: - -```go -func chunkDocument(text string, chunkSize int, overlap int) []string { - var chunks []string - words := strings.Fields(text) - - for i := 0; i < len(words); i += chunkSize - overlap { - end := i + chunkSize - if end > len(words) { - end = len(words) - } - - chunk := strings.Join(words[i:end], " ") - chunks = append(chunks, chunk) - - if end == len(words) { - break - } - } - - return chunks -} - -// Usage -chunks := chunkDocument(largeDocument, 500, 50) // 500 words, 50 overlap -for i, chunk := range chunks { - emb, _ := embSvc.Embed(ctx, chunk) - doc := &vectorstore.Document{ - ID: fmt.Sprintf("doc-%s-chunk-%d", docID, i), - Content: vectorstore.NewTextContent(chunk), - Embedding: vectorstore.NewEmbedding(emb, "model"), - Metadata: map[string]any{ - "chunk_index": i, - "parent_doc": docID, - }, - } - docs.Upsert(ctx, doc) -} -``` - -**Recommended sizes:** - -- 200-500 words per chunk -- 10-20% overlap between chunks -- Preserve sentence boundaries - -### 2. Metadata and Tags Strategy - -Use metadata and tags for hybrid search: - -```go -doc := &vectorstore.Document{ - ID: "doc1", - Content: vectorstore.NewTextContent("Installation guide for Aixgo..."), - Embedding: vectorstore.NewEmbedding(emb, "model"), - Tags: []string{"getting-started", "setup", "installation"}, - Metadata: map[string]any{ - "doc_type": "user_guide", - "section": "installation", - "version": "2.0", - "language": "en", - "created_at": time.Now().Format(time.RFC3339), - "author": "docs-team", - }, -} - -// Filter during search -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "model"), - Filters: vectorstore.And( - vectorstore.TagFilter("installation"), - vectorstore.Eq("doc_type", "user_guide"), - vectorstore.Eq("version", "2.0"), - ), - Limit: 10, -}) -``` - -### 3. Batch Processing - -Index efficiently with batch operations: - -```go -func indexDocuments(collection vectorstore.Collection, embSvc embeddings.EmbeddingService, docs []string) error { - ctx := context.Background() - const batchSize = 100 - - for i := 0; i < len(docs); i += batchSize { - end := i + batchSize - if end > len(docs) { - end = len(docs) - } - - batch := docs[i:end] - - // Batch embed - embeddings, err := embSvc.EmbedBatch(ctx, batch) - if err != nil { - return err - } - - // Create documents - var vsDocs []*vectorstore.Document - for j, content := range batch { - vsDocs = append(vsDocs, &vectorstore.Document{ - ID: fmt.Sprintf("doc-%d", i+j), - Content: vectorstore.NewTextContent(content), - Embedding: vectorstore.NewEmbedding(embeddings[j], "model"), - }) - } - - // Batch upsert - _, err = collection.UpsertBatch(ctx, vsDocs) - if err != nil { - return err - } - } - - return nil -} -``` - -### 4. Error Handling and Retries - -Implement robust error handling: - -```go -func queryWithRetry(collection vectorstore.Collection, query *vectorstore.Query, maxRetries int) (*vectorstore.QueryResult, error) { - var lastErr error - - for i := 0; i < maxRetries; i++ { - ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) - defer cancel() - - result, err := collection.Query(ctx, query) - if err == nil { - return result, nil - } - - lastErr = err - if !isRetryable(err) { - break - } - - // Exponential backoff - time.Sleep(time.Duration(1< -2. Use batch operations (`EmbedBatch`) -3. Deploy TEI locally for unlimited requests - -#### 4. Low Search Quality - -**Debugging steps:** - -```go -// 1. Check similarity scores -for _, match := range result.Matches { - fmt.Printf("Score: %.3f - %s\n", match.Score, match.Document.Content.String()) -} - -// 2. Lower MinScore threshold -query.MinScore = 0.5 // or 0.0 to see all results - -// 3. Increase limit -query.Limit = 20 // see more candidates - -// 4. Try different embedding model -// Larger models (1024 dims) often perform better -``` - -**Improvement strategies:** - -1. Use better embedding model (bge-large vs all-MiniLM) -2. Optimize chunk size (test different sizes) -3. Add reranking step with cross-encoder -4. Implement hybrid search (semantic + keyword) - -#### 5. Collection Not Found - -```text -Error: collection does not exist -``` - -**Solution:** - -```go -// Collections are created on first use -docs := store.Collection("documents") // Creates if doesn't exist - -// Or check existing collections -collections, _ := store.ListCollections(ctx) -fmt.Println("Available collections:", collections) -``` - -## Advanced Topics - -### Hybrid Search - -Combine vector similarity with filters: - -```go -// Semantic search with metadata filters -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "model"), - Filters: vectorstore.And( - vectorstore.TagFilter("documentation"), - vectorstore.Eq("category", "api-reference"), - vectorstore.Gte("version", "2.0"), - ), - Limit: 20, -}) - -// Post-process with keyword matching if needed -var filtered []*vectorstore.Match -for _, match := range result.Matches { - if strings.Contains(strings.ToLower(match.Document.Content.String()), "aixgo") { - filtered = append(filtered, match) - } -} -``` - -### Pagination - -Handle large result sets with pagination: - -```go -pageSize := 20 -page := 0 - -for { - result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(emb, "model"), - Limit: pageSize, - Offset: page * pageSize, - }) - - fmt.Printf("Page %d: %d results\n", page+1, result.Count()) - - // Process results - for _, match := range result.Matches { - // Process match - } - - // Check for more pages - if !result.HasMore() { - break - } - - page++ -} -``` - -### Versioning - -Handle document versions: - -```go -doc := &vectorstore.Document{ - ID: "user-guide-v2", - Content: vectorstore.NewTextContent("Updated installation guide..."), - Embedding: vectorstore.NewEmbedding(emb, "model"), - Tags: []string{"latest"}, - Metadata: map[string]any{ - "doc_id": "user-guide", - "version": "2.0", - "latest": true, - }, -} - -// Search latest versions only -result, _ := docs.Query(ctx, &vectorstore.Query{ - Embedding: vectorstore.NewEmbedding(queryEmb, "model"), - Filters: vectorstore.TagFilter("latest"), -}) -``` - -## Next Steps - -- **Try the Example**: [RAG Agent Example](../../examples/rag-agent) -- **API Reference**: [Vector Store Package](https://pkg.go.dev/github.com/aixgo-dev/aixgo/pkg/vectorstore) -- **Extend Aixgo**: [Adding Custom Providers](./extending-aixgo.md) -- **Production**: [Deployment Guide](./production-deployment.md) - -## Resources - -- [Firestore Vector Search](https://firebase.google.com/docs/firestore/vector-search) -- [HuggingFace Embeddings](https://huggingface.co/models?pipeline_tag=sentence-similarity) -- [OpenAI Embeddings](https://platform.openai.com/docs/guides/embeddings) -- [RAG Best Practices](https://www.pinecone.io/learn/retrieval-augmented-generation/) diff --git a/web/content/philosophy-condensed.md b/web/content/philosophy-condensed.md deleted file mode 100644 index 40c4229..0000000 --- a/web/content/philosophy-condensed.md +++ /dev/null @@ -1,301 +0,0 @@ ---- -title: 'The Aixgo Philosophy' -description: 'Why we built Aixgo and what we believe about production AI. Our design principles, values, and commitment to production-grade tooling.' ---- - -## Production AI Deserves Production Tooling - -**We're not trying to out-prototype Python. We're trying to out-ship it.** - -For too long, production AI teams have been forced to choose between the velocity of Python frameworks and the reliability of production-grade infrastructure. Aixgo eliminates that choice. - -AI agents should ship with the same performance, security, and simplicity as the rest of your production systems. - ---- - -## The Problem We're Solving - -### The Python Production Penalty - -Python excels at research and prototyping. But production reveals fundamental limitations: - -**Bloated Deployments:** - -- 1GB+ containers vs. <20MB binaries -- 200+ dependencies vs. zero runtime dependencies -- Minutes to build vs. seconds - -**Runtime Surprises:** - -- Type errors caught in production, not at compile time -- "Works on my machine" dependency conflicts -- AttributeError exceptions that should never ship - -**Scaling Complexity:** - -- GIL prevents true parallelism -- Slow cold starts (30-45s) kill serverless economics -- Heavy memory footprint (512MB+ baseline) - -**Security Vulnerabilities:** - -- Massive dependency trees with transitive CVEs -- No compile-time type safety -- Difficult to audit and secure - -### Why Go? - -Go developers shouldn't abandon their stack's strengths—speed, security, simplicity—just to build AI agents. - -**The Production Trade:** - -- 5MB binary → 1.5GB container -- Type safety → Runtime errors -- Instant startup → 45-second cold starts -- 10 dependencies → 200+ packages - -**Aixgo exists because production AI deserves better.** - ---- - -## Design Principles - -### 1. A Single Binary is Better Than a Thousand Dependencies - -**Go's compilation model:** - -- No runtime dependencies -- No Python interpreter -- No virtual environments -- No Docker required (though containers work great) - -**Real-world impact:** - -```text -Python AI Service: Aixgo Service: -- Base: python:3.11 1GB - Binary: <20MB -- Deps: pip (200MB) - Base: scratch -- Code: 50MB - Total: <20MB -Total: 1.2GB (150x smaller) -``` - -**Deployment unlocked:** - -- Edge devices with limited storage -- Serverless with sub-100ms cold starts -- Multi-region rollouts in seconds - -### 2. Type Safety at Compile Time Beats Hope at Runtime - -```go -// This won't compile - caught before deployment -agent := aixgo.NewAgent( - aixgo.WithName("analyzer"), - aixgo.WithModel(123), // Type error: expected string -) -``` - -**Production benefits:** - -- IDE tells you what breaks when APIs change -- Refactoring confidence across teams -- No runtime type errors in production -- Less time debugging, more time shipping - -### 3. Go-Native Patterns, Not Python Ports - -**We embrace Go's strengths:** - -- **Goroutines** for concurrent agent execution -- **Channels** for message passing (local mode) -- **gRPC** for distributed communication -- **Context** for cancellation and timeouts -- **Interfaces** for extensibility - -**Why this matters:** - -- Code reads like idiomatic Go -- Standard tooling works (go test, pprof, race detector) -- Go developers feel at home immediately - -**Same code, different transport:** - -```go -// Works locally AND distributed -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(producer) -supervisor.AddAgent(analyzer) -supervisor.Run() // Runtime picks: channels or gRPC -``` - -### 4. Observability is Not Optional - -**Built-in OpenTelemetry integration:** - -- Distributed tracing across multi-agent workflows -- Structured logging with trace correlation -- Metrics export (Prometheus, StatsD) -- Works with Grafana, Datadog, Langfuse, New Relic - -**Configuration, not instrumentation:** - -```yaml -observability: - tracing: true - service_name: 'agent-system' - exporter: 'otlp' -``` - -**Debug before users report issues:** - -- Trace requests across agent boundaries -- Correlate logs with distributed traces -- Monitor performance in real-time - -### 5. Clear Configuration is Better Than Clever Code - -**Why YAML over Python DSLs:** - -- Declarative and reviewable in PRs -- Deployable without code changes -- Validatable at load time -- Shareable across teams - -Less flexible, but production-proven. - -### 6. Simplicity Scales; Complexity Fails - -**API stability over rapid iteration:** - -- Fewer primitives, production-hardened -- Stricter contracts, better long-term maintainability -- Opinionated patterns, proven at scale - -**v1.0 Compatibility Guarantee:** - -- Semantic versioning -- No breaking changes without major bumps -- Clear migration guides - -See [v1.0 Compatibility](/v1-compatibility) for details. - ---- - -## When to Choose Aixgo - -### Choose Aixgo When - -**Deploying to production, not experimenting:** - -- Predictable performance and resource usage matter -- Container size and cold starts impact serverless/edge -- You're building systems that scale and stay running - -**Your team uses Go:** - -- Backend services already in Go -- You value compile-time error detection -- Avoiding Python dependency management - -**Performance is non-negotiable:** - -- Sub-100ms cold starts -- High-throughput pipelines (10,000+ req/s) -- Resource-constrained environments - -**You need production guarantees:** - -- Type safety -- Single binary deployments -- Minimal attack surface - -### Choose Python When - -**Doing exploratory research:** - -- Rapid prototyping with frequent pivots -- Experimenting with cutting-edge models -- Throwaway notebooks and scripts - -**You need Python's ML ecosystem:** - -- Training with PyTorch, TensorFlow, JAX -- Data analysis with pandas, numpy -- Jupyter workflows - -**Your team is Python-native:** - -- No Go experience and no interest in learning -- Existing Python infrastructure - ---- - -## What We're NOT - -**Not a replacement for Python in AI research:** - -- We're not competing with PyTorch or TensorFlow -- We're focused on production agent orchestration - -**Not a general-purpose AI toolkit:** - -- Specialized for multi-agent systems -- Not a model training framework - -**Not trying to do everything:** - -- We'd rather do fewer things excellently -- Production deployment focus, not prototyping velocity - ---- - -## The Long View - -### Our Commitment - -**Open development:** - -- Public roadmap on GitHub -- Transparent decision-making -- Community-driven feature prioritization - -**MIT License:** - -- Use in commercial products -- No vendor lock-in -- Fork-friendly - -**Long-term vision:** - -- Enterprise-grade orchestration -- Production-proven patterns -- World-class developer experience - -**Where we're not going:** - -- Chasing research trends -- Sacrificing stability for novelty - ---- - -## Join the Movement - -If you're tired of wrestling with Python in production, if you believe AI agents should ship with the same simplicity as your Go services, or if you want to see what production-grade AI looks like—join us. - -**Get Started:** - -- [Quick Start Guide](/guides/quick-start) - Running in 5 minutes -- [Core Concepts](/guides/core-concepts) - Architecture deep dive -- [Aixgo Proverbs](/proverbs) - 15 production principles - -**Get Involved:** - -- [GitHub](https://github.com/aixgo-dev/aixgo) - Star, contribute, open issues -- [Discussions](https://github.com/aixgo-dev/aixgo/discussions) - Share ideas -- [Roadmap](https://github.com/aixgo-dev/aixgo/projects) - See what's next - ---- - -**Where Python prototypes go to die in production, Go agents ship and scale.** - -*This is our philosophy. Welcome to Aixgo.* diff --git a/web/content/proverbs.md b/web/content/proverbs.md deleted file mode 100644 index 027fe74..0000000 --- a/web/content/proverbs.md +++ /dev/null @@ -1,92 +0,0 @@ ---- -title: 'The Aixgo Proverbs' -description: '15 production-tested principles for building AI agents that ship and scale. Inspired by Go Proverbs and The Zen of Go.' ---- - -## The Aixgo Proverbs - -_Inspired by [Go Proverbs](https://go-proverbs.github.io/) and [The Zen of Go](https://dave.cheney.net/2020/02/23/the-zen-of-go)_ - ---- - -### 1. Production AI deserves production tooling - -Research frameworks are built for notebooks. Production frameworks are built for uptime. Choose tools that match your deployment target, not your prototype. - -### 2. Don't prototype in Python and hope it ships; ship in Go and know it works - -Prototyping velocity matters, but shipping velocity matters more. Type safety, compile-time errors, and single binaries eliminate the "hope it works in prod" phase. - -### 3. A single binary is better than a thousand dependencies - -Every dependency is a liability—security patches, version conflicts, build complexity. A single <20MB binary has zero runtime dependencies and infinite deployment flexibility. - -### 4. Type safety at compile time beats hope at runtime - -`AttributeError` in production is a failure of tooling. If your IDE can't tell you what breaks, your users will. The compiler is your first line of defense. - -### 5. Ship megabytes, not gigabytes - -1.5GB containers kill serverless economics. <20MB binaries enable edge deployment, multi-region rollouts, and sub-100ms cold starts. Size is a feature. - -### 6. Observability is not optional - -If you can't trace it, you can't debug it. Built-in OpenTelemetry means every agent is observable from day one. Configuration, not instrumentation. - -### 7. Clear configuration is better than clever code - -YAML files are reviewable, deployable without rebuilding, and shareable across teams. Python DSLs are flexible until you need to audit what's actually running. - -### 8. Go-native patterns, not Python ports - -Goroutines, channels, contexts, and interfaces are Go's strengths. Don't port Python concepts—embrace the language you're shipping in. - -### 9. Channels orchestrate agents; gRPC orchestrates systems - -Local mode uses channels for zero-latency message passing. Distributed mode uses gRPC for cross-service orchestration. Same code, different transport. - -### 10. Fast cold starts enable real serverless - -45-second Python cold starts mean you're paying for idle compute or pre-warming instances. Sub-100ms Go cold starts mean serverless actually scales to zero. - -### 11. The best deployment is one you never debug - -When your binary runs the same everywhere—dev, staging, prod—"works on my machine" disappears. Static compilation eliminates environment drift. - -### 12. Errors are values; handle them before production does - -Go's explicit error handling forces you to think about failure cases at write time, not runtime. Every `if err != nil` is a production incident prevented. - -### 13. If it doesn't compile, it doesn't ship - -Compilation failures are cheaper than production failures. Type errors, interface mismatches, and API breaks surface before deployment, not after. - -### 14. Simplicity scales; complexity fails - -Clever abstractions break under load. Clear, explicit code survives oncall at 3 AM. Optimize for the engineer debugging in production. - -### 15. Where prototypes go to die, Go agents ship and thrive - -Python excels at research. Go excels at production. Pick the tool that matches the environment your code will run in, not the environment it's written in. - ---- - -## Further Reading - -Want to dive deeper? Read the full [Why Aixgo](/why-aixgo) for production principles, design trade-offs, and when to choose Aixgo vs. Python frameworks. - -**Get Started:** - -- [Quick Start Guide](/guides/quick-start) - Running in 5 minutes -- [Core Concepts](/guides/core-concepts) - Architecture deep dive -- [Features](/features) - What's available today - -**Get Involved:** - -- [GitHub](https://github.com/aixgo-dev/aixgo) - Star, contribute, open issues -- [Discussions](https://github.com/aixgo-dev/aixgo/discussions) - Share ideas, ask questions -- [Roadmap](https://github.com/aixgo-dev/aixgo/projects) - See what's next - ---- - -_Production AI deserves production tooling. Welcome to Aixgo._ diff --git a/web/content/v1-compatibility.md b/web/content/v1-compatibility.md deleted file mode 100644 index d666fe2..0000000 --- a/web/content/v1-compatibility.md +++ /dev/null @@ -1,331 +0,0 @@ ---- -title: 'Aixgo v1 Compatibility Promise' -description: 'Our commitment to API stability, upgrade paths, and long-term maintenance for Aixgo v1.0 and beyond.' ---- - -# Aixgo v1 Compatibility Promise - -## Introduction - -Aixgo v1.0 will establish a foundation for long-term stability. Code written against the v1 specification—whether YAML workflow configurations, Go SDK usage, or protocol -integrations—will continue to work correctly with all future v1.x releases. - -This document defines what "compatibility" means, what is covered by our guarantee, and how Aixgo will evolve while honoring this commitment. This stability commitment reflects our -[core philosophy](/why-aixgo) that production AI deserves production-grade tooling. - -## The Compatibility Promise - -When Aixgo reaches v1.0, we commit that: - -**YAML workflow configurations** written for v1.0 will execute correctly on all v1.x releases without modification. Your declarative agent definitions, supervisor patterns, and -orchestration workflows will remain stable. - -**Public Go SDK APIs** exposed in the `github.com/aixgo-dev/aixgo` module will maintain source-level compatibility. Code that compiles against v1.0 will continue to compile against -all v1.x versions. - -**gRPC and MCP protocol wire formats** will remain backward-compatible. Services built on v1.0 will interoperate with all v1.x releases, enabling incremental upgrades in -distributed deployments. - -While we expect the vast majority of programs will maintain this compatibility, it is impossible to guarantee that no future change will break any program. This document catalogs -exceptions to our compatibility promise. - -## What Is Covered - -### YAML Workflow Configurations - -All documented YAML schema elements for workflow definitions are stable: - -- Agent type definitions (`role: react`, `role: producer`, etc.) -- Supervisor configuration (`max_rounds`, `model`, etc.) -- Input/output routing (`inputs`, `outputs`, `source`, `target`) -- Model provider specifications (`model`, `temperature`, `max_tokens`) -- Tool definitions (`tools`, `input_schema`, `description`) -- Orchestration patterns (sequential, parallel, reflection, etc.) - -**Example of guaranteed stability:** - -```yaml -supervisor: - name: coordinator - model: gpt-4-turbo - max_rounds: 10 - -agents: - - name: analyzer - role: react - model: gpt-4-turbo - prompt: 'Analyze incoming data' - tools: - - name: query_database - description: 'Query the database' - input_schema: - type: object - properties: - query: { type: string } -``` - -This configuration will execute identically across all v1.x releases. - -### Go SDK Public APIs - -All exported types, functions, and methods in the `github.com/aixgo-dev/aixgo` module are covered: - -- Core types (`Agent`, `Supervisor`, `Message`, `Tool`) -- Configuration builders (`NewSupervisor`, `NewAgent`, `WithModel`) -- Lifecycle methods (`Run`, `Stop`, `AddAgent`) -- Tool registration interfaces (`RegisterTool`, `ToolHandler`) -- Context and observability hooks - -**Example of guaranteed API:** - -```go -import "github.com/aixgo-dev/aixgo" - -supervisor := aixgo.NewSupervisor("coordinator") -supervisor.AddAgent(analyzer) -supervisor.Run(context.Background()) // Signature stable -``` - -### Protocol Wire Formats - -The serialized message formats for gRPC and MCP are guaranteed: - -- gRPC service definitions (`.proto` files) -- MCP transport protocol messages -- Message envelope structures -- Authentication and authorization headers - -Clients and servers built on v1.0 can communicate with v1.x counterparts. - -## What Is Excluded - -The following are explicitly **not** covered by the compatibility guarantee: - -### 1. Security Fixes - -If a security vulnerability is discovered, we will fix it even if doing so breaks compatibility. Security always takes precedence over backward compatibility. - -**Rationale:** Protecting users from exploits is more important than API stability. We will document breaking security changes in release notes with migration guidance. - -### 2. Experimental Features - -Features marked as **alpha** or **beta** in documentation may change or be removed: - -- Functions with `// Experimental: ...` comments -- YAML configuration fields documented as "alpha" or "beta" -- Features in packages suffixed with `/alpha` or `/beta` - -**How to identify experimental features:** - -```go -// Experimental: MultiModalInput may change in future releases -type MultiModalInput struct { ... } -``` - -```yaml -# Beta: Vision capabilities are under active development -agents: - - name: image-analyzer - role: vision # Beta feature -``` - -**Guidance:** Avoid experimental features in production code until they graduate to stable status. - -### 3. Internal Packages - -Packages under `internal/` directories are not covered. These are implementation details subject to change without notice. - -**Example:** - -- `github.com/aixgo-dev/aixgo/internal/executor` - Not stable -- `github.com/aixgo-dev/aixgo/internal/transport` - Not stable - -### 4. Bugs and Unspecified Behavior - -Fixing bugs may break programs that depend on incorrect behavior. Similarly, behaviors not explicitly documented are subject to change. - -**Example:** If a YAML parser incorrectly accepts malformed input, fixing this bug may break configurations that relied on the bug. - -### 5. Performance Characteristics - -We do not guarantee execution speed, memory usage, or resource consumption. Optimizations may change performance profiles between releases. - -**Rationale:** Optimizations improve user experience but should not affect functional correctness. Your tests should validate behavior, not performance, unless you explicitly need -performance contracts. - -### 6. Tooling and CLI - -The `aixgo` CLI tool, code generators, and development utilities may change: - -- Command-line flags and arguments -- Output formats (except stable machine-readable formats like JSON) -- Error messages and logging - -**Guidance:** For automation, use the Go SDK or gRPC APIs rather than parsing CLI output. - -### 7. External Dependencies - -Compatibility with third-party LLM providers, vector databases, or external services is not guaranteed. If OpenAI changes their API, Aixgo may update its integration accordingly. - -**Rationale:** We cannot control external service evolution. We will minimize disruption but cannot guarantee zero-impact migrations. - -### 8. Unkeyed YAML Struct Literals - -Adding new fields to YAML configurations may break workflows that rely on field ordering in unkeyed structs. - -**Not recommended:** - -```yaml -agents: - - analyzer # Unkeyed positional field - - react # Order-dependent - - gpt-4-turbo -``` - -**Recommended:** - -```yaml -agents: - - name: analyzer - role: react - model: gpt-4-turbo -``` - -## How Aixgo Will Evolve - -### Adding Features - -New capabilities will be introduced through: - -1. **New agent types** - Additional roles beyond the v1.0 set -2. **Optional configuration fields** - New YAML fields with sensible defaults -3. **New SDK functions** - Additional exported functions and types -4. **Protocol extensions** - New gRPC methods or MCP message types - -**Backward compatibility:** Existing code will not need modification. New features are opt-in. - -### Deprecation Policy - -When features need to be replaced: - -1. **Deprecation notice** - Feature marked deprecated with migration guidance -2. **Grace period** - Minimum 12 months of continued support -3. **Removal** - Only in major version bump (v2.0) - -**Example:** - -```go -// Deprecated: Use NewSupervisorWithOptions instead. -// This function will be removed in v2.0. -func NewSupervisor(name string) *Supervisor { ... } -``` - -### Semantic Versioning - -Aixgo follows strict semantic versioning: - -- **v1.x.y** - Patch releases (bug fixes, security patches) -- **v1.x.0** - Minor releases (new features, backward-compatible) -- **v2.0.0** - Major releases (breaking changes) - -## Writing Future-Proof Code - -Follow these guidelines to ensure your code remains compatible: - -### 1. Use Keyed YAML Fields - -Always specify field names explicitly: - -```yaml -# Good -agents: - - name: analyzer - role: react - model: gpt-4-turbo - -# Bad - relies on field ordering -agents: - - analyzer - - react - - gpt-4-turbo -``` - -### 2. Avoid Experimental Features - -Check documentation for "alpha," "beta," or "experimental" warnings: - -```go -// Good - stable API -supervisor := aixgo.NewSupervisor("coordinator") - -// Bad - experimental API -supervisor := aixgo.NewExperimentalSupervisor("coordinator") // May change -``` - -### 3. Depend on Public APIs Only - -Import from `github.com/aixgo-dev/aixgo`, not `internal/` packages: - -```go -// Good -import "github.com/aixgo-dev/aixgo" - -// Bad - internal package, not stable -import "github.com/aixgo-dev/aixgo/internal/executor" -``` - -### 4. Handle Errors Gracefully - -Don't assume specific error messages or types unless documented: - -```go -// Good - handle any error -if err := supervisor.Run(ctx); err != nil { - return fmt.Errorf("supervisor failed: %w", err) -} - -// Bad - parsing error messages is fragile -if err != nil && strings.Contains(err.Error(), "timeout") { - // Error message format not guaranteed -} -``` - -### 5. Pin to Minor Versions - -Use Go modules to control upgrade cadence: - -```go -// go.mod -require github.com/aixgo-dev/aixgo v1.2.3 // Specific version -``` - -For production, pin to a specific minor version and test before upgrading. - -## Migration Path from Alpha - -When v1.0 is released, we will provide: - -1. **Migration guide** - Detailed changelog of breaking changes from v0.x -2. **Automated migration tools** - CLI tools to update YAML configs and Go code -3. **Deprecation warnings** - Advance notice of features removed in v1.0 -4. **Extended support** - v0.x will receive critical security patches for 6 months after v1.0 release - -## Reporting Compatibility Issues - -If you encounter a compatibility regression: - -1. **File an issue** on GitHub with reproduction steps -2. **Include version numbers** for both working and broken versions -3. **Provide minimal examples** demonstrating the regression - -We treat compatibility issues as high-priority bugs. Regressions will be fixed in patch releases. - -## Conclusion - -The v1.0 compatibility promise is our commitment to building a stable foundation for production AI agent systems. By clearly defining what is covered and what is excluded, we aim -to balance innovation with reliability. - -Your YAML workflows, Go integrations, and distributed deployments built on v1.0 will continue to work as Aixgo evolves. We take this responsibility seriously and will honor this -commitment throughout the v1.x lifecycle. - -For questions about compatibility or to discuss migration strategies, join our [Discord community](https://discord.gg/aixgo) or open a discussion on GitHub. diff --git a/web/content/why-aixgo.md b/web/content/why-aixgo.md deleted file mode 100644 index d4582e7..0000000 --- a/web/content/why-aixgo.md +++ /dev/null @@ -1,182 +0,0 @@ ---- -title: 'Why Aixgo' -description: 'Why we built Aixgo and what we believe about production AI. Our design principles, values, and commitment to production-grade tooling.' ---- - -## Production AI Deserves Production Tooling - -**We're not trying to out-prototype Python. We're trying to out-ship it.** - -For too long, production AI teams have been forced to choose between: - -- The velocity of Python frameworks (LangChain, CrewAI, AutoGen) -- The reliability of production-grade infrastructure (Go, Rust, Java) - -Aixgo exists to eliminate that choice. We believe AI agents should ship with the same performance, security, and simplicity as the rest of your production systems. - ---- - -
- -## The Production Reality - -| What Matters | Python Frameworks | Aixgo | Impact | -|--------------|------------------|-------|---------| -| **Container Size** | 1.2GB+ | <20MB | 60x smaller | -| **Cold Start** | 30-45 seconds | <100ms | 450x faster | -| **Dependencies** | 200+ packages | ~10 packages | 95% fewer | -| **Type Safety** | Runtime discovery | Compile-time | Zero production surprises | -| **Memory Baseline** | 512MB+ | 50MB | 10x more efficient | - -
- ---- - -## The Problem We're Solving - -Python excels at AI research and prototyping. But production reveals fundamental limitations: - -- **Bloated deployments**: 1GB+ containers, 200+ dependencies, massive security surface -- **Runtime surprises**: Type errors caught in production, not at compile time -- **GIL bottleneck**: No true parallelism for multi-agent systems -- **Slow cold starts**: 30-45 second startup kills serverless economics -- **Scaling complexity**: Manual orchestration, heavy memory footprint - -Go developers shouldn't have to abandon their stack's strengths—speed, security, simplicity, and scalability—just to build AI agents. That meant trading a 5MB binary for a 1.5GB container, type safety for runtime errors, and instant startup for 45-second cold starts. - -**Aixgo exists because production AI teams deserve better.** - ---- - -## Our Design Principles - -### 1. Production-First, Not Research-First - -API stability over rapid iteration. Performance-driven decisions. Security and observability built-in from day one. We prioritize production-hardened primitives over research experiments—fewer features, but battle-tested at scale. - -**Key trade-off**: YAML configuration instead of Python DSLs for declarative, reviewable, deployable workflows. - -### 2. Single Binary Simplicity - -Deploy AI agents in <20MB binaries with zero runtime dependencies. No Python interpreter, no virtual environments, no Docker required (though it works great with containers). - -**Real impact**: <20MB total deployment vs 1.2GB Python containers. Deploy to edge devices, serverless, IoT, anywhere. - -### 3. Type Safety as a Feature - -Catch errors at compile time, not in production. Go's type system enforces contracts between agents, tools, and workflows—your IDE tells you what's broken before your customers do. - -**Team velocity**: Less time debugging production runtime errors, more time building features. Refactor with confidence. - -### 4. Go-Native Patterns - -We don't port Python concepts to Go. We embrace Go's strengths: channels for local message passing, gRPC for distributed systems, goroutines for concurrency, context for cancellation. - -**The abstraction that matters**: Same code works locally (channels) and distributed (gRPC). Runtime picks transport automatically. - -### 5. Observable by Default - -Every agent interaction is traceable via OpenTelemetry, logged with structured context, and measurable with metrics. No instrumentation code required—just configuration. - -**Built-in integrations**: Prometheus, Grafana, Datadog, Langfuse, New Relic. Works out of the box. - -### 6. Open and Permissive - -MIT licensed. Use in commercial products without restrictions. No vendor lock-in, no surprise license changes. Your investment is protected. - ---- - -## When to Choose Aixgo - -
- -### Choose Aixgo When: - -**Deploying to production, not experimenting** -- Predictable performance and resource usage matter -- Container size and cold start times are critical (serverless, edge) -- Multi-region or distributed deployments needed - -**Your team already uses Go** -- Backend services in Go, want AI agents in same stack -- Value type safety and compile-time error detection -- Want to avoid Python dependency management overhead - -**Performance is non-negotiable** -- Sub-100ms cold starts for serverless -- High-throughput pipelines (10,000+ req/s) -- Resource-constrained environments (IoT, edge) - -**You need production-grade guarantees** -- Compile-time type safety -- Single binary deployments -- Minimal dependency surface -- Enterprise security requirements - -### Choose Python Frameworks When: - -**Doing exploratory research** -- Rapid prototyping with frequent pivots -- Research workflows that don't need production deployment -- Throwaway scripts and notebooks - -**You need Python's ML ecosystem** -- Training models with PyTorch, TensorFlow, JAX -- Data analysis with pandas, numpy, scikit-learn -- Integration with Jupyter notebooks - -**Your team is Python-native** -- No Go experience and no interest in learning -- Existing Python infrastructure -- Python-first organizational culture - -
- ---- - -## Our Commitment - -### Stability First - -When v1.0 releases, we guarantee API stability with semantic versioning, long-term support, and clear upgrade paths. No breaking changes without major version bumps. - -See our [v1.0 Compatibility Promise](/v1-compatibility) for details. - -### Open Development - -Public roadmap, open issue tracking, community-driven feature prioritization, transparent decision-making. Your feedback shapes Aixgo. - -### Long-Term Vision - -Enterprise-grade multi-agent orchestration with production-proven patterns. We're building for what ships, not what trends. - ---- - -## Join the Movement - -Aixgo is just getting started. We're building this in the open, learning from the community, and evolving based on real-world production usage. - -If you're tired of wrestling with Python in production, if you believe AI agents should ship with the same simplicity as the rest of your Go services, or if you just want to see what production-grade AI looks like in pure Go—join us. - -
- -**Get Started:** - -- [Quick Start Guide](/guides/quick-start) - Get running in 5 minutes -- [Core Concepts](/guides/core-concepts) - Understand Aixgo's architecture -- [Features](/features) - Explore what's available today -- [Proverbs](/proverbs) - 15 production principles - -**Get Involved:** - -- [GitHub](https://github.com/aixgo-dev/aixgo) - Star the repo, open issues, contribute -- [Discussions](https://github.com/aixgo-dev/aixgo/discussions) - Ask questions, share ideas -- [Roadmap](https://github.com/orgs/aixgo-dev/projects/1) - See what's coming next - -
- ---- - -**Where Python prototypes go to die in production, Go agents ship and scale.** - -This is our philosophy. Welcome to Aixgo. diff --git a/web/data/features.yaml b/web/data/features.yaml deleted file mode 100644 index c0fcea7..0000000 --- a/web/data/features.yaml +++ /dev/null @@ -1,457 +0,0 @@ -# Aixgo Feature Releases -# Status indicators: complete (✅), in_progress (🚧), roadmap (❌) - -categories: - - name: "Core AI Capabilities" - subcategories: - - title: "LLM Providers" - features: - - name: "OpenAI" - status: "complete" - tooltip: - content: "Full support for GPT-4, GPT-3.5, and other OpenAI models with streaming, function calling, and vision capabilities." - - name: "Anthropic (Claude)" - status: "complete" - tooltip: - content: "Native integration with Claude 3 family (Opus, Sonnet, Haiku) featuring extended context windows and advanced reasoning." - - name: "Google Gemini" - status: "complete" - tooltip: - content: "Access to Gemini Pro and Ultra models with multi-modal capabilities and large context support." - - name: "xAI (Grok)" - status: "complete" - tooltip: - content: "Integration with Grok models for real-time information and alternative AI perspectives." - - name: "Vertex AI" - status: "complete" - tooltip: - content: "Enterprise-grade access to Google's AI models through Vertex AI platform with enhanced security and compliance." - - name: "HuggingFace (free)" - status: "complete" - tooltip: - content: "Free Inference API with simulated streaming due to API limitations." - - name: "Ollama (local)" - status: "complete" - tooltip: - content: "Run AI models locally with zero API costs. Enterprise-grade SSRF protection, hybrid cloud fallback, production K8s manifests. Support for phi, llama, mistral, gemma, and 100+ models." - - name: "HuggingFace (TGI)" - status: "roadmap" - tooltip: - content: "Native streaming support for paid Text Generation Inference endpoints." - - - title: "Agent System" - features: - - name: "ReAct Agent" - status: "complete" - tooltip: - content: "Reasoning and Acting agent that iteratively plans, executes tools, and observes results to solve complex tasks." - - name: "Supervisor Orchestration" - status: "complete" - tooltip: - content: "Coordinates multiple specialized agents, managing task delegation and result aggregation for complex workflows." - - name: "Classifier Agent" - status: "complete" - tooltip: - content: "Routes queries to appropriate specialized agents based on intent classification and context analysis." - - name: "Aggregator Agent" - status: "complete" - tooltip: - content: "Combines outputs from multiple agents into coherent responses, handling result synthesis and deduplication. Includes deterministic voting strategies (majority, unanimous, weighted, confidence) with zero LLM cost." - - name: "Planner Agent" - status: "complete" - tooltip: - content: "Breaks down complex tasks into executable steps with 6 advanced planning strategies: Chain-of-Thought (systematic decomposition), Tree-of-Thought (multi-branch exploration with scoring), ReAct (reasoning-action cycles), Monte Carlo Tree Search (UCB1-based path optimization), Backward Chaining (goal-to-steps decomposition), and Hierarchical Planning (multi-level task breakdown)." - - name: "Producer Agent" - status: "complete" - tooltip: - content: "Generates content and artifacts based on task requirements, supporting various output formats and templates." - - name: "Logger Agent" - status: "complete" - tooltip: - content: "Captures and structures agent interactions, decisions, and outputs for debugging and audit trails." - - - title: "Orchestration Patterns" - features: - - name: "Parallel Pattern" - status: "complete" - tooltip: - content: "Execute multiple independent tasks concurrently, reducing latency by leveraging parallel processing." - - name: "Sequential Pattern" - status: "complete" - tooltip: - content: "Chain tasks in order where each step depends on previous results, ensuring proper execution flow." - - name: "Reflection Pattern" - status: "complete" - tooltip: - content: "Self-critique and iterative improvement with quality scoring (20-50% improvement). Supports single critic, self-reflection, and multi-critic aggregation with automatic score extraction from JSON, pattern matching, or sentiment analysis." - - name: "MapReduce Pattern" - status: "complete" - tooltip: - content: "Distribute work across multiple agents and combine results, ideal for processing large datasets." - - name: "Planning Pattern" - status: "complete" - tooltip: - content: "Generate and execute multi-step plans dynamically, adapting to intermediate results and conditions." - - name: "Classification Pattern" - status: "complete" - tooltip: - content: "Route requests to specialized handlers based on classification, enabling intent-based processing." - - name: "Supervisor Pattern" - status: "complete" - tooltip: - content: "Hub-and-spoke coordination where supervisor delegates to specialists." - - name: "Router Pattern" - status: "complete" - tooltip: - content: "Intelligent routing to appropriate agent with 25-50% cost reduction." - - name: "Swarm Pattern" - status: "complete" - tooltip: - content: "Decentralized coordination with dynamic agent handoffs." - - name: "Hierarchical Pattern" - status: "complete" - tooltip: - content: "Multi-level delegation with manager-worker structure." - - name: "RAG Pattern" - status: "complete" - tooltip: - content: "Retrieval-Augmented Generation with 4 variants: Standard (retrieve→generate), Conversational (with history tracking), Multi-Query (query expansion with reciprocal rank fusion), and Hybrid (semantic + keyword retrieval merged with RRF scoring)." - - name: "Ensemble Pattern" - status: "complete" - tooltip: - content: "Multi-model voting and consensus for 25-50% error reduction." - - name: "Aggregation Pattern" - status: "complete" - tooltip: - content: "Multi-agent synthesis using consensus, weighted, or semantic strategies." - - - name: "Tools & MCP" - subcategories: - - title: "Tools & MCP" - features: - - name: "Function Calling" - status: "complete" - tooltip: - content: "Enable LLMs to invoke external functions and APIs, extending capabilities beyond text generation." - - name: "Tool Registration" - status: "complete" - tooltip: - content: "Dynamic registration and discovery of tools, allowing agents to utilize custom functions at runtime." - - name: "Local Transport" - status: "complete" - tooltip: - content: "In-process tool communication for low-latency function calls without network overhead." - - name: "gRPC Transport" - status: "complete" - tooltip: - content: "Distributed tool execution via gRPC, enabling remote service integration and microservices architecture." - - name: "Service Discovery" - status: "complete" - tooltip: - content: "Automatic discovery and registration of available tools and services across the infrastructure." - - name: "Typed Tool Registration" - status: "complete" - tooltip: - content: "Type-safe tool registration using Go generics with automatic JSON schema generation from struct types via reflection. Validates arguments at compile-time and runtime for robust tool execution." - - - name: "Data Infrastructure" - subcategories: - - title: "Vector Databases" - features: - - name: "Firestore" - status: "complete" - tooltip: - content: "Google Cloud Firestore integration for vector storage with built-in scaling and real-time synchronization." - - name: "In-Memory Store" - status: "complete" - tooltip: - content: "High-performance in-memory vector storage for development, testing, and low-latency use cases." - - name: "Qdrant" - status: "roadmap" - tooltip: - content: "High-performance vector search engine with advanced filtering and hybrid search capabilities." - roadmap: "Full integration with metadata filtering and batch operations." - - name: "pgvector" - status: "roadmap" - tooltip: - content: "PostgreSQL extension for vector similarity search, combining relational and vector data." - roadmap: "Complete CRUD operations and index optimization support." - - - title: "Session Persistence" - features: - - name: "Session Manager" - status: "complete" - tooltip: - content: "Full session lifecycle management with Create, GetOrCreate, List, and Delete operations. Thread-safe for concurrent access." - - name: "JSONL File Storage" - status: "complete" - tooltip: - content: "Lightweight append-only JSONL storage at ~/.aixgo/sessions/ with automatic directory creation and 0700/0600 permissions." - - name: "Redis Backend" - status: "complete" - tooltip: - content: "Distributed session storage for multi-node deployments with configurable key prefix and connection pooling." - - name: "Checkpoint/Restore" - status: "complete" - tooltip: - content: "Save conversation state snapshots and rollback when needed. Includes checksum validation for data integrity." - - name: "CallWithSession()" - status: "complete" - tooltip: - content: "Runtime integration that automatically appends messages to sessions and provides session context to agents." - - name: "Context Helpers" - status: "complete" - tooltip: - content: "SessionFromContext() and ContextWithSession() for seamless context.Context-based session passing between components." - - - title: "Memory & Context" - features: - - name: "Conversation History" - status: "complete" - tooltip: - content: "Persistent storage and retrieval of conversation threads, maintaining context across sessions." - - name: "RAG Systems" - status: "complete" - tooltip: - content: "Retrieval Augmented Generation for grounding LLM responses in your knowledge base and documents." - - name: "Semantic Search" - status: "complete" - tooltip: - content: "Vector-based similarity search for finding relevant information based on meaning rather than keywords." - - name: "Long-term Memory" - status: "roadmap" - tooltip: - content: "Cross-session knowledge retention and personalization based on historical interactions." - roadmap: "Automatic memory consolidation and fact extraction from conversations." - - - title: "Embeddings" - features: - - name: "OpenAI" - status: "complete" - tooltip: - content: "OpenAI's text-embedding models (ada-002, text-embedding-3) for high-quality vector representations." - - name: "HuggingFace API" - status: "complete" - tooltip: - content: "Access to HuggingFace's hosted embedding models via their Inference API." - - name: "HuggingFace TEI" - status: "complete" - tooltip: - content: "Text Embeddings Inference server integration for self-hosted, optimized embedding generation." - - - name: "Security & Observability" - subcategories: - - title: "Security" - features: - - name: "Auth Framework" - status: "complete" - tooltip: - content: "Pluggable authentication system supporting multiple providers and custom auth strategies." - - name: "RBAC Authorization" - status: "complete" - tooltip: - content: "Role-Based Access Control for fine-grained permissions on agents, tools, and data resources." - - name: "Rate Limiting" - status: "complete" - tooltip: - content: "Configurable request throttling to prevent abuse and control API costs per user or endpoint." - - name: "Injection Protection" - status: "complete" - tooltip: - content: "Detection and mitigation of prompt injection attacks to prevent unauthorized LLM behavior manipulation." - - name: "SSRF Protection" - status: "complete" - tooltip: - content: "Enterprise-grade SSRF (Server-Side Request Forgery) protection with URL validation, private IP blocking (RFC1918), cloud metadata service blocking (169.254.169.254), DNS rebinding prevention, and configurable host allowlists. Protects local inference services like Ollama." - - name: "TLS/mTLS Support" - status: "complete" - tooltip: - content: "Transport Layer Security handled by cloud infrastructure (Cloud Run, GKE, etc.) for encrypted service-to-service communication." - - name: "Audit Logging" - status: "complete" - tooltip: - content: "Comprehensive logging of security events, access attempts, and data operations for compliance." - - name: "JWT Verification" - status: "complete" - tooltip: - content: "JSON Web Token validation with RS256/HS256 signature verification, automatic JWKS fetching with caching, expiration checking, issuer/audience validation, and RSA public key verification for stateless authentication." - - name: "File-Based API Keys" - status: "complete" - tooltip: - content: "Load API keys from secure files supporting two formats: line-based (user_id=api_key) and JSON. Includes automatic permission validation (rejects world-readable files) and comment/empty-line handling." - - name: "Input Validation" - status: "complete" - tooltip: - content: "Schema-based validation of inputs to prevent injection attacks and ensure data integrity." - - name: "Error Sanitization" - status: "complete" - tooltip: - content: "Automatically masks sensitive information (file paths, IPs) in error messages and logs." - - name: "SIEM Integration" - status: "complete" - tooltip: - content: "Export audit logs to Elasticsearch, Splunk HEC, Webhook, or custom backends for security monitoring." - - name: "Type-Safe Validation" - status: "complete" - tooltip: - content: "Compile-time type checking with Go's type system plus Pydantic AI-style automatic validation retry (40-70% reliability improvement)." - - - title: "Observability" - features: - - name: "OpenTelemetry" - status: "complete" - tooltip: - content: "Industry-standard distributed tracing, metrics, and logging for comprehensive system observability." - - name: "Langfuse Integration" - status: "complete" - tooltip: - content: "LLM-specific observability platform integration for tracking prompts, completions, costs, and quality metrics." - - name: "Prometheus Metrics" - status: "complete" - tooltip: - content: "Expose operational metrics in Prometheus format for monitoring, alerting, and performance analysis." - - name: "Health Checks" - status: "complete" - tooltip: - content: "Liveness and readiness endpoints for orchestration platforms and load balancer integration." - - name: "Distributed Tracing" - status: "complete" - tooltip: - content: "End-to-end request tracing across services to diagnose latency issues and understand system behavior." - - name: "Cost Tracking" - status: "complete" - tooltip: - content: "Automatic token counting with per-request, per-agent, per-user cost tracking. Calculate costs per provider with accurate pricing." - - - title: "Multi-Modal" - features: - - name: "Vision/Images" - status: "roadmap" - tooltip: - content: "Image understanding and analysis capabilities for visual question answering and OCR tasks." - roadmap: "Integration with vision-enabled models like GPT-4 Vision and Claude 3." - - name: "Audio Processing" - status: "roadmap" - tooltip: - content: "Speech-to-text transcription and audio analysis for voice-driven applications." - roadmap: "Support for Whisper and other audio AI models." - - name: "Document Parsing" - status: "roadmap" - tooltip: - content: "Extract structured data from PDFs, images, and complex document formats." - roadmap: "Integration with document AI services and layout analysis models." - - - name: "Infrastructure & Operations" - subcategories: - - title: "CLI & Developer Tools" - features: - - name: "Interactive Chat Assistant" - status: "in_progress" - tooltip: - content: "Alpha: Multi-model coding assistant with basic file operations (read/write/glob/grep), git integration (status/diff/commit/log), and terminal execution. Early development with core features working—more capabilities coming soon." - roadmap: "Planned: MCP tool integration, code refactoring tools, test generation, project scaffolding, and IDE integration." - - name: "Session Management" - status: "complete" - tooltip: - content: "Persistent chat sessions stored in ~/.aixgo/sessions/ with JSON format. List, resume, and delete sessions via CLI. Automatic save after each interaction with cost tracking and conversation history." - - name: "Model Information" - status: "complete" - tooltip: - content: "View all available LLM models with pricing information via 'aixgo models' command. Shows input/output token costs for 25+ models across 7+ providers." - - name: "Cobra CLI Framework" - status: "complete" - tooltip: - content: "Modern CLI with subcommands (run, chat, session, models) replacing flag-based interface. Better help text, command discovery, and user experience." - - - title: "Configuration" - features: - - name: "Phased Agent Startup" - status: "complete" - tooltip: - content: "Dependency-aware agent startup using topological sort. Declare dependencies with depends_on field, automatic phase-based initialization eliminates race conditions, concurrent startup within phases for performance. Includes configurable timeout (30s default) and supports all runtimes (Local, Simple, Distributed)." - - name: "Public Agent Package" - status: "complete" - tooltip: - content: "Standalone package exporting Agent, Message, and Runtime interfaces for building custom agents without the full framework. Minimal dependencies, 85%+ test coverage, and comprehensive documentation for library-style integration." - - name: "YAML Workflows" - status: "complete" - tooltip: - content: "Declarative workflow definitions using YAML for version-controlled, code-free agent orchestration." - - name: "Go SDK" - status: "complete" - tooltip: - content: "Comprehensive Go library for programmatic agent creation, customization, and integration." - - name: "29+ Example Configs" - status: "complete" - tooltip: - content: "Production-ready reference implementations covering common patterns and use cases." - - name: "Complete Use Cases" - status: "complete" - tooltip: - content: "End-to-end examples demonstrating real-world applications from setup to deployment." - - name: "Single Binary (<20MB)" - status: "complete" - tooltip: - content: "Compile to a single <20MB binary with zero runtime dependencies. Deploy anywhere without package managers." - - name: "Instant Startup (<100ms)" - status: "complete" - tooltip: - content: "Near-instant cold start enabling true serverless viability and real-time response." - - name: "60-70% Cost Savings" - status: "complete" - tooltip: - content: "Dramatically lower compute costs vs Python frameworks through Go's efficiency and small footprint." - - - title: "Deployment" - features: - - name: "Docker" - status: "complete" - tooltip: - content: "Containerized deployment with optimized images for consistent runtime environments." - - name: "Docker Compose" - status: "complete" - tooltip: - content: "Multi-container orchestration for local development and simple production deployments." - - name: "Cloud Run" - status: "complete" - tooltip: - content: "Serverless deployment on Google Cloud Run with automatic scaling and zero-ops infrastructure." - - name: "Kubernetes Manifests" - status: "complete" - tooltip: - content: "Production-ready Kubernetes configurations including deployments, services, and ingress rules." - - name: "Kubernetes Operator" - status: "roadmap" - tooltip: - content: "Custom controller for automated agent lifecycle management on Kubernetes." - roadmap: "CRDs for declarative agent provisioning and GitOps workflows." - - name: "Terraform IaC" - status: "roadmap" - tooltip: - content: "Infrastructure as Code modules for automated cloud resource provisioning." - roadmap: "Modules for GCP, AWS, and Azure deployments with best practices." - - - title: "Production Reliability" - features: - - name: "Circuit Breakers" - status: "complete" - tooltip: - content: "Automatic failure detection and circuit breaking to prevent cascade failures during service outages." - - name: "Retry with Backoff" - status: "complete" - tooltip: - content: "Exponential backoff retry logic for handling transient failures and rate limit errors gracefully." - - name: "State Persistence" - status: "complete" - tooltip: - content: "Durable storage of agent state and conversation context for resuming workflows after interruptions." - - name: "Crash Recovery" - status: "roadmap" - tooltip: - content: "Automatic detection and recovery from process crashes with workflow continuation." - roadmap: "Checkpoint-based recovery and state reconstruction mechanisms." - - name: "Multi-Region" - status: "roadmap" - tooltip: - content: "Deploy agents across multiple geographic regions for low latency and high availability." - roadmap: "Cross-region state replication and request routing strategies." diff --git a/web/data/milestones.yaml b/web/data/milestones.yaml deleted file mode 100644 index 04ef9c2..0000000 --- a/web/data/milestones.yaml +++ /dev/null @@ -1,65 +0,0 @@ -# Aixgo Development Milestones -# High-level version summaries for homepage display - -milestones: - - version: "Current Release" - status: "complete" - date: "March 2026" - highlights: - - "Interactive coding assistant via 'aixgo chat' command" - - "Multi-model support with mid-conversation switching" - - "File operations (read/write/glob/grep) and git integration" - - "Terminal command execution with safety prompts" - - "Session management (list/resume/delete) with cost tracking" - - "Cobra CLI framework with modern subcommands" - - - version: "v0.5.0 Stable" - status: "complete" - date: "February 2026" - highlights: - - "Public Provider API - LLM providers accessible via pkg/llm/provider" - - "Guided ReAct Workflows - Step-by-step execution with verification" - - "Cost Calculator API - Pricing data for 25+ models via pkg/llm/cost" - - "40-70% improved reliability with guided workflows" - - "30-50% reduced LLM calls through parallel tool execution" - - - version: "v0.4.0 Stable" - status: "complete" - date: "February 2026" - highlights: - - "Go 1.26 with modernized codebase" - - "6 advanced planner strategies (MCTS, Tree-of-Thought, Backward Chaining)" - - "4 RAG variants (Conversational, Multi-Query, Hybrid)" - - "JWT verification with JWKS caching and RS256 signatures" - - "Reflection pattern with multi-critic aggregation" - - "Typed MCP tool registration with schema generation" - - - version: "v0.3.3 Beta" - status: "complete" - date: "February 2026" - highlights: - - "8+ LLM providers (OpenAI, Anthropic, Gemini, xAI, Vertex AI, HuggingFace, Ollama)" - - "13 orchestration patterns (Supervisor, Parallel, RAG, Reflection, MapReduce, etc.)" - - "Session persistence with JSONL and Redis backends" - - "Full observability (OpenTelemetry, Langfuse, Prometheus)" - - "Enterprise security (Auth, RBAC, TLS/mTLS, audit logging)" - - - version: "Next Release" - status: "planned" - date: "Q2 2026" - highlights: - - "Visual workflow debugging dashboard" - - "Session encryption at rest" - - "PostgreSQL session backend" - - "Additional vector databases (Qdrant, pgvector)" - - "Enhanced assistant tools and workflows" - - - version: "Production Release" - status: "planned" - date: "Q3 2026" - highlights: - - "API stability guarantees and semantic versioning" - - "Kubernetes operator for automated lifecycle management" - - "Multi-modal support (Vision, Audio, Documents)" - - "Long-term memory and personalization" - - "Infrastructure as Code (Terraform modules)" diff --git a/web/data/version.yaml b/web/data/version.yaml deleted file mode 100644 index 4691857..0000000 --- a/web/data/version.yaml +++ /dev/null @@ -1,11 +0,0 @@ -# Aixgo Version Information -# Simplified: Just "Stable" without version number to avoid manual updates - -current: "Stable" -stage: "Stable" -tagline: "The AI Agent Framework That Ships in <20MB" -subheadline: "Build, deploy, and scale AI agents in Go. No containers. No cold starts. No Python." - -# CTA button text -cta_text: "Get Started" -cta_url: "/guides/quick-start" diff --git a/web/layouts/404.html b/web/layouts/404.html deleted file mode 100644 index 3e38646..0000000 --- a/web/layouts/404.html +++ /dev/null @@ -1,24 +0,0 @@ -{{ define "main" }} - -{{ end }} diff --git a/web/layouts/_default/_markup/render-codeblock.html b/web/layouts/_default/_markup/render-codeblock.html deleted file mode 100644 index ae7b76e..0000000 --- a/web/layouts/_default/_markup/render-codeblock.html +++ /dev/null @@ -1,9 +0,0 @@ -
- -
{{ .Inner }}
-
diff --git a/web/layouts/_default/baseof.html b/web/layouts/_default/baseof.html deleted file mode 100644 index ac7e50f..0000000 --- a/web/layouts/_default/baseof.html +++ /dev/null @@ -1,28 +0,0 @@ - - - - {{ partial "head.html" . }} - - - {{ partial "header.html" . }} - -
- {{ block "main" . }}{{ end }} -
- - {{ partial "footer.html" . }} - - - - - - - - - diff --git a/web/layouts/_default/list.html b/web/layouts/_default/list.html deleted file mode 100644 index 75d1ba7..0000000 --- a/web/layouts/_default/list.html +++ /dev/null @@ -1,93 +0,0 @@ -{{ define "main" }} - -
-
-

{{ .Title }}

- - {{ if .Description }} -

{{ .Description }}

- {{ end }} - -
- {{ if eq .Section "guides" }} - - {{ range .Pages.GroupByParam "category" }} -
-

{{ .Key | default "Other" }}

-
- {{ range .Pages }} - -
- - - - -
-
-

{{ .Title }}

-

{{ .Description }}

-
-
- {{ end }} -
-
- {{ end }} - - {{ else if eq .Section "blog" }} - - - - {{ else }} - -
- {{ range .Pages }} -
-

{{ .Title }}

- {{ if .Description }} -

{{ .Description }}

- {{ end }} -
- {{ end }} -
- {{ end }} -
-
-
- -{{ end }} diff --git a/web/layouts/_default/single.html b/web/layouts/_default/single.html deleted file mode 100644 index b80cef8..0000000 --- a/web/layouts/_default/single.html +++ /dev/null @@ -1,64 +0,0 @@ -{{ define "main" }} - -
-
- - {{ if or (eq .Section "guides") (eq .Section "blog") }} - ← Back to {{ if eq .Section "guides" }}Guides{{ else }}Blog{{ end }} - {{ end }} - - - {{ if eq .Section "blog" }} - - {{ end }} - - - {{ if eq .Section "guides" }} - {{ if .Params.breadcrumb }} - - {{ end }} - {{ end }} - -

{{ .Title }}

- - {{ if .Description }} -

{{ .Description }}

- {{ end }} - -
- {{ .Content }} -
- - - {{ if and (eq .Section "blog") (.Params.tags) }} -
- {{ range .Params.tags }} - {{ . }} - {{ end }} -
- {{ end }} -
-
- -{{ end }} diff --git a/web/layouts/index.html b/web/layouts/index.html deleted file mode 100644 index f0bb317..0000000 --- a/web/layouts/index.html +++ /dev/null @@ -1,360 +0,0 @@ -{{ define "main" }} - - -
-
-

- {{ .Site.Data.version.tagline }} -

-

- {{ .Site.Data.version.subheadline }} -

- {{ .Site.Data.version.cta_text }} → -
-
- - -
-
-

- From Prototype to Planet-Scale in 60 Seconds -

- -
-
-

1. Install

-
- -
go get github.com/aixgo-dev/aixgo
-
-
- -
-

2. Create config/agents.yaml

-
- -
supervisor:
-  name: coordinator
-  model: gpt-4o-mini  # OpenAI - fast orchestration
-  max_rounds: 10
-
-agents:
-  - name: data-producer
-    role: producer
-    interval: 1s
-    outputs:
-      - target: analyzer
-
-  - name: analyzer
-    role: react
-    model: claude-3-5-haiku  # Anthropic - strong reasoning
-    prompt: |
-      You are a data analyst. Analyze incoming data and provide insights.
-    inputs:
-      - source: data-producer
-    outputs:
-      - target: logger
-
-  - name: logger
-    role: logger
-    inputs:
-      - source: analyzer
-
-
- -
-

3. Create main.go

-
- -
package main
-
-import (
-    "github.com/aixgo-dev/aixgo"
-    _ "github.com/aixgo-dev/aixgo/agents"
-)
-
-func main() {
-    if err := aixgo.Run("config/agents.yaml"); err != nil {
-        panic(err)
-    }
-}
-
-
- -
-

4. Deploy anywhere

-
- -
# Local development
-go run main.go
-
-# Production - single <20MB binary
-go build -o agent
-./agent
-
-# Edge, Lambda, Cloud Run, Kubernetes - one binary, zero configuration
-
-
-
- - -
-
- - -
-
-

- Why the Industry is Moving to Go -

-

- Python dominated AI because it was easy to prototype. Go will dominate production because it's built to ship. -

- -
-
-

Container Size

-
-
-

Python frameworks

-

1.2GB with dependencies

-
-
-
-

Aixgo

-

<20MB single binary

-
-
-
-

Impact:

-

Deploy to edge devices, serverless, anywhere

-
-
-
- -
-

Startup Performance

-
-
-

Python frameworks

-

30-45s cold start

-
-
-
-

Aixgo

-

<100ms instant startup

-
-
-
-

Impact:

-

True serverless viability, real-time response

-
-
-
- -
-

Runtime Safety

-
-
-

Python frameworks

-

Runtime - Discover errors in production

-
-
-
-

Aixgo

-

Compile-time - Compiler catches errors before deploy

-
-
-
-

Impact:

-

Ship with confidence, sleep at night

-
-
-
- -
-

LLM Data Validation

-
-
-

Python frameworks

-

Runtime only - Type changes found in production

-
-
-
-

Aixgo

-

Compile-time - Type changes caught before deploy, auto-retry

-
-
-
-

Impact:

-

Refactor with confidence, LLM errors auto-recover

-
-
-
-
-
-
- - - -
-
-

- The Go Advantage -

- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
What Matters in ProductionPython FrameworksAixgo
Deploy Anywhere1GB+ containers, complex deps<20MB binary, zero deps
Cold Start Speed10-45 seconds<100ms
Type SafetyRuntime discoveryCompile-time guarantees
ConcurrencyGIL bottleneckNative parallelism
Scaling PatternRewrite for distributionSame code, local → distributed
Operational CostHigh compute overhead60-70% infrastructure savings
-
-
-
- - -
-
-

- Built for What's Next -

-

- Aixgo isn't just another framework. It's the foundation for the next generation of AI systems. -

- -
-
-
-
-

13 Orchestration Patterns

-
-

All patterns production-ready: Supervisor, Sequential, Parallel, Router, Swarm, RAG, Reflection, Ensemble, and more. Go channels locally, gRPC for distributed.

-
- -
-
-
-

8+ LLM Providers

-
-

OpenAI, Anthropic, Google Gemini, xAI, Vertex AI, HuggingFace, Ollama, and vLLM. Switch providers with a config change. Auto-detection by model name.

-
- -
-
-
-

Full Observability Stack

-
-

OpenTelemetry tracing, Langfuse integration, Prometheus metrics, automatic cost tracking. Kubernetes health probes included.

-
- -
-
-
-

Enterprise Security

-
-

4 auth modes, RBAC, SSRF protection, prompt injection defense, SIEM integration. Rate limiting and audit logging built-in.

-
- -
-
-
-

Type-Safe Validation

-
-

Pydantic AI-style validation with automatic retry. 40-70% improvement in structured output reliability. Compile-time type checking.

-
- -
-
-
-

60-70% Lower Infra Costs

-
-

<20MB binaries, <100ms cold starts, ~50MB memory. Router pattern saves 25-50% on LLM costs. Local inference with Ollama for zero API costs.

-
-
-
-
- - -
-
-

- What We're Building -

- - {{ partial "milestone-cards.html" . }} -
-
- - -
-
-

- Try {{ .Site.Data.version.current }} -

-

- The AI infrastructure built on Python is reaching its limits. Aixgo is what comes next. Join us in building it. -

- {{ .Site.Data.version.cta_text }} → -
-
- -{{ end }} diff --git a/web/layouts/index.json b/web/layouts/index.json deleted file mode 100644 index 291cdfd..0000000 --- a/web/layouts/index.json +++ /dev/null @@ -1,12 +0,0 @@ -{{- $.Scratch.Add "index" slice -}} -{{- range .Site.RegularPages -}} - {{- $.Scratch.Add "index" (dict "content" .Plain "date" (.Date.Format "02 January 2006") "externalUrl" .Params.externalUrl "permalink" .Permalink "section" .Section "summary" .Summary "title" .Title "type" .Type) -}} -{{- end -}} -{{- range .Site.Taxonomies -}} - {{- range . -}} - {{- range .Pages -}} - {{- $.Scratch.Add "index" (dict "content" "" "date" "" "externalUrl" .Params.externalUrl "permalink" .Permalink "section" .Section "summary" "" "title" .Title "type" .Type) -}} - {{- end -}} - {{- end -}} -{{- end -}} -{{- $.Scratch.Get "index" | jsonify -}} diff --git a/web/layouts/partials/alpha-notice.html b/web/layouts/partials/alpha-notice.html deleted file mode 100644 index 4f4dbc4..0000000 --- a/web/layouts/partials/alpha-notice.html +++ /dev/null @@ -1,48 +0,0 @@ -{{/* - Alpha Notice Partial - Displays a styled notice block for alpha-quality features - - Parameters: - - badges: Array of badge objects, each with a "status" field (alpha/planned/available) - - title: The bold title text (rendered in tags, properly escaped) - - message: The description text (properly escaped, no HTML needed) - - link: URL for the status page link (defaults to /#alpha-status) - - linkText: Text for the status page link (defaults to "Check status →") - - detailed: Boolean, if true shows a more detailed warning format -*/}} - -{{- $badges := .badges -}} -{{- $title := .title | default "" -}} -{{- $message := .message | default "" -}} -{{- $link := .link | default "/#alpha-status" -}} -{{- $linkText := .linkText | default "Check status →" -}} -{{- $detailed := .detailed | default false -}} - - diff --git a/web/layouts/partials/footer.html b/web/layouts/partials/footer.html deleted file mode 100644 index 52b2204..0000000 --- a/web/layouts/partials/footer.html +++ /dev/null @@ -1,48 +0,0 @@ - diff --git a/web/layouts/partials/head.html b/web/layouts/partials/head.html deleted file mode 100644 index b0e14e2..0000000 --- a/web/layouts/partials/head.html +++ /dev/null @@ -1,24 +0,0 @@ - - -{{ if .IsHome }}{{ .Site.Title }}{{ else }}{{ .Title }} | {{ .Site.Title }}{{ end }} - - - - - - - -{{ partial "seo.html" . }} - - - - - - - -{{ with .OutputFormats.Get "RSS" }} - -{{ end }} - - -{{ partial "posthog.html" . }} diff --git a/web/layouts/partials/header.html b/web/layouts/partials/header.html deleted file mode 100644 index ddf07f9..0000000 --- a/web/layouts/partials/header.html +++ /dev/null @@ -1,45 +0,0 @@ -
- -
diff --git a/web/layouts/partials/milestone-cards.html b/web/layouts/partials/milestone-cards.html deleted file mode 100644 index 0bdabc5..0000000 --- a/web/layouts/partials/milestone-cards.html +++ /dev/null @@ -1,44 +0,0 @@ -{{- /* - Milestone Cards Partial - Renders development milestone cards from data/milestones.yaml - - Usage: {{ partial "milestone-cards.html" . }} - - Status mapping: complete (✅), in_progress (🚧), planned (📋) -*/ -}} - -{{- $statusLabels := dict "complete" "Complete" "in_progress" "In Progress" "planned" "Planned" -}} -{{- $statusIcons := dict "complete" "✅" "in_progress" "🚧" "planned" "📋" -}} - -{{- with site.Data.milestones -}} -
- {{- range $milestone := .milestones -}} - {{- $icon := "📋" -}} - {{- if eq $milestone.status "complete" -}} - {{- $icon = "✅" -}} - {{- else if eq $milestone.status "in_progress" -}} - {{- $icon = "🚧" -}} - {{- end -}} -
-
- {{ $milestone.version }} - {{- if eq $milestone.status "complete" }} - ✅{{ end -}} -
- {{- if $milestone.date }} -
{{ $milestone.date }}
- {{- end }} -
    - {{- range $milestone.highlights -}} -
  • {{ $icon }} {{ . }}
  • - {{- end -}} -
-
- {{- end -}} -
- - -{{- end -}} diff --git a/web/layouts/partials/posthog.html b/web/layouts/partials/posthog.html deleted file mode 100644 index 2c7363d..0000000 --- a/web/layouts/partials/posthog.html +++ /dev/null @@ -1,16 +0,0 @@ -{{ if hugo.IsProduction }} -{{ $posthogKey := getenv "HUGO_POSTHOG_KEY" }} -{{ $posthogHost := or (getenv "HUGO_POSTHOG_HOST") "https://us.i.posthog.com" }} -{{ with $posthogKey }} - - -{{ end }} -{{ end }} diff --git a/web/layouts/partials/seo.html b/web/layouts/partials/seo.html deleted file mode 100644 index 51388ff..0000000 --- a/web/layouts/partials/seo.html +++ /dev/null @@ -1,97 +0,0 @@ -{{/* Canonical, OpenGraph, Twitter, JSON-LD. Included from head.html. */}} - -{{- $title := cond .IsHome .Site.Title (printf "%s | %s" .Title .Site.Title) -}} -{{- $description := cond (ne .Description "") .Description .Site.Params.description -}} -{{- $ogImage := absURL "aixgo-logo.png" -}} - - - - - - - - - - - -{{- if .IsPage }} - {{- with .Date }} - - {{- end }} - {{- with .Lastmod }} - - {{- end }} - {{- with .Params.author }} - - {{- end }} -{{- end }} - - - - - - - - -{{- if .IsHome }} - -{{- end }} - -{{- if and .IsPage (eq .Section "blog") }} - -{{- end }} diff --git a/web/layouts/robots.txt b/web/layouts/robots.txt deleted file mode 100644 index 6b4c001..0000000 --- a/web/layouts/robots.txt +++ /dev/null @@ -1,4 +0,0 @@ -User-agent: * -Allow: / - -Sitemap: {{ "sitemap.xml" | absURL }} diff --git a/web/layouts/shortcodes/alpha-notice.html b/web/layouts/shortcodes/alpha-notice.html deleted file mode 100644 index a24bf0d..0000000 --- a/web/layouts/shortcodes/alpha-notice.html +++ /dev/null @@ -1,42 +0,0 @@ -{{/* - Alpha Notice Shortcode - Wrapper for the alpha-notice partial - - Usage: - {{< alpha-notice - status="alpha" - title="Alpha Software" - message="This guide uses alpha-quality APIs." - linkText="Check current status →" - >}} - - Or for detailed format: - {{< alpha-notice - status="alpha" - title="ALPHA SOFTWARE WARNING" - message="This feature is under active development..." - detailed=true - >}} - - Multiple statuses (comma-separated): - {{< alpha-notice - status="alpha,planned" - title="Alpha Feature" - message="Some features are planned for future releases." - >}} -*/}} - -{{- $statusParam := .Get "status" | default "alpha" -}} -{{- $statuses := split $statusParam "," -}} -{{- $badges := slice -}} -{{- range $statuses -}} - {{- $badges = $badges | append (dict "status" (trim . " ")) -}} -{{- end -}} - -{{ partial "alpha-notice.html" (dict - "badges" $badges - "title" (.Get "title") - "message" (.Get "message") - "link" (.Get "link") - "linkText" (.Get "linkText") - "detailed" (.Get "detailed") -) }} diff --git a/web/layouts/shortcodes/button.html b/web/layouts/shortcodes/button.html deleted file mode 100644 index ff461fb..0000000 --- a/web/layouts/shortcodes/button.html +++ /dev/null @@ -1,3 +0,0 @@ - - {{ .Inner }} - diff --git a/web/layouts/shortcodes/feature-card.html b/web/layouts/shortcodes/feature-card.html deleted file mode 100644 index 96b261d..0000000 --- a/web/layouts/shortcodes/feature-card.html +++ /dev/null @@ -1,13 +0,0 @@ -
-
{{ .Get "icon" }}
-

{{ .Get "title" }}

-

{{ .Get "description" }}

- {{ if .Get "benefit" }} -
- - - - {{ .Get "benefit" }} -
- {{ end }} -
diff --git a/web/layouts/shortcodes/feature-grid.html b/web/layouts/shortcodes/feature-grid.html deleted file mode 100644 index 39a2025..0000000 --- a/web/layouts/shortcodes/feature-grid.html +++ /dev/null @@ -1,3 +0,0 @@ -
- {{ .Inner }} -
diff --git a/web/layouts/shortcodes/feature-releases.html b/web/layouts/shortcodes/feature-releases.html deleted file mode 100644 index be6a7f8..0000000 --- a/web/layouts/shortcodes/feature-releases.html +++ /dev/null @@ -1,61 +0,0 @@ -{{- /* - Feature Releases Shortcode - Renders feature tables from data/features.yaml - - Usage: {{< feature-releases >}} - - Supports: - - Status indicators: complete (✅), in_progress (🚧), roadmap (❌) - - Tooltips for additional context - - Maintains existing CSS classes for styling -*/ -}} - -{{- $statusIcons := dict "complete" "✅" "in_progress" "🚧" "roadmap" "❌" -}} - -{{- with site.Data.features -}} - {{- range .categories -}} -

{{ .name }}

- - {{- /* Find max feature count in this category */ -}} - {{- $maxCount := 0 -}} - {{- range .subcategories -}} - {{- $count := len .features -}} - {{- if gt $count $maxCount -}} - {{- $maxCount = $count -}} - {{- end -}} - {{- end -}} - -
- {{- range .subcategories -}} -
-

{{ .title }}

- - {{- range .features -}} - {{- $icon := index $statusIcons .status -}} -
- {{- $icon }} {{ .name -}} - {{- if .tooltip }} ⓘ - - {{- if .tooltip.title -}} - {{ .tooltip.title }}: {{ .tooltip.content }} - {{- else -}} - {{ .tooltip.content }} - {{- end -}} - {{- if .tooltip.roadmap }} Roadmap: {{ .tooltip.roadmap }}{{ end -}} - - - {{- end -}} -
- {{- end -}} - - {{- /* Add empty cells to match max count */ -}} - {{- $currentCount := len .features -}} - {{- $emptyCells := sub $maxCount $currentCount -}} - {{- range seq $emptyCells -}} -
 
- {{- end -}} -
- {{- end -}} -
- {{- end -}} -{{- end -}} diff --git a/web/layouts/shortcodes/roadmap-timeline.html b/web/layouts/shortcodes/roadmap-timeline.html deleted file mode 100644 index 34bf09b..0000000 --- a/web/layouts/shortcodes/roadmap-timeline.html +++ /dev/null @@ -1,34 +0,0 @@ -
-
-

v0.1 - Alpha Release (November 2024)

-
    -
  • Local mode with Go channels
  • -
  • Core agent types (Producer, ReAct, Logger)
  • -
  • YAML configuration
  • -
  • Basic observability foundation
  • -
  • OpenAI, Anthropic, xAI integration
  • -
-
- -
-

v0.2 - Beta Release (Q1 2026)

-
    -
  • Distributed mode with gRPC transport
  • -
  • Vector database integrations (Firestore, Qdrant, pgvector)
  • -
  • Enhanced observability (Langfuse integration)
  • -
  • Additional agent types (Classifier, Aggregator)
  • -
  • Advanced error handling patterns
  • -
-
- -
-

v1.0 - Production Release (Q4 2025)

-
    -
  • API stability guarantees
  • -
  • Production hardening (rate limiting, circuit breakers)
  • -
  • Complete documentation and examples
  • -
  • Performance benchmarks and optimization
  • -
  • Enterprise support options
  • -
-
-
diff --git a/web/layouts/shortcodes/status-badge.html b/web/layouts/shortcodes/status-badge.html deleted file mode 100644 index c1b99bb..0000000 --- a/web/layouts/shortcodes/status-badge.html +++ /dev/null @@ -1,12 +0,0 @@ -{{ $status := .Get "status" }} -{{ $text := "UNKNOWN" }} - -{{ if eq $status "alpha" }} - {{ $text = "v0.1" }} -{{ else if eq $status "planned" }} - {{ $text = "PLANNED" }} -{{ else if eq $status "available" }} - {{ $text = "AVAILABLE" }} -{{ end }} - -{{ $text }} diff --git a/web/package-lock.json b/web/package-lock.json deleted file mode 100644 index 07437c6..0000000 --- a/web/package-lock.json +++ /dev/null @@ -1,1161 +0,0 @@ -{ - "name": "web", - "lockfileVersion": 3, - "requires": true, - "packages": { - "": { - "devDependencies": { - "markdownlint-cli": "^0.48.0" - } - }, - "node_modules/@types/debug": { - "version": "4.1.12", - "resolved": "https://registry.npmjs.org/@types/debug/-/debug-4.1.12.tgz", - "integrity": "sha512-vIChWdVG3LG1SMxEvI/AK+FWJthlrqlTu7fbrlywTkkaONwk/UAGaULXRlf8vkzFBLVm0zkMdCquhL5aOjhXPQ==", - "dev": true, - "license": "MIT", - "dependencies": { - "@types/ms": "*" - } - }, - "node_modules/@types/katex": { - "version": "0.16.7", - "resolved": "https://registry.npmjs.org/@types/katex/-/katex-0.16.7.tgz", - "integrity": "sha512-HMwFiRujE5PjrgwHQ25+bsLJgowjGjm5Z8FVSf0N6PwgJrwxH0QxzHYDcKsTfV3wva0vzrpqMTJS2jXPr5BMEQ==", - "dev": true, - "license": "MIT" - }, - "node_modules/@types/ms": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/@types/ms/-/ms-2.1.0.tgz", - "integrity": "sha512-GsCCIZDE/p3i96vtEqx+7dBUGXrc7zeSK3wwPHIaRThS+9OhWIXRqzs4d6k1SVU8g91DrNRWxWUGhp5KXQb2VA==", - "dev": true, - "license": "MIT" - }, - "node_modules/@types/unist": { - "version": "2.0.11", - "resolved": "https://registry.npmjs.org/@types/unist/-/unist-2.0.11.tgz", - "integrity": "sha512-CmBKiL6NNo/OqgmMn95Fk9Whlp2mtvIv+KNpQKN2F4SjvrEesubTRWGYSg+BnWZOnlCaSTU1sMpsBOzgbYhnsA==", - "dev": true, - "license": "MIT" - }, - "node_modules/ansi-regex": { - "version": "6.2.2", - "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.2.2.tgz", - "integrity": "sha512-Bq3SmSpyFHaWjPk8If9yc6svM8c56dB5BAtW4Qbw5jHTwwXXcTLoRMkpDJp6VL0XzlWaCHTXrkFURMYmD0sLqg==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=12" - }, - "funding": { - "url": "https://github.com/chalk/ansi-regex?sponsor=1" - } - }, - "node_modules/argparse": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/argparse/-/argparse-2.0.1.tgz", - "integrity": "sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q==", - "dev": true, - "license": "Python-2.0" - }, - "node_modules/balanced-match": { - "version": "4.0.4", - "resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-4.0.4.tgz", - "integrity": "sha512-BLrgEcRTwX2o6gGxGOCNyMvGSp35YofuYzw9h1IMTRmKqttAZZVU67bdb9Pr2vUHA8+j3i2tJfjO6C6+4myGTA==", - "dev": true, - "license": "MIT", - "engines": { - "node": "18 || 20 || >=22" - } - }, - "node_modules/brace-expansion": { - "version": "5.0.4", - "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-5.0.4.tgz", - "integrity": "sha512-h+DEnpVvxmfVefa4jFbCf5HdH5YMDXRsmKflpf1pILZWRFlTbJpxeU55nJl4Smt5HQaGzg1o6RHFPJaOqnmBDg==", - "dev": true, - "license": "MIT", - "dependencies": { - "balanced-match": "^4.0.2" - }, - "engines": { - "node": "18 || 20 || >=22" - } - }, - "node_modules/character-entities": { - "version": "2.0.2", - "resolved": "https://registry.npmjs.org/character-entities/-/character-entities-2.0.2.tgz", - "integrity": "sha512-shx7oQ0Awen/BRIdkjkvz54PnEEI/EjwXDSIZp86/KKdbafHh1Df/RYGBhn4hbe2+uKC9FnT5UCEdyPz3ai9hQ==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/character-entities-legacy": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/character-entities-legacy/-/character-entities-legacy-3.0.0.tgz", - "integrity": "sha512-RpPp0asT/6ufRm//AJVwpViZbGM/MkjQFxJccQRHmISF/22NBtsHqAWmL+/pmkPWoIUJdWyeVleTl1wydHATVQ==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/character-reference-invalid": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/character-reference-invalid/-/character-reference-invalid-2.0.1.tgz", - "integrity": "sha512-iBZ4F4wRbyORVsu0jPV7gXkOsGYjGHPmAyv+HiHG8gi5PtC9KI2j1+v8/tlibRvjoWX027ypmG/n0HtO5t7unw==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/commander": { - "version": "14.0.3", - "resolved": "https://registry.npmjs.org/commander/-/commander-14.0.3.tgz", - "integrity": "sha512-H+y0Jo/T1RZ9qPP4Eh1pkcQcLRglraJaSLoyOtHxu6AapkjWVCy2Sit1QQ4x3Dng8qDlSsZEet7g5Pq06MvTgw==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=20" - } - }, - "node_modules/debug": { - "version": "4.4.3", - "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", - "integrity": "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA==", - "dev": true, - "license": "MIT", - "dependencies": { - "ms": "^2.1.3" - }, - "engines": { - "node": ">=6.0" - }, - "peerDependenciesMeta": { - "supports-color": { - "optional": true - } - } - }, - "node_modules/decode-named-character-reference": { - "version": "1.2.0", - "resolved": "https://registry.npmjs.org/decode-named-character-reference/-/decode-named-character-reference-1.2.0.tgz", - "integrity": "sha512-c6fcElNV6ShtZXmsgNgFFV5tVX2PaV4g+MOAkb8eXHvn6sryJBrZa9r0zV6+dtTyoCKxtDy5tyQ5ZwQuidtd+Q==", - "dev": true, - "license": "MIT", - "dependencies": { - "character-entities": "^2.0.0" - }, - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/deep-extend": { - "version": "0.6.0", - "resolved": "https://registry.npmjs.org/deep-extend/-/deep-extend-0.6.0.tgz", - "integrity": "sha512-LOHxIOaPYdHlJRtCQfDIVZtfw/ufM8+rVj649RIHzcm/vGwQRXFt6OPqIFWsm2XEMrNIEtWR64sY1LEKD2vAOA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=4.0.0" - } - }, - "node_modules/dequal": { - "version": "2.0.3", - "resolved": "https://registry.npmjs.org/dequal/-/dequal-2.0.3.tgz", - "integrity": "sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=6" - } - }, - "node_modules/devlop": { - "version": "1.1.0", - "resolved": "https://registry.npmjs.org/devlop/-/devlop-1.1.0.tgz", - "integrity": "sha512-RWmIqhcFf1lRYBvNmr7qTNuyCt/7/ns2jbpp1+PalgE/rDQcBT0fioSMUpJ93irlUhC5hrg4cYqe6U+0ImW0rA==", - "dev": true, - "license": "MIT", - "dependencies": { - "dequal": "^2.0.0" - }, - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/entities": { - "version": "4.5.0", - "resolved": "https://registry.npmjs.org/entities/-/entities-4.5.0.tgz", - "integrity": "sha512-V0hjH4dGPh9Ao5p0MoRY6BVqtwCjhz6vI5LT8AJ55H+4g9/4vbHx1I54fS0XuclLhDHArPQCiMjDxjaL8fPxhw==", - "dev": true, - "license": "BSD-2-Clause", - "engines": { - "node": ">=0.12" - }, - "funding": { - "url": "https://github.com/fb55/entities?sponsor=1" - } - }, - "node_modules/fdir": { - "version": "6.5.0", - "resolved": "https://registry.npmjs.org/fdir/-/fdir-6.5.0.tgz", - "integrity": "sha512-tIbYtZbucOs0BRGqPJkshJUYdL+SDH7dVM8gjy+ERp3WAUjLEFJE+02kanyHtwjWOnwrKYBiwAmM0p4kLJAnXg==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=12.0.0" - }, - "peerDependencies": { - "picomatch": "^3 || ^4" - }, - "peerDependenciesMeta": { - "picomatch": { - "optional": true - } - } - }, - "node_modules/get-east-asian-width": { - "version": "1.4.0", - "resolved": "https://registry.npmjs.org/get-east-asian-width/-/get-east-asian-width-1.4.0.tgz", - "integrity": "sha512-QZjmEOC+IT1uk6Rx0sX22V6uHWVwbdbxf1faPqJ1QhLdGgsRGCZoyaQBm/piRdJy/D2um6hM1UP7ZEeQ4EkP+Q==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=18" - }, - "funding": { - "url": "https://github.com/sponsors/sindresorhus" - } - }, - "node_modules/ignore": { - "version": "7.0.5", - "resolved": "https://registry.npmjs.org/ignore/-/ignore-7.0.5.tgz", - "integrity": "sha512-Hs59xBNfUIunMFgWAbGX5cq6893IbWg4KnrjbYwX3tx0ztorVgTDA6B2sxf8ejHJ4wz8BqGUMYlnzNBer5NvGg==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">= 4" - } - }, - "node_modules/ini": { - "version": "4.1.3", - "resolved": "https://registry.npmjs.org/ini/-/ini-4.1.3.tgz", - "integrity": "sha512-X7rqawQBvfdjS10YU1y1YVreA3SsLrW9dX2CewP2EbBJM4ypVNLDkO5y04gejPwKIY9lR+7r9gn3rFPt/kmWFg==", - "dev": true, - "license": "ISC", - "engines": { - "node": "^14.17.0 || ^16.13.0 || >=18.0.0" - } - }, - "node_modules/is-alphabetical": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/is-alphabetical/-/is-alphabetical-2.0.1.tgz", - "integrity": "sha512-FWyyY60MeTNyeSRpkM2Iry0G9hpr7/9kD40mD/cGQEuilcZYS4okz8SN2Q6rLCJ8gbCt6fN+rC+6tMGS99LaxQ==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/is-alphanumerical": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/is-alphanumerical/-/is-alphanumerical-2.0.1.tgz", - "integrity": "sha512-hmbYhX/9MUMF5uh7tOXyK/n0ZvWpad5caBA17GsC6vyuCqaWliRG5K1qS9inmUhEMaOBIW7/whAnSwveW/LtZw==", - "dev": true, - "license": "MIT", - "dependencies": { - "is-alphabetical": "^2.0.0", - "is-decimal": "^2.0.0" - }, - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/is-decimal": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/is-decimal/-/is-decimal-2.0.1.tgz", - "integrity": "sha512-AAB9hiomQs5DXWcRB1rqsxGUstbRroFOPPVAomNk/3XHR5JyEZChOyTWe2oayKnsSsr/kcGqF+z6yuH6HHpN0A==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/is-hexadecimal": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/is-hexadecimal/-/is-hexadecimal-2.0.1.tgz", - "integrity": "sha512-DgZQp241c8oO6cA1SbTEWiXeoxV42vlcJxgH+B3hi1AiqqKruZR3ZGF8In3fj4+/y/7rHvlOZLZtgJ/4ttYGZg==", - "dev": true, - "license": "MIT", - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/js-yaml": { - "version": "4.1.1", - "resolved": "https://registry.npmjs.org/js-yaml/-/js-yaml-4.1.1.tgz", - "integrity": "sha512-qQKT4zQxXl8lLwBtHMWwaTcGfFOZviOJet3Oy/xmGk2gZH677CJM9EvtfdSkgWcATZhj/55JZ0rmy3myCT5lsA==", - "dev": true, - "license": "MIT", - "dependencies": { - "argparse": "^2.0.1" - }, - "bin": { - "js-yaml": "bin/js-yaml.js" - } - }, - "node_modules/jsonc-parser": { - "version": "3.3.1", - "resolved": "https://registry.npmjs.org/jsonc-parser/-/jsonc-parser-3.3.1.tgz", - "integrity": "sha512-HUgH65KyejrUFPvHFPbqOY0rsFip3Bo5wb4ngvdi1EpCYWUQDC5V+Y7mZws+DLkr4M//zQJoanu1SP+87Dv1oQ==", - "dev": true, - "license": "MIT" - }, - "node_modules/jsonpointer": { - "version": "5.0.1", - "resolved": "https://registry.npmjs.org/jsonpointer/-/jsonpointer-5.0.1.tgz", - "integrity": "sha512-p/nXbhSEcu3pZRdkW1OfJhpsVtW1gd4Wa1fnQc9YLiTfAjn0312eMKimbdIQzuZl9aa9xUGaRlP9T/CJE/ditQ==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/katex": { - "version": "0.16.27", - "resolved": "https://registry.npmjs.org/katex/-/katex-0.16.27.tgz", - "integrity": "sha512-aeQoDkuRWSqQN6nSvVCEFvfXdqo1OQiCmmW1kc9xSdjutPv7BGO7pqY9sQRJpMOGrEdfDgF2TfRXe5eUAD2Waw==", - "dev": true, - "funding": [ - "https://opencollective.com/katex", - "https://github.com/sponsors/katex" - ], - "license": "MIT", - "dependencies": { - "commander": "^8.3.0" - }, - "bin": { - "katex": "cli.js" - } - }, - "node_modules/katex/node_modules/commander": { - "version": "8.3.0", - "resolved": "https://registry.npmjs.org/commander/-/commander-8.3.0.tgz", - "integrity": "sha512-OkTL9umf+He2DZkUq8f8J9of7yL6RJKI24dVITBmNfZBmri9zYZQrKkuXiKhyfPSu8tUhnVBB1iKXevvnlR4Ww==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">= 12" - } - }, - "node_modules/linkify-it": { - "version": "5.0.0", - "resolved": "https://registry.npmjs.org/linkify-it/-/linkify-it-5.0.0.tgz", - "integrity": "sha512-5aHCbzQRADcdP+ATqnDuhhJ/MRIqDkZX5pyjFHRRysS8vZ5AbqGEoFIb6pYHPZ+L/OC2Lc+xT8uHVVR5CAK/wQ==", - "dev": true, - "license": "MIT", - "dependencies": { - "uc.micro": "^2.0.0" - } - }, - "node_modules/markdown-it": { - "version": "14.1.1", - "resolved": "https://registry.npmjs.org/markdown-it/-/markdown-it-14.1.1.tgz", - "integrity": "sha512-BuU2qnTti9YKgK5N+IeMubp14ZUKUUw7yeJbkjtosvHiP0AZ5c8IAgEMk79D0eC8F23r4Ac/q8cAIFdm2FtyoA==", - "dev": true, - "license": "MIT", - "dependencies": { - "argparse": "^2.0.1", - "entities": "^4.4.0", - "linkify-it": "^5.0.0", - "mdurl": "^2.0.0", - "punycode.js": "^2.3.1", - "uc.micro": "^2.1.0" - }, - "bin": { - "markdown-it": "bin/markdown-it.mjs" - } - }, - "node_modules/markdownlint": { - "version": "0.40.0", - "resolved": "https://registry.npmjs.org/markdownlint/-/markdownlint-0.40.0.tgz", - "integrity": "sha512-UKybllYNheWac61Ia7T6fzuQNDZimFIpCg2w6hHjgV1Qu0w1TV0LlSgryUGzM0bkKQCBhy2FDhEELB73Kb0kAg==", - "dev": true, - "license": "MIT", - "dependencies": { - "micromark": "4.0.2", - "micromark-core-commonmark": "2.0.3", - "micromark-extension-directive": "4.0.0", - "micromark-extension-gfm-autolink-literal": "2.1.0", - "micromark-extension-gfm-footnote": "2.1.0", - "micromark-extension-gfm-table": "2.1.1", - "micromark-extension-math": "3.1.0", - "micromark-util-types": "2.0.2", - "string-width": "8.1.0" - }, - "engines": { - "node": ">=20" - }, - "funding": { - "url": "https://github.com/sponsors/DavidAnson" - } - }, - "node_modules/markdownlint-cli": { - "version": "0.48.0", - "resolved": "https://registry.npmjs.org/markdownlint-cli/-/markdownlint-cli-0.48.0.tgz", - "integrity": "sha512-NkZQNu2E0Q5qLEEHwWj674eYISTLD4jMHkBzDobujXd1kv+yCxi8jOaD/rZoQNW1FBBMMGQpuW5So8B51N/e0A==", - "dev": true, - "license": "MIT", - "dependencies": { - "commander": "~14.0.3", - "deep-extend": "~0.6.0", - "ignore": "~7.0.5", - "js-yaml": "~4.1.1", - "jsonc-parser": "~3.3.1", - "jsonpointer": "~5.0.1", - "markdown-it": "~14.1.1", - "markdownlint": "~0.40.0", - "minimatch": "~10.2.4", - "run-con": "~1.3.2", - "smol-toml": "~1.6.0", - "tinyglobby": "~0.2.15" - }, - "bin": { - "markdownlint": "markdownlint.js" - }, - "engines": { - "node": ">=20" - } - }, - "node_modules/mdurl": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/mdurl/-/mdurl-2.0.0.tgz", - "integrity": "sha512-Lf+9+2r+Tdp5wXDXC4PcIBjTDtq4UKjCPMQhKIuzpJNW0b96kVqSwW0bT7FhRSfmAiFYgP+SCRvdrDozfh0U5w==", - "dev": true, - "license": "MIT" - }, - "node_modules/micromark": { - "version": "4.0.2", - "resolved": "https://registry.npmjs.org/micromark/-/micromark-4.0.2.tgz", - "integrity": "sha512-zpe98Q6kvavpCr1NPVSCMebCKfD7CA2NqZ+rykeNhONIJBpc1tFKt9hucLGwha3jNTNI8lHpctWJWoimVF4PfA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "@types/debug": "^4.0.0", - "debug": "^4.0.0", - "decode-named-character-reference": "^1.0.0", - "devlop": "^1.0.0", - "micromark-core-commonmark": "^2.0.0", - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-chunked": "^2.0.0", - "micromark-util-combine-extensions": "^2.0.0", - "micromark-util-decode-numeric-character-reference": "^2.0.0", - "micromark-util-encode": "^2.0.0", - "micromark-util-normalize-identifier": "^2.0.0", - "micromark-util-resolve-all": "^2.0.0", - "micromark-util-sanitize-uri": "^2.0.0", - "micromark-util-subtokenize": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-core-commonmark": { - "version": "2.0.3", - "resolved": "https://registry.npmjs.org/micromark-core-commonmark/-/micromark-core-commonmark-2.0.3.tgz", - "integrity": "sha512-RDBrHEMSxVFLg6xvnXmb1Ayr2WzLAWjeSATAoxwKYJV94TeNavgoIdA0a9ytzDSVzBy2YKFK+emCPOEibLeCrg==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "decode-named-character-reference": "^1.0.0", - "devlop": "^1.0.0", - "micromark-factory-destination": "^2.0.0", - "micromark-factory-label": "^2.0.0", - "micromark-factory-space": "^2.0.0", - "micromark-factory-title": "^2.0.0", - "micromark-factory-whitespace": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-chunked": "^2.0.0", - "micromark-util-classify-character": "^2.0.0", - "micromark-util-html-tag-name": "^2.0.0", - "micromark-util-normalize-identifier": "^2.0.0", - "micromark-util-resolve-all": "^2.0.0", - "micromark-util-subtokenize": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-extension-directive": { - "version": "4.0.0", - "resolved": "https://registry.npmjs.org/micromark-extension-directive/-/micromark-extension-directive-4.0.0.tgz", - "integrity": "sha512-/C2nqVmXXmiseSSuCdItCMho7ybwwop6RrrRPk0KbOHW21JKoCldC+8rFOaundDoRBUWBnJJcxeA/Kvi34WQXg==", - "dev": true, - "license": "MIT", - "dependencies": { - "devlop": "^1.0.0", - "micromark-factory-space": "^2.0.0", - "micromark-factory-whitespace": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0", - "parse-entities": "^4.0.0" - }, - "funding": { - "type": "opencollective", - "url": "https://opencollective.com/unified" - } - }, - "node_modules/micromark-extension-gfm-autolink-literal": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/micromark-extension-gfm-autolink-literal/-/micromark-extension-gfm-autolink-literal-2.1.0.tgz", - "integrity": "sha512-oOg7knzhicgQ3t4QCjCWgTmfNhvQbDDnJeVu9v81r7NltNCVmhPy1fJRX27pISafdjL+SVc4d3l48Gb6pbRypw==", - "dev": true, - "license": "MIT", - "dependencies": { - "micromark-util-character": "^2.0.0", - "micromark-util-sanitize-uri": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - }, - "funding": { - "type": "opencollective", - "url": "https://opencollective.com/unified" - } - }, - "node_modules/micromark-extension-gfm-footnote": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/micromark-extension-gfm-footnote/-/micromark-extension-gfm-footnote-2.1.0.tgz", - "integrity": "sha512-/yPhxI1ntnDNsiHtzLKYnE3vf9JZ6cAisqVDauhp4CEHxlb4uoOTxOCJ+9s51bIB8U1N1FJ1RXOKTIlD5B/gqw==", - "dev": true, - "license": "MIT", - "dependencies": { - "devlop": "^1.0.0", - "micromark-core-commonmark": "^2.0.0", - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-normalize-identifier": "^2.0.0", - "micromark-util-sanitize-uri": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - }, - "funding": { - "type": "opencollective", - "url": "https://opencollective.com/unified" - } - }, - "node_modules/micromark-extension-gfm-table": { - "version": "2.1.1", - "resolved": "https://registry.npmjs.org/micromark-extension-gfm-table/-/micromark-extension-gfm-table-2.1.1.tgz", - "integrity": "sha512-t2OU/dXXioARrC6yWfJ4hqB7rct14e8f7m0cbI5hUmDyyIlwv5vEtooptH8INkbLzOatzKuVbQmAYcbWoyz6Dg==", - "dev": true, - "license": "MIT", - "dependencies": { - "devlop": "^1.0.0", - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - }, - "funding": { - "type": "opencollective", - "url": "https://opencollective.com/unified" - } - }, - "node_modules/micromark-extension-math": { - "version": "3.1.0", - "resolved": "https://registry.npmjs.org/micromark-extension-math/-/micromark-extension-math-3.1.0.tgz", - "integrity": "sha512-lvEqd+fHjATVs+2v/8kg9i5Q0AP2k85H0WUOwpIVvUML8BapsMvh1XAogmQjOCsLpoKRCVQqEkQBB3NhVBcsOg==", - "dev": true, - "license": "MIT", - "dependencies": { - "@types/katex": "^0.16.0", - "devlop": "^1.0.0", - "katex": "^0.16.0", - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - }, - "funding": { - "type": "opencollective", - "url": "https://opencollective.com/unified" - } - }, - "node_modules/micromark-factory-destination": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-factory-destination/-/micromark-factory-destination-2.0.1.tgz", - "integrity": "sha512-Xe6rDdJlkmbFRExpTOmRj9N3MaWmbAgdpSrBQvCFqhezUn4AHqJHbaEnfbVYYiexVSs//tqOdY/DxhjdCiJnIA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-factory-label": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-factory-label/-/micromark-factory-label-2.0.1.tgz", - "integrity": "sha512-VFMekyQExqIW7xIChcXn4ok29YE3rnuyveW3wZQWWqF4Nv9Wk5rgJ99KzPvHjkmPXF93FXIbBp6YdW3t71/7Vg==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "devlop": "^1.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-factory-space": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-factory-space/-/micromark-factory-space-2.0.1.tgz", - "integrity": "sha512-zRkxjtBxxLd2Sc0d+fbnEunsTj46SWXgXciZmHq0kDYGnck/ZSGj9/wULTV95uoeYiK5hRXP2mJ98Uo4cq/LQg==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-character": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-factory-title": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-factory-title/-/micromark-factory-title-2.0.1.tgz", - "integrity": "sha512-5bZ+3CjhAd9eChYTHsjy6TGxpOFSKgKKJPJxr293jTbfry2KDoWkhBb6TcPVB4NmzaPhMs1Frm9AZH7OD4Cjzw==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-factory-whitespace": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-factory-whitespace/-/micromark-factory-whitespace-2.0.1.tgz", - "integrity": "sha512-Ob0nuZ3PKt/n0hORHyvoD9uZhr+Za8sFoP+OnMcnWK5lngSzALgQYKMr9RJVOWLqQYuyn6ulqGWSXdwf6F80lQ==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-factory-space": "^2.0.0", - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-character": { - "version": "2.1.1", - "resolved": "https://registry.npmjs.org/micromark-util-character/-/micromark-util-character-2.1.1.tgz", - "integrity": "sha512-wv8tdUTJ3thSFFFJKtpYKOYiGP2+v96Hvk4Tu8KpCAsTMs6yi+nVmGh1syvSCsaxz45J6Jbw+9DD6g97+NV67Q==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-chunked": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-chunked/-/micromark-util-chunked-2.0.1.tgz", - "integrity": "sha512-QUNFEOPELfmvv+4xiNg2sRYeS/P84pTW0TCgP5zc9FpXetHY0ab7SxKyAQCNCc1eK0459uoLI1y5oO5Vc1dbhA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-symbol": "^2.0.0" - } - }, - "node_modules/micromark-util-classify-character": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-classify-character/-/micromark-util-classify-character-2.0.1.tgz", - "integrity": "sha512-K0kHzM6afW/MbeWYWLjoHQv1sgg2Q9EccHEDzSkxiP/EaagNzCm7T/WMKZ3rjMbvIpvBiZgwR3dKMygtA4mG1Q==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-character": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-combine-extensions": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-combine-extensions/-/micromark-util-combine-extensions-2.0.1.tgz", - "integrity": "sha512-OnAnH8Ujmy59JcyZw8JSbK9cGpdVY44NKgSM7E9Eh7DiLS2E9RNQf0dONaGDzEG9yjEl5hcqeIsj4hfRkLH/Bg==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-chunked": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-decode-numeric-character-reference": { - "version": "2.0.2", - "resolved": "https://registry.npmjs.org/micromark-util-decode-numeric-character-reference/-/micromark-util-decode-numeric-character-reference-2.0.2.tgz", - "integrity": "sha512-ccUbYk6CwVdkmCQMyr64dXz42EfHGkPQlBj5p7YVGzq8I7CtjXZJrubAYezf7Rp+bjPseiROqe7G6foFd+lEuw==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-symbol": "^2.0.0" - } - }, - "node_modules/micromark-util-encode": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-encode/-/micromark-util-encode-2.0.1.tgz", - "integrity": "sha512-c3cVx2y4KqUnwopcO9b/SCdo2O67LwJJ/UyqGfbigahfegL9myoEFoDYZgkT7f36T0bLrM9hZTAaAyH+PCAXjw==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT" - }, - "node_modules/micromark-util-html-tag-name": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-html-tag-name/-/micromark-util-html-tag-name-2.0.1.tgz", - "integrity": "sha512-2cNEiYDhCWKI+Gs9T0Tiysk136SnR13hhO8yW6BGNyhOC4qYFnwF1nKfD3HFAIXA5c45RrIG1ub11GiXeYd1xA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT" - }, - "node_modules/micromark-util-normalize-identifier": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-normalize-identifier/-/micromark-util-normalize-identifier-2.0.1.tgz", - "integrity": "sha512-sxPqmo70LyARJs0w2UclACPUUEqltCkJ6PhKdMIDuJ3gSf/Q+/GIe3WKl0Ijb/GyH9lOpUkRAO2wp0GVkLvS9Q==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-symbol": "^2.0.0" - } - }, - "node_modules/micromark-util-resolve-all": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-resolve-all/-/micromark-util-resolve-all-2.0.1.tgz", - "integrity": "sha512-VdQyxFWFT2/FGJgwQnJYbe1jjQoNTS4RjglmSjTUlpUMa95Htx9NHeYW4rGDJzbjvCsl9eLjMQwGeElsqmzcHg==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-sanitize-uri": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-sanitize-uri/-/micromark-util-sanitize-uri-2.0.1.tgz", - "integrity": "sha512-9N9IomZ/YuGGZZmQec1MbgxtlgougxTodVwDzzEouPKo3qFWvymFHWcnDi2vzV1ff6kas9ucW+o3yzJK9YB1AQ==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "micromark-util-character": "^2.0.0", - "micromark-util-encode": "^2.0.0", - "micromark-util-symbol": "^2.0.0" - } - }, - "node_modules/micromark-util-subtokenize": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/micromark-util-subtokenize/-/micromark-util-subtokenize-2.1.0.tgz", - "integrity": "sha512-XQLu552iSctvnEcgXw6+Sx75GflAPNED1qx7eBJ+wydBb2KCbRZe+NwvIEEMM83uml1+2WSXpBAcp9IUCgCYWA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT", - "dependencies": { - "devlop": "^1.0.0", - "micromark-util-chunked": "^2.0.0", - "micromark-util-symbol": "^2.0.0", - "micromark-util-types": "^2.0.0" - } - }, - "node_modules/micromark-util-symbol": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/micromark-util-symbol/-/micromark-util-symbol-2.0.1.tgz", - "integrity": "sha512-vs5t8Apaud9N28kgCrRUdEed4UJ+wWNvicHLPxCa9ENlYuAY31M0ETy5y1vA33YoNPDFTghEbnh6efaE8h4x0Q==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT" - }, - "node_modules/micromark-util-types": { - "version": "2.0.2", - "resolved": "https://registry.npmjs.org/micromark-util-types/-/micromark-util-types-2.0.2.tgz", - "integrity": "sha512-Yw0ECSpJoViF1qTU4DC6NwtC4aWGt1EkzaQB8KPPyCRR8z9TWeV0HbEFGTO+ZY1wB22zmxnJqhPyTpOVCpeHTA==", - "dev": true, - "funding": [ - { - "type": "GitHub Sponsors", - "url": "https://github.com/sponsors/unifiedjs" - }, - { - "type": "OpenCollective", - "url": "https://opencollective.com/unified" - } - ], - "license": "MIT" - }, - "node_modules/minimatch": { - "version": "10.2.4", - "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-10.2.4.tgz", - "integrity": "sha512-oRjTw/97aTBN0RHbYCdtF1MQfvusSIBQM0IZEgzl6426+8jSC0nF1a/GmnVLpfB9yyr6g6FTqWqiZVbxrtaCIg==", - "dev": true, - "license": "BlueOak-1.0.0", - "dependencies": { - "brace-expansion": "^5.0.2" - }, - "engines": { - "node": "18 || 20 || >=22" - }, - "funding": { - "url": "https://github.com/sponsors/isaacs" - } - }, - "node_modules/minimist": { - "version": "1.2.8", - "resolved": "https://registry.npmjs.org/minimist/-/minimist-1.2.8.tgz", - "integrity": "sha512-2yyAR8qBkN3YuheJanUpWC5U3bb5osDywNB8RzDVlDwDHbocAJveqqj1u8+SVD7jkWT4yvsHCpWqqWqAxb0zCA==", - "dev": true, - "license": "MIT", - "funding": { - "url": "https://github.com/sponsors/ljharb" - } - }, - "node_modules/ms": { - "version": "2.1.3", - "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", - "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", - "dev": true, - "license": "MIT" - }, - "node_modules/parse-entities": { - "version": "4.0.2", - "resolved": "https://registry.npmjs.org/parse-entities/-/parse-entities-4.0.2.tgz", - "integrity": "sha512-GG2AQYWoLgL877gQIKeRPGO1xF9+eG1ujIb5soS5gPvLQ1y2o8FL90w2QWNdf9I361Mpp7726c+lj3U0qK1uGw==", - "dev": true, - "license": "MIT", - "dependencies": { - "@types/unist": "^2.0.0", - "character-entities-legacy": "^3.0.0", - "character-reference-invalid": "^2.0.0", - "decode-named-character-reference": "^1.0.0", - "is-alphanumerical": "^2.0.0", - "is-decimal": "^2.0.0", - "is-hexadecimal": "^2.0.0" - }, - "funding": { - "type": "github", - "url": "https://github.com/sponsors/wooorm" - } - }, - "node_modules/picomatch": { - "version": "4.0.4", - "resolved": "https://registry.npmjs.org/picomatch/-/picomatch-4.0.4.tgz", - "integrity": "sha512-QP88BAKvMam/3NxH6vj2o21R6MjxZUAd6nlwAS/pnGvN9IVLocLHxGYIzFhg6fUQ+5th6P4dv4eW9jX3DSIj7A==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=12" - }, - "funding": { - "url": "https://github.com/sponsors/jonschlinkert" - } - }, - "node_modules/punycode.js": { - "version": "2.3.1", - "resolved": "https://registry.npmjs.org/punycode.js/-/punycode.js-2.3.1.tgz", - "integrity": "sha512-uxFIHU0YlHYhDQtV4R9J6a52SLx28BCjT+4ieh7IGbgwVJWO+km431c4yRlREUAsAmt/uMjQUyQHNEPf0M39CA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=6" - } - }, - "node_modules/run-con": { - "version": "1.3.2", - "resolved": "https://registry.npmjs.org/run-con/-/run-con-1.3.2.tgz", - "integrity": "sha512-CcfE+mYiTcKEzg0IqS08+efdnH0oJ3zV0wSUFBNrMHMuxCtXvBCLzCJHatwuXDcu/RlhjTziTo/a1ruQik6/Yg==", - "dev": true, - "license": "(BSD-2-Clause OR MIT OR Apache-2.0)", - "dependencies": { - "deep-extend": "^0.6.0", - "ini": "~4.1.0", - "minimist": "^1.2.8", - "strip-json-comments": "~3.1.1" - }, - "bin": { - "run-con": "cli.js" - } - }, - "node_modules/smol-toml": { - "version": "1.6.0", - "resolved": "https://registry.npmjs.org/smol-toml/-/smol-toml-1.6.0.tgz", - "integrity": "sha512-4zemZi0HvTnYwLfrpk/CF9LOd9Lt87kAt50GnqhMpyF9U3poDAP2+iukq2bZsO/ufegbYehBkqINbsWxj4l4cw==", - "dev": true, - "license": "BSD-3-Clause", - "engines": { - "node": ">= 18" - }, - "funding": { - "url": "https://github.com/sponsors/cyyynthia" - } - }, - "node_modules/string-width": { - "version": "8.1.0", - "resolved": "https://registry.npmjs.org/string-width/-/string-width-8.1.0.tgz", - "integrity": "sha512-Kxl3KJGb/gxkaUMOjRsQ8IrXiGW75O4E3RPjFIINOVH8AMl2SQ/yWdTzWwF3FevIX9LcMAjJW+GRwAlAbTSXdg==", - "dev": true, - "license": "MIT", - "dependencies": { - "get-east-asian-width": "^1.3.0", - "strip-ansi": "^7.1.0" - }, - "engines": { - "node": ">=20" - }, - "funding": { - "url": "https://github.com/sponsors/sindresorhus" - } - }, - "node_modules/strip-ansi": { - "version": "7.1.2", - "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.2.tgz", - "integrity": "sha512-gmBGslpoQJtgnMAvOVqGZpEz9dyoKTCzy2nfz/n8aIFhN/jCE/rCmcxabB6jOOHV+0WNnylOxaxBQPSvcWklhA==", - "dev": true, - "license": "MIT", - "dependencies": { - "ansi-regex": "^6.0.1" - }, - "engines": { - "node": ">=12" - }, - "funding": { - "url": "https://github.com/chalk/strip-ansi?sponsor=1" - } - }, - "node_modules/strip-json-comments": { - "version": "3.1.1", - "resolved": "https://registry.npmjs.org/strip-json-comments/-/strip-json-comments-3.1.1.tgz", - "integrity": "sha512-6fPc+R4ihwqP6N/aIv2f1gMH8lOVtWQHoqC4yK6oSDVVocumAsfCqjkXnqiYMhmMwS/mEHLp7Vehlt3ql6lEig==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=8" - }, - "funding": { - "url": "https://github.com/sponsors/sindresorhus" - } - }, - "node_modules/tinyglobby": { - "version": "0.2.15", - "resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.15.tgz", - "integrity": "sha512-j2Zq4NyQYG5XMST4cbs02Ak8iJUdxRM0XI5QyxXuZOzKOINmWurp3smXu3y5wDcJrptwpSjgXHzIQxR0omXljQ==", - "dev": true, - "license": "MIT", - "dependencies": { - "fdir": "^6.5.0", - "picomatch": "^4.0.3" - }, - "engines": { - "node": ">=12.0.0" - }, - "funding": { - "url": "https://github.com/sponsors/SuperchupuDev" - } - }, - "node_modules/uc.micro": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/uc.micro/-/uc.micro-2.1.0.tgz", - "integrity": "sha512-ARDJmphmdvUk6Glw7y9DQ2bFkKBHwQHLi2lsaH6PPmz/Ka9sFOBsBluozhDltWmnv9u/cF6Rt87znRTPV+yp/A==", - "dev": true, - "license": "MIT" - } - } -} diff --git a/web/package.json b/web/package.json deleted file mode 100644 index a61eee2..0000000 --- a/web/package.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "devDependencies": { - "markdownlint-cli": "^0.48.0" - } -} diff --git a/web/static/CNAME b/web/static/CNAME deleted file mode 100644 index 37a420d..0000000 --- a/web/static/CNAME +++ /dev/null @@ -1 +0,0 @@ -aixgo.dev diff --git a/web/static/_headers b/web/static/_headers deleted file mode 100644 index 2025224..0000000 --- a/web/static/_headers +++ /dev/null @@ -1,20 +0,0 @@ -# Cloudflare Pages cache policy -# -# Hugo doesn't fingerprint CSS/JS filenames by default, so these get a -# short browser cache with revalidation. If we add resources.Fingerprint -# later, fingerprinted assets can move to a long-cache+immutable rule. -# -# Reference: https://developers.cloudflare.com/pages/configuration/headers/ - -/css/* - Cache-Control: public, max-age=3600, must-revalidate - -/js/* - Cache-Control: public, max-age=3600, must-revalidate - -# Brand images rarely change -- safe to long-cache -/aixgo-logo.png - Cache-Control: public, max-age=31536000, immutable - -/favicon.svg - Cache-Control: public, max-age=31536000, immutable diff --git a/web/static/aixgo-logo.png b/web/static/aixgo-logo.png deleted file mode 100644 index 5dfdb2049af520710f10e214b3fd3b17ff2144d9..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 14528 zcmXwA1yGw^(?v`0K(ONOQmnWIch^$fOL2F1cXw}#Q`}uj30~Z#xD|&#yx;sYXlAn6 zd-v|%v*+v+siGu}jzWk60|SFDD0|W02{W}Zr4*C=RFV+v}FJuQ9UFeIbpto;W zn5-NE=!>wPg~5=syUSVv1rgFm>^$PsWHaFua>o+}OL%Xx@Mvli$-ls5f96JNwEb&&B_Y2WI8QsiF9KhU?;q|wMf0ofjFSgCAZB>T|a4 zKk@oz``b{9!rzezEuWo9Ze=}YytxbmjdZ7vk(>#WhVZ?HJW7-OuKqj-ONP-nhZ*-2 z!VZ2bHddzA*a6e@0n^wgdU#RjKl;x`Mg2H>I&&3bvppU;is$0%|i>V#ub z&qXnM_l<{T+Y~lD+i2OjmY~`>^@M``VBb?u6o> zFlhRB=1W$Igf=)3O!gggqFJ(e8$O>8hi8+C6pj$MgZpmJTRmE(bZTz7z~?N_34{I% zvkmKJ*BzZt`Rmg=ioZXRga%!Ce3EhKAQOI}2y?a=%`ql$Sa~?SH^p9~RG2$ss;dZ2!3SJ^nC!gFTH z zMJDDD@1AD|%m{kfLFFpx8H`!fnxotdPzxu4`|;_Y9f@Z%b;3Ad%->@yQ5E!@RJtY$ zol~AcPG?PKlONFN3)%$AlZBT$T9k%9fqA~rxip*tQ6{w0-^T|&`YBPA#Z57XFup8M zz^s(X0K{fr`hP{|W799;La^jSQhvOnbts3}UjjH7?WO5S{FN@;HY_Yw-b>Mw^Kzi| z!%(2^Y*lR0RGmu=eag5;F;1N3;Kd(YOjE(*<%avQH=HtpsAk-y6kvsok7BCqdZyB` z@>P|x{QCNFzHj4)Ww%|TDx=rD-RsrH3q+m|5UHOjFr#9&kWw8SL444Z7N0zS)s$dN z@D8SRqZd{ZJi|{pUSJv-jJYd<5EMriW;dE|CEJ(jlXd24)L@2>4Xe_d+#qwcAVdRn zVow%eQCOQ4U;mapU&)ll?1=?|vyG2oZ*mFM9{GR<5BKV-a6@B!i~Rn5jWF@Qgm*$A z74+7J-Z{q4P0o&ZuP1J`Zr^Ti^0b%lq*!!Lt2RPNt)W3j|do` z^{Ie7N^@!tFeHZMi}uU%_>H&fp?7uhA^%$0cIAAmK=_5sidD{fM)D}+bG_Y)=e3mO z$^EwN-vTpc=ou>ZTO7P|P&C0PKpix8Wzc=N7ZD z8h7ZS8fE$h(>Pv%3>s|ntAohfk#(?}5sKmZX6oH}-cDq(7A3Y2U+*}TcddDVD?{jK z3+MNm6=O~-1TGgqw z@1+f}C>AYzb2}Vkg!#l_Ng(xgrwC~ohk@i0xhaPqw?nCTnZ?#J&aJwd%n7iCoKL#t z?MMg;qEiJX?b5k=jp`bG7F>d0oE`ZRwF^eNO}P5-uy#v3fYX)Q56xXd{&Q5yXBJbn z<#Qe3c~H|4PMg)EBvGrIVS6q?oAc&si&z~wwxXqM=Z=>4A?@72NvC0rHP9;1RW>AO z>!fqtudeQ1HBnNadx^dIVpWa4l*;z46f^-#4@Mm`^lGf$w-J86SpFuPNz!=>M(F^nmlLt)w)d=> zPME`*Q6`VR=L(er*xVOh6=2Tnfi1tgu6`4_(jsos?$}i9t+>wP`DEysg1^od43=V^+N#UCPd4Ak0t)WNJxDKggkcmDM$%H|h|MaIE8zR9j z=>2dWc0svxe+mLpzW7yu>2TmDc)C}=&*KQn_fTPnTJzsy`KwPE zzDofAfWj5G5gz#KG<~Eqq47*DwBkG@RueqUWK)ie7z^TP@Fe;cLKVM_H;o;^nmHpe zDFsB^MzAJnCvl!ZgCySqVi{Ug`^R9u9eX+XGG_Y3aMC!ags5m}iDcE;DDE(RMEs4t zS`jm|9QtnY&wD!_nOmb_(bwa^3 zPL4%X{p^2mcjn=j%_L-5Bc`)=oi~$PA$d8gcW9<4)^p|$`sTN+xPj6Z+pZ~cS$IP8gq&f<}@ZdzpNv3*hMrAl%q}RoQpFm z17NZz%z3&u9kS8-d1cYIQ_=B-GE2TjrDf{!;#q4JJ0+VfkUQrGKpO7N+i{K zRIw(`^sH-1W(6B4kF4^o9yO4(S`;FNTy3F_nAv>~G9MENV^8HBVj_YjYK|6zUQuQ! zs-`iXAK^Le7Wj!THO8w)4GkIf>X$v6cClAg#-i+%=*eVdd=9>)Ngz6vdb1#1UXa3` ziE1xLj9!FsM?GQ@-LQ3P5lx0JZ^LyZEeFW){8Ya{yG&7x)|WIG>Da2sIGqs;(wE=K zLNakz65t_8JNtn=tpXNw)pEyR&5V{o9RE;77j!WegBOuUT+R}e=2}6O7p}qV$ANfv zN-WCx!x?GPQP1+` zPju04HqR{8hqljE+?g|{#Gtucl{{q7I=vT{?Fu@%$BtO)trDofN_Z*vktpIUN3o$P zWa@zbgtQAxp26N?oFh*BH&s@~=9EMI$SS(1)bv%>qXLLR*=QJkHyT+Jd=GocX=}n| zn~eev6+#eV-4QFmgs`B_Geio?L2(a|c@9@VmSmk_$4MRp14?*avM(qvP< zJED~M;6G(>1Si49?#0{}2i4PxRS|WFV(`JE!jZ4l-i22s6@8Iz1n=OdC+E={Nmlp{ z4sTVp*=413vgcvvcUbgA9`o+P69z)T6HmWqo@~E{?9|{VbucN)ojYCR=5AYJP-c!> zjF)~cYY$^bvUZ>Q3gV?`X{SZ622O&jtuocbTOYXusW0H}$dF;UvAS=L?}N`L>D=m^esBUA!eWUICvVs$yYtd7a$B!xl)?dW4~{N-t(*~>I{eZjnXR~F8drba?kYp1K`=U1 z5mjLBf#&**BX;v~NsnNMHg5z^rGO)5jsL{7dFn*GMV2M5EWI!`M~tWO+UL)uH6{lm zyMYsU?_uqJ3I#fAbNv+;gTbY465zRWXg$^b!^{|h-ESu#&}CpwnSQcQY+irPK7(1a zteDbit4M2qMsn<(IE;17PoqImhk6_{?0%%RH+hEj$!pRihtbP*r@4>Ko{ri^dG8Yc zn*aTWo$+E-mS_y2AFlc#sMUTM3NLid`{(_de}@wWI>m>Dcqzp5Yhu@)f8mb?CisI0 z{+zC?+3Q_|uKpSks{grvFW?Bs8_~vmb*fXwz5E+tk|#HHI`WHAfU2R$)RDrGZ{m5T zpi<+GF%H3nU` z*k0E%Xo!vrhMUV1V>R8tJQ%!*>h+Uf6W29EC_+IY%JgR(h~p0RXbpc&jB12t{bU%A41`(Z)idZI!AkcnsNilyeTyB{sg5 zW{gz0rY`C`Qk|Y%~RWE`pTc=J;qVkD{L8jUs z@yLbH2Kt8Z5@1YHwsD4)43PZX_kjj0v{nTo;j)&MG{SID+-a)Q@y1$AAUULdpdrsH zv-(5f9~uCFefDF9?GM#%MD0>pCvx9!JUT`E^dLw*+(hG~>MBc^HqtrxNQR7_7$Qf` zk(k@a5V?{;5QWrN{Mqddo@XFySnpgEal=9cCFPX>5$Ac`1u|18Y@oXr$%DyQ-?Rku z$T^&+V=W_pbJx_#P*Z`gNh4IXuq&^F{MfX=Ik1QS>Rle8#^UDJN@rX)e~AqrCVU5YTWTT&{@3BKstUW-ty5(&r9eK4bjB zF6fHdoJUAn+XNJ`TxF-{G^4;;zVG$T26H?kE17ui(d<6duD0qAFxI^V;;I)%NOt|( ztatpjz&y32{Oe=}PN+T&2V4B8z0=-1rc z7h~!uRGALvD>rJv?Yh4e3GNuKb{21t*U{K@zA}+xDT!PinR4ua_g%x;8~Rq7Cq>V3 zB7ogtHaVVeJ2Cws24p2k`0Eq%b_Zi4AZ|<-6y&;JpurbfD`lMyvlQ|3V}M)tGNVAUHs59(GQfHB=5%i3q06DFiyyduKmTD8VLFK zDygxulgl$^yOHqq=NE~vfBQ_e@lub;Rhe7sES5)r!O~S3ud(R1Dw0C76=2O4m@lT`rtbJ#4 zCr(U(c}r$ALbP)U<&>#E4^exP2xT*yyh@VeV)(z)z=YI(9I{01&J=;qf0xG#`~p3; z2jARq6pg>u(`F@vGEgyT$&bsnAVSo$Fn~e=LEEPTMXW+|33=ZJu7W_LKaL~}(fuiXI1|iQe&7T#Y1}f~|X{XDwP{ zv=6P16AE)sI+EwB3A7ZBsB@s1vT8I28mU4+w4zbS7IUa!>|Nb;MdnMP+w*sB$S8T5WFY+UVSu*AZ8(xC`n*Bi=3$D8m?N{oFK!a zC(7ns#er<%<+0U2ggSaXzx%7+cPqTUsQ-3pN2|6Hku<16c%kcqy|pf|uW*KMG{9GNM|y5J88 zU{`QQdTEgXB&eRrV;@I0S<9VKZ7h|HsX}n|mz9pe8gA%+BuWL_dyatH|5V5Z#-%Pr z?7sKt*e}M;ODv~zt}WIH!$R`qN;ch4^4{Dlu~7!XQ4}oX7X$ndY-bgMQ8hhF(y(0J zS>blawhM9}*(-`4tb|isdr{*i_f2mdLM$zorIz*)u2zmXooqPLV~LI8Ox{Zo1Ge%o zt;##8Um2R{$bkgzC40WaX>ZV^GecQ^t^|jM=We&S)}mtaBS2cEzZ)Bqa4B*USWb$OEpwLRjJup~~YdnB3wbJUgB%Vs9c3mR#zt{s!qu=rVQxVf!W*qeT;il6S zuPc_7c_zWGBnWGh5cWSG;ZbqFV&MjJlHWc#Iw<6W$hSLxHIJ@Xb*TM>@_?m1SprC_ z{w#}8MudJ5B^I(?3yJ^&?m8OsB@Kqj{BC~SWR-z~irtOfb3(}wkOJn2WGePJ!gP2& zEuQZr>Y}e(jD2SQ{)*A*c<1V}&q8MZY9)J2hKy<(a?fK;=qD@%@nlJQZ#W}+S{cxC znnqMp@*w59x_X((;%Y$#3<(iV&j#VoZyG|;BKw&jxRS3_CszgD69^+aYZhq}=6Oyb z&1}X5kiZ@E(6(2-B-~!EgkxklWTjDynBFYHC+ec0C~*kI&Ht{v`5sUC!| z$o!oVY`M`1GAwdhKPUb^G(MB}ZMSSgWCB+yoH1LX7l>o1T*!s0MYz2#3J@ne;B6R~ za_w$o>)%?hFKwI+vm>6)6G;@ptOOxP%g2jkG>gNTAaXp+#(wdk!vH*X+TVqo%6$Fw zG1q;JHA-`28`+j z85qIEzQV@C+;Kcbg;yScW-)d3PE^=D0Z5))yr|@JRTUJb%H5XDjnmatai&lgfM+Gw zOvxK*nU0NJ5S2!8_ij18pWTNoIFZ4l0dW}0|N9g@JiHnS>@oxfGFBRLk5?UoQAR$L zo8<(BufPhQT6QFPnm0Bs6~TH;;3E5ShVt!2$;q*A`_A7WBlmAYX_e@NkzFW}6(W78 zg)nn4^&yqPEvpH(!m4sL^6j2t<{S!%W1W!?DciV8TE;mx!f7en4$l~iiExztsM#ZW zJDsI7B>mA}7)0#JY?>oMWU%QKvfZ+DB{ys-Izp!+~=F&B&#BpKxgK#78e94|MB9(})HK$#Y54y>-wi%%~0lfhx@Q?d@5eE%^g z9^#PnJbn68-YS)XQjRzj*%iaqCTR75z{O5XK@5g)OS6Gw{{VmsINTCNuYIn-q0FG) z(SZ7!L>gyKV6weYB{Rds07v5$9y;l57x&=r#U#G|2;v{)u<|+DHj*D=RaD=pOYMY+ zs)X`$xi*zW=5tNq_!u}S4rH>lQT?16q_L zW{o68XSy-IPv$t2VFCWTFch&xRM}c-vx0u3ySY#DBGCy`h#w|H-jSaGaB{gGhW8#K zi~3OBhhec$K|q0Mc%gS!pS-(2^L@!Qq-ilfc=aCYPMWSaM4FM7xg?b_jfD!5AIt~a zhMl6ppw=?)(O-dX``ElUU-sfv5HSiIPC?$e927Mvgn&=s4=g?Zl-~EvwD&_nxm1Z+ z8l`j#?pZ=bY*HF$P{48gq=uTl@QkQ>pgUTlKq!96O(m>Z*@j<>;*bC}Czdn`63?V4 zs>;{e8Qx?DZ#l)@A(3_Qb|>UOj;{l1b?ox}5h=Ye8rK87pqurvDu0-Q$o!BRfNi&< zpNmM$CnFSLHU%GXR24G`#4UrbBsq8zJa7e&SAMaxF~n(WsPbX?D8W+SrsQv!jvHb)zMe=pVH+q6U@U~-$Y0#C<;DJ~0rv`@2%8nf1 zZ~8K<^eXISK(!K(c;JfbL9eyHg#gr-_n8OCwYJrc!hO{*;pN**2#u`ej8Mt)^5-Is z4=Zh(^RiivB$)ik)Y+FV$6gU!J#l&8=I?4JCQhw;cQzk560ZS6K_GH%55V}?ElG$w z;)pNa-&E&6e+79vW$146snNtCpSkhYM3aP(4dolt5NuFb(?)(ttmQ5<6wN@4u;I2K zhV$uU_M=}n~?)vQ8>wXb@7L}5CB<=A{89$ldG%VOICj35OKZp7J<&08e z2?<(vJ2BJi*&opR`mmQUFVQt;iw>%Dueqt8w-UJShY`3jQnCSXa1u-S4-)b#QFRdc z*;P_L%elj}aY|>H-fBXO?}YZb*eY`ViwiQX*EI28twv&(i==phL1ifer;nF{JRd(h z2>UG-r)|x>Vk2NNnhmJ^X+>FybtB0@Xp<|VPxx&VZ47M_B!VOJxFY#+S+Hig;z4bx zsNxI(TC%;+`X@AKlV3RWkL0D=MB2ad2Cm#`@X-x%Z zcSfS$fFasYp$@^ej$x1ZNt2Tp3f6rRQEHXMWHiNAudd zv`!>#icGY^{W#IyAV>sNo?`&O5_5-Q;9r~xAl$bHRLKUm^6^U!;!|v{mv&%C2Yj&?Act zK2L%abLI{bBOxp+Ik4DOQeNyN7aaJ+TF7att@PI01wb|upMkGsrW5E3B>^Kr<$AQo z8@IbA9}qDuZTwwR?fwok{sac89!=&3Iv_o)RK~(LAq+gz+#+dTcM(fNN*+Ix3@ zY!gU#^&cJNIOJ0QpW%WF3&|;!>s~0F3myZ(!>UCQ&77D-=S!HN_x05GUQq|w4BvdFr)N`ekU=(Xzz;RoQ#`U&4-IZ;K zZyuFvIu;*5eI&w5`h1PGzBAvjzdWlVB`GgszIawaO&p95JZ~$JcXrv?JlnR%YUQap zkAs#OSQ%lx=C@P4U@NE*Jm2FIO=+mx z*Z{=7zF8HvmFplAh}sD}_`_;>5-_nZW<tP@aq7uZac?DzoJ_;Lxh@(*>I6N>lu3+aY0O3zn-aco+yB;T78WsopSaB_ zyh8R$TlsaDa;c2|{+-;_JWrtvh&`EQ2#5`(xee-WrgdKR0l9FFenAR@$c#w%20(Wb=gKHVsCSHw6?uwA)8Ol^Lgh# z0IpkB$+23fi=lfSnsAJKoRIRt!DhZl(CJSbxP;ih_r~$(vafZm2C_+WBvUUCBo~6v zKyK^8(3;R&N2(Q(YStMs#;ju>^aG`XyS3QeCP6X!DM-?X_o)R)O1`X=Nz&YLh2xcE z)t7S|QXinzas@vrYj!0@4|z4|+#$!?*3O`@o>BCcEcMl|#q&a0gVVD%Qs_aF3N=VK zh3U6xhcb91rY+vLD&f3MFjUD&IoMO6L(BO+ns%t0Zs^LhRnT$N>7Xf?&YI^5dwIl` zkpIu^7_;4Px%ojd27fvnCVqofGiYsLA4i$0s(FW+HZmA{O4P~ExVED8JDkRerK|{s zgW5^OG;Px>NTp6=kQ0pg&Z3b#15q)FV5dnWm9Pwlk*J1=F!AdR1bkUI6BHQN5i0uV zXttVUlWcNcx38{pR5JYVGVi$8c=xaQ_4di`?v?CdK@5w|X*2VwBtimOF{@Eb^oO|0 zDD|H_z5`b2ikm`u+OCc0{10eD5CR;l-}T-_SvGtEPK+fGcyYyRxQGcv;5uz;HWlZF z7NLw#Ojek$I|Mb$uF&rtsw%{^IWcHP#OYB!isS{m-CXaZzX@2-P2vguW{T;r>qPPZ(>?c};&<#gFW1ZCgPz~BVB{iq z+6icFAk-cHVv-zLw;%Q5qLNi2yTmD1#FRDMvPk5iE3%bWV2k<#K-1azh(TiXNTZ>c7#8b2F}sPKKqoQFHixXs_Q6pL1Q3R z@}mY0)zd4zg--iAR5e#aZxLzv{3=zXyjPEm#G(*B;j!k-BW%drA6vLzvV9P>V3{V9 z6GXdD;dHh70Kcju(1fdaS** z%N0t1-P)n9Qe~5q6%l^ALUM4LV@A7`dII`Veh20GB#S?Ikm?4obQDfsdR4rW&O8Wv zQLPfG1gPHSST7iJ+&juF)?+>qJN-KuZy@Rs?OMBoYZFfUfO1?5aLx)x67?hy9L-_LLldLUyb+Q^o zKtAT4x9cfq>L_%MnfQT6XZv@St*UIunLCwFw@U6iDDD_Mfa~;f#^SHizu%9ZBS86< zl?SlpD5u>DPSxDVQ(e#u`GA+m}OiuEU`{fZP$e+uiHDW5CKMmf{?&oS6}n zj}yLdVVP$KO6Zy)Zp_lFe8roy{04l8GBu*bK$q!5n@rHVw29QV_7LBu{ zsoGU+sC8?AG`#$Un)I%(*3^)a9F@l7tD;c~J4U6SiAUL@+US>jSAHs)oqCC?ps^F( z0y@^E+;2-jfo}XVGp=Y`I@ivu+*!WSHfIyb5ZZ!US7o0j8MfAjTC-}=%Wy@o6cv`< zPp^~iFwz?nJ7rB3Y8tMVJB+1*=(Lg0Zvp!Ga#EmHZE01P9gf|xwsr*77~J}A1{I~1 zQC(jp>6Cj=(>C>1XqhM_6z3vypl4LfY+De@vGX&qL@*vg+ZXJB(HsXbVsdt(cwN1k zUjYZf-J=J|C_{h?@?~A$8Gwiz>r6FNr@Tb=?#v`Z5b^~Q1%R%VvRsEl8^~+{T=NKd zkw=|2zR#<*p8MYQC|*8IbfWM9_mq2q0+YtCsCdjKNuCCZPd5CBZTjXIwezB#1w!I9 z>={gQlW2k1tx#Q?9{YO)kj@o~1ZbQQ6HdTb*Q_b32yw6Cs4yaa%h=HII^YR1oGc_N z4Xm?8SNKWmuecIom`Mx%LWRC3;~yayL&ciQ(;PQBje&p|nMfw6U@eKjdhchABR3W& z@`EZ73K5wQ?6*_D_B`Tq(P71`|CG8m!q{5PLgN9_a{OSLTA{*2`U@8wxoq(*92eYSY;Nns#+Bo^`*;uiYr3W(on z=%#%?({^^4Hh~-1!xl$0Uf)B%HL^V>p{yd-NC&W&10zxdNS2{~8}y!7eOZVmN_5#F zf4hxkVqOJ7hT{=`i^tPXv+A4?4zBnwCF~UDaoy%|ha^Y^S32$vC(DnT=$4txm}BEA zSD$x{EI-g23dM%bON6D4EPKBVLe0q@3fy98N^@-Qmu^Z|oPR!^$*CQ=1y40D7igRy zXCOyW*1+xyL$rWEYY3*$YCg1TrJIneS9bso?YT#xin{kTtk`nJhl|+riPdVVh`$&I z58p5kt#NEBSwdvHygcr)yE1+gTS2&2j;={BGoW)g&OalaC!ryUcq*Kx>n}(=K)&u5 z{-$rBP&rp21eIjz3{V6(AMC|;tJG}iG){#3ZtsNjgoQ~y?2d6xea`*#J5L0#@b0*} zCMyEdz+$!aYgNYa&Hdqq7h2dg3l)*<)4$6>si-g=FcSO@UjvkOpwCc9oRU!9fbOV+ zzJkNu?r$U_t6W@YDHJ;ZG%Wd&yT=caLyu;7Rs@K5!l&HoWSLCl6 z_V*}TmZ1Bxlc~92pa|rdci_Kg3veckU{3iKP5yU68@E+LTKfMRc`G5mNQSW5aDB!y z+Z|IMjg5Xph%sQQiF?AE3!!x(>)QJEQ(T*;@Tk)hOG~4|imOnajgupFd~9k*cbM_D ztJROhSj3L-je`l?3EcZwIPC=e2$d6Leam~hWK+Wy*!y~y_HlQ16S`=HfKC92I$XPbZJ1{8=ivm=znHCca!SIg~)u_CRPrxo3%O(p6# zgTZ_;?T&RWzHjYvkO~B3#T0~ijGKi~A_U8p9{~MTDXJKl)}!n$gstyob({n(Wn&{Jd{LpW=(iNq5JtRI-Br19)D5RX+2gPbMHB zBFAY`MY(o%yJLAGYVpBhsV6C$w8*?kAha`5g`vopl3=CKP`KW~fl1~o+x@rNafU_E zvGveM{--H)_mLU&@oNWEax@AF5*@{6DHmQ)VToS6gGCbm?m4Ob4%&yl(9nUTVgL~k-MbL=bO z%Gp?(Q~4ChMr;Dh7v+sm?-0p+yp8{~6?w`980aaB>@5==RUFuy0%svEG#Jb3J1({g zBo9$Lnr5K&>tyINJ_o2kg}&uffn+FMc=(tO1lom)d@jij^(37`vgv~KzbPspr*(B| zhwvI6vGX0S`rO7u8U$vF48A9a)CCAA;9Sox)X*+up+R~mql@lkvybsa4^A87Pb*O* zBF!xLvU41171gdh($W?>9-%{g3^OdDY_Tm3=iRgW55bwvnYkYwmZ;^NYrQLn7K&s| z<4R{U7_<#+Z7HES$KS!`yMrJhuoNst`!YjkbeVX&imW`%G}{LwfLg8FMdhZfFk9k7 zJDH1T@lysB1>N#8=GnL5kL3iEV$n&Mi0YA0E~3*2xh0XMBDF1|N;8t38XNPW63$E~ z{xnEmdYa2K*q4#}@Es$+=fH!K}afJneUo^>UDrm+_kc`%|&_pn1*aCQ}GYv6kCIc^|3;!T*7_I)kn)AlF?ekLe zYViHtQA2PnZkdD`!J4vC{kqF(@#t@4%;=^fQWv_U;$z#8+ZERbylaB6L|k-N7cS+~ zf$?`v1wZDA3ke#O{qBlahqom2LpO77`W`()WgC}sfiMpfXp(^%GJa^Ub!5)4rwFK|II*z852kiSoJ)2r>D5+%lv)%Fj|Isd6-ZMLM%2C zREI@hZ5*X$CV6gsWjh2A$&Rh?`B0_k^;rc$`$q5mow+ hr, -.prose > hr { - margin: var(--space-24) 0; /* 96px between major sections */ - border: none; - height: 1px; - opacity: 0.1; -} - -.hero-section-wrapper { - margin-top: var(--space-16); /* 64px */ - margin-bottom: var(--space-24); /* 96px */ - padding: var(--space-12) var(--space-6); -} - -.hero-subheadline { - font-size: var(--text-xl); - line-height: var(--leading-relaxed); - opacity: 0.85; - margin-bottom: var(--space-12); - max-width: 42rem; -} - -/* Increase spacing around sections */ -article section, -.prose section { - margin: var(--space-16) 0; -} - -/* CTA Button - Make it unmissable */ -.cta-massive { - font-size: var(--text-xl) !important; - padding: 1.25rem 3rem !important; - font-weight: 700 !important; - min-width: 320px; - box-shadow: 0 8px 24px rgba(0, 0, 0, 0.3); - transition: all 0.3s ease; -} - -.cta-massive:hover { - transform: translateY(-2px); - box-shadow: 0 12px 32px rgba(0, 0, 0, 0.4); -} - -@media (max-width: 768px) { - .cta-massive { - font-size: var(--text-lg) !important; - padding: 1rem 2rem !important; - min-width: 100%; - } -} - -/* Hero CTAs container */ -.hero-ctas { - display: flex; - gap: 1.5rem; - margin-top: var(--space-12); - margin-bottom: var(--space-8); - flex-wrap: wrap; -} - -/* ========================================================================== - Professional Table Styling with Visible Borders - ========================================================================== */ - -article table, -.prose table { - width: 100%; - margin: 3rem auto; - border-collapse: separate; - border-spacing: 0; - background: rgba(255, 255, 255, 0.02); - border: 2px solid rgba(255, 255, 255, 0.2); - border-radius: 12px; - overflow: hidden; - box-shadow: 0 4px 20px rgba(0, 0, 0, 0.3); - font-size: 1.0625rem; - max-width: 1200px; -} - -/* Table header with visible border */ -article thead, -.prose thead { - background: rgba(255, 255, 255, 0.08); - border-bottom: 2px solid rgba(255, 255, 255, 0.25); -} - -article thead th, -.prose thead th { - padding: 1.25rem 1.5rem; - text-align: left; - font-weight: 700; - font-size: 1.125rem; - color: rgba(255, 255, 255, 0.95); - border-right: 1px solid rgba(255, 255, 255, 0.15); -} - -article thead th:last-child, -.prose thead th:last-child { - border-right: none; -} - -/* Table body with visible borders between rows */ -article tbody tr, -.prose tbody tr { - border-bottom: 1px solid rgba(255, 255, 255, 0.12); -} - -article tbody tr:last-child, -.prose tbody tr:last-child { - border-bottom: none; -} - -article tbody tr:hover, -.prose tbody tr:hover { - background: rgba(255, 255, 255, 0.05); -} - -/* Table cells with visible vertical borders */ -article tbody td, -.prose tbody td { - padding: 1.25rem 1.5rem; - color: rgba(255, 255, 255, 0.85); - line-height: 1.6; - border-right: 1px solid rgba(255, 255, 255, 0.12); -} - -article tbody td:last-child, -.prose tbody td:last-child { - border-right: none; -} - -/* First column styling with stronger border */ -article tbody td:first-child, -.prose tbody td:first-child { - font-weight: 600; - color: rgba(255, 255, 255, 0.95); - background: rgba(255, 255, 255, 0.03); - border-right: 2px solid rgba(255, 255, 255, 0.2); -} - -/* Strong emphasis in tables */ -article table strong, -.prose table strong { - font-weight: 700; - color: rgba(255, 255, 255, 0.98); -} - -/* Mobile responsive */ -@media (max-width: 768px) { - article table, - .prose table { - font-size: 0.9375rem; - margin: 2rem auto; - border: 1px solid rgba(255, 255, 255, 0.2); - } - - article thead th, - .prose thead th { - padding: 1rem; - font-size: 1rem; - } - - article tbody td, - .prose tbody td { - padding: 1rem; - } -} - -/* Code block improvements */ -pre { - margin: var(--space-8) 0 !important; - border-radius: 8px !important; -} - -code { - font-size: 0.95rem !important; -} - -/* Lists improvements */ -ul, ol { - margin: var(--space-6) 0; - padding-left: 1.5rem; -} - -li { - margin: 0.5rem 0; - line-height: var(--leading-relaxed); -} - -/* Strong emphasis */ -strong { - font-weight: 700; -} - -/* Hero Tagline - Half size of H1, directly under brand name */ -.hero-tagline { - font-size: var(--text-4xl); /* 48px - half of 96px H1 */ - font-weight: 700; - line-height: var(--leading-tight); - letter-spacing: var(--tracking-tight); - margin-top: 0; - margin-bottom: var(--space-8); - opacity: 0.9; -} - -@media (max-width: 1024px) { - .hero-tagline { - font-size: var(--text-3xl); /* 36px on tablet */ - } -} - -@media (max-width: 768px) { - .hero-tagline { - font-size: var(--text-2xl); /* 30px on mobile */ - } -} - -/* Fix page layout to look like a proper hero */ -article.max-w-prose { - max-width: 100% !important; -} - -.prose { - max-width: 100% !important; -} - -/* Center the hero section */ -.hero-section-wrapper { - max-width: 1280px; - margin-left: auto; - margin-right: auto; - text-align: center; - padding-left: 2rem; - padding-right: 2rem; -} - -/* Center all main content sections */ -article > * { - max-width: 1280px; - margin-left: auto; - margin-right: auto; - padding-left: 2rem; - padding-right: 2rem; -} - -/* Center headings */ -article h2 { - text-align: center; -} - -/* Center CTAs */ -.hero-ctas { - justify-content: center; -} - -/* Comparison grid should be centered */ -.comparison-grid { - margin-left: auto; - margin-right: auto; -} - -/* Tables should be centered with max width */ -article table { - margin-left: auto; - margin-right: auto; - max-width: 1024px; -} - -/* Code blocks should have max width */ -article pre { - max-width: 1024px; - margin-left: auto; - margin-right: auto; -} - -/* Hide BlowFish's auto-generated hero title to prevent duplicate "Aixgo" */ -.hero-title { - display: none !important; -} - -/* Also try hiding by article title class */ -article.hero .article-title { - display: none !important; -} - -/* And header title in hero layout */ -.hero header h1:first-child { - display: none !important; -} -/* Hide BlowFish's auto-generated hero H1 (lines 70-80 in hero.html template) */ -article.prose > div > div > div > div > div > h1.mb-2.text-4xl { - display: none !important; -} - -/* Also hide the author headline H2 if present */ -article.prose > div > div > div > div > div > h2.mt-0.mb-0.text-xl { - display: none !important; -} - -/* ========================================================================== - Code Section Improvements - Inline Labels - ========================================================================== */ - -.code-section { - margin: 2rem 0; - text-align: left; - max-width: 1024px; - margin-left: auto; - margin-right: auto; - padding: 0 2rem; -} - -.code-label { - display: block; - font-weight: 700; - font-size: 1rem; - margin-bottom: 0.75rem; - opacity: 0.9; - background: rgba(255, 255, 255, 0.05); - padding: 0.5rem 1rem; - border-radius: 6px; - border-left: 3px solid rgba(255, 255, 255, 0.3); - text-align: left; - width: fit-content; -} - -.code-section pre { - margin-top: 0.75rem !important; -} - -/* ========================================================================== - Comparison Table Enhancement - ========================================================================== */ - -.comparison-table-section { - max-width: 1200px; - margin: 4rem auto; - padding: 0 2rem; -} - -.comparison-intro { - text-align: center; - font-size: 1.25rem; - line-height: 1.7; - max-width: 900px; - margin: 0 auto 3rem; - opacity: 0.9; -} - -.comparison-cards { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(320px, 1fr)); - gap: 2rem; - margin: 3rem 0; -} - -.comparison-card { - background: rgba(255, 255, 255, 0.04); - border: 2px solid rgba(255, 255, 255, 0.1); - border-radius: 16px; - padding: 2rem; - transition: all 0.3s ease; -} - -.comparison-card:hover { - transform: translateY(-8px); - border-color: rgba(255, 255, 255, 0.3); - box-shadow: 0 12px 40px rgba(0, 0, 0, 0.3); -} - -.comparison-card h3 { - font-size: 1.5rem; - font-weight: 700; - margin: 0 0 1.5rem 0; - text-align: center; -} - -.comparison-card .comparison-item { - margin: 1rem 0; - padding: 0.75rem; - border-radius: 8px; - background: rgba(0, 0, 0, 0.2); -} - -.comparison-card .comparison-label { - display: block; - font-weight: 600; - font-size: 0.85rem; - text-transform: uppercase; - letter-spacing: 0.05em; - opacity: 0.7; - margin-bottom: 0.5rem; -} - -.comparison-card .comparison-value { - display: block; - font-size: 1.125rem; - font-weight: 600; - margin-bottom: 0.5rem; -} - -.comparison-card .comparison-impact { - display: block; - font-size: 0.9rem; - opacity: 0.85; - line-height: 1.5; -} - -/* ========================================================================== - Features List - Left Aligned - ========================================================================== */ - -.features-section { - max-width: 900px; - margin: 4rem auto; - padding: 0 2rem; -} - -.features-intro { - font-size: 1.25rem; - line-height: 1.7; - margin-bottom: 3rem; - opacity: 0.9; -} - -.feature-item { - margin: 2.5rem 0; - padding-left: 0; -} - -.feature-item h3 { - font-size: 1.5rem; - font-weight: 700; - margin: 0 0 1rem 0; - text-align: left; -} - -.feature-item p { - font-size: 1.125rem; - line-height: 1.7; - opacity: 0.85; - margin: 0; -} - -/* ========================================================================== - Roadmap Layout - Clean & Professional - ========================================================================== */ - -.timeline-section { - max-width: 900px; - margin: 4rem auto; - padding: 0 2rem; -} - -.timeline { - margin-top: 3rem; -} - -.timeline-item { - margin-bottom: 3rem; -} - -.timeline-period { - font-weight: 700; - font-size: 1.25rem; - margin-bottom: 1rem; - opacity: 0.7; - text-transform: uppercase; - letter-spacing: 0.05em; -} - -.timeline-content ul { - margin: 0; - padding-left: 1.5rem; - list-style: disc; -} - -.timeline-content li { - margin: 0.5rem 0; - font-size: 1.125rem; - line-height: 1.7; - opacity: 0.85; -} - -/* ========================================================================== - Section Title Improvements - ========================================================================== */ - -.section-title-reduced { - font-size: 2.5rem !important; - font-weight: 700 !important; - margin: 3rem 0 2rem 0 !important; -} - -.section-subtitle { - font-size: 1.125rem; - opacity: 0.7; - font-weight: 400; - margin-top: -1rem; - margin-bottom: 2rem; -} - -/* ========================================================================== - Mobile Responsive for New Components - ========================================================================== */ - -@media (max-width: 768px) { - .comparison-cards { - grid-template-columns: 1fr; - gap: 1.5rem; - } - - .timeline-item { - margin-bottom: 2rem; - } - - .timeline-period { - font-size: 1.125rem; - } - - .timeline-content li { - font-size: 1rem; - } - - .section-title-reduced { - font-size: 2rem !important; - } -} - diff --git a/web/static/css/main.css b/web/static/css/main.css deleted file mode 100644 index 7266c68..0000000 --- a/web/static/css/main.css +++ /dev/null @@ -1,1903 +0,0 @@ -/* Reset and Base Styles */ -* { - margin: 0; - padding: 0; - box-sizing: border-box; -} - -:root { - /* Colors */ - --color-bg: #0a0f1c; - --color-bg-light: #141b2d; - --color-bg-card: #1a2233; - --color-border: #2a3544; - --color-cyan: #22d3ee; - --color-cyan-dark: #06b6d4; - --color-text: #ffffff; - --color-text-muted: #94a3b8; - --color-text-dim: #64748b; - - /* Typography */ - --font-sans: 'Inter', -apple-system, BlinkMacSystemFont, sans-serif; - --font-mono: 'JetBrains Mono', 'Consolas', 'Monaco', monospace; - - /* Font sizes - balanced modular scale */ - --text-xs: 0.75rem; /* 12px */ - --text-sm: 0.875rem; /* 14px */ - --text-base: 1rem; /* 16px */ - --text-lg: 1.125rem; /* 18px */ - --text-xl: 1.25rem; /* 20px */ - --text-2xl: 1.5rem; /* 24px */ - --text-3xl: 1.875rem; /* 30px */ - --text-4xl: 2.25rem; /* 36px */ - --text-5xl: 2.75rem; /* 44px */ - --text-6xl: 3.5rem; /* 56px */ - - /* Font weights */ - --weight-normal: 400; - --weight-medium: 500; - --weight-semibold: 600; - --weight-bold: 700; - --weight-extrabold: 800; - --weight-black: 900; - - /* Line heights */ - --leading-tight: 1.15; - --leading-normal: 1.5; - --leading-relaxed: 1.75; - - /* Letter spacing */ - --tracking-tight: -0.035em; - --tracking-normal: 0; - --tracking-wide: 0.025em; - - /* Spacing system (8px base unit) */ - --space-1: 0.5rem; - --space-2: 1rem; - --space-3: 1.5rem; - --space-4: 2rem; - --space-5: 2.5rem; - --space-6: 3rem; - --space-8: 4rem; - --space-10: 5rem; - --space-12: 6rem; - --space-16: 8rem; - --space-20: 10rem; - - /* Container widths */ - --container-xs: 640px; - --container-sm: 768px; - --container-md: 1024px; - --container-lg: 1280px; - --container-xl: 1400px; -} - -html { - overflow-x: hidden; - width: 100%; -} - -body { - font-family: var(--font-sans); - background-color: var(--color-bg); - color: var(--color-text); - line-height: 1.6; - font-size: 16px; - overflow-x: hidden; - position: relative; - width: 100%; -} - -/* Container */ -.container { - max-width: 1200px; - margin: 0 auto; - padding: 0 2rem; -} - -/* Wide container for feature tables - supports 4 columns */ -.container-wide { - max-width: 1400px; - margin: 0 auto; - padding: 0 2rem; -} - -.container-narrow { - max-width: 800px; - margin: 0 auto; - padding: 0 2rem; -} - -/* Header */ -header { - position: sticky; - top: 0; - background-color: rgba(10, 15, 28, 0.95); - backdrop-filter: blur(10px); - border-bottom: 1px solid rgba(42, 53, 68, 0.3); - z-index: 100; - padding: 1rem 0; -} - -.nav-wrapper { - display: flex; - justify-content: space-between; - align-items: center; -} - -.logo a { - font-size: 1.5rem; - font-weight: 700; - color: var(--color-cyan); - text-decoration: none; - transition: color 0.2s; -} - -.logo a:hover { - color: var(--color-cyan-dark); -} - -.nav-menu { - display: flex; - list-style: none; - gap: 2rem; - align-items: center; -} - -.nav-menu a { - color: var(--color-text); - text-decoration: none; - transition: color 0.2s; - font-weight: 500; -} - -.nav-menu a:hover { - color: var(--color-cyan); -} - -.external-link-icon { - display: inline-block; - width: 14px; - height: 14px; - margin-left: 4px; - vertical-align: middle; - opacity: 0.7; - transition: opacity 0.2s; -} - -.nav-menu a:hover .external-link-icon { - opacity: 1; -} - -/* Navigation GitHub Icon (desktop: icon-only ghost; mobile: labeled full-width) */ -.nav-github-icon { - display: inline-flex; - align-items: center; - justify-content: center; - width: 40px; - height: 40px; - border: 1px solid transparent; - border-radius: 6px; - transition: all 0.2s; - color: var(--color-text); -} - -.nav-github-icon:hover { - border-color: var(--color-cyan); - background-color: rgba(34, 211, 238, 0.05); - color: var(--color-cyan); -} - -.nav-github-icon svg { - flex-shrink: 0; -} - -.nav-github-icon .nav-label { - display: none; -} - -/* Get Started CTA Button */ -.nav-cta { - margin-left: 0.5rem; -} - -.btn-get-started { - display: inline-flex; - align-items: center; - gap: 0.5rem; - padding: 0.625rem 1.25rem; - background: linear-gradient(135deg, var(--color-cyan) 0%, #06b6d4 100%); - color: var(--color-bg) !important; - border-radius: 6px; - font-weight: 600 !important; - transition: all 0.2s; - box-shadow: 0 2px 8px rgba(34, 211, 238, 0.2); -} - -.btn-get-started:hover { - transform: translateY(-1px); - box-shadow: 0 4px 12px rgba(34, 211, 238, 0.3); - color: var(--color-bg) !important; -} - -/* Footer */ -footer { - margin-top: 8rem; - padding: 4rem 0 2rem; - border-top: 1px solid rgba(42, 53, 68, 0.3); - background: rgba(0, 0, 0, 0.1); -} - -.footer-grid { - display: grid; - grid-template-columns: 1.5fr 1fr 1fr 1fr; - gap: 3rem; - max-width: 1200px; - margin: 0 auto; -} - -/* Brand Column */ -.footer-brand { - display: flex; - flex-direction: column; - gap: 0.75rem; -} - -.footer-logo { - font-size: 1.5rem; - font-weight: 700; - color: var(--color-cyan); - margin-bottom: 0.25rem; -} - -.footer-tagline { - font-size: 0.95rem; - color: var(--color-text-muted); - margin: 0; - line-height: 1.5; -} - -.footer-copyright { - font-size: 0.875rem; - color: var(--color-text-dim); - margin: 0.5rem 0 0.25rem 0; -} - -.footer-license { - font-size: 0.875rem; - color: var(--color-text-muted); - text-decoration: none; - transition: color 0.2s; - width: fit-content; -} - -.footer-license:hover { - color: var(--color-cyan); -} - -/* Footer Columns */ -.footer-column { - display: flex; - flex-direction: column; - gap: 1rem; -} - -.footer-heading { - font-size: 0.875rem; - font-weight: 600; - color: var(--color-text); - text-transform: uppercase; - letter-spacing: 0.05em; - margin: 0; -} - -.footer-links { - list-style: none; - padding: 0; - margin: 0; - display: flex; - flex-direction: column; - gap: 0.625rem; -} - -.footer-links a { - color: var(--color-text-muted); - text-decoration: none; - font-size: 0.9375rem; - transition: color 0.2s; - display: inline-block; -} - -.footer-links a:hover { - color: var(--color-cyan); -} - -/* Typography */ -h1, h2, h3, h4, h5, h6 { - line-height: 1.2; - font-weight: 700; -} - -.highlight { - color: var(--color-cyan); -} - -/* Buttons */ -.btn { - display: inline-block; - padding: 0.7125rem 1.425rem; - border-radius: 0.375rem; - text-decoration: none; - font-weight: 600; - transition: all 0.2s; - border: none; - cursor: pointer; -} - -.btn-primary { - background-color: transparent; - color: var(--color-text); - border: 1px solid var(--color-border); -} - -.btn-primary:hover { - background-color: var(--color-cyan); - color: var(--color-bg); - border-color: var(--color-cyan); - transform: translateY(-2px); - box-shadow: 0 4px 12px rgba(34, 211, 238, 0.3); -} - -.btn-secondary { - background: linear-gradient(135deg, var(--color-cyan) 0%, #06b6d4 100%); - color: var(--color-bg); - border: none; - box-shadow: 0 2px 8px rgba(34, 211, 238, 0.2); -} - -.btn-secondary:hover { - transform: translateY(-2px); - box-shadow: 0 4px 12px rgba(34, 211, 238, 0.3); - color: var(--color-bg); -} - -.btn-cta { - background-color: var(--color-cyan); - color: var(--color-bg); - border: 1px solid var(--color-cyan); - box-shadow: 0 0 40px rgba(34, 211, 238, 0.4); -} - -.btn-cta:hover { - background-color: rgba(34, 211, 238, 0.9); - border-color: var(--color-cyan); - transform: translateY(-4px) scale(1.05); - box-shadow: 0 0 50px rgba(34, 211, 238, 0.6), 0 8px 24px rgba(34, 211, 238, 0.5); -} - -/* Hero Section */ -.hero { - padding: 14rem 0 14rem; - text-align: center; - background: radial-gradient(ellipse 40% 80% at 50% 50%, rgba(34, 211, 238, 0.15), transparent 70%); - position: relative; -} - -.hero-title { - font-size: var(--text-6xl); /* 56px - largest, homepage hero only */ - margin-bottom: 1.5rem; - font-weight: 800; - line-height: 1.1; -} - -.hero-subtitle { - font-size: 1.25rem; - color: var(--color-text-muted); - margin-bottom: 2.5rem; - max-width: 700px; - margin-left: auto; - margin-right: auto; - line-height: 1.6; -} - -/* Code Section */ -.code-section { - padding: 6rem 0; -} - -.section-title { - font-size: var(--text-4xl); /* 36px - section headers */ - text-align: center; - margin-bottom: 1.5rem; - font-weight: 700; -} - -.section-subtitle { - text-align: center; - color: var(--color-text-muted); - font-size: 1.1rem; - max-width: 700px; - margin: 0 auto 4rem; - line-height: 1.6; -} - -.code-steps { - display: grid; - gap: 2rem; - margin-bottom: 2rem; -} - -.code-step { - background-color: rgba(0, 0, 0, 0.3); - border: 1px solid var(--color-border); - border-radius: 0.75rem; - padding: 1.25rem; -} - -.code-step h3 { - color: var(--color-cyan); - margin-bottom: 0.75rem; - font-size: 1.1rem; -} - -.code-block-wrapper { - position: relative; -} - -.copy-button { - position: absolute; - top: 0.5rem; - right: 0.5rem; - background-color: transparent; - border: none; - border-radius: 0.25rem; - padding: 0.375rem; - color: var(--color-text-dim); - cursor: pointer; - display: flex; - align-items: center; - justify-content: center; - transition: all 0.2s; - z-index: 10; -} - -.copy-button:hover { - background-color: rgba(255, 255, 255, 0.1); - color: var(--color-text); -} - -.copy-button.copied { - color: #10b981; -} - -.code-step pre { - background-color: rgba(26, 34, 51, 0.6); - border: 1px solid rgba(42, 53, 68, 0.3); - border-radius: 0.5rem; - padding: 0.75rem; - padding-right: 3rem; - overflow-x: auto; - margin: 0; -} - -.code-step code { - font-family: var(--font-mono); - font-size: 0.9rem; - color: var(--color-text); - line-height: 1.6; -} - -.code-footer { - text-align: center; - color: var(--color-text-muted); - font-size: 1.1rem; - margin-top: 3rem; -} - -.code-cta { - text-align: center; - margin-top: 2.5rem; -} - -/* Why Go Section */ -.why-go-section { - padding: 6rem 0; -} - -.features-grid { - display: grid; - grid-template-columns: repeat(2, 1fr); - gap: 1.5rem; - max-width: 1000px; - margin: 0 auto; -} - -.feature-card { - background-color: transparent; - border: 1px solid var(--color-border); - border-radius: 0.75rem; - padding: 1rem; - transition: all 0.3s; -} - -.feature-card:hover { - border-color: var(--color-cyan); - transform: translateY(-4px); -} - -.feature-card h3 { - color: var(--color-cyan); - margin-bottom: 0.875rem; - font-size: 1.125rem; - font-weight: 700; -} - -.feature-comparison { - display: flex; - flex-direction: column; - gap: 0.625rem; -} - -.comparison-item { - padding: 0.125rem 0; -} - -.comparison-label { - color: var(--color-text-muted); - font-size: 0.9rem; - margin-bottom: 0.25rem; -} - -.comparison-value { - color: var(--color-text); - font-size: 0.9rem; - line-height: 1.4; -} - -.comparison-value.highlight { - color: var(--color-cyan); - font-weight: 500; -} - -.comparison-impact { - color: var(--color-text-muted); - font-size: 0.9rem; - line-height: 1.4; -} - -.comparison-divider { - height: 1px; - background-color: var(--color-border); - opacity: 0.5; - margin: 0.25rem 0; -} - -/* Advantage Section */ -.advantage-section { - padding: 6rem 0; -} - -.advantage-table-wrapper { - max-width: 1000px; - margin: 0 auto; - overflow-x: auto; -} - -.advantage-table { - width: 100%; - border-collapse: collapse; - border: 1px solid var(--color-border); - border-radius: 0.75rem; - overflow: hidden; -} - -.advantage-table thead { - background-color: rgba(34, 211, 238, 0.05); -} - -.advantage-table th { - padding: 1rem 1.5rem; - text-align: left; - font-size: 1rem; - font-weight: 600; - color: var(--color-text); - border-bottom: 1px solid var(--color-border); -} - -.advantage-table th:first-child { - color: var(--color-text); - font-weight: 700; -} - -.advantage-table th:last-child { - color: var(--color-cyan); - font-weight: 700; -} - -.advantage-table td { - padding: 1rem 1.5rem; - color: var(--color-text-muted); - font-size: 0.9rem; - border-bottom: 1px solid var(--color-border); -} - -.advantage-table td:first-child { - color: var(--color-text); - font-weight: 600; -} - -.advantage-table td:last-child { - color: var(--color-cyan); - font-weight: 600; -} - -.advantage-table tbody tr:last-child td { - border-bottom: none; -} - -.advantage-table tbody tr:hover { - background-color: rgba(34, 211, 238, 0.03); -} - -/* Built For Section */ -.built-for-section { - padding: 6rem 0; -} - -.section-subtitle { - text-align: center; - color: var(--color-text-muted); - font-size: 1.1rem; - max-width: 700px; - margin: 0 auto 4rem; -} - -.features-grid-2 { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); - gap: 1.5rem; -} - -.feature-card-2 { - background-color: transparent; - border: 1px solid var(--color-border); - border-radius: 0.75rem; - padding: 1rem; - transition: all 0.3s; -} - -.feature-card-2:hover { - border-color: rgba(34, 211, 238, 0.3); -} - -.feature-card-2:hover .feature-icon { - background-color: rgba(32, 40, 58, 1); -} - -.feature-header { - display: flex; - align-items: flex-start; - gap: 1rem; - margin-bottom: 0.75rem; -} - -.feature-icon { - display: flex; - align-items: center; - justify-content: center; - flex-shrink: 0; - padding: 0.5rem; - border-radius: 0.5rem; - background-color: rgba(29, 40, 58, 0.8); - color: var(--color-cyan); - transition: background-color 0.3s; -} - -.feature-icon svg { - width: 20px; - height: 20px; - stroke: var(--color-cyan); -} - -.feature-card-2 h3 { - color: var(--color-text); - margin: 0; - font-size: 1.1rem; - font-weight: 600; - flex: 1; -} - -.feature-card-2 p { - color: var(--color-text-muted); - line-height: 1.6; - font-size: 0.9rem; - margin: 0; -} - -/* Roadmap Section */ -.roadmap-section { - padding: 6rem 0; -} - -.roadmap-grid { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); - gap: 2rem; -} - -.roadmap-card { - background-color: rgba(17, 24, 39, 0.5); - border: 1px solid var(--color-border); - border-radius: 0.75rem; - padding: 1.5rem; -} - -.version-badge { - display: inline-block; - background-color: rgba(34, 211, 238, 0.1); - color: var(--color-cyan); - padding: 0.25rem 0.75rem; - border-radius: 0.375rem; - font-size: 0.9rem; - font-weight: 600; - margin-bottom: 1rem; -} - -.roadmap-card ul { - list-style: none; - margin: 0; -} - -.roadmap-card li { - color: var(--color-text-muted); - padding: 0.5rem 0; - padding-left: 1.5rem; - position: relative; - font-size: 0.9rem; -} - -.roadmap-card li::before { - content: "•"; - position: absolute; - left: 0; - color: var(--color-cyan); - font-weight: bold; -} - -.roadmap-cta { - text-align: center; - margin-top: 2.5rem; -} - -/* CTA Section */ -.cta-section { - padding: 6rem 0; - text-align: center; - background: radial-gradient(ellipse at center, rgba(34, 211, 238, 0.1), transparent 70%); -} - -.cta-title { - font-size: var(--text-4xl); /* 36px - matches section titles */ - margin-bottom: 1.5rem; - font-weight: 700; -} - -.cta-subtitle { - font-size: 1.25rem; - color: var(--color-text-muted); - margin-bottom: 2.5rem; - max-width: 600px; - margin-left: auto; - margin-right: auto; -} - -/* List Pages (Guides & Blog) */ -.list-page { - padding: 4rem 0; -} - -.page-title { - font-size: var(--text-4xl); /* 36px - list page titles */ - margin-bottom: 1rem; - font-weight: 700; -} - -.page-title.cyan { - color: var(--color-cyan); -} - -.page-description { - font-size: 1.1rem; - color: var(--color-text-muted); - margin-bottom: 3rem; -} - -/* Guides */ -.guide-category { - margin-bottom: 4rem; -} - -.guide-category h2 { - font-size: 1.5rem; - margin-bottom: 1.5rem; - color: var(--color-text); -} - -.guide-list { - display: grid; - gap: 1.5rem; -} - -.guide-card { - display: flex; - gap: 1.5rem; - padding: 1.5rem; - background-color: var(--color-bg-card); - border: 1px solid var(--color-border); - border-radius: 0.75rem; - text-decoration: none; - transition: all 0.3s; -} - -.guide-card:hover { - border-color: var(--color-cyan); - transform: translateX(8px); -} - -.guide-icon { - color: var(--color-cyan); - flex-shrink: 0; -} - -.guide-content h3 { - color: var(--color-text); - margin-bottom: 0.5rem; - font-size: 1.2rem; -} - -.guide-content p { - color: var(--color-text-muted); - line-height: 1.6; -} - -/* Blog */ -.blog-list { - display: grid; - gap: 2rem; -} - -.blog-card { - padding: 2rem; - background-color: var(--color-bg-card); - border: 1px solid var(--color-border); - border-radius: 0.75rem; - text-decoration: none; - display: block; - transition: all 0.3s; -} - -.blog-card:hover { - border-color: var(--color-cyan); - transform: translateY(-4px); -} - -.blog-meta { - display: flex; - gap: 1.5rem; - margin-bottom: 1rem; - font-size: 0.9rem; - color: var(--color-text-muted); -} - -.blog-date, -.blog-author { - display: flex; - align-items: center; - gap: 0.5rem; -} - -.blog-card h2 { - color: var(--color-text); - margin-bottom: 0.75rem; - font-size: 1.5rem; -} - -.blog-card p { - color: var(--color-text-muted); - line-height: 1.6; - margin-bottom: 1rem; -} - -.tags-inline { - display: flex; - flex-wrap: wrap; - gap: 0.5rem; -} - -.tag { - display: inline-block; - background-color: rgba(34, 211, 238, 0.1); - color: var(--color-cyan); - padding: 0.25rem 0.75rem; - border-radius: 0.375rem; - font-size: 0.9rem; - font-weight: 500; -} - -/* Single Page */ -.single-page { - padding: 4rem 0; -} - -.back-link { - display: inline-flex; - align-items: center; - color: var(--color-text-muted); - text-decoration: none; - margin-bottom: 2rem; - transition: color 0.2s; -} - -.back-link:hover { - color: var(--color-cyan); -} - -.post-meta { - display: flex; - gap: 1.5rem; - margin-bottom: 2rem; - font-size: 0.9rem; - color: var(--color-text-muted); -} - -.post-date, -.post-author { - display: flex; - align-items: center; - gap: 0.5rem; -} - -.breadcrumb { - margin-bottom: 1rem; -} - -.breadcrumb-link { - color: var(--color-cyan); - text-decoration: none; - font-size: 0.9rem; -} - -.breadcrumb-link:hover { - text-decoration: underline; -} - -.single-page h1 { - font-size: 2.5rem; - margin-bottom: 1.5rem; - font-weight: 700; -} - -.lead { - font-size: 1.2rem; - color: var(--color-text-muted); - margin-bottom: 2rem; - line-height: 1.6; -} - -.content { - color: var(--color-text-muted); - line-height: 1.8; -} - -.content h2 { - color: var(--color-text); - margin-top: 2.5rem; - margin-bottom: 1rem; - font-size: 1.75rem; -} - -.content h3 { - color: var(--color-text); - margin-top: 2rem; - margin-bottom: 0.75rem; - font-size: 1.4rem; -} - -.content p { - margin-bottom: 1.25rem; -} - -.content ul, -.content ol { - margin-bottom: 1.25rem; - padding-left: 1.5rem; -} - -.content li { - margin-bottom: 0.5rem; -} - -.content a { - color: var(--color-cyan); - text-decoration: none; -} - -.content a:hover { - text-decoration: underline; -} - -.content code { - font-family: var(--font-mono); - background-color: var(--color-bg-card); - padding: 0.2rem 0.4rem; - border-radius: 0.25rem; - font-size: 0.9em; -} - -.content pre { - background-color: var(--color-bg-card); - border: 1px solid var(--color-border); - border-radius: 0.5rem; - padding: 1.5rem; - overflow-x: auto; - margin-bottom: 1.25rem; -} - -.content pre code { - background: none; - padding: 0; - font-size: 0.9rem; - line-height: 1.6; -} - -.tags { - display: flex; - flex-wrap: wrap; - gap: 0.5rem; - margin-top: 3rem; - padding-top: 2rem; - border-top: 1px solid var(--color-border); -} - -/* Table Styling - Professional design with visible borders */ -.content table { - width: 100%; - margin: 3rem auto; - border-collapse: separate; - border-spacing: 0; - background: rgba(255, 255, 255, 0.02); - border: 2px solid rgba(255, 255, 255, 0.2); - border-radius: 12px; - overflow: visible; /* Changed from hidden to allow tooltips to display */ - box-shadow: 0 4px 20px rgba(0, 0, 0, 0.3); - font-size: 1rem; -} - -.content table thead { - background: rgba(34, 211, 238, 0.1); - border-bottom: 2px solid rgba(34, 211, 238, 0.3); -} - -.content table thead th { - padding: 0.875rem 1.25rem; - text-align: left; - font-weight: 700; - font-size: 1.05rem; - color: rgba(255, 255, 255, 0.95); - border-right: 1px solid rgba(255, 255, 255, 0.15); -} - -.content table thead th:last-child { - border-right: none; -} - -.content table tbody tr { - border-bottom: 1px solid rgba(255, 255, 255, 0.12); - transition: background-color 0.2s; -} - -.content table tbody tr:last-child { - border-bottom: none; -} - -.content table tbody tr:hover { - background: rgba(34, 211, 238, 0.05); -} - -.content table tbody td { - padding: 0.875rem 1.25rem; - color: var(--color-text-muted); - line-height: 1.5; - border-right: 1px solid rgba(255, 255, 255, 0.12); - position: relative; /* Allow absolute positioning of tooltips */ - overflow: visible; /* Allow tooltips to overflow cell boundaries */ -} - -.content table tbody td:last-child { - border-right: none; -} - -.content table tbody td:first-child { - font-weight: 600; - color: var(--color-text); - background: rgba(255, 255, 255, 0.03); - border-right: 2px solid rgba(255, 255, 255, 0.2); -} - -.content table strong { - font-weight: 700; - color: var(--color-text); -} - -/* ========================================================================== - Hamburger Menu Base Styles - Hidden on desktop, shown on mobile via media query - ========================================================================== */ - -/* Hide hamburger elements on desktop */ -.nav-toggle { - display: none; -} - -.nav-hamburger { - display: none; - flex-direction: column; - cursor: pointer; - gap: 5px; - z-index: 101; - padding: 10px; -} - -.nav-hamburger span { - width: 25px; - height: 3px; - background-color: var(--color-text); - transition: all 0.3s ease; - border-radius: 3px; -} - -/* ========================================================================== - Feature Grid - Features Page - Desktop: Table-like layout, Mobile: Card-based layout - ========================================================================== */ - -/* Wide container for feature grid to support 4 columns */ -.feature-grid { - max-width: 1400px; - margin: 0 auto; -} - -.feature-grid h3 { - font-size: 2rem; - margin-bottom: 2rem; - margin-top: 4rem; - color: var(--color-cyan); - font-weight: 700; -} - -.feature-grid h3:first-of-type { - margin-top: 0; -} - -/* Desktop: Table-like layout */ -@media (min-width: 1024px) { - .feature-table { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); - gap: 0; - margin-bottom: 4rem; - border: 2px solid rgba(255, 255, 255, 0.2); - border-radius: 12px; - overflow: visible; - background: rgba(26, 34, 51, 0.3); - } - - .feature-column { - background: transparent; - border-right: 1px solid rgba(255, 255, 255, 0.1); - border-radius: 0; - padding: 0; - transition: background-color 0.2s ease; - overflow: visible; - } - - .feature-column:last-child { - border-right: none; - } - - .feature-column:hover { - background: rgba(34, 211, 238, 0.03); - } - - .feature-column h4 { - font-size: 1rem; - font-weight: 600; - color: var(--color-cyan); - margin: 0; - padding: 1.25rem 1rem; - border-bottom: 2px solid rgba(34, 211, 238, 0.3); - background: rgba(34, 211, 238, 0.05); - text-align: center; - } - - .feature-item { - position: relative; - padding: 0.875rem 1rem; - margin: 0; - background: transparent; - border: none; - border-bottom: 1px solid rgba(255, 255, 255, 0.05); - border-radius: 0; - font-size: 0.9rem; - color: var(--color-text); - transition: background-color 0.2s ease; - overflow: visible; - } - - .feature-item-empty { - opacity: 0.3; - } - - .feature-item:hover { - background: rgba(34, 211, 238, 0.08); - } -} - -/* Tablet & Mobile: Card-based layout */ -@media (max-width: 1023px) { - /* Hide empty placeholder cells on mobile */ - .feature-item-empty { - display: none !important; - } - - .feature-table { - display: grid; - grid-template-columns: 1fr; - gap: 1.5rem; - margin-bottom: 3rem; - overflow: visible; - } - - .feature-column { - background: rgba(26, 34, 51, 0.4); - border: 1px solid var(--color-border); - border-radius: 12px; - padding: 1.5rem; - transition: border-color 0.2s ease; - overflow: visible; - } - - .feature-column:hover { - border-color: rgba(34, 211, 238, 0.3); - } - - .feature-column h4 { - font-size: 1.125rem; - font-weight: 600; - color: var(--color-cyan); - margin: 0 0 1.25rem 0; - padding-bottom: 0.75rem; - border-bottom: 2px solid rgba(34, 211, 238, 0.2); - } - - .feature-item { - position: relative; - padding: 0.625rem 0.875rem; - margin-bottom: 0.5rem; - background: rgba(0, 0, 0, 0.2); - border: 1px solid transparent; - border-radius: 6px; - font-size: 0.9375rem; - color: var(--color-text); - transition: all 0.2s ease; - overflow: visible; - } - - .feature-item:last-child { - margin-bottom: 0; - } - - .feature-item:hover { - background: rgba(34, 211, 238, 0.05); - border-color: rgba(34, 211, 238, 0.2); - } -} - -/* Responsive - Mobile */ -@media (max-width: 768px) { - .hero-title { - font-size: var(--text-4xl); /* 36px on mobile */ - } - - .section-title { - font-size: var(--text-3xl); /* 30px on mobile */ - } - - .cta-title { - font-size: var(--text-3xl); /* 30px on mobile */ - } - - .page-title { - font-size: var(--text-3xl); /* 30px on mobile */ - } - - /* Show hamburger on mobile */ - .nav-hamburger { - display: flex; - } - - /* Hide and transform nav menu into slide-out drawer */ - .nav-menu { - position: fixed; - top: 0; - right: -100%; - height: 100vh; - width: 70%; - max-width: 300px; - background-color: var(--color-bg); - border-left: 1px solid var(--color-border); - flex-direction: column; - gap: 0; - padding: 4rem 1.5rem 2rem; - transition: right 0.3s ease; - box-shadow: -5px 0 15px rgba(0, 0, 0, 0.3); - z-index: 100; - overflow-y: auto; /* Allow scrolling if content is too tall */ - } - - .nav-menu li { - width: 100%; - border-bottom: 1px solid var(--color-border); - } - - .nav-menu a { - display: block; - padding: 0.875rem 0; - font-size: 1rem; - } - - /* Mobile GitHub: show label, full-width like other items */ - .nav-github-icon { - display: flex !important; - justify-content: center; - align-items: center; - width: 100% !important; - height: auto !important; - padding: 0.875rem 1rem !important; - margin: 0 !important; - box-sizing: border-box; - border: 1px solid var(--color-border); - gap: 0.5rem; - } - - .nav-github-icon .nav-label { - display: inline; - } - - /* Mobile Get Started CTA */ - .nav-cta { - margin-left: 0 !important; - margin-top: 0.5rem !important; - margin-bottom: 0.5rem; - border-bottom: none !important; - padding: 0 !important; - } - - .btn-get-started { - display: flex !important; - justify-content: center; - align-items: center; - width: 100% !important; - padding: 1rem 1.25rem !important; - font-size: 1rem; - margin: 0 !important; - box-sizing: border-box; - } - - /* Show menu when checkbox is checked */ - .nav-toggle:checked ~ .nav-menu { - right: 0; - } - - /* Hamburger animation when menu is open */ - .nav-toggle:checked ~ .nav-hamburger span:nth-child(1) { - transform: rotate(45deg) translate(7px, 7px); - } - - .nav-toggle:checked ~ .nav-hamburger span:nth-child(2) { - opacity: 0; - } - - .nav-toggle:checked ~ .nav-hamburger span:nth-child(3) { - transform: rotate(-45deg) translate(7px, -7px); - } - - /* Overlay backdrop when menu is open */ - .nav-toggle:checked ~ .nav-menu::before { - content: ''; - position: fixed; - top: 0; - left: 0; - right: 30%; - bottom: 0; - background-color: rgba(0, 0, 0, 0.5); - z-index: -1; - } - - /* Transform Go Advantage table into cards */ - .advantage-table-wrapper { - overflow: visible; /* No horizontal scroll needed */ - } - - .advantage-table { - border: none; - border-radius: 0; - } - - /* Hide table header on mobile (accessibility: keep in DOM) */ - .advantage-table thead { - position: absolute; - width: 1px; - height: 1px; - padding: 0; - margin: -1px; - overflow: hidden; - clip: rect(0, 0, 0, 0); - white-space: nowrap; - border: 0; - } - - /* Transform each row into a card */ - .advantage-table tbody tr { - display: block; - margin-bottom: 1.5rem; - border: 1px solid var(--color-border); - border-radius: 0.75rem; - background-color: rgba(0, 0, 0, 0.2); - padding: 1rem; - } - - .advantage-table tbody tr:hover { - background-color: rgba(34, 211, 238, 0.08); - } - - /* Stack cells vertically within each card */ - .advantage-table td { - display: block; - border: none; - padding: 0.5rem 0; - text-align: left; - } - - /* Feature name (first cell) - large and cyan */ - .advantage-table td:first-child { - color: var(--color-cyan); - font-size: 1.1rem; - font-weight: 700; - margin-bottom: 0.75rem; - padding-bottom: 0.75rem; - border-bottom: 1px solid var(--color-border); - background: none; - } - - /* Add context labels using ::before pseudo-elements */ - .advantage-table td:nth-child(2)::before { - content: 'Python Frameworks: '; - display: block; - color: var(--color-text-muted); - font-size: 0.8rem; - font-weight: 500; - margin-bottom: 0.25rem; - text-transform: uppercase; - letter-spacing: 0.5px; - } - - .advantage-table td:nth-child(3)::before { - content: 'Aixgo: '; - display: block; - color: var(--color-cyan); - font-size: 0.8rem; - font-weight: 600; - margin-bottom: 0.25rem; - text-transform: uppercase; - letter-spacing: 0.5px; - } - - /* Python Frameworks value - muted */ - .advantage-table td:nth-child(2) { - color: var(--color-text-muted); - font-size: 0.9rem; - margin-bottom: 0.5rem; - } - - /* Aixgo value - emphasized with cyan */ - .advantage-table td:nth-child(3) { - color: var(--color-cyan); - font-weight: 600; - font-size: 0.95rem; - } - - /* Optimize code blocks for mobile - enable wrapping for accessibility */ - .code-step pre, - .content pre { - padding: 0.75rem; - padding-right: 2.5rem; /* Space for copy button */ - font-size: 0.85rem; /* Slightly smaller for mobile */ - max-width: 100%; /* Prevent code blocks from expanding page width */ - box-sizing: border-box; - overflow-x: visible; /* Allow wrapping instead of scroll */ - white-space: pre-wrap; /* Wrap long lines while preserving formatting */ - overflow-wrap: break-word; /* Break long words at boundaries */ - word-break: break-word; /* Fallback for URLs/paths */ - } - - /* Ensure code step containers don't overflow */ - .code-step { - max-width: 100%; - overflow-x: visible; - } - - .code-step code, - .content pre code { - font-size: 0.8rem; - line-height: 1.5; - white-space: pre-wrap; /* Enable wrapping in code elements */ - overflow-wrap: break-word; /* Break long words */ - } - - /* Make copy button smaller on mobile */ - .copy-button { - padding: 0.25rem; - top: 0.35rem; - right: 0.35rem; - } - - .copy-button svg { - width: 14px; - height: 14px; - } - - /* Additional mobile improvements */ - - /* Hero section - reduce vertical padding */ - .hero { - padding: 8rem 0 6rem; - } - - .hero-subtitle { - font-size: 1.1rem; - padding: 0 1rem; - } - - /* Full-width buttons on mobile */ - .btn { - width: 100%; - max-width: 300px; - } - - /* Features grid - single column */ - .features-grid { - grid-template-columns: 1fr; - gap: 1.5rem; - } - - .feature-card { - padding: 1.25rem; - } - - /* Container padding - reduce for more mobile space */ - .container, - .container-narrow { - padding-left: 1.25rem; - padding-right: 1.25rem; - } - - /* Section padding - reduce vertical spacing */ - .code-section, - .why-go-section, - .advantage-section, - .built-for-section, - .roadmap-section, - .cta-section { - padding: 4rem 0; - } - - /* Mobile Footer */ - .footer-grid { - grid-template-columns: 1fr; - gap: 2.5rem; - text-align: center; - } - - .footer-brand { - align-items: center; - padding-bottom: 1.5rem; - border-bottom: 1px solid rgba(42, 53, 68, 0.3); - } - - .footer-column { - align-items: center; - } - - .footer-heading { - font-size: 0.8125rem; - } - - .footer-links { - align-items: center; - } - - .footer-links a { - font-size: 0.875rem; - padding: 0.5rem 0; - min-height: 44px; - } - - .content table { - font-size: 0.9rem; - margin: 2rem auto; - } - - .content table thead th, - .content table tbody td { - padding: 1rem; - } - - /* Feature Grid - Mobile responsive adjustments */ - .feature-grid h3 { - font-size: 1.5rem; - margin-bottom: 1.5rem; - margin-top: 2.5rem; - } - - .feature-grid h3:first-of-type { - margin-top: 0; - } -} - -/* ========================================================================== - Custom Tooltip for Feature Tables - Provides instant-response, readable tooltips for feature explanations - ========================================================================== */ - -.info-tooltip { - position: relative; - display: inline-block; - cursor: help; - opacity: 0.6; - transition: opacity 0.2s ease; -} - -.info-tooltip:hover { - opacity: 1; -} - -.info-tooltip .tooltip-text { - visibility: hidden; - opacity: 0; - /* Larger, readable text */ - font-size: 0.9375rem; /* 15px */ - line-height: 1.5; - font-weight: 400; - /* Better contrast */ - background-color: rgba(20, 27, 45, 0.98); - color: rgba(255, 255, 255, 0.95); - /* Generous spacing */ - padding: 12px 16px; - border-radius: 8px; - /* Positioning - appears above the icon */ - position: absolute; - bottom: calc(100% + 8px); /* 8px gap above icon */ - left: 50%; - transform: translateX(-50%) scale(0.95); - /* Width control - increased max-width and added min-width */ - width: max-content; - min-width: 200px; - max-width: 400px; /* Increased from 280px to accommodate longer text */ - text-align: left; - white-space: normal; - word-wrap: break-word; - /* Styling */ - border: 1px solid rgba(255, 255, 255, 0.2); - box-shadow: 0 8px 24px rgba(0, 0, 0, 0.5); - /* Instant response - minimal delay */ - transition: opacity 0.15s ease, transform 0.15s ease, visibility 0.15s; - transition-delay: 0s; - z-index: 10000; /* Very high z-index to ensure it appears above everything */ - pointer-events: none; -} - -/* Tooltip arrow */ -.info-tooltip .tooltip-text::after { - content: ''; - position: absolute; - top: 100%; - left: 50%; - margin-left: -6px; - border-width: 6px; - border-style: solid; - border-color: rgba(20, 27, 45, 0.98) transparent transparent transparent; -} - -/* Show on hover with instant response */ -.info-tooltip:hover .tooltip-text { - visibility: visible; - opacity: 1; - transform: translateX(-50%) scale(1); - transition-delay: 0.05s; /* Very short 50ms delay for instant feel */ -} - -/* Mobile touch support */ -@media (hover: none) { - .info-tooltip .tooltip-text { - display: none; - } - - .info-tooltip:active .tooltip-text { - display: block; - visibility: visible; - opacity: 1; - transform: translateX(-50%) scale(1); - } -} - -/* Responsive tooltip on small screens */ -@media (max-width: 768px) { - .info-tooltip .tooltip-text { - min-width: 180px; - max-width: 300px; /* Increased from 220px for better mobile readability */ - font-size: 0.875rem; /* 14px on mobile */ - padding: 10px 14px; - } -} - -/* ============================================ - Philosophy Page Enhancements - ============================================ */ - -/* Production Reality Metrics Table */ -.philosophy-metrics { - margin: 3rem 0; - padding: 2rem; - background: linear-gradient(135deg, rgba(34, 211, 238, 0.05) 0%, rgba(6, 182, 212, 0.05) 100%); - border: 1px solid var(--color-border); - border-radius: 12px; -} - -.philosophy-metrics h2 { - text-align: center; - margin-bottom: 1.5rem; - color: var(--color-cyan); -} - -.philosophy-metrics table { - width: 100%; - margin: 0; -} - -.philosophy-metrics table th { - background: var(--color-bg-card); - padding: 1rem; - font-weight: var(--weight-semibold); -} - -.philosophy-metrics table td { - padding: 0.875rem 1rem; -} - -.philosophy-metrics table td:nth-child(3) { - color: var(--color-cyan); - font-weight: var(--weight-bold); -} - -.philosophy-metrics table td:nth-child(4) { - color: var(--color-text-muted); - font-style: italic; -} - -/* Decision Section */ -.decision-section { - margin: 3rem 0; -} - -.decision-section h3 { - margin-top: 2.5rem; - margin-bottom: 1rem; - padding-bottom: 0.5rem; - border-bottom: 2px solid var(--color-cyan); -} - -.decision-section > div { - margin: 2rem 0; -} - -.decision-section strong { - display: block; - color: var(--color-cyan); - margin-top: 1.5rem; - margin-bottom: 0.5rem; - font-size: 1.125rem; -} - -.decision-section ul { - list-style: none; - padding-left: 0; -} - -.decision-section ul li { - padding: 0.5rem 0 0.5rem 1.75rem; - position: relative; -} - -.decision-section ul li::before { - content: '→'; - position: absolute; - left: 0.5rem; - color: var(--color-cyan); - font-weight: bold; -} - -/* Philosophy CTA Grid */ -.philosophy-cta-grid { - display: grid; - grid-template-columns: 1fr 1fr; - gap: 3rem; - margin: 2.5rem 0; - padding: 2.5rem; - background: var(--color-bg-card); - border-radius: 12px; - border: 1px solid var(--color-border); -} - -.philosophy-cta-grid strong { - display: block; - font-size: 1.25rem; - color: var(--color-cyan); - margin-bottom: 1rem; -} - -.philosophy-cta-grid ul { - list-style: none; - padding-left: 0; -} - -.philosophy-cta-grid ul li { - padding: 0.5rem 0; -} - -.philosophy-cta-grid ul li a { - color: var(--color-text); - text-decoration: none; - transition: color 0.2s; -} - -.philosophy-cta-grid ul li a:hover { - color: var(--color-cyan); -} - -/* Mobile Responsive */ -@media (max-width: 768px) { - .philosophy-metrics { - padding: 1.5rem; - margin: 2rem 0; - } - - .philosophy-metrics table { - font-size: 0.875rem; - } - - .philosophy-metrics table th, - .philosophy-metrics table td { - padding: 0.75rem 0.5rem; - } - - .philosophy-cta-grid { - grid-template-columns: 1fr; - gap: 2rem; - padding: 2rem 1.5rem; - } - - .decision-section strong { - font-size: 1rem; - } -} diff --git a/web/static/favicon.svg b/web/static/favicon.svg deleted file mode 100644 index efd8579..0000000 --- a/web/static/favicon.svg +++ /dev/null @@ -1,4 +0,0 @@ - - - - diff --git a/web/static/js/main.js b/web/static/js/main.js deleted file mode 100644 index d83e7ee..0000000 --- a/web/static/js/main.js +++ /dev/null @@ -1,32 +0,0 @@ -// Copy code functionality -document.addEventListener('DOMContentLoaded', function() { - const copyButtons = document.querySelectorAll('.copy-button'); - - copyButtons.forEach(button => { - button.addEventListener('click', async function() { - const codeBlock = this.nextElementSibling.querySelector('code'); - const code = codeBlock.textContent; - - try { - await navigator.clipboard.writeText(code); - - // Show success state - this.classList.add('copied'); - const originalHTML = this.innerHTML; - this.innerHTML = ` - - - - `; - - // Reset after 2 seconds - setTimeout(() => { - this.classList.remove('copied'); - this.innerHTML = originalHTML; - }, 2000); - } catch (err) { - console.error('Failed to copy code:', err); - } - }); - }); -}); diff --git a/web/static/llms.txt b/web/static/llms.txt deleted file mode 100644 index 69508f7..0000000 --- a/web/static/llms.txt +++ /dev/null @@ -1,47 +0,0 @@ -# Aixgo - -> Aixgo is a production-grade AI agent framework for Go. It enables secure, scalable multi-agent systems with no Python dependencies, single-binary deployment under 20MB, sub-100ms cold starts, and true parallel concurrency (no GIL). - -Aixgo targets backend Go engineers, DevOps teams, and data engineers who need to ship AI agents in production without the operational overhead of Python frameworks like LangChain or LlamaIndex. The framework provides 13 orchestration patterns (Supervisor, Sequential, Parallel, Router, Swarm, Hierarchical, RAG, Reflection, Ensemble, Classifier, Aggregation, Planning, MapReduce), 6 agent types (ReAct, Classifier, Aggregator, Planner, Producer, Logger), 8+ LLM providers (OpenAI, Anthropic, Gemini, xAI, Vertex AI, Amazon Bedrock, HuggingFace, plus inference services), Pydantic AI-style validation retry for structured output, and Model Context Protocol (MCP) tool calling. - -## Documentation - -- [Quick Start](https://aixgo.dev/guides/quick-start): Install, configure, and run a first agent in under five minutes -- [Core Concepts](https://aixgo.dev/guides/core-concepts): Agents, runtime, supervisor, message flow, and configuration -- [Agent Types](https://aixgo.dev/guides/agent-types): ReAct, Classifier, Aggregator, Planner, Producer, Logger and when to use each -- [Multi-Agent Orchestration](https://aixgo.dev/guides/multi-agent-orchestration): Building agent systems with the supervisor and orchestration patterns -- [Pattern Composition](https://aixgo.dev/guides/pattern-composition): Combining the 13 orchestration patterns for complex workflows -- [Validation with Retry](https://aixgo.dev/guides/validation-with-retry): Pydantic-style structured output validation with automatic retry for 40-70% reliability gains -- [Sessions](https://aixgo.dev/guides/sessions): Conversation state and session persistence across agent invocations -- [Type Safety](https://aixgo.dev/guides/type-safety): Compile-time guarantees, generics usage, and avoiding runtime errors -- [Provider Integration](https://aixgo.dev/guides/provider-integration): Adding and configuring LLM providers -- [Provider Comparison](https://aixgo.dev/guides/provider-comparison): Trade-offs between OpenAI, Anthropic, Gemini, xAI, Bedrock, and Vertex AI -- [AWS Bedrock](https://aixgo.dev/guides/aws-bedrock): Using Anthropic, Amazon, Meta, Mistral, Cohere, and AI21 models via Bedrock -- [Embeddings](https://aixgo.dev/guides/embeddings): Embedding generation with OpenAI and HuggingFace -- [Vector Databases](https://aixgo.dev/guides/vector-databases): Firestore and in-memory vector stores for RAG -- [Observability](https://aixgo.dev/guides/observability): OpenTelemetry tracing, metrics, and cost tracking -- [Production Deployment](https://aixgo.dev/guides/production-deployment): Docker, Kubernetes, Cloud Run patterns -- [Single vs Distributed](https://aixgo.dev/guides/single-vs-distributed): Choosing between local channel runtime and gRPC distributed runtime -- [Cost Optimization](https://aixgo.dev/guides/cost-optimization): Router pattern, model tiering, caching, and budget controls -- [Extending Aixgo](https://aixgo.dev/guides/extending-aixgo): Custom agents, providers, and orchestration patterns -- [Using Public Interfaces](https://aixgo.dev/guides/using-public-interfaces): Stable Go API surface and module boundaries -- [Docker from Scratch](https://aixgo.dev/guides/docker-from-scratch): Minimal containers for production aixgo deployments -- [Chat Assistant Tutorial](https://aixgo.dev/guides/chat-assistant): End-to-end build of a multi-agent chat assistant - -## Why Aixgo - -- [Why Aixgo](https://aixgo.dev/why-aixgo): Positioning vs Python frameworks, design tradeoffs, target use cases -- [Features](https://aixgo.dev/features): Complete feature catalog with status indicators -- [Proverbs](https://aixgo.dev/proverbs): Design principles and idioms for the framework - -## Source and reference - -- [GitHub Repository](https://github.com/aixgo-dev/aixgo): Source, issues, releases, contributing guide -- [Go Package Reference](https://pkg.go.dev/github.com/aixgo-dev/aixgo): Generated Go API documentation -- [Releases](https://github.com/aixgo-dev/aixgo/releases): Version history and changelogs -- [Discussions](https://github.com/orgs/aixgo-dev/discussions): Community Q&A and design conversations - -## Optional - -- [Blog](https://aixgo.dev/blog): Release announcements and engineering posts -- [Roadmap](https://github.com/orgs/aixgo-dev/projects/1): Public roadmap and milestone tracking