The first Claude Code plugin that generates AI images — from your terminal, cost-optimized, on-brand.
Install · Commands · How it works · Pilot ecosystem · Landing page
img-pilot is the third plugin in the Sakaax pilot ecosystem:
ux-pilot → UX discovery + brief
brand-pilot → brand tokens (CSS + Tailwind + palette)
img-pilot → AI-generated visual assets (this plugin)
If either sister plugin has been run, img-pilot reads their outputs automatically — palette, fonts, tone, style, product type, validated design tokens. Zero reconfiguration. You keep the same design identity across every plugin in the pipeline because they all share the same DA and they all read the same briefs.
Every other AI image tool lives in a browser tab. You context-switch, you prompt, you download, you drag into your project, you repeat for every variant. img-pilot lives inside Claude Code — the terminal you're already in — and makes Claude the art director that writes the prompt, chooses the provider, and derives every asset size from a single generation.
You pay for one API call. You get eight assets (logo + favicon set + OG + Twitter card + Discord embed + GitHub banner + iOS icon + Android icon).
Every dev project needs visual assets. Logo. Favicon. OG image. GitHub banner. Most devs either:
- Skip them entirely — default favicon, no OG image, cardboard-box GitHub page
- Spend hours in Figma or Canva for basic assets, inventing a design language on the fly
- Pay a designer $200+ for a logo they end up using mostly at 16×16
img-pilot collapses that into a single confirmed terminal command. On-brand, in minutes, for the cost of one API call.
| Manual approach | img-pilot | |
|---|---|---|
| Prompt quality | User guesses at specifics | Claude builds from full UX + brand context |
| API calls per result | One per asset (5+ calls for a full set) | One source → 8+ derived assets via sharp |
| Cost for a full set | $0.20 – $0.40 | $0.03 – $0.08 |
| Consistency | Each asset designed in isolation | All derived from the same source, guaranteed |
| Favicon set | Forgotten or default | Auto-derived, all sizes, webmanifest included |
| Review | Assets used immediately | Gallery HTML in img-pilot/, review before use |
| Provider lock-in | Hardcoded to one | 9 providers, switch in one config line |
| API key safety | Hope you remembered .gitignore |
Auto-gitignore + chmod 600 + pre-commit hook |
# Step 1 — Add the marketplace entry
/plugin marketplace add Sakaax/img-pilot
# Step 2 — Install the plugin
/plugin install img-pilot@img-pilotThen configure a provider:
/img-pilot configFill in an API key for at least one of the nine supported providers. Config is stored globally at ~/.config/img-pilot/config.toml (reusable across all your projects) with an optional project-local override at <project>/img-pilot/config.toml.
/img-pilot # Guided flow — infers what you need, proposes the cheapest path
/img-pilot logo # Generate logo (icon / text / combo)
/img-pilot favicon # Favicon + full icon set (16/32/180/192/512 + site.webmanifest)
/img-pilot social # OG image, Twitter card, Discord embed
/img-pilot banner # GitHub banner (1280×640)
/img-pilot icons # iOS/Android app icons (with proper rounded corners)
/img-pilot config # Edit config (providers, defaults, limits)Every command runs standalone or as part of the full flow via /img-pilot.
img-pilot reads, in priority order:
| Source | What it extracts |
|---|---|
brand-pilot/tokens.css + tailwind.config.snippet.js + .palette.cache.json |
Validated brand tokens (exact hex, radius scale, shadow scale, fonts) |
brand-pilot/brand-kit.md |
Human-readable brand description |
ux-pilot/ux-brief.md |
Palette, fonts, tone, style, product type, page structure |
img-pilot/brief.md |
Your own fallback brief (from discovery) |
If none of these exist, img-pilot runs a quick 6-question discovery and saves img-pilot/brief.md for future runs.
Production-quality, 200–300 word prompts. Exact hex values (not "orange"), specific style words from the brief, explicit constraints (must work at 16×16, transparent background, flat design), and a curated anti-slop list (no gradients on logo marks, no stock-photo aesthetics, no generic tech clichés like gears or circuits).
The full prompt is shown to you in chat before anything is dispatched.
Every request goes through a decision tree:
- Already exists in
img-pilot/? — skill asks "regenerate?" - Can we skip the API entirely? — SVG-pure (letter favicon, geometric logo)
- Derivable from an existing asset? — resize / compose / round corners (zero cost)
- Derivable from another asset in this run's plan? — one API call + N sharp derivations
- API call required — build prompt, pick provider, emit plan step with cost estimate
You see the complete plan before any charge: assets to produce, provider chosen (with the reason), each step's kind, total cost, prompt preview. You approve. You pay only then.
Every run updates img-pilot/gallery.html — a persistent dark-themed HTML gallery of every asset generated in this project. Each card shows the image, the prompt used, the provider, the cost, dimensions, and timestamp. Latest run on top, scrollable history.
Run with --serve to browse it on http://localhost:4090.
Nine providers in v0.1, one thin adapter each (~80–120 lines per file). Plug your own key. No cloud SDK dependencies. Switch providers per run with --provider <name>.
| Provider | Model | Best for | ~$/image |
|---|---|---|---|
| OpenAI | GPT Image 1.5 | General quality baseline | $0.04 |
| Black Forest Labs | FLUX 2 Pro | Photorealism, sharp edges | $0.04 |
| Imagen 4 (Vertex AI) | Best value, strong text rendering | $0.04 | |
| Stability AI | Stable Diffusion 3.5 | Dev-friendly, self-hostable path | $0.03 |
| Ideogram | Ideogram v3 | Text-in-image (logos, wordmarks) | $0.08 |
| Leonardo AI | Leonardo | Custom models, brand fine-tuning | $0.035 |
| Replicate | Multi-model router | Access to any open-source model | variable |
| Recraft | Recraft v3 | Native SVG output (icons) | $0.04 |
| fal.ai | Multi-model, ultra-fast | Low-latency, async webhooks | variable |
Midjourney is on the roadmap and intentionally not included in v0.1 — no official REST API (Discord-only), and we'd rather ship 9 rock-solid adapters than 10 fragile ones.
Everything lands in img-pilot/ at your project root (auto-added to .gitignore):
img-pilot/
├── config.toml # Provider API keys (chmod 600, gitignored)
├── brief.md # Image brief (auto from ux-pilot or discovery)
├── gallery.html # Persistent audit log of every asset
├── logo/
│ ├── logo-icon.png # 1024×1024 source
│ ├── logo-text.png
│ └── logo-combo.png
├── favicon/
│ ├── favicon.svg
│ ├── favicon-16.png · favicon-32.png · apple-touch-icon.png
│ ├── icon-192.png · icon-512.png
│ └── site.webmanifest
├── social/
│ ├── og-image.png # 1200×630
│ ├── twitter-card.png # 1200×628
│ └── discord-embed.png # 1280×720
├── banner/
│ └── github-banner.png # 1280×640
└── icons/
├── ios-180.png
├── android-192.png
└── android-512.png
The CLI emits a --dry-run JSON plan before anything costs money. Claude Code presents the plan in chat — assets, provider, full prompt preview, total cost — and waits for your explicit approval. You approve, you pay. You cancel, you don't. There is no path in the code where an API call happens without this confirmation round-trip.
Default max_api_calls_per_session = 5 (configurable). Even if something goes sideways in the skill layer, the CLI refuses to exceed the limit and throws SessionLimitExceededError.
| Layer | What it does |
|---|---|
| Auto-gitignore | img-pilot/ is appended to .gitignore before any config write. If the gitignore check fails, the write is aborted. |
| chmod 600 | config.toml is set to owner read/write only immediately after write (POSIX; Windows uses icacls). |
| Pre-commit hook | A hook is installed in .git/hooks/pre-commit that scans staged files for sk-..., AIza..., key-..., api_key = "..." patterns and blocks commits containing them. Append-safe (preserves any existing hook). |
API keys are never printed in full anywhere — terminal output, logs, and error messages all mask them as sk-...xxxx (first 3 + last 4 chars).
Zero telemetry. Zero feature flags. Zero remote rule fetching. If you tcpdump during a scan, the only outbound traffic is the provider call you explicitly approved. Period.
| Language | TypeScript 5.8 |
| Runtime | Bun 1.1+ (Node.js fallback supported) |
| Image processing | sharp (libvips native binding) — no ImageMagick required |
| HTTP | Native fetch — no SDK lock-in |
| Config | TOML via @iarna/toml |
| Tests | 120+ via Bun test runner; all provider calls mocked in CI |
| License | MIT |
Dependencies are intentionally minimal. sharp covers 100% of the image operations needed in v0.1 (resize, crop, compose, round corners via SVG mask, format convert, text overlay). No shell-spawning. No subprocess management.
img-pilot/
├── .claude-plugin/{plugin,marketplace}.json
├── .claude/skills/img-pilot/SKILL.md
├── skills/img-pilot.md
├── skill.json
├── src/
│ ├── index.ts # CLI dispatcher
│ ├── types.ts
│ ├── config/ # Hybrid global+project TOML loader
│ ├── security/ # Gitignore, chmod, pre-commit hook, key masking
│ ├── brief/ # ux-pilot + brand-pilot + own parsers, sanitize, reader
│ ├── discovery/ # 6-question fallback flow
│ ├── optimizer/ # Derivation map, existing-scan, plan-builder
│ ├── prompt/ # Templates + anti-slop builder
│ ├── providers/ # 9 adapters (OpenAI, BFL, Google, Stability, Ideogram, Leonardo, Replicate, Recraft, fal)
│ ├── processor/ # sharp-based resize/crop/compose/round-corners/convert/add-text
│ ├── svg/ # Zero-cost generators (letter favicon, geometric logo, text logo, webmanifest)
│ └── gallery/ # HTML gallery + --serve mode
├── templates/ # config.toml, gallery.html, 5 prompt templates
├── hooks/pre-commit.sh
├── tests/ # 120+ tests, all API calls mocked
└── .github/workflows/ci.yml
v0.2 and beyond:
- Midjourney when they ship an official REST API
- Self-hosted inference — local Stable Diffusion via Ollama/ComfyUI
- Batch mode — N variants of the same prompt in one call
- Fine-tuning / style reference upload (LoRA, IP-adapter)
- CI wrapper — regenerate assets in GitHub Actions on brief change
- Provider benchmarking — run the same prompt across 3 providers and compare side-by-side
- Live pricing — fetch cost estimates from provider APIs at dry-run time
Claude Code caches plugins aggressively. To force a clean reinstall:
- Quit Claude Code completely
- Delete the cache:
rm -rf ~/.claude/plugins/cache/img-pilot - Relaunch Claude Code
- Reinstall:
/plugin marketplace add Sakaax/img-pilot /plugin marketplace update img-pilot /plugin install img-pilot@img-pilot /reload-plugins
- Google Imagen requires a Vertex AI service-account JSON (the full JSON blob, not a path) pasted into
api_key. You also need the Imagen API enabled on your GCP project. See Google Cloud quickstart. - Midjourney is not supported in v0.1. Use any of the other 9 providers in the meantime — FLUX 2 Pro via Black Forest Labs or Replicate gets you comparable quality.
Run /img-pilot config to set up a provider. Alternatively create ~/.config/img-pilot/config.toml manually — the template is shipped with the plugin and will be written on first config invocation.
sharp ships a native binary (~15 MB) that downloads on bun install. If it fails:
- Make sure you're on Node 18+ / Bun 1.1+
- Check your network (corporate proxy? VPN?)
- Retry:
bun install --force - Last resort:
rm -rf node_modules bun.lock && bun install
| Landing page | img-pilot.sakaax.com |
| GitHub | github.com/Sakaax/img-pilot |
| Pilot ecosystem | ux-pilot · brand-pilot |
| @sakaaxx |
Built by @Sakaax. Part of the pilot ecosystem — each plugin stands alone, but they compose into a complete design pipeline from first sketch to production assets. If you run all three in order, you get a full product design done in your terminal.
MIT — free forever. No accounts, no subscriptions, no usage caps beyond what your chosen image-generation provider imposes.
Built by Sakaax — first Claude Code plugin to generate AI images from the terminal.
