vLLM Image MCP Server

An MCP (Model Context Protocol) server that provides AI-powered image generation tools via any vLLM-Omni compatible endpoint. The server is model-aware, GPU-aware, and supports both single and batch image generation with automatic parameter tuning based on the loaded model.

Features

10 MCP tools for image generation, batch processing, progress monitoring, and system status
Model-aware defaults — auto-detects the loaded model and applies optimal parameters
5 built-in model profiles: FLUX.2-klein 4B/9B, Qwen-Image variants, Z-Image-Turbo
Multi-format output — save as PNG (lossless), JPG (quality=95), or WebP (smallest, ideal for web)
Up to 4K resolution — FLUX Klein models support resolutions up to 3840x2160
Flexible output paths — the AI model chooses where to save images per-call (e.g. directly into your project's assets/ folder)
Batch generation with automatic concurrency control based on estimated VRAM
Batch progress monitoring — poll running batches to track completion and prevent timeouts
Resolution validation — snaps to valid multiples, clamps to megapixel limits
Aspect ratio presets — use shortcuts like 16:9, 16:9_2k, 16:9_4k instead of raw pixels

Tools

Tool	Description
`generate_image`	Generate a single image from a text prompt
`batch_generate`	Generate multiple images concurrently (max 20)
`get_model_info`	Get current model info and recommended parameters
`list_presets`	List aspect ratio presets for the current model
`estimate_generation`	Estimate time and resource usage before generating
`gpu_status`	Check GPU/VRAM availability for capacity planning
`server_health`	Check vLLM server connectivity and status
`cancel_batch`	Cancel a running batch generation
`check_batch_progress`	Check progress of a running batch generation
`list_active_batches`	List all currently running batch jobs

Dynamic Output Directory

Both generate_image and batch_generate require an output_dir parameter. The AI model decides where to save images based on your project context — no hardcoded paths needed:

generate_image(prompt="A hero banner", output_dir="./src/assets/images")
batch_generate(prompts=["cat", "dog"], output_dir="./public/img")

This means images land exactly where your project needs them.

Output Formats

Both generate_image and batch_generate support a format parameter:

Format	Use Case	Quality	File Size
`png` (default)	Lossless quality, transparency support	Lossless	Largest
`jpg`	General use, photographs	95% quality	~70% smaller
`webp`	Web projects, optimized delivery	90% quality	~80% smaller

generate_image(prompt="A hero banner", output_dir="./assets", format="webp")
batch_generate(prompts=["cat", "dog"], output_dir="./public", format="jpg")

The AI model chooses the format based on context (e.g. webp for web projects, png for graphic design).

Batch Progress Monitoring

For long-running batch generations, use check_batch_progress to poll every ~50 seconds:

# Start batch
batch_generate(prompts=[...], output_dir="./output")  # returns batch_id

# Poll progress
check_batch_progress(batch_id="batch_abc123def456")

# Discover running batches
list_active_batches()

Prerequisites

Python 3.11+
A running vLLM-Omni server with an image generation model loaded
pip or uv for package installation

Installation

From source

git clone https://github.com/sumitchatterjee13/vllm-image-mcp-server.git
cd vllm-image-mcp-server
pip install -e .

Verify installation

vllm-image-mcp --help

Or run directly:

python -m vllm_image_mcp.server --help

Usage

Standalone

# Default: connects to http://localhost:6655
vllm-image-mcp

# Custom vLLM server URL
vllm-image-mcp --vllm-url http://192.168.1.100:6655

# With custom timeout
vllm-image-mcp --vllm-url http://localhost:6655 --timeout 600

CLI Arguments

Argument	Default	Description
`--vllm-url`	`http://localhost:6655`	Base URL of the vLLM-Omni server
`--max-concurrent`	auto	Max concurrent generations for batch
`--timeout`	`300`	Request timeout in seconds

Note: There is no --output-dir flag. The output path is provided by the AI model on every generate_image / batch_generate call, so images are saved wherever the project needs them.

MCP Client Configuration

Claude Code

Option A — CLI command:

claude mcp add vllm-image -- python -m vllm_image_mcp.server --vllm-url http://localhost:6655

Option B — Project config (.mcp.json in project root):

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Cursor

Create .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global):

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Kilo Code

Create .kilocode/mcp.json in your project root:

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ],
      "alwaysAllow": [],
      "disabled": false
    }
  }
}

Windows Note

On native Windows (not WSL/Git Bash), if you get "Connection closed" errors, wrap the command with cmd:

{
  "mcpServers": {
    "vllm-image": {
      "command": "cmd",
      "args": [
        "/c", "python", "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Remote vLLM Server

To connect to a vLLM server on another machine, change the --vllm-url argument:

"args": [
  "-m", "vllm_image_mcp.server",
  "--vllm-url", "http://192.168.1.100:6655"
]

Supported Models

The server includes built-in profiles with optimal defaults for these models:

Model	Type	Steps	Guidance	Neg. Prompt	Est. VRAM	Max Resolution
`black-forest-labs/FLUX.2-klein-4B`	Distilled	4	1.0	No	13 GB	4K (9.0 MP)
`black-forest-labs/FLUX.2-klein-9B`	Distilled	4	1.0	No	29 GB	4K (9.0 MP)
`Qwen/Qwen-Image-2512`	Standard	28	5.0	Yes	40 GB	2K (4.0 MP)
`Qwen/Qwen-Image`	Standard	50	5.0	Yes	40 GB	2K (4.0 MP)
`Tongyi-MAI/Z-Image-Turbo`	Distilled	8	1.0	No	16 GB	2K (4.0 MP)

Unknown models automatically use fallback defaults (steps=28, guidance=5.0, 1024x1024).

FLUX Klein Resolution Presets

FLUX Klein models support resolutions from standard to 4K:

Preset	Resolution	Megapixels	Use Case
`1:1`	1024x1024	1.0 MP	Default, fast
`16:9`	1024x576	0.6 MP	Widescreen, fast
`9:16`	576x1024	0.6 MP	Portrait, fast
`1:1_2k`	2048x2048	4.2 MP	High detail square
`16:9_2k`	2560x1440	3.7 MP	2K widescreen
`4:3_2k`	2048x1536	3.1 MP	2K standard
`16:9_4k`	3840x2160	8.3 MP	4K, maximum detail
`9:16_4k`	2160x3840	8.3 MP	4K portrait

4K resolutions take significantly longer and use more VRAM. Use only when the user needs high-resolution output (wallpapers, print-quality, posters).

Prompt Writing Guide

The server includes model-specific prompt guidance that is returned by the get_model_info tool. The AI model should call get_model_info before its first generation to learn the loaded model's prompting rules. Below is a summary.

General Rules (All Models)

Write prompts as natural language prose, never comma-separated tags
Put the subject first — the first 10-20 words carry the most weight
Always specify lighting — it has the single greatest impact on quality
Ideal length: 30-80 words
Do NOT use quality tags like 8k, masterpiece, best quality, ultra HD — they waste tokens
Do NOT describe sequential actions — images are a single moment
Do NOT mix conflicting styles (e.g. "photorealistic oil painting")
For photorealism, reference real cameras: Shot on Sony A7IV, 85mm lens at f/2.0
For text in images, use quotation marks: The sign reads "OPEN"

FLUX.2 Klein (Distilled Models)

Negative prompts are ignored at CFG 1.0 — do not send them
Be extra explicit and descriptive — no auto-enhancement available
Every word must contribute visual information; filler text hurts quality
Use emphasis phrases: "prominently featuring", "the focal point is"

Good prompt:

Professional studio product shot on polished concrete. A minimalist ceramic
coffee mug with matte black finish, steam rising from hot coffee, centered
in frame. Ultra-realistic commercial photography. Three-point softbox
lighting, diffused highlights, no harsh shadows. Shot with 85mm lens at f/5.6.

Bad prompt:

coffee mug, black, steam, studio, professional, 8k, masterpiece, best quality,
ultra detailed, sharp focus

Qwen-Image / Z-Image (Standard Models)

Use negative prompts — they improve results significantly
Good default negative: blurry, low quality, distorted, watermark, oversaturated, artificial
For portraits add: extra fingers, deformed hands, unnatural proportions
The model interprets prompts very literally — be precise
For text-heavy images, raise guidance_scale to 7.0
Wrap desired text in double quotes and specify font style
Use photograph instead of photorealistic or 3d render

Good prompt:

A futuristic sports car parked under neon city lights, photorealistic style.
Reflections shimmer on wet asphalt streets. Dramatic side lighting with deep
shadows and vibrant highlights. "Night Racer" in metallic chrome text on the hood.

Negative prompt:

blurry, low quality, distorted, watermark, oversaturated

Running Tests

pip install pytest pytest-asyncio
pytest tests/ -v

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src/vllm_image_mcp		src/vllm_image_mcp
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vLLM Image MCP Server

Features

Tools

Dynamic Output Directory

Output Formats

Batch Progress Monitoring

Prerequisites

Installation

From source

Verify installation

Usage

Standalone

CLI Arguments

MCP Client Configuration

Claude Code

Claude Desktop

Cursor

Kilo Code

Windows Note

Remote vLLM Server

Supported Models

FLUX Klein Resolution Presets

Prompt Writing Guide

General Rules (All Models)

FLUX.2 Klein (Distilled Models)

Qwen-Image / Z-Image (Standard Models)

Running Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

vLLM Image MCP Server

Features

Tools

Dynamic Output Directory

Output Formats

Batch Progress Monitoring

Prerequisites

Installation

From source

Verify installation

Usage

Standalone

CLI Arguments

MCP Client Configuration

Claude Code

Claude Desktop

Cursor

Kilo Code

Windows Note

Remote vLLM Server

Supported Models

FLUX Klein Resolution Presets

Prompt Writing Guide

General Rules (All Models)

FLUX.2 Klein (Distilled Models)

Qwen-Image / Z-Image (Standard Models)

Running Tests

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages