Skip to content

sumitchatterjee13/vllm-image-mcp-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vLLM Image MCP Server

An MCP (Model Context Protocol) server that provides AI-powered image generation tools via any vLLM-Omni compatible endpoint. The server is model-aware, GPU-aware, and supports both single and batch image generation with automatic parameter tuning based on the loaded model.

Features

  • 10 MCP tools for image generation, batch processing, progress monitoring, and system status
  • Model-aware defaults — auto-detects the loaded model and applies optimal parameters
  • 5 built-in model profiles: FLUX.2-klein 4B/9B, Qwen-Image variants, Z-Image-Turbo
  • Multi-format output — save as PNG (lossless), JPG (quality=95), or WebP (smallest, ideal for web)
  • Up to 4K resolution — FLUX Klein models support resolutions up to 3840x2160
  • Flexible output paths — the AI model chooses where to save images per-call (e.g. directly into your project's assets/ folder)
  • Batch generation with automatic concurrency control based on estimated VRAM
  • Batch progress monitoring — poll running batches to track completion and prevent timeouts
  • Resolution validation — snaps to valid multiples, clamps to megapixel limits
  • Aspect ratio presets — use shortcuts like 16:9, 16:9_2k, 16:9_4k instead of raw pixels

Tools

Tool Description
generate_image Generate a single image from a text prompt
batch_generate Generate multiple images concurrently (max 20)
get_model_info Get current model info and recommended parameters
list_presets List aspect ratio presets for the current model
estimate_generation Estimate time and resource usage before generating
gpu_status Check GPU/VRAM availability for capacity planning
server_health Check vLLM server connectivity and status
cancel_batch Cancel a running batch generation
check_batch_progress Check progress of a running batch generation
list_active_batches List all currently running batch jobs

Dynamic Output Directory

Both generate_image and batch_generate require an output_dir parameter. The AI model decides where to save images based on your project context — no hardcoded paths needed:

generate_image(prompt="A hero banner", output_dir="./src/assets/images")
batch_generate(prompts=["cat", "dog"], output_dir="./public/img")

This means images land exactly where your project needs them.

Output Formats

Both generate_image and batch_generate support a format parameter:

Format Use Case Quality File Size
png (default) Lossless quality, transparency support Lossless Largest
jpg General use, photographs 95% quality ~70% smaller
webp Web projects, optimized delivery 90% quality ~80% smaller
generate_image(prompt="A hero banner", output_dir="./assets", format="webp")
batch_generate(prompts=["cat", "dog"], output_dir="./public", format="jpg")

The AI model chooses the format based on context (e.g. webp for web projects, png for graphic design).

Batch Progress Monitoring

For long-running batch generations, use check_batch_progress to poll every ~50 seconds:

# Start batch
batch_generate(prompts=[...], output_dir="./output")  # returns batch_id

# Poll progress
check_batch_progress(batch_id="batch_abc123def456")

# Discover running batches
list_active_batches()

Prerequisites

  • Python 3.11+
  • A running vLLM-Omni server with an image generation model loaded
  • pip or uv for package installation

Installation

From source

git clone https://github.com/sumitchatterjee13/vllm-image-mcp-server.git
cd vllm-image-mcp-server
pip install -e .

Verify installation

vllm-image-mcp --help

Or run directly:

python -m vllm_image_mcp.server --help

Usage

Standalone

# Default: connects to http://localhost:6655
vllm-image-mcp

# Custom vLLM server URL
vllm-image-mcp --vllm-url http://192.168.1.100:6655

# With custom timeout
vllm-image-mcp --vllm-url http://localhost:6655 --timeout 600

CLI Arguments

Argument Default Description
--vllm-url http://localhost:6655 Base URL of the vLLM-Omni server
--max-concurrent auto Max concurrent generations for batch
--timeout 300 Request timeout in seconds

Note: There is no --output-dir flag. The output path is provided by the AI model on every generate_image / batch_generate call, so images are saved wherever the project needs them.


MCP Client Configuration

Claude Code

Option A — CLI command:

claude mcp add vllm-image -- python -m vllm_image_mcp.server --vllm-url http://localhost:6655

Option B — Project config (.mcp.json in project root):

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Cursor

Create .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global):

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Kilo Code

Create .kilocode/mcp.json in your project root:

{
  "mcpServers": {
    "vllm-image": {
      "command": "python",
      "args": [
        "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ],
      "alwaysAllow": [],
      "disabled": false
    }
  }
}

Windows Note

On native Windows (not WSL/Git Bash), if you get "Connection closed" errors, wrap the command with cmd:

{
  "mcpServers": {
    "vllm-image": {
      "command": "cmd",
      "args": [
        "/c", "python", "-m", "vllm_image_mcp.server",
        "--vllm-url", "http://localhost:6655"
      ]
    }
  }
}

Remote vLLM Server

To connect to a vLLM server on another machine, change the --vllm-url argument:

"args": [
  "-m", "vllm_image_mcp.server",
  "--vllm-url", "http://192.168.1.100:6655"
]

Supported Models

The server includes built-in profiles with optimal defaults for these models:

Model Type Steps Guidance Neg. Prompt Est. VRAM Max Resolution
black-forest-labs/FLUX.2-klein-4B Distilled 4 1.0 No 13 GB 4K (9.0 MP)
black-forest-labs/FLUX.2-klein-9B Distilled 4 1.0 No 29 GB 4K (9.0 MP)
Qwen/Qwen-Image-2512 Standard 28 5.0 Yes 40 GB 2K (4.0 MP)
Qwen/Qwen-Image Standard 50 5.0 Yes 40 GB 2K (4.0 MP)
Tongyi-MAI/Z-Image-Turbo Distilled 8 1.0 No 16 GB 2K (4.0 MP)

Unknown models automatically use fallback defaults (steps=28, guidance=5.0, 1024x1024).

FLUX Klein Resolution Presets

FLUX Klein models support resolutions from standard to 4K:

Preset Resolution Megapixels Use Case
1:1 1024x1024 1.0 MP Default, fast
16:9 1024x576 0.6 MP Widescreen, fast
9:16 576x1024 0.6 MP Portrait, fast
1:1_2k 2048x2048 4.2 MP High detail square
16:9_2k 2560x1440 3.7 MP 2K widescreen
4:3_2k 2048x1536 3.1 MP 2K standard
16:9_4k 3840x2160 8.3 MP 4K, maximum detail
9:16_4k 2160x3840 8.3 MP 4K portrait

4K resolutions take significantly longer and use more VRAM. Use only when the user needs high-resolution output (wallpapers, print-quality, posters).


Prompt Writing Guide

The server includes model-specific prompt guidance that is returned by the get_model_info tool. The AI model should call get_model_info before its first generation to learn the loaded model's prompting rules. Below is a summary.

General Rules (All Models)

  • Write prompts as natural language prose, never comma-separated tags
  • Put the subject first — the first 10-20 words carry the most weight
  • Always specify lighting — it has the single greatest impact on quality
  • Ideal length: 30-80 words
  • Do NOT use quality tags like 8k, masterpiece, best quality, ultra HD — they waste tokens
  • Do NOT describe sequential actions — images are a single moment
  • Do NOT mix conflicting styles (e.g. "photorealistic oil painting")
  • For photorealism, reference real cameras: Shot on Sony A7IV, 85mm lens at f/2.0
  • For text in images, use quotation marks: The sign reads "OPEN"

FLUX.2 Klein (Distilled Models)

  • Negative prompts are ignored at CFG 1.0 — do not send them
  • Be extra explicit and descriptive — no auto-enhancement available
  • Every word must contribute visual information; filler text hurts quality
  • Use emphasis phrases: "prominently featuring", "the focal point is"

Good prompt:

Professional studio product shot on polished concrete. A minimalist ceramic
coffee mug with matte black finish, steam rising from hot coffee, centered
in frame. Ultra-realistic commercial photography. Three-point softbox
lighting, diffused highlights, no harsh shadows. Shot with 85mm lens at f/5.6.

Bad prompt:

coffee mug, black, steam, studio, professional, 8k, masterpiece, best quality,
ultra detailed, sharp focus

Qwen-Image / Z-Image (Standard Models)

  • Use negative prompts — they improve results significantly
  • Good default negative: blurry, low quality, distorted, watermark, oversaturated, artificial
  • For portraits add: extra fingers, deformed hands, unnatural proportions
  • The model interprets prompts very literally — be precise
  • For text-heavy images, raise guidance_scale to 7.0
  • Wrap desired text in double quotes and specify font style
  • Use photograph instead of photorealistic or 3d render

Good prompt:

A futuristic sports car parked under neon city lights, photorealistic style.
Reflections shimmer on wet asphalt streets. Dramatic side lighting with deep
shadows and vibrant highlights. "Night Racer" in metallic chrome text on the hood.

Negative prompt:

blurry, low quality, distorted, watermark, oversaturated

Running Tests

pip install pytest pytest-asyncio
pytest tests/ -v

License

MIT

About

A mcp server to generate images using models surved using vllm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages