Complete guide for deploying Live VLM WebUI using Docker on all supported platforms.
The easiest way to run Live VLM WebUI in Docker:
./scripts/start_container.shWhat the script does:
- ✅ Auto-detects your platform (PC, Jetson Orin, Jetson Thor, Mac)
- ✅ Pulls the appropriate pre-built image from GitHub Container Registry
- ✅ Configures GPU access automatically
- ✅ Sets up correct runtime and permissions
- ✅ Starts the container with optimal settings
Supported platforms:
- x86_64 PC
- NVIDIA DGX Spark
- NVIDIA Jetson AGX Orin
- NVIDIA Jetson Orin Nano
- NVIDIA Jetson AGX Thor
⚠️ Mac (cannot talk to local inference server like Ollama due to Mac's Docker limitation)
Example output:
🚀 Starting Live VLM WebUI Docker Container
Platform detected: x86_64 PC
GPU: NVIDIA RTX 4090 detected
Pulling image: ghcr.io/nvidia-ai-iot/live-vlm-webui:latest
Starting container...
✅ Container started successfully!
Access at: https://localhost:8090
Docker is strongly recommended for Jetson platforms:
✅ Works immediately - No platform-specific Python/pip setup
✅ Isolated environment - No system package conflicts
✅ Full GPU monitoring - jtop included and configured
✅ Production-ready - Tested and optimized
✅ No Python version conflicts - Self-contained environment
✅ Easy updates - docker pull to get latest version
Alternative (pip): Possible but requires more manual setup. See main README for pip installation.
For advanced users who want fine-grained control over Docker configuration.
docker run -d \
--name live-vlm-webui \
--network host \
--gpus all \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest
# Access at: https://localhost:8090docker run -d \
--name live-vlm-webui \
-p 8090:8090 \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest
# Access at: https://localhost:8090docker run -d \
--name live-vlm-webui \
--network host \
--runtime nvidia \
--privileged \
-v /run/jtop.sock:/run/jtop.sock:ro \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest-jetson-orin
# Access at: https://localhost:8090Note: --privileged and jtop socket mount are required for GPU monitoring.
docker run -d \
--name live-vlm-webui \
--network host \
--gpus all \
--privileged \
-v /run/jtop.sock:/run/jtop.sock:ro \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest-jetson-thor
# Access at: https://localhost:8090Note: Thor uses --gpus all (SBSA-compliant) instead of --runtime nvidia.
docker run -d \
--name live-vlm-webui \
-p 8090:8090 \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest-mac
# Access at: https://localhost:8090Note: No GPU support on Mac in Docker. CPU monitoring only.
./scripts/stop_container.sh
# OR manually:
docker stop live-vlm-webuidocker restart live-vlm-webuidocker logs live-vlm-webui
# Follow logs in real-time:
docker logs -f live-vlm-webuidocker rm -f live-vlm-webui# Stop and remove old container
docker stop live-vlm-webui
docker rm live-vlm-webui
# Pull latest image
docker pull ghcr.io/nvidia-ai-iot/live-vlm-webui:latest
# Start new container
./scripts/start_container.shFor x86_64 PC:
docker build -f docker/Dockerfile -t live-vlm-webui:x86 .For Jetson Orin:
docker build -f docker/Dockerfile.jetson-orin -t live-vlm-webui:jetson-orin .For Jetson Thor:
docker build -f docker/Dockerfile.jetson-thor -t live-vlm-webui:jetson-thor .For Mac:
docker build -f docker/Dockerfile.mac -t live-vlm-webui:mac .Build for multiple platforms at once:
./scripts/build_multiarch.shThis builds images for both amd64 and arm64.
Use --network host when connecting to services on the same host:
docker run -d \
--name live-vlm-webui \
--network host \
--gpus all \
live-vlm-webui:x86Benefits:
- ✅ Container can access
localhost:11434(Ollama) - ✅ Container can access
localhost:8000(vLLM, NIM) - ✅ No port mapping needed
Use -p port mapping when connecting to remote services:
docker run -d \
--name live-vlm-webui \
-p 8090:8090 \
--gpus all \
-e VLM_API_BASE=http://your-vlm-server:8000/v1 \
-e VLM_MODEL=llama-3.2-11b-vision-instruct \
live-vlm-webui:x86Configure the application using environment variables:
| Variable | Default | Description |
|---|---|---|
VLM_API_BASE |
Auto-detected | VLM API endpoint URL |
VLM_MODEL |
Auto-detected | Model name to use |
VLM_PROMPT |
"Describe..." | Default prompt |
VLM_API_KEY |
- | API key (for cloud services) |
PORT |
8090 | Server port |
For production deployment with your own SSL certificates:
docker run -d \
--name live-vlm-webui \
-p 8090:8090 \
-v /path/to/your/cert.pem:/app/cert.pem:ro \
-v /path/to/your/key.pem:/app/key.pem:ro \
live-vlm-webui:x86All images are available on GitHub Container Registry:
| Image Tag | Platform | Base Image | Size | GPU Support |
|---|---|---|---|---|
latest |
x86_64 PC | Ubuntu 22.04 + CUDA 12.4 | ~1.5GB | NVIDIA GPU |
latest-jetson-orin |
Jetson Orin | L4T r36.2 | ~1.2GB | Jetson GPU |
latest-jetson-thor |
Jetson Thor | Ubuntu 24.04 + CUDA 13.0 | ~1.3GB | Jetson GPU |
latest-mac |
Mac | Ubuntu 22.04 | ~800MB | CPU only |
Pull specific image:
docker pull ghcr.io/nvidia-ai-iot/live-vlm-webui:latest-jetson-orinBase Image: nvidia/cuda:12.4.1-runtime-ubuntu22.04
- Includes NVIDIA CUDA runtime libraries for GPU monitoring via NVML
- Enables
pynvmlto query GPU name, utilization, VRAM, temperature, and power - Compatible with NVIDIA drivers 545+ (GeForce, Quadro, Tesla, etc.)
- Image size: ~1.5GB (compressed)
Base Image: nvcr.io/nvidia/l4t-base:r36.2.0 (L4T r36.2.0, JetPack 6.0)
- Optimized for Jetson Orin platform (AGX Orin, Orin Nano, Orin NX)
- Uses
jtop(jetson-stats from PyPI) for GPU monitoring - Supports JetPack 6.x
- Image size: ~1.2GB (compressed)
Base Image: nvcr.io/nvidia/cuda:13.0.0-runtime-ubuntu24.04
- Jetson Thor is SBSA-compliant - Uses standard NGC CUDA containers (no L4T-specific images needed!)
- This is a major architectural change from previous Jetsons (Orin, Xavier)
- Uses
jtop(jetson-stats from GitHub) for latest Thor GPU monitoring support - Ubuntu 24.04 base (aligned with JetPack 7.x)
- Reference: Jetson Thor CUDA Setup Guide
Why separate Dockerfiles?
- Jetson Orin: Requires L4T-specific base images (
l4t-base:r36.x) - Jetson Thor: SBSA-compliant, uses standard CUDA containers
- Monitoring: Both use
jtopfor GPU stats (NVML limited on Jetson) - jetson-stats source: Orin uses PyPI (stable), Thor uses GitHub (bleeding-edge support)
Check logs:
docker logs live-vlm-webuiCommon issues:
- Port 8090 already in use: Use
-p 8091:8090to map to different port - GPU not accessible: Ensure NVIDIA Docker runtime is installed
- Permission denied: Try adding
sudoor check Docker group membership
For PC/Workstation:
- Ensure
--gpus allflag is used - Verify NVIDIA Docker runtime:
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi
For Jetson:
- Ensure jtop is running on host:
sudo systemctl status jtop - Verify socket mount:
-v /run/jtop.sock:/run/jtop.sock:ro - Check
--privilegedflag is set
Solution: Use --network host instead of -p 8090:8090
Reason: Bridge network mode isolates container networking. With --network host, container can access localhost:11434 (Ollama), localhost:8000 (vLLM), etc.
Tip: Pre-pull images on slow connections:
docker pull ghcr.io/nvidia-ai-iot/live-vlm-webui:latestMap to different port:
docker run -d -p 9090:8090 --name live-vlm-webui \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest
# Access at: https://localhost:9090- Main Documentation: README.md
- Troubleshooting Guide: docs/troubleshooting.md
- VLM Backend Setup: See README for Ollama, vLLM, NVIDIA API setup
- Docker Compose: See
docker-compose.ymlin repository root
Issues or questions?
- GitHub Issues: https://github.com/nvidia-ai-iot/live-vlm-webui/issues
- Check troubleshooting guide first
- Include platform info and logs when reporting issues