fix: repair production Docker Compose (#400)#420
Open
Conversation
Five bugs prevented `docker compose up` (production mode) from working:
1. **Manager healthcheck wrong port** — compose set `PORT: 8000` but
the healthcheck URL used `:8080`. The agent service depended on this
healthcheck, so startup would hang indefinitely.
2. **`DATABASEAUTO_MIGRATE` typo** — missing double-underscore separator.
The env var was never read (auto-migration is always enabled in
`main.py`). Removed the dead variable to avoid confusion.
3. **Web port mapping** — nginx inside the web container listens on 8080
(see `nginx.conf`). The compose file mapped `3000:3000`, so the host
got a "connection refused" on port 3000. Fixed to `3000:8080`.
4. **`CHOKIDAR_USEPOLLING=true` in prod** — a Vite/Webpack hot-reload
env var that serves no purpose in the production nginx image. Removed.
5. **`VITE_API_URL` vs `API_URL`** — `docker-entrypoint.sh` injects the
API base URL via `\${API_URL:-}` into `config.js` at container start.
The compose file set `VITE_API_URL`, which the entrypoint never reads.
Renamed to `API_URL`.
Also fixed in `docker-compose.build.yml`:
6. **`IDUN_MANAGER_HOST` with path** — the engine's `with_config_from_api`
already appends `/api/v1/agents/config` to the base host. The previous
value included the full path, resulting in a doubled URL.
Trimmed to the base URL.
Additionally:
- Re-enabled the `idun_network` Docker network (was commented out) so
all services communicate over an explicit bridge rather than the
implicit default network.
- Added CI workflow `.github/workflows/smoke-test-compose.yml` that
starts the production stack on every PR touching compose/Dockerfile
paths and asserts the Manager health endpoint and web UI respond.
- Updated `docs/deployment/overview.mdx` and `docs/quickstart.mdx` to
document both the production (pre-built images) and development (build
from source) compose variants, replacing the outdated dev-only example.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The smoke test was trying to pull `freezaa9/idun-ai:0.5.1` from Docker Hub, which doesn't exist until a release is published. This caused the CI job to fail immediately with "manifest unknown". Fix: build the manager and web images directly from their Dockerfiles in the smoke test, then tag them to match the names expected by `docker-compose.yml`. This lets CI test the exact same compose configuration end-users run without depending on published images. Also added a separate `validate-compose` job that validates the YAML syntax of both `docker-compose.yml` and `docker-compose.build.yml` using `docker compose config --quiet`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Closes #400. Running
docker compose up(the production stack using pre-built images) was broken in several ways, making it impossible to start the platform without editing the compose file manually.What was broken
Six bugs were identified by cross-referencing
docker-compose.ymlwith the Manager Dockerfile, the webnginx.conf,docker-entrypoint.sh, and the engine source code:docker-compose.yml:8080but compose setPORT: 8000docker-compose.ymlDATABASEAUTO_MIGRATE: true(missing__separators)main.pydocker-compose.yml3000:3000but nginx listens on8080connection refusedon host port 3000docker-compose.ymlCHOKIDAR_USEPOLLING=truein web servicedocker-compose.ymlVITE_API_URLset instead ofAPI_URLdocker-entrypoint.shreads${API_URL:-}, notVITE_API_URL→ web UI always pointed at wrong/empty API URLdocker-compose.build.ymlIDUN_MANAGER_HOSThad full/api/v1/agents/configpathwith_config_from_apiappends this path — resulting in a doubled URLChanges
docker-compose.yml— fixes 1–5 above, plus re-enables the explicitidun_network(was fully commented out) so services communicate over an explicit bridge.docker-compose.build.yml— fix 6: trimIDUN_MANAGER_HOSTto base URL only..github/workflows/smoke-test-compose.yml(new) — CI smoke test that:docker-compose.yml, Dockerfiles, or nginx configdb + manager + webusing the production composeGET /api/v1/healthzreturns{"status": ...}docs/deployment/overview.mdx— rewrites the "Managed (full stack)" tab to document the productiondocker compose uppath (pre-built images). Adds a "Development" tab for thedocker-compose.dev.ymlbuild-from-source path. Updates the embedded YAML example to match the real file.docs/quickstart.mdx— adds a nested tabs block in the "Start the platform" step showing both the prod and dev compose commands.How to test manually
Tests
uv run pytest services/idun_agent_manager/tests)docker compose config --quiet)🤖 Generated with Claude Code