feat(ci): build nightly distribution images from source#6068
Draft
cdoern wants to merge 1 commit into
Draft
Conversation
Add a nightly workflow that builds distribution container images from the current main source tree (INSTALL_MODE=editable), boot-smoke-tests each image on a native per-arch runner, and publishes a multi-arch manifest to DockerHub. Building from source removes the test.pypi propagation wait that made the previous nightly docker builds flaky. Images are published as :nightly, :<date>, and :<short-sha>. The :latest tag is deliberately left to the release pipeline so it continues to mean "latest stable release"; pull :nightly to track main. On pull requests touching build-relevant paths the workflow builds and boot-smoke-tests the starter distro on amd64 without pushing, so startup regressions are caught before merge. Remove the schedule trigger from the pypi.yml docker job, since nightly images are now built from source here. Release and workflow_dispatch image builds in pypi.yml are unchanged, and the nightly test.pypi package publish is untouched. Add scripts/smoke-test-distro.sh, which boots a built image, waits for /v1/health, and verifies /v1/models responds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a maintained nightly distribution-image build so users can pull a trustworthy "latest main" container, and retires the flaky test.pypi-based nightly docker build.
nightly-distro.yml(new)mainsource tree (INSTALL_MODE=editable) every night at 02:00 UTC. Building from source removes the test.pypi propagation wait that was the main flake source in the old nightly.ubuntu-24.04+ubuntu-24.04-arm) — no QEMU at runtime — so each image is boot-smoke-tested on its real architecture before it's trusted.docker buildx imagetools create.:nightly,:<YYYYMMDD>,:<short-sha>.:latestis intentionally not touched — it stays owned by the release pipeline and means "latest stable release". Pull:nightlyto trackmain.starterdistro on amd64 only, no push — so a change that breaks server startup is caught before merge instead of at the next nightly. (providers-build.ymlbuilds venv-only on PRs, so nothing booted a distro container per-PR before this.)pypi.yml(changed)scheduletrigger from thepublish-docker-imagesjob. Nightly images are now built from source bynightly-distro.yml. Release andworkflow_dispatchimage builds are unchanged, and the nightly test.pypi package publish is untouched.scripts/smoke-test-distro.sh(new)/v1/healthforOK, asserts/v1/modelsreturns valid JSON, and dumps container logs on failure. Reusable locally and in CI.Test plan
Static checks:
actionlinton both workflows — cleanshellcheck scripts/smoke-test-distro.sh— cleanpull_request/schedule/workflow_dispatch:[starter/amd64], push=false[starter/amd64] [starter/arm64] [postgres-demo/amd64] [postgres-demo/arm64], push=trueEnd-to-end, locally (native arm64):
Output:
(The
401provider model-refresh errors from the dummy keys are expected and non-fatal — the server still starts and serves.)Notes / follow-ups
ubuntu-24.04-armGitHub-hosted runners enabled for the org.DOCKERHUB_USERNAME/DOCKERHUB_TOKENsecrets — nothing new to provision.starter,postgres-demo; expand later as needed.🤖 Generated with Claude Code