Replies: 2 comments 5 replies
-
|
Update: Tracked down the source of the pod-ephemeral pressure and resolved the evictions. While BuildKit’s rootless state was mounted at /app-root/.local/share/buildkit on a PVC, I noticed the writable layer was still growing under /var/lib. A quick check showed a rootful BuildKit daemon writing to /var/lib/buildkit: du -xhd1 /var/lib 2>/dev/null | sort -h Fix applied: I mounted a volume at /var/lib/buildkit (in addition to the existing PVC at /app-root/.local/share/buildkit). After that change, I’ve had zero evictions across 10 successful builds. If helpful for others, two actionable follow-ups that would make this more robust by default: Docs clarity: Call out that a rootful BuildKit daemon (e.g., via buildctl-daemonless.sh or default socket) will use /var/lib/buildkit, which counts toward pod-ephemeral storage—suggest mounting a volume there as a safety net, or ensuring only the rootless daemon is used. Chart option: Consider exposing an optional volume mount for /var/lib/buildkit behind a values.yaml flag to prevent accidental rootful writes to the writable layer. I’d really appreciate any suggestions or hints if I’ve misconfigured something or missed best practices—happy to adjust my setup based on your guidance. Thanks! |
Beta Was this translation helpful? Give feedback.
-
|
This is intentional. The security model is based on agents that are used once to ensure isolation and idempotency, see the doc. Once you have pending jobs in your DevOps Server, new pods (managed from KEDA) will be created to consume the queue. Since i am not using KEDA, no new pods will be scaled. Thats a problem. I read it in docs that is a default, but what about not using KEDA to scale with queue lenght? In that scenario, do you find any written file in /app-root/.local/share/buildkit? If yes, what is the content of that file? ` ` In that case why just replace the PVC path to /var/lib/buildkit? Will depends on if BuildKit is writing in /app-root/.local/share/buildkit. Good idea - thanks |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I’m evaluating the Blue Agent Helm chart for building container images on Kubernetes and I’ve hit two issues.
Environment
Chart version: 12.0.1
Image: ghcr.io/clemlesne/blue-agent: (bookworm flavor)
Kubernetes: Client Version: v1.29.1
Server Version: v1.27.10+rke2r1
StorageClass: csi-sc-cinder
KEDA: not activated
1) Azure Pipelines agent does not reconnect after first successful job
After the first job completes, the agent does not reconnect to the DevOps server. The default start.sh seems to exit after the initial run, so I mounted a custom script via ConfigMap to keep it running. With that change, the agent does not reconnect and stays alive.
Custom start.sh:
`
`
If the reconnection behavior is intentional, what could be my problem not being able to reconnect?
2) Pod evictions due to ephemeral storage, despite BuildKit PVC
I’m seeing evictions with:
ephemeral local storage usage exceeds the total limit of containers
Bumping the ephemeral limit helps temporarily but isn’t ideal since there is not much space on the nodes (50Gi). I tried moving BuildKit’s state to a volume per docs. (Side note: the docs link currently appears broken: link.)
`
`
Pipeline step rootless buildkit:
`
`
I can confirm the PVC is mounted at /app-root/.local/share/buildkit (separate filesystem), but the pod still accrues significant ephemeral usage and eventually gets evicted.
Questions / Request for guidance
Mount paths: For the bookworm image running BuildKit rootless with HOME=/app-root, is /app-root/.local/share/buildkit the correct path for BuildKit state? (I believe yes.)
Additional mounts: Do I also need to mount a volume at /tmp (or symlink /tmp → /app-root/.local/tmp) to prevent other tools from writing large intermediates to the container writable layer? The chart mounts a PVC-backed tmpdir at /app-root/.local/tmp, but some tools ignore $TMPDIR and use /tmp.
Workspace: Should AZP_WORK=/app-root/azp-work be on a PVC by default with the chart’s pipelines.cache settings, or should I add an explicit mount there as well?
start.sh: Is there anything in the default start.sh that’s required for BuildKit that I might have removed by supplying my own script?
Recommended GC: Any recommended buildkitd.toml GC settings for rootless usage (e.g., maxUsedSpace, reservedSpace) that you expect to work well with this chart?
Happy to provide rendered manifests or logs if that helps. Thanks for any pointers, and for the chart!
Beta Was this translation helpful? Give feedback.
All reactions