Skip to content

Builds stuck in "preparing build cache for export" stage #6008

@razzmatazz

Description

@razzmatazz

Contributing guidelines and issue reporting guide

Well-formed report checklist

  • I have found a bug that the documentation does not mention anything about my problem
  • I have found a bug that there are no open or closed issues that are related to my problem
  • I have provided version/information about my environment and done my best to provide a reproducer

Description of bug

Bug description

I am seeing buildkit failing to leave the "preparing build cache for export" stage when building a large/multi stage image. It does seem to pass from time-to-time (after ~ 1-2 hours), but mostly it seems to be stuck in "checkLoops"/"removeLoops" fn, with CPU pegged to 100%.

docker buildx build invocation specifies "registry" cache with two --cache-from (type=registry) flags (same repo, 2 different tags–if that makes a difference) and a single --cache-to (type=registry)

pprof output:

(pprof) top50 -cum
Showing nodes accounting for 118.43s, 98.06% of 120.77s total
Dropped 139 nodes (cum <= 0.60s)
      flat  flat%   sum%        cum   cum%
         0     0%     0%    119.92s 99.30%  github.com/moby/buildkit/cache/remotecache/v1.(*CacheChains).Marshal
         0     0%     0%    119.92s 99.30%  github.com/moby/buildkit/cache/remotecache/v1.(*CacheChains).normalize
     9.64s  7.98%  7.98%    119.92s 99.30%  github.com/moby/buildkit/cache/remotecache/v1.(*normalizeState).checkLoops
         0     0%  7.98%    119.92s 99.30%  github.com/moby/buildkit/cache/remotecache/v1.(*normalizeState).removeLoops
         0     0%  7.98%    119.90s 99.28%  github.com/moby/buildkit/cache/remotecache.(*contentCacheExporter).Finalize
         0     0%  7.98%    119.70s 99.11%  github.com/moby/buildkit/solver/llbsolver.runCacheExporters.func1.1
         0     0%  7.98%    119.20s 98.70%  github.com/moby/buildkit/solver/llbsolver.inBuilderContext.func1
         0     0%  7.98%    117.89s 97.62%  github.com/moby/buildkit/solver.(*Job).InContext
         0     0%  7.98%    116.29s 96.29%  github.com/moby/buildkit/solver/llbsolver.inBuilderContext
         0     0%  7.98%    114.14s 94.51%  github.com/moby/buildkit/solver/llbsolver.runCacheExporters.func1
         0     0%  7.98%    111.42s 92.26%  golang.org/x/sync/errgroup.(*Group).Go.func1
    22.34s 18.50% 26.48%     33.73s 27.93%  runtime.mapiternext
     6.20s  5.13% 31.61%     32.42s 26.84%  runtime.mapiterinit
    14.93s 12.36% 43.98%     28.40s 23.52%  runtime.mapaccess2_faststr
    17.66s 14.62% 58.60%     17.66s 14.62%  aeshashbody
     0.61s  0.51% 59.10%     14.30s 11.84%  github.com/moby/buildkit/cache/remotecache/v1.(*normalizeState).checkLoops.func1
     7.89s  6.53% 65.64%     13.69s 11.34%  runtime.mapdelete_faststr
     5.69s  4.71% 70.35%     11.06s  9.16%  runtime.mapassign_faststr
     9.67s  8.01% 78.36%      9.67s  8.01%  runtime.add (inline)
     5.69s  4.71% 83.07%      9.03s  7.48%  runtime.mapaccess2_fast64
     2.53s  2.09% 85.16%      6.27s  5.19%  runtime.(*bmap).overflow (inline)
     2.39s  1.98% 87.14%      6.04s  5.00%  runtime.rand
     3.51s  2.91% 90.05%      3.51s  2.91%  runtime.isEmpty (inline)
     0.22s  0.18% 90.23%      3.35s  2.77%  internal/chacha8rand.(*State).Refill
     3.13s  2.59% 92.82%      3.13s  2.59%  internal/chacha8rand.block
     1.94s  1.61% 94.43%      1.94s  1.61%  runtime.memhash64
     1.43s  1.18% 95.61%      1.43s  1.18%  runtime.strhash
     0.88s  0.73% 96.34%      0.88s  0.73%  runtime.tophash (inline)
     0.71s  0.59% 96.93%      0.71s  0.59%  internal/abi.(*Type).Pointers (inline)
     0.68s  0.56% 97.49%      0.68s  0.56%  runtime.duffzero
     0.06s  0.05% 97.54%      0.66s  0.55%  runtime.bucketMask (inline)
     0.63s  0.52% 98.06%      0.63s  0.52%  runtime.bucketShift (inline)

Reproduction

It may be difficult to reproduce and I cannot ship Dockerfile as that is private but it does appear from time to time,–and I believe #2009 is related.

Version information

Running buildkitd in docker-container mode, v0.21.1 (the current moby/buildkit:buildx-stable-1).

~$ docker buildx version && docker buildx inspect
github.com/docker/buildx v0.24.0 d0e5e86
Name:          gha-runner-vm-builder
Driver:        docker-container
Last Activity: 2025-06-03 14:52:14 +0000 UTC

Nodes:
Name:                  gha-runner-vm-builder0
Endpoint:              unix:///var/run/docker.sock
Driver Options:        network="host"
Status:                running
BuildKit daemon flags: --allow-insecure-entitlement=network.host
BuildKit version:      v0.21.1
Platforms:             linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
Labels:
 org.mobyproject.buildkit.worker.executor:         oci
 org.mobyproject.buildkit.worker.hostname:         mayhem-gha-runner
 org.mobyproject.buildkit.worker.network:          host
 org.mobyproject.buildkit.worker.oci.process-mode: sandbox
 org.mobyproject.buildkit.worker.selinux.enabled:  false
 org.mobyproject.buildkit.worker.snapshotter:      overlayfs
File#buildkitd.toml:
 > debug = true
 >
 > [grpc]
 >   debugAddress = "0.0.0.0:6060"
 >
 > [worker]
 >
 >   [worker.oci]
 >     gc = false
 >

and

~$ docker version && docker info
Client: Docker Engine - Community
 Version:           28.2.2
 API version:       1.50
 Go version:        go1.24.3
 Git commit:        e6534b4
 Built:             Fri May 30 12:07:27 2025
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          28.2.2
  API version:      1.50 (minimum version 1.24)
  Go version:       go1.24.3
  Git commit:       45873be
  Built:            Fri May 30 12:07:27 2025
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.27
  GitCommit:        05044ec0a9a75232cad458027ca83437aae3f4da
 runc:
  Version:          1.2.5
  GitCommit:        v1.2.5-0-g59923ef
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
Client: Docker Engine - Community
 Version:    28.2.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.24.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.36.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 108
 Server Version: 28.2.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
 runc version: v1.2.5-0-g59923ef
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
 Kernel Version: 6.8.0-60-generic
 Operating System: Ubuntu 24.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 48
 Total Memory: 47.03GiB
 Name: clint-vm-1
 ID: f696c190-9a0a-4598-b3a1-98f47405a8f0
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions