Skip to content

iso: Fix minikube stop with vfkit and krunkit drivers #21089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 20, 2025

Conversation

nirs
Copy link
Contributor

@nirs nirs commented Jul 18, 2025

With current minikube kernel stopping vfkit or krunkit is never complete:

% minikube stop                
✋  Stopping node "minikube"  ...

❌  Exiting due to GUEST_STOP_TIMEOUT: Unable to stop VM: Temporary Error: stop: Maximum number of retries (60) exceeded

The vfkit (and krunkit) drivers stop by sending POST request to change the VM state. vfkit and krunkit trigger a graceful shutdown in the guest. Something is missing in our current kernel, making the shutdown request to be ignored, and the vfkit (or krunkit) child process never terminate.

minikube should handle this issue by falling back to SIGTERM after a reasonable timeout, and finally using SIGKILL. This should be implemented by the code running the drivers for all drivers, but we don't have such code.

The issue is fixed by creating default kernel config for arm64, and adding the missing configs to our kernel configs.

This was done in 2 steps:

  1. The first commit created a default kernel config without unneeded parts (like specific arm platform support, multimedia support, and sounds card support), and adding the missing configs to our kernel config. With this kernel we can start and stop successfuly with qemu, vfkit, and krunkit with --no-kubernetes.

  2. The second commit add back the configs removed by the first commit. With this we can start clusters normally. This commit will be helpful to understand which configs we need to add to a default kernel config when we upgrade the kernel again.

Build the iso is currently broken by the go.work file. This change also disable the workspace to unbreak the iso build.

Testing

Tested with #20826 and kernel built locally.

Testing basic life cycle: start, stop, start, delete

  • vfkit
  • krunkit
  • qemu

Integration tests:

nirs added 2 commits July 18, 2025 21:15
Create default arm64 config and disable stuff that we cannot use in
a VM.

This chagne was generated by:

1. Create defualt arm64 config

       cd out/buildroot/output-aarch64/build/linux-6.6.95
       make ARCH=arm64 defconfig
       make ARCH=arm64 menuconfig
       (exit saving changes)

2. Disable features that we don't need in the minikube VM:

       - Platform suppport
	 - all platforms
       - Device drivers
         - Multimedia support
         - Sound support

3. Updated our linux defconfig

       cd out/buildroot/output-aarch64
       make linux-update-defconfig

4. Normalize the config

       make linux-menuconfig-aarch64
       (exit saving changes)

With this config qemu, vfkit, and krunkit boot with --no-kubernetes, and
graceful shutdown works in vfkit and krunkit (using --restful-uri).

We cannot start kubernetes yet since some features are not available in
the default architecture config.
This restores the configs removed by updating from the default
architecture config. These configs are required for kubernetes support.

After adding the removed configs, run `make linux-menuconfig-aarch64` to
normalize the config and remove multimedia and sound card support again.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 18, 2025
@k8s-ci-robot k8s-ci-robot requested review from medyagh and prezha July 18, 2025 22:40
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 18, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 18, 2025
@nirs
Copy link
Contributor Author

nirs commented Jul 18, 2025

ok-to-build-iso

@afbjorklund
Copy link
Collaborator

afbjorklund commented Jul 19, 2025

Something is missing in our current kernel, making the shutdown request to be ignored,

This sounds like it would be related to ACPI, which is what is shutting the machines down.

@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

Something is missing in our current kernel, making the shutdown request to be ignored,

This sounds like it would be related to ACPI, which is what is shutting the machines down.

I agree, but when I tried to add only the default CONFIG_ACPI_* configs to our image it did not fix the issue. So there must be something else missing. Copying all the default configs is ugly but it fixes the issue.

The interesting point is that shutting down qemu do work with the current image. Maybe qemu terminate the guest using another way, or maybe it does not do graceful shutdown.

@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

iso build logs: https://storage.googleapis.com/minikube-builds/logs/21089/d297c3e/iso_build.txt

podman build failed. The iso I built was created from the previous build directory, only the kernel was rebuilt.

2025-07-19 02:26:53 (6.52 MB/s) - ‘/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/.v3.4.7.tar.gz.MnXCzf/output’ saved [10923427]

v3.4.7.tar.gz: OK (sha256: 4af6606dd072fe946960680611ba65201be435b43edbfc5cc635b2a01a899e6e)
>>> podman v3.4.7 Extracting
gzip -d -c /home/jenkins/workspace/iso-pr-build/out/buildroot/dl/podman/v3.4.7.tar.gz | /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/tar --strip-components=1 -C /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7   -xf -
# "build flag -mod=vendor only valid when using modules"
sed -e 's|-mod=vendor ||' -i /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/Makefile
>>> podman v3.4.7 Patching
>>> podman v3.4.7 Configuring
mkdir -p /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output && mv /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/vendor /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output/src
mkdir -p /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output/src/github.com/containers
ln -sf /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7 /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output/src/github.com/containers/podman
ln -sf /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7 /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output/src/github.com/containers/podman/v2
>>> podman v3.4.7 Building
mkdir -p /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/bin
CGO_ENABLED=1 GOPATH="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output" PATH=/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/_output/bin:"/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin:/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/sbin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/local/go/bin:/home/jenkins/go/bin" GOARCH=arm64 GOPROXY="https://proxy.golang.org,direct" GOSUMDB='sum.golang.org' GOOS=linux CIRRUS_TAG=v3.4.7 /usr/bin/make -j5 GIT_DIR=. PATH="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin:/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/sbin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/local/go/bin:/home/jenkins/go/bin" AR="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gcc-ar" AS="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-as" LD="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-ld" NM="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gcc-nm" CC="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gcc" GCC="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gcc" CPP="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-cpp" CXX="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-g++" FC="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gfortran" F77="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gfortran" RANLIB="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-gcc-ranlib" READELF="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-readelf" STRIP="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-strip" OBJCOPY="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-objcopy" OBJDUMP="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-objdump" AR_FOR_BUILD="/usr/bin/ar" AS_FOR_BUILD="/usr/bin/as" CC_FOR_BUILD="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/ccache /usr/bin/gcc" GCC_FOR_BUILD="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/ccache /usr/bin/gcc" CXX_FOR_BUILD="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/ccache /usr/bin/g++" LD_FOR_BUILD="/usr/bin/ld" CPPFLAGS_FOR_BUILD="-I/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/include" CFLAGS_FOR_BUILD="-O2 -I/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/include" CXXFLAGS_FOR_BUILD="-O2 -I/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/include" LDFLAGS_FOR_BUILD="-L/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/lib -Wl,-rpath,/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/lib" FCFLAGS_FOR_BUILD="" DEFAULT_ASSEMBLER="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-as" DEFAULT_LINKER="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/aarch64-minikube-linux-gnu-ld" CPPFLAGS="-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64" CFLAGS="-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O2 -g0 -D_FORTIFY_SOURCE=1" CXXFLAGS="-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O2 -g0 -D_FORTIFY_SOURCE=1" LDFLAGS="" FCFLAGS=" -O2 -g0" FFLAGS=" -O2 -g0" PKG_CONFIG="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/bin/pkg-config" STAGING_DIR="/home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/host/aarch64-minikube-linux-gnu/sysroot" INTLTOOL_PERL=/usr/bin/perl BUILDFLAGS="-buildvcs=false" BUILDTAGS="exclude_graphdriver_btrfs btrfs_noversion exclude_graphdriver_devicemapper seccomp systemd" -C /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7 GIT_COMMIT=74d67f5d43bcd322a4fb11a7b58eced866f9d0b9 PREFIX=/usr podman
go: downloading go1.24.0 (linux/amd64)
touch .gopathok
CGO_ENABLED=1 \
	go build \
	-buildvcs=false \
	-ldflags '-X github.com/containers/podman/v3/libpod/define.gitCommit=74d67f5d43bcd322a4fb11a7b58eced866f9d0b9 -X github.com/containers/podman/v3/libpod/define.buildInfo=1752892027 -X github.com/containers/podman/v3/libpod/config._installPrefix=/usr -X github.com/containers/podman/v3/libpod/config._etcDir=/usr/etc ' \
	-tags "exclude_graphdriver_btrfs btrfs_noversion exclude_graphdriver_devicemapper seccomp systemd" \
	-o bin/podman ./cmd/podman
main module (k8s.io/minikube) does not contain package k8s.io/minikube/out/buildroot/output-aarch64/build/podman-v3.4.7/cmd/podman
make[3]: *** [Makefile:305: bin/podman] Error 1
make[2]: *** [package/pkg-generic.mk:273: /home/jenkins/workspace/iso-pr-build/out/buildroot/output-aarch64/build/podman-v3.4.7/.stamp_built] Error 2
make[1]: *** [Makefile:83: _all] Error 2
make[1]: Leaving directory '/home/jenkins/workspace/iso-pr-build/out/buildroot'
make: *** [Makefile:306: minikube-iso-aarch64] Error 2
rm deploy/iso/minikube-iso/board/minikube/aarch64/rootfs-overlay/usr/bin/auto-pause
Build step 'Execute shell' marked build as failure

@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

Reproduced the error locally by:

$ sudo rm -rf out/buildroot/output-aarch64/build/podman-v3.4.7

$ make minikube-iso-aarch64
...
CGO_ENABLED=1 \
	go build \
	-buildvcs=false \
	-ldflags '-X github.com/containers/podman/v3/libpod/define.gitCommit=74d67f5d43bcd322a4fb11a7b58eced866f9d0b9 -X github.com/containers/podman/v3/libpod/define.buildInfo=1752937642 -X github.com/containers/podman/v3/libpod/config._installPrefix=/usr -X github.com/containers/podman/v3/libpod/config._etcDir=/usr/etc ' \
	-tags "exclude_graphdriver_btrfs btrfs_noversion exclude_graphdriver_devicemapper seccomp systemd" \
	-o bin/podman ./cmd/podman
main module (k8s.io/minikube) does not contain package k8s.io/minikube/out/buildroot/output-aarch64/build/podman-v3.4.7/cmd/podman
make[3]: *** [Makefile:305: bin/podman] Error 1
make[2]: *** [package/pkg-generic.mk:273: /home/nsoffer/minikube/out/buildroot/output-aarch64/build/podman-v3.4.7/.stamp_built] Error 2
make[1]: *** [Makefile:83: _all] Error 2
make[1]: Leaving directory '/home/nsoffer/minikube/out/buildroot'
make: *** [Makefile:306: minikube-iso-aarch64] Error 2
rm deploy/iso/minikube-iso/board/minikube/aarch64/rootfs-overlay/usr/bin/auto-pause

Adding go.work seems to break podman build. The workspace is needed only
for running the update commands so let's disable it when building the
iso.

We may need much bigger change to ensur that the workspace is used only
when running the update go commands, or remove it. This change fixes
only the iso build.
@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

ok-to-build-iso

@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

@minikube-bot
Copy link
Collaborator

Hi @nirs, we have updated your PR with the reference to newly built ISO. Pull the changes locally if you want to test with them or update your PR further.

@nirs
Copy link
Contributor Author

nirs commented Jul 19, 2025

Timing minikube start with the new kernel

Tested with #20826 rebased on top of this PR.

no kubernetes

% hyperfine -w 3 -r 10 -C "out/minikube delete" "out/minikube start --driver krunkit --no-kubernetes" "out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes" "out/minikube start --driver qemu --no-kubernetes"
Benchmark 1: out/minikube start --driver krunkit --no-kubernetes
  Time (mean ± σ):      7.618 s ±  0.506 s    [User: 0.426 s, System: 0.235 s]
  Range (min … max):    6.965 s …  8.541 s    10 runs
 
Benchmark 2: out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes
  Time (mean ± σ):      7.569 s ±  1.027 s    [User: 0.400 s, System: 0.250 s]
  Range (min … max):    5.670 s …  9.429 s    10 runs
 
Benchmark 3: out/minikube start --driver qemu --no-kubernetes
  Time (mean ± σ):     15.803 s ±  0.199 s    [User: 0.432 s, System: 0.246 s]
  Range (min … max):   15.479 s … 16.095 s    10 runs
 
Summary
  out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes ran
    1.01 ± 0.15 times faster than out/minikube start --driver krunkit --no-kubernetes
    2.09 ± 0.28 times faster than out/minikube start --driver qemu --no-kubernetes

with kubernetes

% hyperfine -w 2 -r 5 -C "out/minikube delete" "out/minikube start --driver krunkit --container-runtime containerd" "out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd" "out/minikube start --driver qemu --container-runtime containerd"
Benchmark 1: out/minikube start --driver krunkit --container-runtime containerd
  Time (mean ± σ):     21.564 s ±  0.836 s    [User: 1.209 s, System: 0.769 s]
  Range (min … max):   20.814 s … 22.660 s    5 runs
 
Benchmark 2: out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd
  Time (mean ± σ):     18.552 s ±  0.992 s    [User: 0.952 s, System: 0.794 s]
  Range (min … max):   17.011 s … 19.756 s    5 runs
 
Benchmark 3: out/minikube start --driver qemu --container-runtime containerd
  Time (mean ± σ):     27.123 s ±  0.442 s    [User: 1.023 s, System: 0.918 s]
  Range (min … max):   26.708 s … 27.851 s    5 runs
 
Summary
  out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd ran
    1.16 ± 0.08 times faster than out/minikube start --driver krunkit --container-runtime containerd
    1.46 ± 0.08 times faster than out/minikube start --driver qemu --container-runtime containerd

@medyagh
Copy link
Member

medyagh commented Jul 19, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 19, 2025
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 21089) |
+----------------+----------+---------------------+
| minikube start | 49.9s    | 50.5s               |
| enable ingress | 17.4s    | 15.1s               |
+----------------+----------+---------------------+

Times for minikube start: 51.5s 48.6s 50.7s 48.8s 50.0s
Times for minikube (PR 21089) start: 47.8s 50.9s 47.4s 54.9s 51.6s

Times for minikube (PR 21089) ingress: 15.5s 15.0s 15.0s 15.0s 15.0s
Times for minikube ingress: 14.5s 28.5s 14.5s 14.5s 15.0s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 21089) |
+----------------+----------+---------------------+
| minikube start | 23.5s    | 24.2s               |
| enable ingress | 13.1s    | 13.0s               |
+----------------+----------+---------------------+

Times for minikube start: 22.0s 23.0s 23.1s 26.3s 23.0s
Times for minikube (PR 21089) start: 24.5s 22.8s 24.0s 27.0s 23.0s

Times for minikube (PR 21089) ingress: 13.3s 13.3s 12.3s 12.3s 13.8s
Times for minikube ingress: 12.8s 13.3s 12.3s 13.8s 13.3s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 21089) |
+----------------+----------+---------------------+
| minikube start | 22.8s    | 23.9s               |
| enable ingress | 26.3s    | 26.3s               |
+----------------+----------+---------------------+

Times for minikube start: 25.0s 22.8s 21.3s 21.8s 23.3s
Times for minikube (PR 21089) start: 25.6s 23.6s 22.9s 22.5s 25.1s

Times for minikube (PR 21089) ingress: 22.8s 22.8s 23.3s 39.8s 22.8s
Times for minikube ingress: 22.8s 40.3s 22.3s 23.3s 22.8s

@medyagh medyagh merged commit a5b6072 into kubernetes:master Jul 20, 2025
29 of 37 checks passed
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, nirs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 20, 2025
@nirs nirs deleted the iso-minimal-k8s branch July 20, 2025 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants