Skip to content

Conversation

tomhesse
Copy link
Contributor

SUMMARY

Use the correct kernel flavor from the ansible_kernel fact to ensure the correct zfs modules are installed for integration testing.

Fixes #10453

ISSUE TYPE
  • Test Pull Request
COMPONENT NAME

zpool

@ansibullbot ansibullbot added integration tests/integration small_patch Hopefully easy to review tests tests needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR and removed small_patch Hopefully easy to review labels Jul 25, 2025
@felixfontein felixfontein added check-before-release PR will be looked at again shortly before release and merged if possible. backport-11 Automatically create a backport for the stable-10 branch labels Jul 25, 2025
@felixfontein
Copy link
Collaborator

Can you rebase to the current main (now that #10462 has been merged)? The VM image for Alpine changed from 3.21 to 3.22 (as ansible-core changed it), so with the new CI matrix the reuslt might be different.

@ansibullbot ansibullbot added the stale_ci CI is older than 7 days, rerun before merging label Aug 3, 2025
@felixfontein
Copy link
Collaborator

Ping @tomhesse

@felixfontein felixfontein force-pushed the fix/zpool-alpine-tests branch from a0d0b68 to d04680f Compare September 4, 2025 05:09
@ansibullbot ansibullbot removed the stale_ci CI is older than 7 days, rerun before merging label Sep 4, 2025
@felixfontein
Copy link
Collaborator

There is a mismatch between running kernel version and modules on the Alpine VM:

root@ip-192-168-3-124:/home/alpine/ansible_collections/community/general# find /lib/modules/ | grep zfs
/lib/modules/6.12.44-0-virt/extra/zfs.ko.gz
root@ip-192-168-3-124:/home/alpine/ansible_collections/community/general# modprobe zfs
modprobe: FATAL: Module zfs not found in directory /lib/modules/6.12.38-0-virt
root@ip-192-168-3-124:/home/alpine/ansible_collections/community/general# uname -a
Linux ip-192-168-3-124 6.12.38-0-virt #1-Alpine SMP PREEMPT_DYNAMIC 2025-07-14 19:36:17 x86_64 Linux

@Akasurde @mattclay does this need a change in the Ansible testing infrastructure, or is there something we can do about it in c.g?

@mattclay
Copy link

mattclay commented Sep 4, 2025

@felixfontein My first guess is that something is upgrading the kernel modules, either explicitly or as part of installing another package. Without a reboot, that will then prevent those kernel modules from being loaded. I've encountered this issue in tests before. Depending on the exact cause, there are a few options to solve the issue:

  1. Figure out what is upgrading the modules and stop doing that.
  2. Pin the modules to the version that matches the running kernel so they're not upgraded.
  3. As a last resort, reboot the system after the upgraded modules are installed.

Take a look at the test run to see if you can figure out where the mismatched modules come from. It might help to use ansible-test shell to log in to an instance to poke around. If they're already upgraded when you log in, try with the --raw option to bypass most of the bootstrapping to see if that's the cause.

Let me know what you find. If you need help after looking into it, let me know.

@felixfontein
Copy link
Collaborator

@mattclay the problem is that apk does not allow you to install older versions of packages. Trying to install the right version of zfs-virt (6.12.38-r0) causes it to install 6.12.45-r0 instead (and upgrade all modules), since that's the only version available in the repositories (https://dl-cdn.alpinelinux.org/alpine/v3.22/main/x86_64/). So likely the only way to proceed here is to reboot the VM. I've tried using the ansible.builtin.reboot module for that, but that doesn't work since Running ansible.builtin.reboot with local connection would reboot the control node....

I guess the only way to fix this is to change the VM bootstrap to upgrade to the latest package versions, resp. to update the VM image every time the kernel version changes so it always comes with the latest kernel. Which both is probably problematic for other reasons...

@ansibullbot ansibullbot added the stale_ci CI is older than 7 days, rerun before merging label Sep 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-11 Automatically create a backport for the stable-10 branch check-before-release PR will be looked at again shortly before release and merged if possible. integration tests/integration needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR stale_ci CI is older than 7 days, rerun before merging tests tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zpool: fix tests on Alpine
4 participants