Skip to content

Conversation

@hholoubk
Copy link
Contributor

@hholoubk hholoubk commented Oct 24, 2025

  • Remove hardcoded driver values and improve config organization
  • Refactor SRIOV VFIO test to use dynamic driver verification
  • Simplify SR-IOV VFIO driver check logic by removing unnecessary conditionals
  • The check_vfio_pci function can handle None values for exp_driver parameter
  • Reduces code complexity and improves maintainability

Depends on
avocado-framework/avocado-vt#4255

Summary by CodeRabbit

  • Tests
    • Removed a global default driver setting and added a variant-specific driver configuration
    • Added an x86_64-specific test setting and introduced a new default variant
    • Tests now record each device's original driver before operations and log restorations
    • Verification strengthened to ensure devices are restored to their original driver after detach

@coderabbitai
Copy link

coderabbitai bot commented Oct 24, 2025

Walkthrough

The SR-IOV test config was refactored: global expr_driver removed, test_pf overridden for x86_64, expr_driver moved into the mlx5_vfio variant, and a @default variant with empty driver_dict added. Test code now records each VF's original PCI driver and verifies drivers are restored after detach.

Changes

Cohort / File(s) Summary
Configuration restructuring
libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg
Removed global expr_driver; added x86_64 test_pf override; placed expr_driver = "mlx5_vfio_pci" in the mlx5_vfio variant; added @default variant with driver_dict = {}.
Test logic enhancement
libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py
Made expr_driver optional; introduced original_drivers dictionary to lazily capture each VF PCI's current driver before attach; added logging for original-driver capture; after detach, validate each VF's driver is restored to the recorded original.

Sequence Diagram(s)

sequenceDiagram
    participant Test
    participant Host as "Host / VF PCI"
    participant Libvirt as "Libvirt / Guest"

    Test->>Host: Query current driver (if not recorded)
    Host-->>Test: Return driver
    Note right of Test #D3E4CD: Store in original_drivers

    Test->>Libvirt: Attach hostdev to guest
    Libvirt-->>Test: Attachment result

    alt expr_driver provided
        Test->>Libvirt: Validate VF driver == expr_driver
        Libvirt-->>Test: Validation result
    else
        Test->>Libvirt: Validate VF bound to vfio
        Libvirt-->>Test: Validation result
    end

    Test->>Libvirt: Detach hostdev from guest
    Libvirt-->>Test: Detach result

    Test->>Host: Query current driver (post-detach)
    Host-->>Test: Return driver
    Test->>Test: Compare with original_drivers
    alt match
        Note right of Test #D3E4CD: Restoration confirmed
    else
        Note right of Test #F8D7DA: Restoration mismatch reported
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review focus:
    • Race-free capture and per-VF storage of original drivers in original_drivers.
    • Correct handling when expr_driver is None/missing.
    • Post-detach comparison logic and logging paths.

Poem

🐰 I hopped through configs, moved a line or two,
I kept each VF's driver safe and true,
I watched them bind and watched them part,
Then nudged them gently back to start,
A little patch, a joyful chew 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title Check ✅ Passed The pull request title "Refactor SR-IOV VFIO test to use dynamic driver verification" accurately describes the main changes in the changeset. The core modifications involve removing hardcoded driver expectations (the top-level expr_driver = "mlx5_vfio_pci" setting) and replacing them with dynamic per-PCI driver tracking and verification logic in the Python test code. The title is concise, specific, and clearly conveys that this is a refactoring that introduces dynamic driver verification rather than relying on static driver assumptions. The title avoids vague terms and noise, making it clear to reviewers what the primary intent of the change is.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hholoubk
Copy link
Contributor Author

(.libvirt-ci-venv-ci-runtest-SQR4mv) [root@nvidia-grace-hopper-07 libvirt-ci-latest-venv]# bin/avocado run --vt-type libvirt --vt-omit-data-loss --vt-machine-type arm64-pci --vt-connect-uri qemu:///system sriov.plug_unplug.vfio_variant_driver
No python imaging library installed. Screendump and Windows guest BSOD detection are disabled. In order to enable it, please install python-imaging or the equivalent for your distro.
No python imaging library installed. PPM image conversion to JPEG disabled. In order to enable it, please install python-imaging or the equivalent for your distro.
No python imaging library installed. Screendump and Windows guest BSOD detection are disabled. In order to enable it, please install python-imaging or the equivalent for your distro.
No python imaging library installed. PPM image conversion to JPEG disabled. In order to enable it, please install python-imaging or the equivalent for your distro.
JOB ID : 2f8981813eaa5d253f1d6e0886cbbc6ede58764b
JOB LOG : /var/log/avocado/job-results/job-2025-10-24T08.33-2f89818/job.log
(01/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.mlx5_vfio: STARTED
(01/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.mlx5_vfio: PASS (71.42 s)
(02/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.vfio_vfio: STARTED
(02/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.vfio_vfio: PASS (72.21 s)
(03/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.default: STARTED
(03/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_interface.default: PASS (72.48 s)
(04/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.mlx5_vfio: STARTED
(04/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.mlx5_vfio: PASS (72.03 s)
(05/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.vfio_vfio: STARTED
(05/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.vfio_vfio: PASS (72.52 s)
(06/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.default: STARTED
(06/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.single_iface.hostdev_device.default: PASS (72.12 s)
(07/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.mlx5_vfio: STARTED
(07/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.mlx5_vfio: PASS (113.45 s)
(08/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.vfio_vfio: STARTED
(08/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.vfio_vfio: PASS (113.56 s)
(09/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.default: STARTED
(09/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_interface.default: PASS (114.01 s)
(10/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.mlx5_vfio: STARTED
(10/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.mlx5_vfio: PASS (113.70 s)
(11/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.vfio_vfio: STARTED
(11/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.vfio_vfio: PASS (112.48 s)
(12/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.default: STARTED
(12/12) type_specific.io-github-autotest-libvirt.sriov.plug_unplug.vfio_variant_driver.multiple_ifaces.hostdev_device.default: PASS (113.20 s)
RESULTS : PASS 12 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML : /var/log/avocado/job-results/job-2025-10-24T08.33-2f89818/results.html
JOB TIME : 1123.99 s

@hholoubk hholoubk marked this pull request as ready for review October 27, 2025 09:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (3)

25-31: Bug: parse_iface_dict uses free var vf_pci; ignores its parameter

This can capture the outer loop variable by accident and is fragile. Use the function parameter consistently.

Apply this diff:

-def parse_iface_dict(pf_pci):
+def parse_iface_dict(pci_id):
@@
-        vf_pci_addr = utils_sriov.pci_to_addr(vf_pci)
+        vf_pci_addr = utils_sriov.pci_to_addr(pci_id)

Also ensure the call site remains parse_iface_dict(vf_pci); only the local param name changes.


35-37: Typo: "manged" → "managed"

Key name is misspelled; attribute won’t apply.

-        if managed:
-            iface_dict.update({"manged": managed})
+        if managed:
+            iface_dict.update({"managed": managed})

90-93: Bug: ping condition inverted

You fail the test when ping succeeds. Invert the condition.

-            if utils_test.ping(ping_dest, count=3, timeout=5, session=vm_session):
-                test.fail("Failed to ping %s." % ping_dest)
+            if not utils_test.ping(ping_dest, count=3, timeout=5, session=vm_session):
+                test.fail("Failed to ping %s." % ping_dest)
🧹 Nitpick comments (4)
libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg (1)

8-9: Avoid duplicating test_pf; consider arch‑specific blocks only

test_pf is set globally (Line 5) and again under x86_64. Prefer defining per-arch values only and dropping the global to avoid mismatches on aarch64. Please confirm the correct PF name for aarch64 or make it parameterizable.

libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (3)

26-33: Replace eval with ast.literal_eval for safety and predictability

Avoid executing arbitrary code from params. literal_eval suffices for dicts.

+import ast
@@
-            iface_dict = eval(params.get('iface_dict', '{}'))
+            iface_dict = ast.literal_eval(params.get('iface_dict', '{}'))
@@
-            iface_dict = eval(params.get('hostdev_dict', '{}'))
+            iface_dict = ast.literal_eval(params.get('hostdev_dict', '{}'))
@@
-        driver_dict = eval(params.get("driver_dict", "{}"))
+        driver_dict = ast.literal_eval(params.get("driver_dict", "{}"))

44-44: Optional expr_driver retrieval looks right

No default assumed now; matches cfg variants. If empty string is possible from params, consider normalizing: expr_driver = params.get("expr_driver") or None.


111-116: Rebind can be async; add short poll before asserting restoration

A brief poll avoids flakes after detach.

-        for idx in range(iface_number):
-            vf_pci = utils_sriov.get_vf_pci_id(pf_pci, idx)
-            original_driver = original_drivers[vf_pci]
-            libvirt_vfio.check_vfio_pci(vf_pci, exp_driver=original_driver)
-            test.log.debug(f"Verify: Original driver {original_driver} restored for {vf_pci} - PASS")
+        for idx in range(iface_number):
+            vf_pci = utils_sriov.get_vf_pci_id(pf_pci, idx)
+            expected = original_drivers[vf_pci]
+            for _ in range(10):  # wait up to ~10s
+                if libvirt_vfio.get_pci_driver(vf_pci) == expected:
+                    break
+                time.sleep(1)
+            else:
+                test.fail(f"Original driver {expected} not restored for {vf_pci}")
+            test.log.debug(f"Verify: Original driver {expected} restored for {vf_pci} - PASS")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9aa7dea and ab57225.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg (1 hunks)
  • libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (5 hunks)
🔇 Additional comments (3)
libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg (1)

12-12: Good move: scope expr_driver to mlx5_vfio variant

Scoping expr_driver to the mlx5_vfio variant matches the new conditional logic in the test. Please confirm that vfio_vfio intentionally omits expr_driver and relies on the generic VFIO check.

libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (2)

84-87: Conditional VFIO check: confirm default path

When expr_driver is absent, you call check_vfio_pci(vf_pci) without exp_driver. Please confirm the helper’s default expectation (e.g., vfio_pci) aligns with the intended variant behavior.


54-54: Verify handling of None return from get_pci_driver

The code at lines 70–74 stores whatever libvirt_vfio.get_pci_driver(vf_pci) returns without checking for None. If the function can return None when a VF has no bound driver, this value is then passed to check_vfio_pci(..., exp_driver=original_driver) at line 115.

Action required: Consult the virttest.utils_libvirt.libvirt_vfio module documentation or source to confirm:

  • Does get_pci_driver() return None for unbound drivers?
  • Does check_vfio_pci() safely handle None as the exp_driver parameter?

If both answers are "yes," add defensive None checks when storing/retrieving from original_drivers. Otherwise, document the assumption that drivers are always bound during test execution.

- Remove hardcoded driver values and improve config organization
- Refactor SRIOV VFIO test to use dynamic driver verification
- Simplify SR-IOV VFIO driver check logic by removing unnecessary conditionals
- The check_vfio_pci function can handle None values for exp_driver parameter
- Reduces code complexity and improves maintainability

Signed-off-by: hholoubk <[email protected]>

AI assisted code and commit. Human reviewed.
@hholoubk hholoubk force-pushed the XXX-538-sriov-vfio-get-driver branch from b0e7864 to 3624234 Compare October 27, 2025 09:57
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (2)

19-41: Fix wrong variable usage and 'managed' typo in parse_iface_dict.

The function ignores its parameter and reads the free var vf_pci; also sets key "manged". These are brittle and user-visible bugs.

Apply:

-def parse_iface_dict(pf_pci):
+def parse_iface_dict(vf_pci_param):
@@
-    vf_pci_addr = utils_sriov.pci_to_addr(vf_pci)
+    vf_pci_addr = utils_sriov.pci_to_addr(vf_pci_param)
@@
-    if managed:
-        iface_dict.update({"manged": managed})
+    if managed:
+        iface_dict.update({"managed": managed})

86-90: Fix condition to check status code from utils_test.ping return tuple.

The code treats the return tuple as a boolean; in Python, non-empty tuples are always truthy. Extract and check the status code instead:

status, output = utils_test.ping(ping_dest, count=3, timeout=5, session=vm_session)
if status:
    test.fail("Failed to ping %s." % ping_dest)
🧹 Nitpick comments (4)
libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (4)

44-45: Gate VFIO model check when expr_driver is not provided.

With top-level expr_driver removed, None may reach check_vfio_pci. Guard to avoid unintended assertions.

-    libvirt_vfio.check_vfio_pci(vf_pci, exp_driver=expr_driver)
+    if expr_driver:
+        libvirt_vfio.check_vfio_pci(vf_pci, exp_driver=expr_driver)
+    else:
+        test.log.debug("expr_driver not set; skipping VFIO model check")

Please confirm libvirt_vfio.check_vfio_pci explicitly supports exp_driver=None; if it does, we can keep the single call.

Also applies to: 84-85


108-113: Make restore check resilient to async rebind.

Driver rebind can lag detach. Add a short retry/poll to reduce flakes.

-        for idx in range(iface_number):
-            vf_pci = utils_sriov.get_vf_pci_id(pf_pci, idx)
-            original_driver = original_drivers[vf_pci]
-            libvirt_vfio.check_vfio_pci(vf_pci, exp_driver=original_driver)
+        for idx in range(iface_number):
+            vf_pci = utils_sriov.get_vf_pci_id(pf_pci, idx)
+            original_driver = original_drivers[vf_pci]
+            for _ in range(30):
+                try:
+                    libvirt_vfio.check_vfio_pci(vf_pci, exp_driver=original_driver)
+                    break
+                except Exception:
+                    time.sleep(1)
+            else:
+                test.fail(f"Driver {original_driver} not restored for {vf_pci} within timeout")

27-33: Safer parsing for driver_dict; keep eval where templates require variables.

iface_dict/hostdev_dict templates reference variables (e.g., vf_pci_addr, mac_addr) and need eval. driver_dict is a pure literal—use literal_eval.

+import ast
@@
-        driver_dict = eval(params.get("driver_dict", "{}"))
+        driver_dict = ast.literal_eval(params.get("driver_dict", "{}"))

82-83: Harden nested dict access to avoid TypeError when driver is missing.

Avoid None["driver_attr"] crash.

-            if act_hostdev_dict.get("driver")["driver_attr"].get("name") != "vfio":
+            drv_name = (((act_hostdev_dict.get("driver") or {}).get("driver_attr") or {}).get("name"))
+            if drv_name != "vfio":
                 test.fail("The hostdev driver is not set to VFIO.")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0e7864 and 3624234.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg (1 hunks)
  • libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (4 hunks)
🔇 Additional comments (5)
libvirt/tests/src/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.py (2)

54-55: LGTM: per-VF original driver map.

Local dict initialization is clear and scoped correctly.


70-74: Good capture of original driver; add guard for empty/unknown.

If get_pci_driver returns empty/unknown, log and still proceed to avoid KeyError later.

Optionally:

-            if vf_pci not in original_drivers:
-                original_driver = libvirt_vfio.get_pci_driver(vf_pci)
+            if vf_pci not in original_drivers:
+                original_driver = libvirt_vfio.get_pci_driver(vf_pci) or "unknown"
                 original_drivers[vf_pci] = original_driver
libvirt/tests/cfg/sriov/plug_unplug/sriov_attach_detach_device_vfio_variant_driver.cfg (3)

8-9: Platform override present only for x86_64; confirm PF name on aarch64.

aarch64 often uses different PF names. Ensure default test_pf is valid there, or add an aarch64: block.


12-13: Variant-scoped expr_driver is consistent; ensure model matches.

expr_driver="mlx5_vfio_pci" aligns with driver_dict.model="mlx5_vfio_pci". Looks good; just keep them in sync.


16-17: Default variant with empty driver_dict is fine; coordinate with code gating.

This relies on expr_driver being optional in code. Please land the gating change (or confirm check_vfio_pci handles None).

@hholoubk hholoubk changed the title Xxx 538 sriov vfio get driver Refactor SR-IOV VFIO test to use dynamic driver verification Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant