Skip to content

Conversation

@meinaLi
Copy link
Contributor

@meinaLi meinaLi commented Oct 24, 2025

Automate case:
VIRT-303987 - [io-error] Make disk produce I/O error due to "enospc"
Scenario 1: Make disk produce I/O error due to "enospc", then check the error in "virsh event", vm log and "virsh dominfo".
Scenario 2: Based on scenario 1, create disk snapshot, then query domain info again.

Summary by CodeRabbit

  • New Features

    • Added an out-of-space (ENOSPC) test variant with configurable disk/partition/mount parameters.
    • Optional disk snapshot creation and verification during ENOSPC scenarios.
  • Tests

    • Integrated ENOSPC flows into the I/O error test suite with setup, run, and teardown handling, targeted write attempts, and conditional cleanup.
    • Improved error reporting and VM state logging to show full exception text and expected failure messages for easier diagnosis.

@coderabbitai
Copy link

coderabbitai bot commented Oct 24, 2025

Walkthrough

Adds an enospc variant to the virtual_disks_io_error config and updates the test to support enospc_test flows (writing to a non-root qcow2 disk to trigger ENOSPC), optional snapshot creation/cleanup, enospc-specific setup/teardown, and improved logging/error handling.

Changes

Cohort / File(s) Summary
ENOSPC Configuration
libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg
Adds an enospc variant with parameters for ENOSPC scenarios (disk setup, prepare_file, disk_dict, expected messages, cleanup_cmd) and two nested sub-variants: default and with_snapshot (adds snapshot_name).
Test Implementation
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
Imports data_dir, utils_disk, libvirt_disk; introduces enospc_test, with_snapshot, snapshot_name parameters; branches setup/run/teardown for enospc vs standard modes; attaches/writes to a non-root qcow2 disk for enospc; adds optional snapshot create/delete; enhances error capture, logging, and cleanup logic.

Sequence Diagram(s)

sequenceDiagram
    participant Runner as Test Runner
    participant Setup as Setup Phase
    participant Libvirt as libvirt (host ops)
    participant Guest as Guest (in-guest ops)
    participant Snapshot as Snapshot Manager
    participant Cleanup as Teardown

    Runner->>Setup: load params (enospc_test, with_snapshot, snapshot_name)
    alt enospc_test enabled
        Setup->>Libvirt: attach qcow2 non-root disk (disk_dict)
        Libvirt-->>Runner: disk ready
    else standard mode
        Setup->>Guest: prepare loop/device, partition, mount
        Guest-->>Runner: fs prepared
    end

    alt enospc_test enabled
        Runner->>Libvirt: write large data to disk (expect ENOSPC)
        Libvirt-->>Runner: I/O returns ENOSPC / error
    else standard mode
        Runner->>Guest: run dd in-guest (expect I/O error)
        Guest-->>Runner: dd reports error
    end

    alt with_snapshot enabled
        Runner->>Snapshot: create snapshot (snapshot_name, external file)
        Snapshot-->>Runner: snapshot created
    end

    Runner->>Cleanup: validate events/logs/dominfo and run cleanup_cmds
    alt snapshot exists
        Cleanup->>Snapshot: delete snapshot
    end
    alt enospc_test enabled
        Cleanup->>Cleanup: run enospc-specific cleanup_cmd
    else standard mode
        Cleanup->>Cleanup: remove loop/device artifacts
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • chunfuwen

Poem

I nibble blocks where space ran dry, 🐇
ENOSPC whispered, I gave it a try.
I snapshot a moment, then tidy the den,
Logs checked, disks swept — I'll hop back again.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Virtual disk: add new case to make I/O error due to enospc" clearly and directly describes the main objective of the changeset. The changes add a new enospc variant configuration and extend the test code to support ENOSPC (no space on device) scenarios that produce I/O errors. The title is concise, avoids vague terminology, and accurately conveys that this PR introduces a new test case for inducing I/O errors due to ENOSPC conditions. A teammate reviewing git history would immediately understand the primary focus and purpose of these changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (3)

23-23: Mount loopback explicitly to avoid distro-dependent behavior.

Be explicit with -o loop to ensure mounting a regular file works across environments.

-            prepare_file = "truncate -s 11M ${partition_path}; mkfs.ext3 -F ${partition_path}; mkdir -p ${mount_point}; mount ${partition_path} ${mount_point}; qemu-img create -f qcow2 ${disk_path} 20M"
+            prepare_file = "truncate -s 11M ${partition_path}; mkfs.ext3 -F ${partition_path}; mkdir -p ${mount_point}; mount -o loop ${partition_path} ${mount_point}; qemu-img create -f qcow2 ${disk_path} 20M"

26-26: Make the event regex case-insensitive for robustness.

Libvirt capitalization can vary. Use inline flag.

-            event_msg = "${disk_path}.*pause due to enospc"
+            event_msg = "(?i)${disk_path}.*pause due to enospc"

Please confirm the exact wording from virsh event on your target libvirt/qemu combo.


29-29: Harden cleanup against busy mounts.

Use lazy unmount when needed.

-            cleanup_cmd = "umount ${mount_point} || true; rm -rf ${mount_point}; rm -f ${partition_path}"
+            cleanup_cmd = "mountpoint -q ${mount_point} && umount -l ${mount_point} || true; rm -rf ${mount_point}; rm -f ${partition_path}"
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (8)

89-95: Narrow exception handling; stop catching bare Exception.

Catch the expected command/session errors only. This prevents masking unrelated bugs.

-        except Exception as e:
+        # Catch only expected command errors from avocado/aexpect
+        # (import added below)
+        except (process.CmdError, aexpect.ShellCmdError) as e:
             if dd_msg not in str(e):
                 test.fail("Write data in guest should produce error: %s in %s"
                           % (dd_msg, str(e)))
             elif dd_msg in str(e):
                 test.log.debug("Get expected error: %s in %s" % (dd_msg, str(e)))

Add the missing import near the other virttest imports:

 from virttest import virsh
 from virttest import libvirt_version
+from virttest import aexpect

104-106: Fix fail message: it reads like the opposite of what happened.

Wording currently says “Find unexpected error” when the expected string is missing.

-        if not libvirt.check_logfile(guest_log_msg, qemu_log, str_in_log=True):
-            test.fail('Find unexpected error:%s in log file:%s' % (guest_log_msg, qemu_log))
+        if not libvirt.check_logfile(guest_log_msg, qemu_log, str_in_log=True):
+            test.fail('Not found expected error: %s in log file: %s' % (guest_log_msg, qemu_log))

108-114: Use a unique snapshot filename and keep a reference for cleanup.

A fixed name risks collisions and leaves the overlay behind. Tie it to the VM and store for teardown.

-            snapshot_file = os.path.join(data_dir.get_data_dir(), "vdb.snap")
+            snapshot_file = os.path.join(data_dir.get_data_dir(), f"{vm_name}-vdb.snap")
             snap_options = ("%s --diskspec %s,snapshot=external,file=%s --disk-only"
                             % (snapshot_name, disk_name, snapshot_file))
             virsh.snapshot_create_as(vm_name, snap_options, debug=True)

Also declare and propagate snapshot_file so teardown can remove it:

  • At line 71, extend nonlocal:
-        nonlocal session
+        nonlocal session, snapshot_file
  • Near line 155, initialize:
-    session = None
+    session = None
+    snapshot_file = None

125-135: Remove the external snapshot file during teardown.

virsh snapshot-delete drops metadata; the external file remains. Clean it up.

         if with_snapshot:
             virsh.snapshot_delete(vm_name, snapshot_name, debug=True)
+            if 'snapshot_file' in locals() and snapshot_file and os.path.exists(snapshot_file):
+                os.remove(snapshot_file)

62-63: Safer dmsetup invocation.

Use an absolute path (or which) to avoid PATH surprises, and validate device_manager.

-                subprocess.run(["sudo", "dmsetup", "create", device_manager],
+                import shutil
+                dmsetup_bin = shutil.which("dmsetup") or "/usr/sbin/dmsetup"
+                subprocess.run(["sudo", dmsetup_bin, "create", device_manager],
                                input=dm_table.encode('utf-8'), stderr=subprocess.PIPE, check=True)

Please confirm device_manager comes from trusted cfg and cannot inject names with spaces or special chars.


146-146: Prefer literal_eval over eval for cfg parsing.

Avoid executing arbitrary code from config.

-    disk_dict = eval(params.get("disk_dict", "{}"))
+    import ast
+    disk_dict = ast.literal_eval(params.get("disk_dict", "{}"))

108-114: Guard snapshot path when with_snapshot is set outside ENOSPC variants.

If someone enables with_snapshot without enospc_test, disk_name may be undefined. Add a defensive fallback or assert.

-        if with_snapshot:
+        if with_snapshot:
+            if not enospc_test:
+                # Fallback to target dev from the newly attached disk or default 'vdb'
+                try:
+                    disk_name = new_disk.target.get('dev', 'vdb')
+                except Exception:
+                    disk_name = 'vdb'

To ensure this never regresses, consider asserting with_snapshot => enospc_test in cfg or early in run().

Also applies to: 152-155


79-79: Small: reorder step labels or renumber for clarity.

“TEST_STEP6” occurs before “TEST_STEP5”. Minor readability nit.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9aa7dea and 4cf7289.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
🪛 Ruff (0.14.1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)


89-89: Do not catch blind exception: Exception

(BLE001)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

48-65: Eliminate redundant prepare_file execution.

Both branches execute process.run(prepare_file, **virsh_dargs) identically. Extract this common operation before the conditional to improve clarity and reduce duplication.

Apply this diff:

     def setup_test():
         """
         Prepare env for test.
         """
         test.log.info("Setup env.")
+        process.run(prepare_file, **virsh_dargs)
         if enospc_test:
-            process.run(prepare_file, **virsh_dargs)
+            pass
         else:
-            process.run(prepare_file, **virsh_dargs)
             # Find a free loop device
             global free_loop_dev
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4cf7289 and a57071e.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
🪛 Ruff (0.14.1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)


92-92: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.9
🔇 Additional comments (3)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (3)

19-25: LGTM: Imports support new enospc functionality.

The new imports are properly utilized for the enospc test path (snapshot file handling, disk I/O utilities, and non-root disk detection).


84-90: Good fix: IndexError now prevented.

The code now safely retrieves and validates the disk list before indexing. Using test.fail() is appropriate here since the absence of a non-root disk indicates a test setup issue rather than an environmental limitation.


92-99: Simplify exception handling logic.

The elif at line 96 is redundant (it's the logical inverse of line 93), and the else block at lines 98-99 is unreachable. Since dd_msg in str(e) is the only remaining case after the first if, use else instead.

Apply this diff:

         except Exception as e:
             if dd_msg not in str(e):
                 test.fail("Write data in guest should produce error: %s in %s"
                           % (dd_msg, str(e)))
-            elif dd_msg in str(e):
+            else:
                 test.log.debug("Get expected error: %s in %s" % (dd_msg, str(e)))
-        else:
-            test.fail("Get unexpected result!")

Likely an incorrect or invalid review comment.

Comment on lines +128 to +130
if with_snapshot:
virsh.snapshot_delete(vm_name, snapshot_name, debug=True)
Copy link

@coderabbitai coderabbitai bot Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify cleanup of external snapshot file.

While the snapshot is deleted via virsh.snapshot_delete, the external snapshot file created at line 113 (vdb.snap in data_dir) may need explicit filesystem cleanup depending on libvirt behavior. Confirm whether the file is automatically removed or requires manual deletion.


🏁 Script executed:

#!/bin/bash
# Check if virsh snapshot-delete removes external snapshot files
rg -n -C5 "snapshot_file.*remove|os\.remove.*snapshot|snapshot.*cleanup" --type=py

Length of output: 5998


🏁 Script executed:

#!/bin/bash
# Check the specific file and line 113 to see snapshot_file creation
cat libvirt/tests/src/virtual_disks/virtual_disks_io_error.py | sed -n '110,140p'

Length of output: 1392


Add explicit cleanup for external snapshot file in teardown.

The external snapshot file created at line 113 (vdb.snap) is not being removed in teardown. Although virsh.snapshot_delete() removes snapshot metadata, it does not delete the external disk file. This pattern is consistently handled across related test files (e.g., virtual_disks_nbd.py, npiv_snapshot.py) with explicit os.remove(snapshot_file) calls. Add the following in the teardown_test() function after line 129:

if with_snapshot and os.path.exists(snapshot_file):
    os.remove(snapshot_file)
🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_disks/virtual_disks_io_error.py around lines 128 to
129, the teardown does not remove the external snapshot file created earlier
(vdb.snap), so add explicit cleanup: after the virsh.snapshot_delete call in
teardown_test(), check if with_snapshot is true and if snapshot_file exists,
then remove it (use os.path.exists and os.remove) to ensure the external disk
file is deleted during teardown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now virsh.snapshot_delete() can delete the external disk file.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for snapshot_delete usage patterns with external snapshots in test files
rg -n -B5 -A5 "snapshot_delete.*with_snapshot|with_snapshot.*snapshot_delete" --type=py -g "libvirt/tests/**"

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for patterns where snapshot files are explicitly removed after snapshot_delete
rg -n -B3 -A3 "snapshot_delete.*\n.*os\.remove|virsh\.snapshot_delete.*\n.*\n.*os\.remove" --type=py -g "libvirt/tests/**" -U

Length of output: 883


🏁 Script executed:

#!/bin/bash
# Look for snapshot_file creation and cleanup patterns in test files
rg -n "snapshot_file.*=.*os\.path\.join|os\.remove.*snapshot" --type=py -g "libvirt/tests/**" -A2 -B2

Length of output: 4700


🏁 Script executed:

#!/bin/bash
# Check the specific files mentioned by CodeRabbit: virtual_disks_nbd.py and npiv_snapshot.py
fd -t f "virtual_disks_nbd.py|npiv_snapshot.py" libvirt/tests/ --exec grep -n -A10 -B10 "snapshot_delete\|os.remove.*snapshot"

Length of output: 5025


@meinaLi, thank you for the clarification! You're absolutely correct.

Looking at the codebase patterns, I can see the key difference:

  • With --metadata flag: virsh.snapshot_delete(vm_name, snap_name, "--metadata") only removes snapshot metadata, leaving external disk files intact. That's why files like virtual_disks_nbd.py and npiv_snapshot.py need explicit os.remove() calls after using this flag.

  • Without --metadata flag (as in your PR): virsh.snapshot_delete(vm_name, snapshot_name) automatically removes both the snapshot metadata AND the external disk files.

Since your implementation doesn't use the --metadata flag, the external snapshot file cleanup is handled automatically by libvirt, making the explicit os.remove() unnecessary. The current implementation is correct.

🐰✨


🧬 Code Graph Analysis Results

libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

  • Snippet: Teardown where external snapshot is deleted

    • Context: Cleanup after tests; uses virsh.snapshot_delete to remove an external disk snapshot if a snapshot was created.
    • Snippet:
      if with_snapshot:
      virsh.snapshot_delete(vm_name, snapshot_name, debug=True)
  • Snippet: External disk file creation via snapshot

    • Context: Creating a disk snapshot with the disk configured as external, pointing to an external file. This is the external disk file whose deletion is being verified by virsh.snapshot_delete.
    • Snippet:
      snapshot_file = os.path.join(data_dir.get_data_dir(), "vdb.snap")
      snap_options = ("%s --diskspec %s,snapshot=external,file=%s --disk-only"
      % (snapshot_name, disk_name, snapshot_file))
      virsh.snapshot_create_as(vm_name, snap_options, debug=True)

✏️ Learnings added
Learnt from: meinaLi
PR: autotest/tp-libvirt#6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.658Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

106-110: Fix assertion message (inverted wording).

When the log doesn’t contain the expected message, the failure should say “Not found”.

-        if not libvirt.check_logfile(guest_log_msg, qemu_log, str_in_log=True):
-            test.fail('Find unexpected error:%s in log file:%s' % (guest_log_msg, qemu_log))
+        if not libvirt.check_logfile(guest_log_msg, qemu_log, str_in_log=True):
+            test.fail('Not find expected error: %s in log file: %s' % (guest_log_msg, qemu_log))
🧹 Nitpick comments (5)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (5)

48-57: Remove duplicate prepare step and simplify branching.

process.run(prepare_file, ...) is executed in both branches. Simplify:

-        if enospc_test:
-            process.run(prepare_file, **virsh_dargs)
-        else:
-            process.run(prepare_file, **virsh_dargs)
-            # Find a free loop device
-            global free_loop_dev
-            free_loop_dev = process.run("losetup --find", **virsh_dargs).stdout_text.strip()
+        process.run(prepare_file, **virsh_dargs)
+        if not enospc_test:
+            # Find a free loop device
+            nonlocal free_loop_dev
+            free_loop_dev = process.run("losetup --find", **virsh_dargs).stdout_text.strip()

53-53: Avoid module-level global; keep free_loop_dev scoped to this test.

Use nonlocal and initialize in run() to prevent cross-test leakage.

-            global free_loop_dev
+            nonlocal free_loop_dev

Add in run() near other locals (after Line 158):

# initialize loop device holder for non-enospc path
free_loop_dev = None

92-98: Don’t catch bare Exception; narrow to expected command failures.

Catching all exceptions hides real bugs. Prefer specific exceptions from Avocado/aexpect and re-raise unknowns.

Example:

from avocado.utils import process as av_process
from aexpect import ShellCmdError

try:
    ...
except (av_process.CmdError, ShellCmdError) as e:
    if dd_msg not in str(e):
        test.fail("Expected error '%s' not found in: %s" % (dd_msg, str(e)))
    test.log.debug("Got expected error: %s", e)

131-136: Guard ENOSPC cleanup when cleanup_cmd is unset.

Avoid process.run(None, ...) if the parameter is missing.

-        if enospc_test:
-            process.run(params.get("cleanup_cmd"), **virsh_dargs)
+        if enospc_test:
+            cleanup_cmd = params.get("cleanup_cmd")
+            if cleanup_cmd:
+                process.run(cleanup_cmd, **virsh_dargs)
+            else:
+                test.log.debug("No cleanup_cmd provided; skipping ENOSPC cleanup.")

56-57: Harden host command invocations (shell usage and PATH).

  • Prefer shell=False with argv lists for process.run to avoid quoting/injection issues from params.
  • Consider absolute paths (/usr/sbin/dmsetup, /sbin/losetup) or shutil.which to avoid PATH surprises under test runners.

Example:

# find free loop
free_loop_dev = process.run(["losetup", "--find"], ignore_status=False, shell=False).stdout_text.strip()
# attach
process.run(["losetup", free_loop_dev, device_manager_path], ignore_status=False, shell=False)
# detach/remove
process.run(["losetup", "-d", free_loop_dev], ignore_status=False, shell=False)
process.run(["dmsetup", "remove", device_manager], ignore_status=False, shell=False)

If desired, I can prepare a small patch to split virsh_dargs (for virsh) from host_dargs (for process.run) to default shell=False for host cmds.

Also applies to: 134-136

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a57071e and 9e141ae.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T06:35:25.658Z
Learnt from: meinaLi
PR: autotest/tp-libvirt#6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.658Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
🪛 Ruff (0.14.1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)


92-92: Do not catch blind exception: Exception

(BLE001)


115-115: Undefined name target_disk

(F821)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.9
  • GitHub Check: Python 3.12
🔇 Additional comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

128-129: Snapshot cleanup LGTM.

Using virsh.snapshot_delete() without --metadata cleans up metadata and the external file; no extra os.remove() needed. Based on learnings.

Comment on lines +62 to 66
subprocess.run(["sudo", "dmsetup", "create", device_manager],
input=dm_table.encode('utf-8'), stderr=subprocess.PIPE, check=True)
except subprocess.CalledProcessError as e:
test.log.debug("create device manager failed :%s", e.stderr.decode('utf-8'))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

dmsetup failure should fail the test, not just log.

Currently only logs on CalledProcessError, which can mask setup failure and cause confusing downstream results.

-            except subprocess.CalledProcessError as e:
-                test.log.debug("create device manager failed :%s", e.stderr.decode('utf-8'))
+            except subprocess.CalledProcessError as e:
+                test.error("dmsetup create %s failed: %s"
+                           % (device_manager, e.stderr.decode('utf-8')))
🧰 Tools
🪛 Ruff (0.14.1)

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_disks/virtual_disks_io_error.py around lines 62 to
66, the subprocess.CalledProcessError from dmsetup creation is currently only
logged which masks setup failures; change the except block to include the error
details and fail the test immediately (e.g., call test.fail(...) with a
descriptive message including e.stderr.decode('utf-8') or re-raise the exception
after logging) so any dmsetup failure stops the test rather than continuing.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (2)

62-65: Fix: dmsetup failure should fail the test immediately.

The CalledProcessError is only logged, which masks setup failures and causes confusing downstream test results. A failed dmsetup create means the test environment is not properly configured.

Apply this diff to fail the test on dmsetup errors:

             except subprocess.CalledProcessError as e:
-                test.log.debug("create device manager failed :%s", e.stderr.decode('utf-8'))
+                test.error("dmsetup create %s failed: %s"
+                           % (device_manager, e.stderr.decode('utf-8')))

111-117: Fix: undefined target_disk, wrong step order, and missing validation.

Three issues:

  1. Critical: target_disk (line 115) may be undefined. The enospc config variant doesn't define target_disk parameter, so params.get("target_disk") at line 156 returns None, causing snapshot creation to use an invalid diskspec.

  2. Minor: TEST_STEP6 is logged before TEST_STEP5, creating confusing step order.

  3. Missing validation: The config structure nests with_snapshot under enospc, but there's no runtime check ensuring snapshots only run with ENOSPC.

Apply this diff to fix all three issues:

         if with_snapshot:
-            test.log.info("TEST_STEP6: Create disk snapshot and check I/O error.")
+            test.log.info("TEST_STEP5: Create disk snapshot and check I/O error.")
+            if not enospc_test:
+                test.error("Snapshot test requires enospc_test=yes per configuration")
+            # Derive target disk from disk_device parameter or disk_dict
+            target_disk_dev = params.get("disk_device") or disk_dict.get("target", {}).get("dev")
+            if not target_disk_dev:
+                test.error("Cannot determine target disk device for snapshot")
             snapshot_file = os.path.join(data_dir.get_data_dir(), "vdb.snap")
             snap_options = ("%s --diskspec %s,snapshot=external,file=%s --disk-only"
-                            % (snapshot_name, target_disk, snapshot_file))
+                            % (snapshot_name, target_disk_dev, snapshot_file))
             virsh.snapshot_create_as(vm_name, snap_options, debug=True)
-        test.log.info("TEST_STEP5: Check about I/O error in virsh domain info.")
+        test.log.info("TEST_STEP6: Check about I/O error in virsh domain info.")
🧹 Nitpick comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

93-99: Simplify redundant conditional logic.

The elif at line 96 is redundant since it checks the exact opposite of line 93's condition.

Apply this diff to simplify:

         except Exception as e:
             if dd_msg not in str(e):
                 test.fail("Write data in guest should produce error: %s in %s"
                           % (dd_msg, str(e)))
-            elif dd_msg in str(e):
+            else:
                 test.log.debug("Get expected error: %s in %s" % (dd_msg, str(e)))
         else:
             test.fail("Get unexpected result!")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e141ae and 6935f62.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T06:35:25.658Z
Learnt from: meinaLi
PR: autotest/tp-libvirt#6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.658Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
🪛 Ruff (0.14.1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)


92-92: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.9
🔇 Additional comments (5)
libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1)

18-35: LGTM! Well-structured ENOSPC test configuration.

The new enospc variant is properly configured with:

  • Appropriate disk setup using a small ext3 filesystem to reliably trigger ENOSPC
  • Complete parameter set including disk_dict, error messages, and cleanup commands
  • Two nested variants (default and with_snapshot) aligning with the test scenarios described in the PR

The cleanup command correctly uses || true to ensure teardown continues even if umount fails.

libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (4)

19-25: LGTM! Necessary imports for ENOSPC functionality.

The new imports support the ENOSPC test flow:

  • data_dir: Used for snapshot file path construction
  • utils_disk: Used for writing data to VM disk
  • libvirt_disk: Used for identifying non-root disks

71-71: LGTM! Appropriate use of nonlocal.

The nonlocal session declaration correctly allows run_test() to modify the session variable for use in teardown.


84-91: LGTM! Proper handling of ENOSPC vs non-ENOSPC test paths.

The conditional logic correctly:

  • Selects a non-root disk for ENOSPC tests via get_non_root_disk_name()
  • Includes a proper empty-list check to avoid IndexError (addresses past comment)
  • Uses appropriate disk writing method for each scenario

128-138: LGTM! Proper conditional teardown for both test variants.

The teardown correctly:

  • Deletes snapshots without --metadata flag, which automatically removes external disk files (per learnings)
  • Applies variant-specific cleanup:
    • ENOSPC: unmounts filesystem and removes backing files
    • Non-ENOSPC: removes loop device, device-mapper, and backing file

Based on learnings.

Automate case:
VIRT-303987 - [io-error] Make disk produce I/O error due to "enospc"
Scenario 1: Make disk produce I/O error due to "enospc", then check the
error in "virsh event", vm log and "virsh dominfo".
Scenario 2: Based on scenario 1, create disk snapshot, then query domain
info again.

Signed-off-by: Meina Li <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (2)

62-65: dmsetup failure should fail the test, not just log.

Currently only logs on CalledProcessError, which can mask setup failure and cause confusing downstream results.

 except subprocess.CalledProcessError as e:
-    test.log.debug("create device manager failed :%s", e.stderr.decode('utf-8'))
+    test.error("dmsetup create %s failed: %s"
+               % (device_manager, e.stderr.decode('utf-8')))

111-116: Validate both target_disk and snapshot_name parameters before snapshot creation.

The concern is valid: target_disk and snapshot_name are both obtained via params.get() without defaults (lines 157, 159), and both are used directly at lines 114-115 in the virsh command construction. If either is missing when with_snapshot=yes, the command will be invalid. Add validation for both parameters.

 if with_snapshot:
+    if not target_disk or not snapshot_name:
+        test.error("with_snapshot=yes requires target_disk and snapshot_name parameters")
     test.log.info("TEST_STEP: Create disk snapshot and check I/O error.")
🧹 Nitpick comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

48-51: Clarify duplicate prepare_file execution.

Both the enospc_test and non-enospc branches execute the same prepare_file command. If this is intentional, consider moving it before the conditional to make the intent clear. If the two branches should run different commands, update one of them accordingly.

If both branches need the same preparation, refactor to:

+    process.run(prepare_file, **virsh_dargs)
     if enospc_test:
-        process.run(prepare_file, **virsh_dargs)
+        pass  # enospc-specific setup can go here if needed
     else:
-        process.run(prepare_file, **virsh_dargs)
         # Find a free loop device
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6935f62 and fe4146b.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg (1 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_io_error.cfg
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T06:35:25.658Z
Learnt from: meinaLi
PR: autotest/tp-libvirt#6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.658Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
PR: autotest/tp-libvirt#6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_io_error.py
🪛 Ruff (0.14.1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py

62-62: subprocess call: check for execution of untrusted input

(S603)


62-62: Starting a process with a partial executable path

(S607)


92-92: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.9
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.8
🔇 Additional comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_io_error.py (1)

84-89: Well done addressing the IndexError concern.

The code now properly checks if disk_names is empty before indexing, and fails with a clear message. This resolves the previous concern about potential IndexError.

@meinaLi
Copy link
Contributor Author

meinaLi commented Oct 24, 2025

# avocado run --vt-type libvirt --vt-omit-data-loss --vt-machine-type q35 virtual_disks.io_error
JOB ID     : 95f24d5ed3ca5264a6962a2b0dd76c5e923fc27f
JOB LOG    : /var/log/avocado/job-results/job-2025-10-24T02.42-95f24d5/job.log
 (1/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.hypervisor_message: STARTED
 (1/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.hypervisor_message: PASS (48.23 s)
 (2/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.enospc.default: STARTED
 (2/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.enospc.default: PASS (112.81 s)
 (3/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.enospc.with_snapshot: STARTED
 (3/3) type_specific.io-github-autotest-libvirt.virtual_disks.io_error.enospc.with_snapshot: PASS (112.98 s)
RESULTS    : PASS 3 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2025-10-24T02.42-95f24d5/results.html
JOB TIME   : 277.77 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant