Skip to content

Conversation

smalis-msft
Copy link
Contributor

The intention behind this change is to provide better crash stacks and dumps in ICMs. By panicking at the point of error construction, rather than after the error has bubbled up to a higher layer, we are better able to capture stack context around the cause of the failure and to provide a meaningful stack to Watson (instead of just blaming the HaltRequest::Panic handler for all these different error sources). Essentially simulating 'unwrapping' these errors.

However the ability to keep a VM 'alive' for inspection after a failure is still useful, and we want to preserve it. Therefore this change makes this new behavior configurable.

@Copilot Copilot AI review requested due to automatic review settings August 25, 2025 19:39
@smalis-msft smalis-msft requested a review from a team as a code owner August 25, 2025 19:39
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This RFC introduces a new fatal_error method to the CpuIo trait that allows configurable handling of fatal VM errors. Instead of using generic VpHaltReason variants, errors now trigger immediate panics (for better crash stacks) or debug breaks (for VM inspection), based on policy configuration.

Key changes:

  • Adds CpuIo::fatal_error method with configurable FatalErrorPolicy
  • Removes generic VpHaltReason variants (InvalidVmState, Hypervisor, EmulationFailure)
  • Updates all error handling sites to use the new fatal_error method

Reviewed Changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vmm_core/vmm_core_defs/src/lib.rs Removes InvalidVmState and VpError halt reason variants
vmm_core/virt/src/io.rs Adds new fatal_error method to CpuIo trait
vmm_core/virt/src/generic.rs Removes generic error halt reason variants
vmm_core/src/vmotherboard_adapter.rs Implements fatal_error with configurable policy
vmm_core/src/partition_unit/vp_set.rs Removes handling for removed halt reason variants
Multiple virt_* files Updates error handling to use dev.fatal_error()
openhcl/underhill_core/src/worker.rs Configures fatal error policy and removes panic handling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant