nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead #2138

gurasinghMS · 2025-10-10T04:43:27Z

Every QueueHandler now stores an aer_handler. The aer_handler implements the new AerHandler trait and can either be an AdminAerHandler which handles aer commands and communication properly or it can be a NoOpAerHandler that does nothing. The implementation used is determined at the time of QueuePair creation by the NvmeDriver which provides the is_admin boolean value. The bool is persisted as part of the QueuePair state so that the correct implementation is used post restore.

In the admin queue path (using AdminAerHandler), the NvmeDriver no longer drives the AER loop. It is instead handed by the QueueHandler. When looking for commands to process (in the recv channel), the admin QueueHandler prioritizes sending an AERs before processing commands. So when the run loop starts, the QueueHandler automatically sends and AER as the first command. And that subsequent AER commands are prioritized if an AER isn't already pending.

Aer commands are always modeled as detached rpc calls and the AdminAerHandler just scans every completion on the admin queue awaiting the AEN. This is not that much of a performance defecit because such scanning only happens on the admin queue. Performance of IO Queues not impacted.

Io QueueHandlers are provided the NoOpAerHandler which only has empty function. These empty functions should be compiled away and thus the IO implementation should remain exactly as performant as it is today (Thanks to @alandau for the insights here). This is especially true for the poll_send_aer that sits on the hot/critical path. Since it uses the #[inline] tag and always returns false, the entire if check should be compiled away for all Io Queues. (We don't need to inline the other functions because they don't actually sit on the critical path and won't be hit on Io queues anyways.

gurasinghMS · 2025-10-10T17:55:20Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

 }

+pub struct NoOpAerHandler;
+impl AerHandler for NoOpAerHandler {}


For the reviewers: Should we be adding a panic in here somewhere to make sure that functions like handle_aen( ) are never invoked?

The functions will be invoked, but the invocation will be inlined (well, modulo the Box<dyn> thing). If you put a panic! here, it will panic.

That's right! I was wondering if we should panic if this function is called. As in, if we end up in a situation where the driver is sending GetAen commands to an IOQueue, something has gone horribly wrong somewhere ....

Ah, sorry, missed that you're talking about handle_aer_request (as opposed to, say, poll_send_aer).
The function can be called only if the hardware devices to send an AEN without an explicit AER, and this is out of spec, right? If we don't anticipate buggy hardware (physical or virtualized), or we want to catch these bugs at the expense of a panic, this looks like a good idea. But I don't know if that's the policy for OpenVMM. As an alternative, we can log a message (that's likely to be ignored)

if the intention is that these functions should never be invoked then a panic is the right thing to do.

I added 2 panics in the NoOp handler code for the functions on the non-critical path. i.e. these functions are only ever invoked if the driver requests for an AEN or if the IO Queue tries to send an AER .... both of which should NEVER happen. This should be more of a failsafe

Copilot

Pull Request Overview

This PR templatizes the NVMe queue handler to optimize AER (Asynchronous Event Request) handling by introducing handler-specific behavior for admin vs IO queues. The key improvement is moving AER handling from the driver level to the queue level while maintaining performance for IO operations.

Introduces an AerHandler trait with AdminAerHandler for admin queues and NoOpAerHandler for IO queues
Moves AER command management from the NvmeDriver to the AdminAerHandler within queue processing
Uses templating and inlining to ensure IO queue performance remains unaffected

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
queue_pair.rs	Adds AER handler trait system, implements admin and no-op handlers, integrates handler into queue processing loop
driver.rs	Updates queue creation to specify admin/IO type, modifies AER handling to use new queue-level API

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

Co-authored-by: Copilot <[email protected]>

github-actions · 2025-10-10T20:10:19Z

At least one Petri test failed.

alandau · 2025-10-10T20:35:32Z

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

        #[mesh(6)]
        pub handler_data: QueueHandlerSavedState,
+        #[mesh(7)]
+        pub is_admin: Option<()>,


nit: Why is this an Option instead of a plain bool?

See https://openvmm.dev/guide/dev_guide/contrib/save-state.html about safely adding fields to saved state. I bet that's what led Guramrit to make this an Option. I do agree, this should be an Option<bool> rather than an Option<()>.

I think there are two ways to go forward here:

The right way: if is_admin is Some(...), then we can trust what it says. Otherwise, we need the code in the restore paths to tell the QueueHandler if it's an admin queue or not. Perhaps this can be handled in the top level restore of the NVMe controller.

The okay way: detect that we're the admin queue at the time of issuing first AEN, and reconfigure ourselves for that.

Ignore this problem, since we aren't going to ever save nvme driver without using keepalive, and we're not going to use keepalive until after all this code is in.

I actually think (1) is easier than (2), but let me know what you all think.

I thought I was being so smart using Option<()> instead of a bool hehe!! Made it an option because of the saved state guidelines that matt linked. Will make this an Option instead!

gurasinghMS · 2025-10-10T23:40:38Z

I updated the code to use concrete types for the QueueHandler to allow for compiler optimizations in the QueueHandler::run() function. @mattkur @alandau I did have to remove the Option<()> from the saved state of the QueuePair as it is no longer needed. The driver now invokes the appropriate type for the QueuePair<T> during both new() and restore(). Let me know what you think.
I am a little concerned about what happens when the controller returns an error for an AER command. If this is unchecked on the driver side, we end up in a vicious loop where the driver just hammers the controller with AER commands. For now I added a failed state. Handler will stop processing AERs upon receiving the first failure (This is exactly what we do today). Will need to think about this case more and maybe add some sort of throttling mechanism

github-actions · 2025-10-11T00:36:19Z

At least one Petri test failed.

gurasinghMS · 2025-10-11T01:06:22Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

+                last_aen,
+                await_aen_cid,
+            } = state;
+            self.last_aen = last_aen.map(AsynchronousEventRequestDw0::from_bits); // Restore from u32


I am wondering if saving this as a u32 makes sense here. Should we instead be updating the type to allow saving it directly?

gurasinghMS · 2025-10-13T21:33:34Z

In order to verify that the poll_send_aer() function is indeed being compiled out of the driver code, I built and inspected the crate assembly using:

cargo build --release --package nvme_driver && cargo rustc --package nvme_driver --lib -- --emit=asm
rustfilt < nvme_driver-9da3de0345bb5b66.s > ../../demangled_output_release.s
grep -n "poll_send_aer" demangled_output_release.s

Output shows that no function is even created for the NoOpAerHandler type. This should mean that things are properly compiled out:

alandau · 2025-10-13T21:58:25Z

In order to verify that the poll_send_aer() function is indeed being compiled out of the driver code, I built and inspected the crate assembly using:
cargo build --release --package nvme_driver && cargo rustc --package nvme_driver --lib -- --emit=asm
rustfilt < nvme_driver-9da3de0345bb5b66.s > ../../demangled_output_release.s
grep -n "poll_send_aer" demangled_output_release.s

I think this pretty much proves it. That said, I'd look at the disassembly of the compiled executable (with symbols) to find the function that's supposed to call poll_send_aer and see that no call happens (and nothing is inlined in its place).

alandau

The bits around making QueuePair generic look good to me, thanks for addressing the comments.

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

alandau · 2025-10-13T22:21:01Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

 }

+pub struct NoOpAerHandler;
+impl AerHandler for NoOpAerHandler {}


Ah, sorry, missed that you're talking about handle_aer_request (as opposed to, say, poll_send_aer).
The function can be called only if the hardware devices to send an AEN without an explicit AER, and this is out of spec, right? If we don't anticipate buggy hardware (physical or virtualized), or we want to catch these bugs at the expense of a panic, this looks like a good idea. But I don't know if that's the policy for OpenVMM. As an alternative, we can log a message (that's likely to be ignored)

gurasinghMS · 2025-10-13T22:53:09Z

Ah, sorry, missed that you're talking about handle_aer_request (as opposed to, say, poll_send_aer).
The function can be called only if the hardware devices to send an AEN without an explicit AER, and this is out of spec, right? If we don't anticipate buggy hardware (physical or virtualized), or we want to catch these bugs at the expense of a panic, this looks like a good idea. But I don't know if that's the policy for OpenVMM. As an alternative, we can log a message (that's likely to be ignored)

If a buggy AEN is sent to an IO queue, the code will just panic when it tries removing from the pending_commands list because the cid was never sent to the device

github-actions · 2025-10-13T23:49:56Z

At least one Petri test failed.

chris-oo · 2025-10-14T20:21:05Z

vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs

+
+    #[derive(Clone, Debug, Protobuf)]
+    #[mesh(package = "nvme_driver")]
+    pub struct AerHandlerSavedState {


This looks fine to me but probably needs review from matt?

what happens when we go to an version of the driver that does not support this? or we come from a version of a driver that didn't support this?

This is saved as an Option<>:

pub struct QueueHandlerSavedState { #[mesh(1)] pub sq_state: SubmissionQueueSavedState, #[mesh(2)] pub cq_state: CompletionQueueSavedState, #[mesh(3)] pub pending_cmds: PendingCommandsSavedState, #[mesh(4)] pub aer_handler: Option<AerHandlerSavedState>, }

I am not sure if we go backwards with the versions (i.e. to a driver that doesn't support this) but if we go forward (i.e. coming from a version of a driver that didn't support this), during restore() the admin aer handler would do nothing and just start from a new state:

fn restore(&mut self, state: &Option<AerHandlerSavedState>) { if let Some(state) = state { let AerHandlerSavedState { last_aen, await_aen_cid, } = state; self.last_aen = last_aen.map(AsynchronousEventRequestDw0::from_bits); // Restore from u32 self.await_aen_cid = *await_aen_cid; } }

We discussed this yesterday. Since this hasn't yet shipped, I don't think we need to think too hard about this. But:

new -> old: no worse than now. AER will still be pended in the device (or not), but that's the existing behavior anyways.
old -> new: no worse than now. AER will still be pended in the device, but as Guramrit mentions, the new driver won't know. (which is the same as existing behavior)

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

chris-oo · 2025-10-14T20:22:52Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

+    /// Returns whether an AER needs to sent to the controller or not. Since
+    /// this is the only function on the critical path, attempt to inline it.
+    #[inline]
+    fn poll_send_aer(&self) -> bool {


is default impls the right thing to do here, or do you want to force trait implementers to implement the noops themselves? It seems like it should be the latter?

Personally, I can see this going either way and I don't have a strong opinion on this. In my head, I was treating the default implementation of the trait as the NoOp-Handler (indicated this in the trait comment as well). Happy to go either way on this one

mattkur

few minor comments, but looks good. Thanks Guramrit!

mattkur · 2025-10-15T18:29:02Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

+            // If error, cleanup and stop processing AENs.
+            if completion.status.status() != 0 {
+                self.failed = true;
+                self.last_aen = None;
+                return;
+            }


nit: is an error logged in this case?

mattkur · 2025-10-15T19:01:05Z

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

+            // Complete the AEN or pend it.
+            let aen = AsynchronousEventRequestDw0::from_bits(completion.dw0);
+            if let Some(send_aen) = self.send_aen.take() {
+                send_aen.complete(aen);
+            } else {
+                self.last_aen = Some(aen);
+            }


nit: please add a comment that explains why it's safe to delay sending the AER here. AER will be sent before the command to get log page (which is what clears the bit in the driver that tells the device to send an AER).

gurasinghMS and others added 14 commits October 6, 2025 10:33

Very jank v1 implementation

96fb229

Responding to comments from matt

4c33530

Merge branch 'main' into move-aer-implementation

60245cb

Minor fixes

1d1ad1e

Now using the updated pending commands implementation

3cde1c3

Merge branch 'main' into move-aer-implementation

f322a88

some more cleanup

1a2c16f

Small fix for clippy

3caa63e

Merge branch 'main' into move-aer-implementation

88ac1c3

Merge branch 'main' into move-aer-implementation

e29c243

No longer using the modified pending commands structureg

886dade

Created the run_admin function

5db3b9d

Using a template for the AER implementation

b967fef

Merge branch 'main' into template-aer-implementation

816dbaf

gurasinghMS commented Oct 10, 2025

View reviewed changes

Added some comments

e0e7c80

gurasinghMS changed the title ~~wip: templatize queue handler in the nvme driver to allow aer handling with low overhead~~ nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead Oct 10, 2025

Merge branch 'main' into template-aer-implementation

7df4cad

gurasinghMS marked this pull request as ready for review October 10, 2025 18:34

gurasinghMS requested review from a team as code owners October 10, 2025 18:34

Copilot AI review requested due to automatic review settings October 10, 2025 18:34

Copilot AI reviewed Oct 10, 2025

View reviewed changes

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Outdated Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Outdated Show resolved Hide resolved

vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs Outdated Show resolved Hide resolved

gurasinghMS mentioned this pull request Oct 10, 2025

nvme_driver: move logic for aer processing to the queue handler instead of the driver #2095

Closed

gurasinghMS and others added 4 commits October 10, 2025 11:41

Update vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

cdd4ab4

Co-authored-by: Copilot <[email protected]>

Update vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

7d80550

Co-authored-by: Copilot <[email protected]>

Update vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs

e58d6f0

Co-authored-by: Copilot <[email protected]>

Remove unused imports

bff5745

alandau reviewed Oct 10, 2025

View reviewed changes

gurasinghMS and others added 3 commits October 10, 2025 16:25

fmt fix

013eb35

Remove changes from the controller. Those are not required

08d4b32

Merge branch 'main' into template-aer-implementation

30f508e

gurasinghMS added 2 commits October 10, 2025 17:42

Fixed the failing test

ea1bd61

const values are placed properly in queue pair file

bfd4f50

gurasinghMS commented Oct 11, 2025

View reviewed changes

gurasinghMS and others added 4 commits October 10, 2025 18:14

Some minor fixes

c73006f

Fixed the build issue. I think

5ee0031

Merge branch 'main' into template-aer-implementation

5dcdb40

Removing random blank lines

381c13e

alandau reviewed Oct 13, 2025

View reviewed changes

Merge branch 'main' into template-aer-implementation

7b55092

gurasinghMS added 2 commits October 14, 2025 09:48

Merge branch 'main' into template-aer-implementation

12589aa

Merge branch 'main' into template-aer-implementation

ca61bd9

alandau previously approved these changes Oct 14, 2025

View reviewed changes

chris-oo reviewed Oct 14, 2025

View reviewed changes

Added comments for the AerHandler trait and the AdminAER handler

b58fb31

gurasinghMS dismissed alandau’s stale review via b58fb31 October 14, 2025 20:51

gurasinghMS and others added 4 commits October 14, 2025 14:01

Fixing fmt issue

c98c58d

Added panics for the non-critical path functions in the no-op handler

0a53861

Fixing fmt

4a47a1a

Merge branch 'main' into template-aer-implementation

62f7b3b

mattkur approved these changes Oct 15, 2025

View reviewed changes

gurasinghMS merged commit 0e50fcf into microsoft:main Oct 15, 2025
50 checks passed

nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead #2138

nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead #2138

Uh oh!

Conversation

gurasinghMS commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS commented Oct 13, 2025

Uh oh!

alandau commented Oct 13, 2025

Uh oh!

alandau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS commented Oct 13, 2025

Uh oh!

github-actions bot commented Oct 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattkur left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gurasinghMS commented Oct 10, 2025 •

edited

Loading

gurasinghMS Oct 14, 2025 •

edited

Loading

gurasinghMS commented Oct 10, 2025 •

edited

Loading

gurasinghMS Oct 14, 2025 •

edited

Loading