-
Notifications
You must be signed in to change notification settings - Fork 156
nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvme_driver: templatize queue handler in the nvme driver to allow aer handling with low overhead #2138
Conversation
} | ||
|
||
pub struct NoOpAerHandler; | ||
impl AerHandler for NoOpAerHandler {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the reviewers: Should we be adding a panic in here somewhere to make sure that functions like handle_aen( ) are never invoked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions will be invoked, but the invocation will be inlined (well, modulo the Box<dyn>
thing). If you put a panic!
here, it will panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right! I was wondering if we should panic if this function is called. As in, if we end up in a situation where the driver is sending GetAen
commands to an IOQueue, something has gone horribly wrong somewhere ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, missed that you're talking about handle_aer_request
(as opposed to, say, poll_send_aer
).
The function can be called only if the hardware devices to send an AEN without an explicit AER, and this is out of spec, right? If we don't anticipate buggy hardware (physical or virtualized), or we want to catch these bugs at the expense of a panic, this looks like a good idea. But I don't know if that's the policy for OpenVMM. As an alternative, we can log a message (that's likely to be ignored)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the intention is that these functions should never be invoked then a panic is the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added 2 panics in the NoOp handler code for the functions on the non-critical path. i.e. these functions are only ever invoked if the driver requests for an AEN or if the IO Queue tries to send an AER .... both of which should NEVER happen. This should be more of a failsafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR templatizes the NVMe queue handler to optimize AER (Asynchronous Event Request) handling by introducing handler-specific behavior for admin vs IO queues. The key improvement is moving AER handling from the driver level to the queue level while maintaining performance for IO operations.
- Introduces an
AerHandler
trait withAdminAerHandler
for admin queues andNoOpAerHandler
for IO queues - Moves AER command management from the NvmeDriver to the AdminAerHandler within queue processing
- Uses templating and inlining to ensure IO queue performance remains unaffected
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
queue_pair.rs | Adds AER handler trait system, implements admin and no-op handlers, integrates handler into queue processing loop |
driver.rs | Updates queue creation to specify admin/IO type, modifies AER handling to use new queue-level API |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
#[mesh(6)] | ||
pub handler_data: QueueHandlerSavedState, | ||
#[mesh(7)] | ||
pub is_admin: Option<()>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why is this an Option
instead of a plain bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://openvmm.dev/guide/dev_guide/contrib/save-state.html about safely adding fields to saved state. I bet that's what led Guramrit to make this an Option. I do agree, this should be an Option<bool>
rather than an Option<()>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two ways to go forward here:
-
The right way: if
is_admin
isSome(...)
, then we can trust what it says. Otherwise, we need the code in the restore paths to tell the QueueHandler if it's an admin queue or not. Perhaps this can be handled in the top levelrestore
of the NVMe controller. -
The okay way: detect that we're the admin queue at the time of issuing first AEN, and reconfigure ourselves for that.
-
Ignore this problem, since we aren't going to ever save nvme driver without using keepalive, and we're not going to use keepalive until after all this code is in.
I actually think (1) is easier than (2), but let me know what you all think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I was being so smart using Option<()> instead of a bool hehe!! Made it an option because of the saved state guidelines that matt linked. Will make this an Option instead!
I updated the code to use concrete types for the QueueHandler to allow for compiler optimizations in the QueueHandler::run() function. @mattkur @alandau I did have to remove the |
last_aen, | ||
await_aen_cid, | ||
} = state; | ||
self.last_aen = last_aen.map(AsynchronousEventRequestDw0::from_bits); // Restore from u32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if saving this as a u32 makes sense here. Should we instead be updating the type to allow saving it directly?
I think this pretty much proves it. That said, I'd look at the disassembly of the compiled executable (with symbols) to find the function that's supposed to call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bits around making QueuePair
generic look good to me, thanks for addressing the comments.
} | ||
|
||
pub struct NoOpAerHandler; | ||
impl AerHandler for NoOpAerHandler {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, missed that you're talking about handle_aer_request
(as opposed to, say, poll_send_aer
).
The function can be called only if the hardware devices to send an AEN without an explicit AER, and this is out of spec, right? If we don't anticipate buggy hardware (physical or virtualized), or we want to catch these bugs at the expense of a panic, this looks like a good idea. But I don't know if that's the policy for OpenVMM. As an alternative, we can log a message (that's likely to be ignored)
If a buggy AEN is sent to an IO queue, the code will just panic when it tries removing from the pending_commands list because the cid was never sent to the device |
|
||
#[derive(Clone, Debug, Protobuf)] | ||
#[mesh(package = "nvme_driver")] | ||
pub struct AerHandlerSavedState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me but probably needs review from matt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when we go to an version of the driver that does not support this? or we come from a version of a driver that didn't support this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is saved as an Option<>:
pub struct QueueHandlerSavedState {
#[mesh(1)]
pub sq_state: SubmissionQueueSavedState,
#[mesh(2)]
pub cq_state: CompletionQueueSavedState,
#[mesh(3)]
pub pending_cmds: PendingCommandsSavedState,
#[mesh(4)]
pub aer_handler: Option<AerHandlerSavedState>,
}
I am not sure if we go backwards with the versions (i.e. to a driver that doesn't support this) but if we go forward (i.e. coming from a version of a driver that didn't support this), during restore()
the admin aer handler would do nothing and just start from a new
state:
fn restore(&mut self, state: &Option<AerHandlerSavedState>) {
if let Some(state) = state {
let AerHandlerSavedState {
last_aen,
await_aen_cid,
} = state;
self.last_aen = last_aen.map(AsynchronousEventRequestDw0::from_bits); // Restore from u32
self.await_aen_cid = *await_aen_cid;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this yesterday. Since this hasn't yet shipped, I don't think we need to think too hard about this. But:
new -> old: no worse than now. AER will still be pended in the device (or not), but that's the existing behavior anyways.
old -> new: no worse than now. AER will still be pended in the device, but as Guramrit mentions, the new driver won't know. (which is the same as existing behavior)
/// Returns whether an AER needs to sent to the controller or not. Since | ||
/// this is the only function on the critical path, attempt to inline it. | ||
#[inline] | ||
fn poll_send_aer(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is default impls the right thing to do here, or do you want to force trait implementers to implement the noops themselves? It seems like it should be the latter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I can see this going either way and I don't have a strong opinion on this. In my head, I was treating the default implementation of the trait as the NoOp-Handler (indicated this in the trait comment as well). Happy to go either way on this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor comments, but looks good. Thanks Guramrit!
// If error, cleanup and stop processing AENs. | ||
if completion.status.status() != 0 { | ||
self.failed = true; | ||
self.last_aen = None; | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is an error logged in this case?
// Complete the AEN or pend it. | ||
let aen = AsynchronousEventRequestDw0::from_bits(completion.dw0); | ||
if let Some(send_aen) = self.send_aen.take() { | ||
send_aen.complete(aen); | ||
} else { | ||
self.last_aen = Some(aen); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please add a comment that explains why it's safe to delay sending the AER here. AER will be sent before the command to get log page (which is what clears the bit in the driver that tells the device to send an AER).
Every
QueueHandler
now stores anaer_handler
. Theaer_handler
implements the newAerHandler
trait and can either be an AdminAerHandler which handles aer commands and communication properly or it can be a NoOpAerHandler that does nothing. The implementation used is determined at the time of QueuePair creation by the NvmeDriver which provides theis_admin
boolean value. The bool is persisted as part of the QueuePair state so that the correct implementation is used post restore.In the admin queue path (using
AdminAerHandler
), theNvmeDriver
no longer drives the AER loop. It is instead handed by the QueueHandler. When looking for commands to process (in the recv channel), the admin QueueHandler prioritizes sending an AERs before processing commands. So when the run loop starts, the QueueHandler automatically sends and AER as the first command. And that subsequent AER commands are prioritized if an AER isn't already pending.Aer commands are always modeled as detached rpc calls and the AdminAerHandler just scans every completion on the admin queue awaiting the AEN. This is not that much of a performance defecit because such scanning only happens on the admin queue. Performance of IO Queues not impacted.
Io QueueHandlers are provided the NoOpAerHandler which only has empty function. These empty functions should be compiled away and thus the IO implementation should remain exactly as performant as it is today (Thanks to @alandau for the insights here). This is especially true for the
poll_send_aer
that sits on the hot/critical path. Since it uses the#[inline]
tag and always returns false, the entire if check should be compiled away for all Io Queues. (We don't need to inline the other functions because they don't actually sit on the critical path and won't be hit on Io queues anyways.