-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Video: Rework buffer management #92370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Video: Rework buffer management #92370
Conversation
Interesting! Instead of having to carry pointers around, one only says "I need 10 buffers", then to enqueue them, " [EDIT: answering my own questions after discussing on #92366] Q1: Is it important that the application knows exactly which buffer index was given to a device? If not, then enqueue could be A1: Not all buffers will have the same size, so cannot all be mixed between devices. Q2: It seems it is not possible anymore to manually craft a video buffer. For instance, if video data arrives from the network and then goes through PXP to adapt it to a display, how to submit the buffer coming directly from the network into a A2: It is possible to add a Q3: Would it make sense to start making use of A3: This does not seem related to what this PR does: buffer management and accounting vs queueing them to the devices. Possibly solved by Q4: In my impression, an application developer does not want to know how many bytes are allocated to video in total, but instead decide on a resolution and let all allocation be automated. The current approach is to open A4: This can come as an overlay on top of this eventually, these functions allow this to be implemented on top. A different topic to solve in parallel. |
There seems to be several topics involved: T1: Memory management: how much in total, where from, which back-end for it It seems like this current PR aims to address something living between T1 and T2. |
@josuah Hope you don't mind. I implemented the needed stuffs to generalize your RTIO driver demo and integtated it in this buffer management rework. Effectively, RTIO serves as the backend that replaces the FIFO queues (and some more) but it does not overlap or conflict with some changes in this PR which mainly reworks the buffer allocation and interfacing with application. The main idea in the RTIO generalization is It doesn't work yet :). I have error at stream_start(). I will try to debug later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for picking it up so early!
I will try with an i.MXRT1160 I have around.
Essential to get it working is probably just to call __frame_done_cb()
whenever a frame is submitted on io_q()
, which might require api->iodev_submit
...
drivers/video/video_common.c
Outdated
int video_dequeue(const struct device *dev, struct video_buffer **buf, | ||
k_timeout_t timeout) | ||
{ | ||
struct rtio_cqe *cqe = rtio_cqe_consume_block(&rtio); | ||
*buf = cqe->userdata; | ||
|
||
if (cqe->result < 0) { | ||
LOG_ERR("I/O operation failed"); | ||
return cqe->result; | ||
} | ||
|
||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dev
and timeout
are unused so can go away.
rtio_cqe_consume_block()
acts as a combined k_poll()
and video_dequeue()
.
Maybe we need a pointer to struct video_interface
in struct video_buffer
to keep track of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffers are kept track of by index and type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! ->index
type is fully user-defined, which allows user to organize a reference table in the application to store extra metadata about buffers.
I think this solves this point, feel free to fold this.
drivers/video/video_common.c
Outdated
if (sqe->op != RTIO_OP_RX) { | ||
LOG_ERR("Invalid operation %d of length %u for submission %p", sqe->op, | ||
sqe->rx.buf_len, (void *)iodev_sqe); | ||
rtio_iodev_sqe_err(iodev_sqe, -EINVAL); | ||
return NULL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fragment was there for the driver to validate that the operation requested is supported.
It is the equivalent to if (vbuf->type != VIDEO_BUF_TYPE_INPUT) {
.
Should this go into video_mcux_csi.c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understand is this is to get the buffer that is filled from the sqe, to do some check (buf type check or buf_addr check as in csi driver). The code is common for all driver so it could be refactored as a function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense as a helper agreed!
Though this does not allow the caller to report completion/failure, so maybe only the 1st buffer would work.
So maybe something like this?
struct rtio_iodev_sqe *video_pop_io_q(struct mpsc *io_q)
{
struct mpsc_node *node;
struct rtio_iodev_sqe *iodev_sqe;
struct video_buffer *vbuf;
node = mpsc_pop(io_q);
if (node == NULL) {
return NULL;
}
iodev_sqe = CONTAINER_OF(node, struct rtio_iodev_sqe, q);
vbuf = iodev_sqe->sqe.userdata;
__ASSERT_NO_MSG(vbuf != NULL);
if ((vbuf->type == VIDEO_BUF_TYPE_OUTPUT && iodev_sqe->sqe.op == RTIO_OP_RX) ||
(vbuf->type == VIDEO_BUF_TYPE_INPUT && iodev_sqe->sqe.op == RTIO_OP_TX)) {
return iodev_sqe;
} else {
LOG_ERR("Unsupported RTIO operation (%d) or video buffer type (%d)",
iodev_sqe->sqe.op, vbuf->type);
rtio_iodev_sqe_err(iodev_sqe, -EINVAL);
return NULL;
}
}
It is only 1 line to get the vbuf out of it:
iodev_sqe = video_pop_io_q(&data->io_q);
if (iodev_sqe == NULL) {
return;
}
vbuf = iodev_sqe->sqe.userdata;
/* Submit `vbuf` contoent to the hardware */
rtio_iodev_sqe_ok(iodev_sqe, 0);
drivers/video/video_common.c
Outdated
rtio_sqe_prep_read(sqe, ri, RTIO_PRIO_NORM, video_buf[index].buffer, | ||
video_buf[index].size, &video_buf[index]); | ||
|
||
sqe->flags |= RTIO_SQE_MULTISHOT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there need to be a video_enqueue_read()
and video_enqueue_write()
or use switch (vbuf->type){}
. Currently it is only doing read requests.
RTIO_SQE_MULTISHOT
might not be possible with write requests as the user has to specify the buffer to use every time, or the same buffer would be sent in loop.
c805080
to
d9db3cb
Compare
I think I see the problem. Unlike the SW generator, the HW device like CSI needs to enqueue buffer to the HW, I blindly remove the enqueue() API. |
d9db3cb
to
f4c7b63
Compare
I will follow-up on this conversation: #92370 (comment) It looks like we are close! |
f4c7b63
to
0cb2ab8
Compare
It already know and can track which buffer is enqueued.
It was in TODO list and done now by commit |
include/zephyr/drivers/video.h
Outdated
/** buffer memory is on the video heap */ | ||
VIDEO_MEMORY_VIDEO = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about VIDEO_MEMORY_POOL
to avoid a repetition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POOL is a set of buffers of same size and attribute, seems not describe well the situation. Are VIDEO_MEMORY_INTERNAL
and VIDEO_MEMORY_EXTERNAL
better ?
drivers/video/video_buffer.c
Outdated
void video_release_buf() | ||
{ | ||
/* Buffer will be re-queued thanks to RTIO_SQE_MULTISHOT */ | ||
rtio_cqe_release(&rtio, rtio_cqe_consume_block(&rtio)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is video_buffer_release()
and video_release_buf()
. To avoid confusion, how about video_buffer_done()
or other name?
a09740a
to
4cd4e46
Compare
Sorry I ran out of time while trying to load anything on my I just noticed you pushed an update of the API as you described, I will propagate that to the other drivers. |
I removed driver's PS: Driver's implementation of |
That is great! This is the only one I cannot test easily... |
Sorry for not inform you before working on it to avoid wasting your time, but I 've just work on this about 45 minutes ago ... I will have to put this on hold for about the next 10 days due to some obligation. Thank you for the work ! |
No time was lost as I spent it getting familiar with the i.MXRT ecosystem and tooling host-side mostly.
There is time before the end of feature freeze, and plenty of other drivers to convert. What is left to do AFAIU:
Thank you for the work as well ! |
|
Currently, the video_enqueue() API enqueues the whole external video buffer container structure from the application and the driver stores this container in its FIFO queue. While it works in simple applications where the enqueued video_buffer container persists for the whole program lifecycle, it does not work in situations where we cannot keep this container, e.g. enqueuing a buffer inside a function, the local variable will be destroyed when the function returns and hence the buffer is no longer valid. Video buffers are maintained and tracked by the via their indices. Set the index field when buffers are allocated and enqueue the internal buffer rather than the external container will fix the issue. Signed-off-by: Phi Bang Nguyen <[email protected]>
Add a video_buf_state field to track the current state of a buffer when it is flowing. This also helps to check whether a buffer can be safely enqueued or dequeued, data are valid or not, etc. Signed-off-by: Phi Bang Nguyen <[email protected]>
The video subsystem holds an internal buffer pool and allocate buffers of this pool on its own heap. The application hence requests the subsystem to allocate buffers on behalf of it. Application does not need to get handles on the allocated buffers, just read back the first buffer index and the number of allocated buffer. Signed-off-by: Phi Bang Nguyen <[email protected]>
Add support to allow user provide their own allocated buffers. Signed-off-by: Phi Bang Nguyen <[email protected]>
The main device that interface with application on memory. Signed-off-by: Phi Bang Nguyen <[email protected]>
- Application side nearly unchanged, except a call to video_release_buf() after finishing with the full buffer to queue it back to sqe. RTIO is opaque to application. - Driver just needs to hold an io_q and deal a bit with the buffer from sqe in isr. No need to implement rtio_iodev_api's submit() - The video subsystem does everything needed for buffer management. Signed-off-by: Phi Bang Nguyen <[email protected]> Co-developed-by: Josuah Demangeon <[email protected]>
Move buffer management stuffs into a separate video_buffer.c file Signed-off-by: Phi Bang Nguyen <[email protected]>
This contains only the mandatory modifications to get the drivers to work: - expose `struct rtio_iodev_sqe` to the drivers so that the functions `rtio_iodev_sqe_ok()` and `rtio_iodev_sqe_err()` can be called. - expose `stuct rtio_cqe` to the application so that the functions `rtio_cqe_release()` can be called with the same completion event (cqe) that was just dequeued. The implementation can be different, but these points seems to be mandatory to operate RTIO end-to-end.
… API Signed-off-by: Phi Bang Nguyen <[email protected]>
Signed-off-by: Phi Bang Nguyen <[email protected]>
Signed-off-by: Phi Bang Nguyen <[email protected]>
4cd4e46
to
bfc86f6
Compare
Update:
TODO:
|
|
Thanks @ngphibang @josuah for all this work. I haven't yet look at all implementation details but more on the API for the time being. Looking at the API, and especially the request / queue mechanism I am wondering how use-case of dealing with several kind of buffers (type / size) are handled. I probably missed something, could you clarify this to me ? Another case that I can think of is when 2 IPs are in a sense pipelines, for example DCMI/DCMIPP + JPEG encoder. The output of the DCMI/DCMIPP (aka output buffer) will be also used as input to the JPEG encoder. That same buffer can be used it is up to the application to dequeue from one side and queue it to the other size. Does the current API allow to do that ? Aka, in a sense use a output type buffer (from DCMI/DCMIPP) into the JPEG input ? |
Hi @avolmat-st ,
Yes, the request() should return the first index of the batch as in above comment (TODO) and in the commit message. It's not implemented yet in the current code but will be done, sorry I am tight up at this moment.
To enqueue the "empty" buffer (freshly requested / allocated), we have the buffer index so we can enqueue it, video_enqueue() can be rewritten which takes index and type (input / output) To enqueue a "full" buffer dequeued from previous device to another device, since after dequeue, we have a full handle of the video_buffer struct so it's not a problem, we can call the same above video_enqueue(index, type). Another way of designing the video_enqueue() API is PS: I forgot that to support user provided buffers, we need to give the buffer address to the framework, so the video_enqueue() need to take the following parameters:
So, finally, it seems |
drivers/video/video_common.c
Outdated
@@ -70,6 +70,7 @@ struct video_buffer *video_buffer_aligned_alloc(size_t size, size_t align, k_tim | |||
vbuf->buffer = block->data; | |||
vbuf->size = size; | |||
vbuf->bytesused = 0; | |||
vbuf->state = VIDEO_BUF_STATE_DONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, it depends on the application logic, but should we consider after allocation that the buffer state is DONE or should it be a state such as INIT for example ?
DONE would mean that it is filled with valid data.
Here, upon allocation, the buffer do not have valid data in it. bytesused is properly set to 0 so there is no chance the application would read bad value from it, so that's the reason why I do not really have strong arguments for a STATE_INIT.
drivers/video/video_common.c
Outdated
return -EINVAL; | ||
} | ||
|
||
video_buf[buf->index].state = VIDEO_BUF_STATE_QUEUED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this state setting be done after a successfull enqueue ? Assuming the enqueue would return an error, this would lead to having a buffer marked as QUEUE while it has failed to be enqueued to the driver, hence I'd expect this mean that it is still not owned by the driver. In such case it would still be in the previous state (most probably DONE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, maybe this needs to be set only if ret = api->enqueue()
has ret == 0
.
include/zephyr/drivers/video.h
Outdated
ret = api->dequeue(dev, buf, timeout); | ||
|
||
if (ret == 0) { | ||
(*buf)->state = VIDEO_BUF_STATE_DONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that buffer state was partially set by the driver as well. For example the driver, if it knows that the contents of the buffer is bad (ex: failure to DMA data into it or anything ...) would set the buffer state to STATE_ERROR to let the application know that the content of the buffer is bad and it shouldn't be used.
Here, since the framework is setting the buffer state to DONE upon a successfull dequeue, the driver do not have anyway to indicate such thing since the framework will override the state to done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the new video_dequeue()
proposed:
https://github.com/nxp-upstream/zephyr/blob/bfc86f6ecb8f79238086b969abb9bbb35c149de3/drivers/video/video_buffer.c#L172-L175
Here is how it is used for sensors:
cqe = rtio_cqe_consume_block(&ctx);
if (cqe->result != 0) {
printk("I/O failed %d\n", cqe->result);
return;
}
So checking cqe->result == 0
would give the result from the driver I/O operation, rather than just the "buffer pop" operation.
drivers/video/video_common.c
Outdated
return -EINVAL; | ||
} | ||
|
||
video_buf[buf->index].state = VIDEO_BUF_STATE_QUEUED; | ||
video_buf[index].state = VIDEO_BUF_STATE_QUEUED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state setting to be done after doing the enqueue since enqueue could fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rtio_sqe_prep_read()
allocates a queue buffer.
rtio_submit()
load it into the queue.
This effectively load the buffer in the driver's buffer queue. Is that good enough to say that the buffer id enqueued?
There is no more api->enqueue()
as the queue data structure is not managed by the driver itself anymore, but by the RTIO functions acting as a "queue middleman".
drivers/video/video_common.c
Outdated
@@ -132,13 +147,28 @@ int video_enqueue(const struct device *dev, uint8_t index) | |||
return -ENOSYS; | |||
} | |||
|
|||
if (video_buf[index].state != VIDEO_BUF_STATE_DONE) { | |||
if (video_buf[buf->index].type != buf->type || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is fixed in further commits but I am wondering if this check is ok when put in a chained devices pipeline. For example, in my current system I have two devices:
- DCMIPP capture which output data, so I have output type which will get data from the camera pipeline.
- JPEG Codec which is a M2M and has one input and one output. The M2M input is actually the same buffer as the DCMIPP output. So this unique buffer is sometimes queued / dequeue from the DMCIPP as output and something from the JPEG codec as input.
If the type is hardcoded in the video_buf table and the check is done when we enqueue, it seems it won't be possible to use a unique buffer between 2 devices. Did I understand it correctly ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh actually seems I already asked this similar question before and @ngphibang replied me. Maybe I just haven't properly understood the code then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's similar but not the same question :). The other one is the possibility of enqueuing the freshly dequeued buffer to another device (answer = yes). For this one, you are right, type shouldn't be checked here. Even more, type is not needed when requesting buffer. Initially, I did it by having the idea of supporting driver-provided buffer in mind, so each driver will have 1 or 2 buffer queues and the check is needed (buffer dequeued from one device will change its type when being queued into another device). But now, it seems user-provided and framework provided buffers are sufficient.
This PR reworks the buffer management in the video subsystem step by step to have a common buffer management framework.
It fixes the current issues of video buffer management and may help to facilitate #89858.
Still a work-in-progress. I will try to describe the change in more details and complete it gradually.
TODO: