Skip to content

RFC: net: core: Dynamic priority and deferred processing #93341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ClaCodes
Copy link
Contributor

@ClaCodes ClaCodes commented Jul 18, 2025

Goal

To enable quality-of-service (QoS) applications, allow to reschedule/defer
the processing of a network packet based on its properties.

Based on: #93298

zephr_net_core_qos

Future Work

  • Implement proper infrastructure to enpower application writers to hook in with custom rules for changing packet priorities based on respective properties of the packet at different layers (replace the sample update_priority)
    • At layer 2: ptype, src mac, dest mac, vlan-stuff
    • At layer 3: src ip, dest ip, protocol
    • At layer 4: port
  • Implement an interuption point between the deconding of the l2 header and the executing of the registered callbacks so that handling of layer 2 (or 3?) protocols may be defereed such as arp for example
  • Implement an interuption point between the deconding of the l3 header and the handling of udp and tcp etc.

@ClaCodes ClaCodes force-pushed the feature/delay_processing branch 4 times, most recently from 26d64cc to a4849f5 Compare July 19, 2025 12:37
Copy link
Contributor

@rlubos rlubos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback (on the topmost 2 commits).

int priority;
int desired_priority;

thread_priority = tx_tc2thread(tc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use rx_tc2thread()? It looks that this function is currently used in the RX context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed.

update_priority(pkt);

if (!being_processed_by_correct_thread(pkt)) {
net_queue_rx(net_pkt_iface(pkt), pkt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we requeue the packet I think we should quit processing_data() as well? The packet should be picked up by the same function, but on a different thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Fixed.

net_queue_rx(net_pkt_iface(pkt), pkt);
}

} while (true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I'm not too fond with this loop approach, it'll add extra processing time for each packet, regardless of whether this TC-requeueing feature is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this, because there was precedent (again loop).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but that was only the case with virtual interfaces (will need @jukkar to get more info on how this worked), now every single packet will need to loop after l2 processing at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not essential to the RFC. Alternatively could just update priority, conditionally requeue and return NET_OK in-place.

default:
NET_DBG("Dropping pkt %p", pkt);

update_priority(pkt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus, I wonder, shouldn't there be a spearate update_priority() function for each layer, by design? Just thinking out loud, but I imagine that after L3 has processed the packet, we shouldn't be using L2-specific rules anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Will rethink this.

@@ -71,39 +71,39 @@ static inline enum net_verdict process_data(struct net_pkt *pkt)
if (!net_pkt_is_raw_processed(pkt)) {
net_pkt_set_raw_processed(pkt, true);
net_packet_socket_input(pkt, ETH_P_ALL, SOCK_RAW);
return NET_CONTINUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think packet-socket processing would require to recalculate the packet priority?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@ClaCodes ClaCodes force-pushed the feature/delay_processing branch 5 times, most recently from c5f9c6c to 9bd6ae9 Compare July 24, 2025 16:36
@ClaCodes
Copy link
Contributor Author

@rlubos
I reverted along the lines of discussion.
Note that, if we go with the current approach of the RFC, we would have to add the snippet to multiple places:

+               update_priority_l2(pkt);
+               if (!being_processed_by_correct_thread(pkt)) {
+                       net_queue_rx(net_pkt_iface(pkt), pkt);
+                       return NET_OK;
+               }

For example after extracting the l2-header but before: STRUCT_SECTION_FOREACH(net_l3_register, l3) { such that for example arp packets can be offloaded on to another traffic class queue.
and after ip-header processing etc ...

I believe, that a loop and a if-else-chain or a switch would be more readable. Presumably the added computation cost is negligible?

void process_data(struct net_pkt *pkt)
{
    // switch would only work if raw packet sockets not needed for loop back etc.
    switch (pkt->step) {
    case STEP_1:
        do_step_1();
        return NET_CONTINUE;
    case STEP_2:
        do_step_2();
        return NET_CONTINUE;
    case STEP_3:
        do_step_3();
        return NET_CONTINUE;
        // ...
    }
}

void processing_data(struct net_pkt *pkt)
{
    enum net_verdict verdict = NET_CONTINUE;
    do {
        verdict = process_data(pkt);
        if (verdict != NET_CONTINUE) {
            break;
        }
        update_prio(pkt, pkt->step);
        if (!on_correct_thread(pkt)) {
            net_queue_rx(pkt);
            verdict = NET_OK;
        }
    } while (verdict == NET_CONTINUE);
    // ... unref
}

If we do not do this, it will be very hard to follow at which point the processing can be interrupted and then recontinued later. This may be an easy source of bugs.

@ClaCodes ClaCodes force-pushed the feature/delay_processing branch from 9bd6ae9 to d88b7ac Compare July 25, 2025 05:56
Store the flag in the packet meta-data so that processing may be deferred
if necessary.

Signed-off-by: Cla Mattia Galliard <[email protected]>
Use the l2_processed-flag to decide whether a network packet needs to be
processed by an L2-handler. This could be used in the future to requeue
packets for later processing by a different traffic class queue.

Signed-off-by: Cla Mattia Galliard <[email protected]>
Specify the socket type, when inputting a packet into a packet-socket.

Signed-off-by: Cla Mattia Galliard <[email protected]>
When handling packets for inputing into packet-sockets, unconditionally
forward them, so that they may be handled by the rest of the network
stack after.

Signed-off-by: Cla Mattia Galliard <[email protected]>
Store a flag about which layer has already processed a packet in its meta
information. This enables deferred processing of it.

Signed-off-by: Cla Mattia Galliard <[email protected]>
@ClaCodes ClaCodes force-pushed the feature/delay_processing branch from d88b7ac to 80f5552 Compare August 16, 2025 09:03
@ClaCodes
Copy link
Contributor Author

  • Rebased

Note: contains all commits from #93246 and adds two commit

To enable quality-of-service (QoS) applications, allow to reschedule/defer
the processing of a network packet based on its properties. This means,
that processing can be offloaded to tc-queue-threads with different
priorities.

Signed-off-by: Cla Mattia Galliard <[email protected]>
@ClaCodes ClaCodes force-pushed the feature/delay_processing branch from 80f5552 to a69163e Compare August 16, 2025 09:35
@ClaCodes ClaCodes requested a review from rlubos August 16, 2025 09:37
Copy link

Copy link
Contributor

@rlubos rlubos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think packet rescheduling logic turned out pretty straightforward and looks fine, however I think we need to think on how would the priority updating look like in practice.

@@ -235,6 +235,21 @@ static uint8_t rx_tc2thread(uint8_t tc)
}
#endif

bool net_tc_rx_is_current_thread(uint8_t tc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if we'd need a dummy stub for the function for the case when NET_TC_RX_COUNT <= 1, the packet won't be rescheduled in such case anyway, not sure how well the compiler will optimize that.

Comment on lines +71 to +74
/* This is just an example.
* Similar infrastructure with custom application rules like
* net_pkt_filter could be established
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to start thinking on how such infrastructure to update packet priorities should look like, I think the application should be able to register a function to update priorities? Should it be done globally or at per-interface basis? Perhaps such function should become part of struct net_if_api?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants