Netstacklat: Evaluation porting Simon's netstacklat to ebpf_exporter #1

netoptimizer · 2025-08-21T15:37:10Z

Created this PR so @simosund can see the changes and evaluation numbers from prod experiment.

Based on code from PR xdp-project/bpf-examples#129

This is a direct copy from bpf-examples but from Simon's devel branch 'netstacklat-groupby' https://github.com/simosund/bpf-examples/tree/netstacklat-groupby/netstacklat Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Keeping this as seperate commit to track what needed to change Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Gotcha#1: My devel laptop have TAI offset zero - Other systems (incl prod) all have 37 sec Gotcha#2: RX-timestamping need to be enabled maually - something else have to enable RX-timestamping Signed-off-by: Jesper Dangaard Brouer <[email protected]>

For ebpf_exporter we cannot control which BPF sections gets loaded. Instead we compile time disable some of the hooks via define/ifdef's. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Instead hardcode ifindex limits based on prod setup. As we don't have a way to configure this via YAML. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Execution runtime: netstacklat_tcp_recv_timestamp - run_time_ns 18885872401 run_cnt 71,443,820 = 264.34 ns Execution runtime: netstacklat_skb_consume_udp - run_time_ns 6124797061 run_cnt 16,324,309 = 375.19 ns Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Something is buggy with this filter - All latency records is on max bucket The READ_ONCE change doesn't fix the issue Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Signed-off-by: Jesper Dangaard Brouer <[email protected]>

This was the real reason for seeing wrong numbers in prod. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

…sync Execution runtime: netstacklat_tcp_recv_timestamp - run_time_ns 32164560127 run_cnt 116,590,498 = 275.88 ns Execution runtime: netstacklat_skb_consume_udp - run_time_ns 10490230543 run_cnt 23,993,428 = 437.21 ns Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Moved filter_nonempty_sockqueue to callers - because record_socket_latency() becomes a BPF function-call - perf e.g. shows bpf_prog_fb69587c6ea462b7_record_socket_latency Execution runtime: netstacklat_tcp_recv_timestamp - run_time_ns 181391511590 run_cnt 788663546 = 229.99 sn Execution runtime: netstacklat_skb_consume_udp - run_time_ns 16212598612 run_cnt 137812779 = 117.64 ns This clearly have a huge improvement for UDP packets. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Enable filter_nonempty_sockqueue compile time to make sure that this config setting doesn't influence performance. I'm only seeing a small effect. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

The filter_nonempty_sockqueue() is effecient for UDP packets, but doesn't work well for TCP packets. Add filtering on socket queue lenght. Try filtering if qlen is not above 3 packets for TCP. Execution runtime: netstacklat_tcp_recv_timestamp - run_time_ns 10690540076 run_cnt 117852699 = 90.71 ns Execution runtime: netstacklat_skb_consume_udp - run_time_ns 2206621632 run_cnt 20004338 = 110.30 ns This have a HUGE improvement for TCP case. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

simosund · 2025-08-22T08:45:17Z

examples/netstacklat.bpf.c

@@ -462,6 +466,9 @@ int BPF_PROG(netstacklat_tcp_recv_timestamp, void *msg, struct sock *sk,
 	if (!filter_nonempty_sockqueue(sk))
 		return 0;

+	if (!filter_queue_len(sk, 3))
+		return 0;
+


In a previous commit you mentioned that the filter_nonempty_socketqueue() seemed to be bugged and resulting in all entries ending up in the max latency bucket. Does this solution not suffer from that issue?

If this doesn't have that issue, and you have time, I would appreciate if you could also try with without CONFIG_MAP_MACROS, to see if the issue is perhaps related to my increment_exp2_histogram_nosync() implementation (i.e. switching over to the macro is what made the bug disappear).

The filter should of course only affect which values are recorded (but not change the values itself), and if what should be the equivalent of qlen > 0 results in only max latency values, then logically qlen > 3 should also have only max latency values (as that's a strict subset of qlen > 0).

Also, good to see that it reduced the overhead a lot. But would also be good to check how this filter affects the latency distribution (compared to running without the filter), and check if it still captures most of the high-latency instances. As I guess the assumption behind this filter is that it's mostly interesting to capture the instances where we have high latency on sockets with a bit of a backlog of data to deliver. But if it turns out that most of the high-latency values come from sockets with a single packet (i.e. become empty after the read) then we maybe need to consider if this is a sensible filter.

netoptimizer added 13 commits August 20, 2025 12:58

Add netstacklat example

aadb5ac

This is a direct copy from bpf-examples but from Simon's devel branch 'netstacklat-groupby' https://github.com/simosund/bpf-examples/tree/netstacklat-groupby/netstacklat Signed-off-by: Jesper Dangaard Brouer <[email protected]>

Adjust netstacklat example to compile

c891328

Keeping this as seperate commit to track what needed to change Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: enable filter_cgroup and disable other hooks

22b929e

For ebpf_exporter we cannot control which BPF sections gets loaded. Instead we compile time disable some of the hooks via define/ifdef's. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: disable ifindex filter

50fafda

Instead hardcode ifindex limits based on prod setup. As we don't have a way to configure this via YAML. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: disable PID filtering

2efbb90

Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: an idea as a code comment

43da77b

Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: evaluate filter_nonempty_sockqueue

48c7f20

Something is buggy with this filter - All latency records is on max bucket The READ_ONCE change doesn't fix the issue Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: disable filter_nonempty_sockqueue

a0776a2

Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netstacklat: restore TAI offset to 37 sec

de778bc

This was the real reason for seeing wrong numbers in prod. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

netoptimizer mentioned this pull request Aug 21, 2025

Netstacklat: Add filtering and grouping xdp-project/bpf-examples#129

Open

netoptimizer added 2 commits August 21, 2025 18:06

netstacklat: compile time enable filter_nonempty_sockqueue

925debf

Enable filter_nonempty_sockqueue compile time to make sure that this config setting doesn't influence performance. I'm only seeing a small effect. Signed-off-by: Jesper Dangaard Brouer <[email protected]>

simosund reviewed Aug 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Netstacklat: Evaluation porting Simon's netstacklat to ebpf_exporter #1

Netstacklat: Evaluation porting Simon's netstacklat to ebpf_exporter #1

Uh oh!

netoptimizer commented Aug 21, 2025

Uh oh!

simosund Aug 22, 2025

Uh oh!

Uh oh!

Netstacklat: Evaluation porting Simon's netstacklat to ebpf_exporter #1

Are you sure you want to change the base?

Netstacklat: Evaluation porting Simon's netstacklat to ebpf_exporter #1

Uh oh!

Conversation

netoptimizer commented Aug 21, 2025

Uh oh!

simosund Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!