Skip to content

Conversation

@jalalmostafa
Copy link
Contributor

@jalalmostafa jalalmostafa commented Mar 23, 2025

Closes #493

@jalalmostafa jalalmostafa force-pushed the dev-bound branch 3 times, most recently from 6f40bf5 to 66904b3 Compare April 3, 2025 18:40
@jalalmostafa jalalmostafa marked this pull request as ready for review April 3, 2025 19:08
@jalalmostafa
Copy link
Contributor Author

Hey @tohojo

The PR to support XDP hints is ready for review. The bpf.h has been updated to that of kernel 6.3 to get the BPF_F_XDP_DEV_BOUND_ONLY definition.

@tohojo
Copy link
Member

tohojo commented May 26, 2025

Hmm, so I don't think the same approach we use for frags will work for the dev_bound flag, because the kernel treats it differently:

For frags, the kernel only considers the flag for the dispatches and doesn't care about any freplace programs that are attached to that main XDP program. Meaning that this is completely under the control of libxdp, we can propagate the frags flag between the component programs and the dispatcher as we see fit, and the same program can be attached to both an XDP program that sets the frags flag, and one that doesn't.

For the dev_bound flag, the kernel explicitly handles freplace programs. It does this, by forcing the freplace program to inherit the dev_bound settings of the XDP program it is being attached to at load time. Meaning that it's irreversible for the lifetime of the program. So for a program to be dev_bound, it has to be attached to a dev_bound dispatcher the first time it is loaded, we can't add it later. And the flag itself has no meaning when set at freplace program load time, only the setting when loading the dispatcher matters.

This means we don't really have to track the flag setting of each program after load (so no need to store the flag in the dispatcher data structure for every program), since it will never be unset again. I think we have a couple of options for how to handle enabling dev_bound functionality:

  1. Don't track it, just propagate the dev_bound flag to the dispatcher. Any subsequent program loaded without the flag will then be automatically dev_bound because of the inheritance in the kernel.
  2. As above, but add tracking to the dispatcher so we can check if there's a program loaded already that is dev_bound, reject loading another one that isn't.

If there's an existing program loaded that is not dev_bound, the operation will fail in any case. But we also disallow mixing programs with and without the flag set for the frags case, so that's consistent.

If we go with 1., that will prevent a non-devbound program from being attached to more than one interface if the first one it attaches to has another devbound program, which will probably be surprising. I think 2. is most consistent with how we handle frags anyway.

The last question is whether this should be an attach time argument (flag to xdp_program__attach()), or a "program flag" the way you're doing it currently. The latter is more consistent with frags, but kinda has an impedence mismatch with how the kernel treats things because of the inheritance. So maybe having the difference reflected in the API is actually better? In any case, if we go with the set_dev_bound() type function, supplying the ifindex when setting the flag doesn't make sense, this will be overridden by the attach ifindex anyway. Which also means it may make more sense to make it an attach-time flag. WDYT?

@jalalmostafa
Copy link
Contributor Author

Hi @tohojo ,
I got busy with my PhD thesis lately.

I agree with you: the better approach is to have an attach flag since it will be done once in the lifetime of the program and to reject to load any programs that mismatch with the loaded dispatcher.

I made a new draft of this PR that you can please review. The draft will propagate the device binding flag of the dispatcher from the first attached program, any subsequent program attachments will be rejected if they mismatch with the running dispatcher.

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty sane overall, but a couple of comments.

Also, please rebase on the latest main and update the test as in c0f1faa - turns out the libxdp compatibility checks were being skipped in CI.

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI tests fail with the following error:

2025-09-12T15:14:56.7276573Z  test_dispatcher_versions.c -l:libxdp.a -lpthread -l:libbpf.a -lz -lelf -lcap-ng
2025-09-12T15:15:01.8559822Z test_dispatcher_versions.c: In function 'load_dispatcher':
2025-09-12T15:15:01.8599786Z test_dispatcher_versions.c:134:12: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
2025-09-12T15:15:01.8600619Z   134 |         if (ret)
2025-09-12T15:15:01.8600933Z       |            ^
2025-09-12T15:15:02.4406598Z cc1: all warnings being treated as errors
2025-09-12T15:15:02.4750679Z make[2]: *** [Makefile:69: test_dispatcher_versions] Error 1
2025-09-12T15:15:02.4760708Z make[2]: Leaving directory '/home/runner/work/xdp-tools/xdp-tools/lib/libxdp/tests'
2025-09-12T15:15:02.4837721Z make[1]: *** [Makefile:158: test] Error 2
2025-09-12T15:15:02.4860095Z make[1]: Leaving directory '/home/runner/work/xdp-tools/xdp-tools/lib/libxdp'

Also, the first commit now adds that binary file, and the second one removes it again. Could you please rebase them so it's not there in the first commit either? :)

@jalalmostafa jalalmostafa force-pushed the dev-bound branch 2 times, most recently from 27519b7 to 48d5d30 Compare September 15, 2025 12:59
@tohojo
Copy link
Member

tohojo commented Sep 15, 2025

Looks like the selftests are failing on kernels that don't support device-binding; you'll have to feature-detect and skip the parts that don't work on those kernels (so the loading of devbound programs, but not the part that tests non-devbound programs)...

@jalalmostafa jalalmostafa force-pushed the dev-bound branch 4 times, most recently from e9e381e to 3ce2c60 Compare September 16, 2025 15:14
@jalalmostafa
Copy link
Contributor Author

jalalmostafa commented Sep 16, 2025

Selftests under kernel 6.14 are failing at prepare_test_kernel.sh. For some reason, it cannot find the kernel packages from RPM repository. Anyway, it is independent of this PR. Rest tests are all successful.

@tohojo
Copy link
Member

tohojo commented Sep 17, 2025

Right, updated the test matrix in the main branch and pushed the rebase button for your branch; let's see if that helps :)

@jalalmostafa
Copy link
Contributor Author

All tests are successful now

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits on the updated test code, still...

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, code LGTM - thanks for tackling this!

Just made a bunch of changes to the selftests, so will rebase just to make sure this still works with those. It may be a few days before I merge this anyway, since I want to double check whether there's something else we should land before doing another dispatcher version bump.

Update the bpf.h UAPI header from the Linux to that from kernel version
6.3. We need the definition of BPF_F_XDP_DEV_BOUND_ONLY to support it
in libxdp.

Signed-off-by: Jalal Mostafa <[email protected]>
Kernel 6.3 supported for some NIC offloads for XDP programs.
The feature in XDP is known to be XDP hints. XDP hints are only
supported if the XDP program is bound to the NIC device using the
BPF_F_XDP_DEV_BOUND_ONLY binding flag.

The device binding flag is represented through `XDP_ATTACH_DEVBIND',
a new attach flag for `xdp_program__attach`. Device binding is
propagated to the dispatcher. Any subsequent programs attachments
are rejected if they are different from the already running dispatcher
on a network interface.

The flag is recored using a new variable in `struct xdp_dispatcher_config`.

Signed-off-by: Jalal Mostafa <[email protected]>
Add a selftest program for libxdp device binding, testing the different
permutations of loading a program with and without the flag.

Signed-off-by: Jalal Mostafa <[email protected]>
If no kernel support for `BPF_F_XDP_DEV_BOUND_ONLY`, skip
micro-selftests that has this flag.

Signed-off-by: Jalal Mostafa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XDP Hints Support

2 participants