Conversation
This implements new probes for DNS based on cgroup socket programs.
We use BPF_PROG_TYPE_CGROUP_SOCK (cgroup/sock_{create,release}) for populating the
sk_to_tgid map, these have the process context as it's basically the path of
socket(2) and close(2).
We use BPF_PROG_TYPE_CGROUP_SOCK_ADDR (cgroup/{sendmsg4,recvmsg4,connect4}) also
for populating sk_to_tgid, here we end up overwriting the value from
sock_create, this is wanted since the socket might have been created in a
different process, then passed down to the children, which then does
sendmsg/recvmsg/connect, meaning we always get the "correct" tgid for an event.
We use BPF_PROG_TYPE_CGROUP_SKB (cgroup/{egress,ingress}) for the actual tapping
of the packet.
- cap_len is how much we captured (like pcap)
- orig_len is how much we wanted to capture (like pcap)
- direction is ingress/egress
- tgid is tgid
- data is the full packet, including ip/udp headers (like pcap)
Contrary to the other probes, we can't stash the full process context, as we
really only have tgid, it would make little sense to stash all the process info
in some map and then send it later. This is a departure from tracing and
"context full probes". We already have all the needed context in userland, both
in quark and $OTHER_PRODUCT.
Main motivation for this change is that the old probes would fail for non-linear
skbs.
Cgroup probes need the file descriptor of the root cgroup, and bluebox doesn't
mount it for us, so we mount it manually.
Regarding the weirdness in probes:
- Old verifiers are sensitive and not always update umin for a scalar value in the
case of a JNZ, see inline comments.
- Old verifiers also do not like spilled bpf-helpers calls (when the argument is
in the stack), see inline comments.
Regarding tests:
I've zapped the old udp_send.c program, as it makes little sense since we can do
it all in goland. We now also send and receive a packet and verify the contents,
the tests in quark are a bit more advanced and check the ip/udp header as well,
but this is good enough imho.
Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com>
Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com>
Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com>
Contributor
Author
|
I think I've addressed all points, thanks for spotting them! |
nicholasberlin
approved these changes
Mar 27, 2025
Contributor
nicholasberlin
left a comment
There was a problem hiding this comment.
Changes LGTM. Thanks.
Contributor
fearful-symmetry
left a comment
There was a problem hiding this comment.
Mostly interesting in the documentation here, but I feel like the cgroup root thing is worth investigating.
fearful-symmetry
approved these changes
Apr 1, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements new probes for DNS based on cgroup socket programs.
We use BPF_PROG_TYPE_CGROUP_SOCK (cgroup/sock_{create,release}) for populating the
sk_to_tgid map, these have the process context as it's basically the path of
socket(2) and close(2).
We use BPF_PROG_TYPE_CGROUP_SOCK_ADDR (cgroup/{sendmsg4,recvmsg4,connect4}) also
for populating sk_to_tgid, here we end up overwriting the value from
sock_create, this is wanted since the socket might have been created in a
different process, then passed down to the children, which then does
sendmsg/recvmsg/connect, meaning we always get the "correct" tgid for an event.
We use BPF_PROG_TYPE_CGROUP_SKB (cgroup/{egress,ingress}) for the actual tapping
of the packet.
Contrary to the other probes, we can't stash the full process context, as we
really only have tgid, it would make little sense to stash all the process info
in some map and then send it later. This is a departure from tracing and
"context full probes". We already have all the needed context in userland, both
in quark and $OTHER_PRODUCT.
Main motivation for this change is that the old probes would fail for non-linear
skbs.
Cgroup probes need the file descriptor of the root cgroup, and bluebox doesn't
mount it for us, so we mount it manually.
Regarding the weirdness in probes
case of a JNZ, see inline comments.
in the stack), see inline comments.
Regarding tests
I've zapped the old udp_send.c program, as it makes little sense since we can do
it all in goland. We now also send and receive a packet and verify the contents,
the tests in quark are a bit more advanced and check the ip/udp header as well,
but this is good enough imho.