Skip to content

Commit b13cfb8

Browse files
authored
Merge pull request #123 from xdp-project/tailgrow02.public
Examples on howto access XDP packet data at packet end
2 parents 7cd4b4c + 4a627a5 commit b13cfb8

File tree

6 files changed

+365
-0
lines changed

6 files changed

+365
-0
lines changed

experiment01-tailgrow/Makefile

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
2+
3+
XDP_TARGETS := xdp_prog_kern xdp_prog_kern2
4+
XDP_TARGETS += xdp_prog_fail1
5+
XDP_TARGETS += xdp_prog_fail2
6+
7+
# USER_TARGETS :=
8+
9+
LIBBPF_DIR = ../libbpf/src/
10+
COMMON_DIR = ../common
11+
12+
COPY_LOADER := xdp_loader
13+
COPY_STATS := xdp_stats
14+
EXTRA_DEPS := $(COMMON_DIR)/parsing_helpers.h
15+
16+
include $(COMMON_DIR)/common.mk

experiment01-tailgrow/README.org

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# -*- fill-column: 76; -*-
2+
#+TITLE: Experiment01 - Accessing data at packet end
3+
#+OPTIONS: ^:nil
4+
5+
This example shows how to access BPF packet data at XDP =data_end=.
6+
Examples like this are needed, as the programmer needs to convince the
7+
BPF verifier that access bounds are safe.
8+
9+
* Use-case: tail-grow timestamping
10+
11+
The BPF helper =bpf_xdp_adjust_tail= is being extended with
12+
capabilites to grow the packet size at tail. To use this for
13+
anything, we need to demo how to access packet data at XDP =data_end=.
14+
15+
One use-case is to *add timestamps in extended tailroom* at XDP
16+
processing time, which will survive when packet is processed by
17+
network-stack (via XDP_PASS). One way to capture this timestamp is to
18+
use =tcpdump=, which could use this to determine the time spend in
19+
network-stack (on NIC without hardware timestamps).
20+
21+
In main example [[file:xdp_prog_kern.c]], the =xdp_tailgrow_parse= code
22+
implements this by parsing up-to the IP-layer, and using the
23+
IP-headers total-length field ([[https://elixir.bootlin.com/linux/v5.6.10/source/include/uapi/linux/ip.h#L97][iphdr->tot_len]]). See the code for the
24+
strange bounding checks needed to convince the verifier. Notice, this
25+
is limited to IPv4 ICMP packets for testing purposes.
26+
27+
** Side-note: extra programs
28+
29+
Side-note: [[file:xdp_prog_kern.c]] also contains some other smaller
30+
programs to test =bpf_xdp_adjust_tail= grow works, and to benchmark
31+
the overhead when doing =XDP_TX=. Selecting others BPF programs via
32+
=xdp_loader= option =--prog== like this:
33+
34+
#+begin_src sh
35+
sudo ./xdp_loader --dev mlx5p1 --force --prog xdp_tailgrow
36+
sudo ./xdp_loader --dev mlx5p1 --force --prog xdp_tailgrow_tx
37+
#+end_src
38+
39+
* Alternative methods
40+
41+
** Works: Use loop to access data_end
42+
43+
Code in [[file:xdp_prog_kern2.c]] shows howto find the =data_end=,
44+
*without parsing packet contents*, but by advancing a =data= position
45+
pointer one-byte at the time in a bounded loop. The bounded loop with
46+
max number of iterations allows the verifier to see the bound. (This
47+
obviously depend on the bounded loop support that was added in kernel
48+
[[https://git.kernel.org/torvalds/c/v5.3-rc1~140^2~179^2^2~5][v5.3]]).
49+
This is not very effecient, but it works.
50+
51+
* Methods that fail
52+
53+
Methods for accessing access BPF packet data at XDP =data_end=.
54+
55+
** Fail#1: Using packet length
56+
57+
In example [[file:xdp_prog_fail1.c]], we try to use the packet length
58+
(calculated as =data_end - data=) to access the last byte as an offset
59+
added to =data=. The verifier rejects this, as the dynamic length
60+
calculation cannot be used for static analysis.
61+
62+
#+begin_src sh
63+
sudo ./xdp_loader --dev mlx5p1 --force --file xdp_prog_fail1.o
64+
#+end_src
65+
66+
** Fail#2: Use data_end directly
67+
68+
In example [[file:xdp_prog_fail2.c]], we try to use the =data_end= pointer
69+
more or less directy to find the last byte in the packet. The packet
70+
data [[https://www.mathwords.com/i/interval_notation.htm][interval]] is defined as =[data, data_end)=, meaning that the byte
71+
=data_end= is pointing is *excluded*. The example tries to access
72+
2nd-last byte (to have a code if-construct that doesn't get removed by
73+
compiler optimizations).
74+
75+
#+begin_src sh
76+
sudo ./xdp_loader --dev mlx5p1 --force --file xdp_prog_fail2.o
77+
#+end_src
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#include <linux/bpf.h>
3+
#include <bpf/bpf_helpers.h>
4+
5+
/*
6+
* This BPF-prog will FAIL, due to verifier rejecting it.
7+
*
8+
* General idea: Use packet length to find and access last byte in
9+
* packet. The verifier cannot see this is safe, as it cannot deduce
10+
* the packet length at verification time.
11+
*/
12+
13+
SEC("xdp_fail1")
14+
int _xdp_fail1(struct xdp_md *ctx)
15+
{
16+
void *data_end = (void *)(long)ctx->data_end;
17+
void *data = (void *)(long)ctx->data;
18+
unsigned char *ptr;
19+
void *pos;
20+
21+
/* (Correct me if I'm wrong)
22+
*
23+
* The verifier cannot use this packet length calculation as
24+
* part of its static analysis. It chooses to use zero as the
25+
* offset value static value.
26+
*/
27+
unsigned int offset = data_end - data;
28+
29+
pos = data;
30+
31+
if (pos + offset > data_end)
32+
goto out;
33+
34+
/* Fails at this line with:
35+
* "invalid access to packet, off=-1 size=1, R1(id=2,off=0,r=0)"
36+
* "R1 offset is outside of the packet"
37+
*
38+
* Because verifer used offset==0 it thinks that we are trying
39+
* to access (data - 1), which is not within [data,data_end)
40+
*/
41+
ptr = pos + (offset - sizeof(*ptr));
42+
if (*ptr == 0xFF)
43+
return XDP_ABORTED;
44+
out:
45+
return XDP_PASS;
46+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#include <linux/bpf.h>
3+
#include <bpf/bpf_helpers.h>
4+
5+
/*
6+
* This BPF-prog will FAIL, due to verifier rejecting it.
7+
*
8+
* General idea: Use data_end point to access last (2nd-last) byte in
9+
* packet. That is not allowed by verifier, as pointer arithmetic on
10+
* pkt_end is prohibited.
11+
*/
12+
13+
SEC("xdp_fail2")
14+
int _xdp_fail2(struct xdp_md *ctx)
15+
{
16+
void *data_end = (void *)(long)ctx->data_end;
17+
volatile unsigned char *ptr;
18+
volatile void *pos;
19+
20+
pos = data_end;
21+
22+
#pragma clang optimize off
23+
if (pos - 1 > data_end)
24+
goto out;
25+
#pragma clang optimize on
26+
27+
/* Verifier fails with: "pointer arithmetic on pkt_end prohibited"
28+
*/
29+
ptr = pos - 2;
30+
if (*ptr == 0xFF)
31+
return XDP_ABORTED;
32+
out:
33+
return XDP_PASS;
34+
}

experiment01-tailgrow/xdp_prog_kern.c

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#include <linux/bpf.h>
3+
#include <linux/in.h>
4+
#include <bpf/bpf_helpers.h>
5+
#include <bpf/bpf_endian.h>
6+
7+
// The parsing helper functions from the packet01 lesson have moved here
8+
#include "../common/parsing_helpers.h"
9+
#include "../common/rewrite_helpers.h"
10+
11+
/* Defines xdp_stats_map */
12+
#include "../common/xdp_stats_kern_user.h"
13+
#include "../common/xdp_stats_kern.h"
14+
15+
struct my_timestamp {
16+
__u16 magic;
17+
__u64 time;
18+
} __attribute__((packed));
19+
20+
SEC("xdp_tailgrow_parse")
21+
int grow_parse(struct xdp_md *ctx)
22+
{
23+
void *data_end;
24+
void *data;
25+
int action = XDP_PASS;
26+
int eth_type, ip_type;
27+
struct hdr_cursor nh;
28+
struct iphdr *iphdr;
29+
struct ethhdr *eth;
30+
__u16 ip_tot_len;
31+
32+
struct my_timestamp *ts;
33+
34+
/* Increase packet size (at tail) and reload data pointers */
35+
__u8 offset = sizeof(*ts);
36+
if (bpf_xdp_adjust_tail(ctx, offset))
37+
goto out;
38+
data_end = (void *)(long)ctx->data_end;
39+
data = (void *)(long)ctx->data;
40+
41+
/* These keep track of the next header type and iterator pointer */
42+
nh.pos = data;
43+
44+
eth_type = parse_ethhdr(&nh, data_end, &eth);
45+
if (eth_type < 0) {
46+
action = XDP_ABORTED;
47+
goto out;
48+
}
49+
50+
if (eth_type == bpf_htons(ETH_P_IP)) {
51+
ip_type = parse_iphdr(&nh, data_end, &iphdr);
52+
} else {
53+
action = XDP_PASS;
54+
goto out;
55+
}
56+
57+
/* Demo use-case: Add timestamp in extended tailroom to ICMP packets,
58+
* before sending to network-stack via XDP_PASS. This can be
59+
* captured via tcpdump, and provide earlier (XDP layer) timestamp.
60+
*/
61+
if (ip_type == IPPROTO_ICMP) {
62+
63+
/* Packet size in bytes, including IP header and data */
64+
ip_tot_len = bpf_ntohs(iphdr->tot_len);
65+
66+
/*
67+
* Tricks to get pass the verifier. Being allowed to use
68+
* packet value iphdr->tot_len, involves bounding possible
69+
* values to please verifier.
70+
*/
71+
if (ip_tot_len < 2) {
72+
/* This check seems strange on unsigned ip_tot_len,
73+
* but is needed, else verifier complains:
74+
* "unbounded min value is not allowed"
75+
*/
76+
goto out;
77+
}
78+
ip_tot_len &= 0xFFF; /* Max 4095 */
79+
80+
/* Finding end of packet + offset, and bound access */
81+
if ((void *)iphdr + ip_tot_len + offset > data_end) {
82+
action = XDP_ABORTED;
83+
goto out;
84+
}
85+
86+
/* Point ts to end-of-packet, that have been offset extended */
87+
ts = (void *)iphdr + ip_tot_len;
88+
ts->magic = 0x5354; /* String "TS" in network-byte-order */
89+
ts->time = bpf_ktime_get_ns();
90+
}
91+
out:
92+
return xdp_stats_record_action(ctx, action);
93+
}
94+
95+
SEC("xdp_tailgrow")
96+
int tailgrow_pass(struct xdp_md *ctx)
97+
{
98+
int offset;
99+
100+
offset = 10;
101+
bpf_xdp_adjust_tail(ctx, offset);
102+
return xdp_stats_record_action(ctx, XDP_PASS);
103+
}
104+
105+
SEC("xdp_pass")
106+
int xdp_pass_func(struct xdp_md *ctx)
107+
{
108+
return xdp_stats_record_action(ctx, XDP_PASS);
109+
}
110+
111+
/* For benchmarking tail grow overhead (does a memset)*/
112+
SEC("xdp_tailgrow_tx")
113+
int tailgrow_tx(struct xdp_md *ctx)
114+
{
115+
int offset;
116+
117+
offset = 32;
118+
bpf_xdp_adjust_tail(ctx, offset);
119+
return xdp_stats_record_action(ctx, XDP_TX);
120+
}
121+
122+
/* Baseline benchmark of XDP_TX */
123+
SEC("xdp_tx")
124+
int xdp_tx_rec(struct xdp_md *ctx)
125+
{
126+
return xdp_stats_record_action(ctx, XDP_TX);
127+
}
128+
129+
char _license[] SEC("license") = "GPL";
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#include <linux/bpf.h>
3+
#include <bpf/bpf_helpers.h>
4+
5+
#define MTU 1536
6+
#define MIN_LEN 14
7+
8+
/*
9+
* This example show howto access packet last byte in XDP packet,
10+
* without parsing packet contents.
11+
*
12+
* It is not very effecient, as it advance the data pointer one-byte in a
13+
* loop until reaching data_end. This is needed as the verifier only allows
14+
* accessing data via advancing the position of the data pointer. The bounded
15+
* loop with a max number of iterations allows the verifier to see the bound.
16+
*/
17+
18+
SEC("xdp_end_loop")
19+
int _xdp_end_loop(struct xdp_md *ctx)
20+
{
21+
void *data_end = (void *)(long)ctx->data_end;
22+
void *data = (void *)(long)ctx->data;
23+
unsigned char *ptr;
24+
unsigned int i;
25+
void *pos;
26+
27+
/* Assume minimum length to reduce loops needed a bit */
28+
unsigned int offset = MIN_LEN;
29+
30+
pos = data;
31+
32+
/* Verifier can handle this bounded 'basic-loop' construct */
33+
for (i = 0; i < (MTU - MIN_LEN); i++ ) {
34+
if (pos + offset > data_end) {
35+
/* Promise verifier no access beyond data_end */
36+
goto out;
37+
}
38+
if (pos + offset == data_end) {
39+
/* Found data_end, exit for-loop and read data.
40+
*
41+
* It seems strange, that finding data_end via
42+
* moving pos (data) pointer forward is needed.
43+
* This is because pointer arithmetic on pkt_end is
44+
* prohibited by verifer.
45+
*
46+
* In principle data_end points to byte that is not
47+
* accessible. Thus, accessing last readable byte
48+
* via (data_end - 1) is prohibited by verifer.
49+
*/
50+
goto read;
51+
}
52+
offset++;
53+
}
54+
/* Show verifier all other cases exit program */
55+
goto out;
56+
57+
read:
58+
ptr = pos + (offset - sizeof(*ptr)); /* Parentheses needed */
59+
if (*ptr == 0xFF)
60+
return XDP_ABORTED;
61+
out:
62+
return XDP_PASS;
63+
}

0 commit comments

Comments
 (0)