Skip to content

net: ptp: add IEEE 802.3 transport, timestamping fixes, and regression coverage#106464

Open
DBS06 wants to merge 22 commits intozephyrproject-rtos:mainfrom
DBS06:ptp/l2-transport-hardening-and-regression-suites
Open

net: ptp: add IEEE 802.3 transport, timestamping fixes, and regression coverage#106464
DBS06 wants to merge 22 commits intozephyrproject-rtos:mainfrom
DBS06:ptp/l2-transport-hardening-and-regression-suites

Conversation

@DBS06
Copy link
Copy Markdown
Contributor

@DBS06 DBS06 commented Mar 27, 2026

Motivation

I started this work while trying to run the Zephyr PTP sample on a Nucleo-H563ZI with a GPS clock and a PTP-capable switch. That exposed a mix of correctness, timestamping, and usability issues: some paths did not work reliably on hardware, IEEE 802.3 / Layer-2 transport support was missing, and the native_sim/native_tap path was not a good enough stand-in for development and regression testing.

What began as a small hardware bring-up fix turned into a broader hardening pass of the PTP stack. Along the way I fixed several bugs in transport handling, timestamp propagation, foreign-master selection, and BTCA/state handling, and I added documentation, shell diagnostics, and regression coverage so the current behavior is easier to validate and maintain.

Key Features

  • Add IEEE 802.3 / Layer-2 PTP transport support (EtherType 0x88f7).
  • Fix RX/TX timestamp handling across PTP sockets, AF_PACKET, and native_tap.
  • Fix foreign-master tracking and BTCA tie-break behavior.
  • Fix STM32H5 PTP reference-clock handling on Nucleo-H563ZI.
  • Add net ptp shell diagnostics and improve sample/documentation usability.
  • Add focused regression tests for PTP state, clock decision, foreign-master handling, message post-processing, and socket timestamping behavior.

Why These Changes Were Needed

  • The PTP stack only supported UDP transport paths. This PR adds IEEE 802.3 / Layer-2 transport support, including AF_PACKET open/send/receive paths, link-layer multicast addressing, and the protocol plumbing needed to run PTP without UDP.
  • Layer-2 RX timestamping is not equally available on every path. The transport now tries recvmsg() first to collect SO_TIMESTAMPING control data, then falls back to recvfrom() plus a PHC read when ancillary RX timestamps are unavailable. ptp_clock_synchronize() now also guards against missing or out-of-range ingress timestamps instead of blindly using bad data.
  • Foreign-master handling had correctness gaps. The first Announce is now stored, new Announce messages are compared against the previous latest record, negative BTCA comparisons no longer replace the current best foreign clock, and the BTCA receiver-port tie-break is corrected. This stabilizes role selection and avoids false BMCA/BTCA decisions.
  • Sync / Follow_Up handling is hardened around TX timestamp callbacks. If a TX timestamp is missing or late, the affected Follow_Up is skipped and the stack continues with the next Sync cycle instead of getting stuck behind a bad timestamp.
  • eth_native_tap.c needed to change because native_tap only queued TX timestamp callbacks for gPTP traffic. PTP socket traffic using SO_TIMESTAMPING therefore missed TX timestamps on native_sim. This PR fixes that by queueing TX callbacks for timestamped packets in general, updating packet timestamps on TX/RX, propagating timestamping flags through net_context, and adding AF_PACKET recvmsg() timestamp support with correct ancillary buffer accounting.
  • eth_stm32_hal_ptp.c needed a separate fix for STM32H5. In "RM0481 Rev 4" the Ethernet chapter documents the timestamp engine as running from a dedicated PTP reference clock (clk_ptp_ref_i) separate from eth_hclk, and RCC Table 115 ("Kernel clock distribution overview") lists ETH (ptp) on pll1_q_ck. This PR therefore models an explicit mac-clk-ptp source for STM32H5 and uses that clock rate for PTP addend programming. On Nucleo-H563ZI, that corrected the large, repeating delay and offset errors seen when the addend was derived from the wrong clock (see below). This took me quite a lot of time to find out ;-)
  • The sample and docs were also updated so this is easier to understand and reproduce: LOG_INF status prints, run-forever sample behavior, expanded README guidance, explicit PTP shell support, and documentation for transport/timestamping behavior.

Testing

  • Added unit/regression suites:
    net.ptp.btca_state_machine, net.ptp.clock_decision, net.ptp.foreign_master, net.ptp.msg_post_recv, and net.ptp.state_matrix.
  • Added socket timestamping coverage for both UDP and AF_PACKET, including recvmsg() ancillary-data handling, msg_controllen accounting, and MSG_CTRUNC behavior.
  • Informally validated with the PTP sample on Nucleo-H563ZI against ptp4l, a GPS clock (direct link), a PTP-capable switch, and a GPS clock through a PTP-capable switch. The exercised transports were UDP/IPv4, UDP/IPv6, and IEEE 802.3 / Layer-2.
  • The attached artifacts are runtime logs rather than interactive net ptp shell transcripts, so the excerpts below use the actual sample output from those runs.
# IEEE 802.3 / L2 excerpt
[00:00:00.051,000] <inf> net_config: Initializing network
[00:00:00.051,000] <inf> net_config: Waiting interface 1 (0x200013d4) to be up...
[00:00:01.852,000] <inf> phy_mii: PHY (0) Link speed 100 Mb, full duplex
[00:00:01.852,000] <inf> net_config: Interface 1 (0x200013d4) coming up
[00:00:01.852,000] <inf> net_config: IPv4 address: 192.168.0.100
[00:00:01.852,000] <dbg> ptp_clock: ptp_clock_init: PTP Clock 02:00:00:FF:FE:00:00:01 initialized
[00:00:01.852,000] <dbg> ptp_port: ptp_port_init: Port 1 initialized
[00:00:01.852,000] <dbg> ptp_port: port_enable: Port 1 opened
[00:00:01.852,000] <dbg> ptp_port: port_state_update: Port 1 changed state from INITIALIZING to LISTENING
[00:00:01.852,000] <inf> net_ptp_sample: Runs forever
[00:00:06.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:06.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:06.161,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:06.161,000] <dbg> ptp_port: ptp_port_add_foreign_tt: Port 1 has a new foreign timeTransmitter 02:00:00:FF:FE:00:00:10-0002
[00:00:07.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:07.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:08.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:08.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:08.161,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:08.161,000] <dbg> ptp_port: port_state_update: Port 1 changed state from LISTENING to TIME RECEIVER
[00:00:08.727,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:08.727,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 0 Delay_Req
[00:00:08.727,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Delay_Resp message
[00:00:09.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:09.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:10.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:10.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:10.160,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:10.727,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:10.727,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 1 Delay_Req
[00:00:10.727,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Delay_Resp message
[00:00:10.727,000] <dbg> ptp_clock: ptp_clock_delay: Delay -7613ns
[00:00:11.155,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:11.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:11.156,000] <wrn> ptp_clock: Clock offset exceeds 1 second (t1=1774476010.144104188 t2=11.106004900 delay=-7613ns offset=-1774476009038091675ns phc_now=11.106402500 |t2-phc|=397600ns)
[00:00:11.156,000] <wrn> ptp_clock: Set clock time: 1774476010.144501775
[00:00:12.155,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:12.156,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:12.156,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 27500ns
[00:00:12.160,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:12.727,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:12.727,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 2 Delay_Req
[00:00:12.727,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Delay_Resp message
[00:00:12.727,000] <dbg> ptp_clock: ptp_clock_delay: Delay 241ns

[...]
[00:00:19.155,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 1100ns
[00:00:20.155,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -164ns
[00:00:20.727,000] <dbg> ptp_clock: ptp_clock_delay: Delay 705ns
[00:00:29.153,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 24ns
[00:00:30.153,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -8ns
[00:00:30.727,000] <dbg> ptp_clock: ptp_clock_delay: Delay 617ns
[00:00:39.152,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -40ns
[00:00:40.152,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -32ns
[00:00:40.727,000] <dbg> ptp_clock: ptp_clock_delay: Delay 601ns
[01:25:54.670,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 2ns
[01:25:55.670,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 34ns
[01:25:56.352,000] <dbg> ptp_clock: ptp_clock_delay: Delay 637ns
[01:25:56.670,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -2ns
[01:25:57.670,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -30ns
[01:25:58.352,000] <dbg> ptp_clock: ptp_clock_delay: Delay 617ns
# UDP / IPv4 excerpt
[00:00:01.752,000] <dbg> ptp_port: port_state_update: Port 1 changed state from INITIALIZING to LISTENING
[00:00:03.269,000] <dbg> ptp_port: port_state_update: Port 1 changed state from LISTENING to TIME RECEIVER
[00:00:03.485,000] <dbg> ptp_clock: ptp_clock_delay: Delay 51ns
[00:00:04.445,000] <wrn> ptp_clock: Clock offset exceeds 1 second (...)
[00:00:04.445,000] <wrn> ptp_clock: Set clock time: 1774389604.308807731
[00:00:05.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 28018ns
[00:00:05.485,000] <dbg> ptp_clock: ptp_clock_delay: Delay 617ns
[00:00:10.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 2493ns
[00:00:11.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 635ns
[00:00:11.485,000] <dbg> ptp_clock: ptp_clock_delay: Delay 671ns
[00:00:18.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -59ns
[00:00:19.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -1ns
[00:00:21.485,000] <dbg> ptp_clock: ptp_clock_delay: Delay 636ns
[00:00:24.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset 1ns
[00:00:25.444,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -1ns
[00:00:25.485,000] <dbg> ptp_clock: ptp_clock_delay: Delay 647ns
# Nucleo-H563ZI behavior without the STM32H5 timer-rate fix
[00:00:09.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:09.826,000] <dbg> ptp_port: port_state_update: Port 1 changed state from LISTENING to TIME RECEIVER
[00:00:09.827,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:09.827,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:10.512,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:10.512,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 0 Delay_Req
[00:00:10.513,000] <dbg> ptp_clock: ptp_clock_delay: Delay 171454959ns
[00:00:10.827,000] <wrn> ptp_clock: Clock offset exceeds 1 second (t1=67.480646204 t2=5.388464840 delay=171454959ns ...)
[00:00:10.827,000] <wrn> ptp_clock: Set clock time: 67.652311843
[00:00:11.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:11.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:11.827,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:11.827,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -500042100ns
[00:00:12.512,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:12.512,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 1 Delay_Req
[00:00:12.513,000] <dbg> ptp_clock: ptp_clock_delay: Delay 171503451ns
[00:00:12.827,000] <wrn> ptp_clock: Clock offset exceeds 1 second (t1=69.480643844 t2=68.652121463 delay=171503451ns ...)
[00:00:12.827,000] <wrn> ptp_clock: Set clock time: 69.652360895
[00:00:13.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:13.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:13.827,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -500042160ns
[00:00:14.512,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:14.512,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 2 Delay_Req
[00:00:14.513,000] <dbg> ptp_clock: ptp_clock_delay: Delay 171583643ns
[00:00:14.826,000] <wrn> ptp_clock: Clock offset exceeds 1 second (t1=71.480646204 t2=70.652169975 delay=171583643ns ...)
[00:00:14.826,000] <wrn> ptp_clock: Set clock time: 71.652440587
[00:00:15.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Announce message
[00:00:15.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Sync message
[00:00:15.826,000] <dbg> ptp_msg: ptp_msg_post_recv: Port 1 received Follow_Up message
[00:00:15.826,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -500042100ns
[00:00:16.512,000] <dbg> ptp_port: port_delay_req_msg_transmit: Port 1 sends Delay_Req message
[00:00:16.512,000] <dbg> ptp_port: port_delay_req_timestamp_cb: Port 1 registered timestamp for 3 Delay_Req
[00:00:16.513,000] <dbg> ptp_clock: ptp_clock_delay: Delay 171637873ns
[00:00:16.826,000] <wrn> ptp_clock: Clock offset exceeds 1 second (t1=73.480643804 t2=72.652250107 delay=171637873ns ...)
[00:00:16.826,000] <wrn> ptp_clock: Set clock time: 73.652494157
[00:00:17.826,000] <dbg> ptp_clock: ptp_clock_synchronize: Offset -500042180ns

The last excerpt is the reason for the STM32H5 PTP reference-clock fix: without it, the measured delay is off by roughly 170 ms and the clock keeps bouncing by about 500 ms instead of converging. RM0481 Rev 4 backs the final approach here: the IEEE 1588 section says the 64-bit PTP time is updated from clk_ptp_ref_i, and the RCC clock distribution tables list ETH (ptp) on pll1_q_ck.

If this PR is merged and the maintainers think it would be useful, would it be possible if I can add me as a maintainer for the PTP area? I plan to keep working on this part of the networking stack.

DBS06 added 4 commits March 27, 2026 21:07
Store the first foreign Announce and compare it against the previous entry.
Fix the BTCA receiver-port tie-break to avoid false role decisions.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
RM0481 Rev 4 documents the Ethernet PTP timestamp clock as a dedicated
reference clock (`clk_ptp_ref_i`), not the `eth_hclk` bus clock.
Table 115 ("Kernel clock distribution overview", p. 470/3154) lists
`ETH (ptp)` on `pll1_q_ck`, and the IEEE 1588 section states that the
64-bit PTP time is updated from `clk_ptp_ref_i`.

Add an explicit `mac-clk-ptp` clock for STM32H5 sourced from
`STM32_SRC_PLL1_Q`, use that clock rate for PTP addend programming, and
remove the previous `stm-eth / 2` workaround.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
run clang format to satisfy compliance test

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
run clang format to satisfy compliance test

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
@zephyrbot zephyrbot added area: native port Host native arch port (native_sim) area: Tests Issues related to a particular existing or missing test platform: STM32 ST Micro STM32 area: PTP IEEE 1588 PTP Protocol area: Samples Samples area: Sockets Networking sockets area: Ethernet area: Networking labels Mar 27, 2026
Implement AF_PACKET recvmsg() support and deliver ancillary
SO_TIMESTAMPING data to packet socket users.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
DBS06 added 17 commits March 28, 2026 00:49
Use NET_CMSG_SPACE() when checking ancillary buffer capacity and
account for aligned cmsg storage in msg_controllen.

This keeps recvmsg() control-data handling consistent with cmsghdr
layout and avoids under-reporting consumed control-buffer space.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add Layer-2 (EtherType 0x88F7) transport support and the
PTP stack updates needed for L2 operation.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
run clang format to satisfy compliance test

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Enable SO_TIMESTAMPING on IEEE 802.3 PTP sockets and read RX timestamps
from recvmsg() control data. If recvmsg() is unavailable or fails at
runtime, fall back to recvfrom() to keep L2 reception working.

Register the Delay_Req TX timestamp callback for both UDP and L2 paths,
and suppress expected UDP parse warnings when running in L2 mode.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add a new `net ptp` shell command to inspect PTP runtime state from
the Zephyr shell.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Replace printk with LOG_INF and adapt sample config

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Improve the PTP sample documentation by clarifying requirements and
providing a complete Linux host + native_sim run guide.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Document IEEE 802.3 transport support and describe timestamping
behavior for both UDP and Layer-2 operation.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
run clang format to satisfy compliance test

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
native_tap only queued TX timestamp callbacks for gPTP packets, which
left PTP SO_TIMESTAMPING socket traffic without TX timestamps.

Add host-clock packet timestamp updates in native_tap TX/RX paths, queue
TX timestamp callbacks when net_pkt_is_tx_timestamping() is set (while
preserving gPTP behavior without double-queueing), and propagate
SO_TIMESTAMPING TX/RX flags for AF_PACKET packets in net_context.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add regression coverage for BTCA and PTP state-machine transitions.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add regression tests for foreign master discovery and update handling.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add a full state transition matrix test for BTCA event handling.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add regression coverage for clock decision and BMCA edge cases.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Add a focused unit test suite for ptp_msg_post_recv().

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Enable timestamping in the UDP socket test config and add a recvmsg()
regression that verifies SO_TIMESTAMPING ancillary data updates
msg_controllen with NET_CMSG_SPACE(sizeof(struct net_ptp_time)).

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
Enable packet timestamping in the AF_PACKET socket test config and add
recvmsg() regression coverage for SO_TIMESTAMPING ancillary data.

Signed-off-by: Philipp Steiner <philipp.steiner1987@gmail.com>
@DBS06 DBS06 force-pushed the ptp/l2-transport-hardening-and-regression-suites branch from 3d92658 to 3270c3a Compare March 27, 2026 23:54
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
E Reliability Rating on New Code (required ≥ C)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Ethernet area: native port Host native arch port (native_sim) area: Networking area: PTP IEEE 1588 PTP Protocol area: Samples Samples area: Sockets Networking sockets area: Tests Issues related to a particular existing or missing test platform: STM32 ST Micro STM32

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants