Skip to content

Conversation

dvdgomez
Copy link

@dvdgomez dvdgomez commented Feb 19, 2025

jira VULN-12931
cve CVE-2024-56642
commit-author Kuniyuki Iwashima [email protected]
commit 6a2fa13

syzkaller reported a use-after-free of UDP kernel socket in cleanup_bearer() without repro. [0][1]

When bearer_disable() calls tipc_udp_disable(), cleanup of the UDP kernel socket is deferred by work calling cleanup_bearer().

tipc_net_stop() waits for such works to finish by checking tipc_net(net)->wq_count.  However, the work decrements the count too early before releasing the kernel socket, unblocking cleanup_net() and resulting in use-after-free.

Let's move the decrement after releasing the socket in cleanup_bearer().

[0]:
ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at
     sk_alloc+0x438/0x608
     inet_create+0x4c8/0xcb0
     __sock_create+0x350/0x6b8
     sock_create_kern+0x58/0x78
     udp_sock_create4+0x68/0x398
     udp_sock_create+0x88/0xc8
     tipc_udp_enable+0x5e8/0x848
     __tipc_nl_bearer_enable+0x84c/0xed8
     tipc_nl_bearer_enable+0x38/0x60
     genl_family_rcv_msg_doit+0x170/0x248
     genl_rcv_msg+0x400/0x5b0
     netlink_rcv_skb+0x1dc/0x398
     genl_rcv+0x44/0x68
     netlink_unicast+0x678/0x8b0
     netlink_sendmsg+0x5e4/0x898
     ____sys_sendmsg+0x500/0x830

[1]:
BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline] BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 udp_hashslot include/net/udp.h:85 [inline]
 udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 sk_common_release+0xaf/0x3f0 net/core/sock.c:3820
 inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437
 inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489
 __sock_release net/socket.c:658 [inline]
 sock_release+0xa0/0x210 net/socket.c:686
 cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

Uninit was created at:
 slab_free_hook mm/slub.c:2269 [inline]
 slab_free mm/slub.c:4580 [inline]
 kmem_cache_free+0x207/0xc40 mm/slub.c:4682
 net_free net/core/net_namespace.c:454 [inline]
 cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: events cleanup_bearer

Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
	Reported-by: syzkaller <[email protected]>
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 6a2fa13312e51a621f652d522d7e2df7066330b6)
	Signed-off-by: David Gomez <[email protected]>

Build Log:

INSTALL sound/usb/line6/snd-usb-variax.ko
  INSTALL sound/usb/misc/snd-ua101.ko
  INSTALL sound/usb/snd-usb-audio.ko
  INSTALL sound/usb/snd-usbmidi-lib.ko
  INSTALL sound/usb/usx2y/snd-usb-us122l.ko
  INSTALL sound/usb/usx2y/snd-usb-usx2y.ko
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+
[TIMER]{MODULES}: 36s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+ arch/x86/boot/bzImage \
	System.map "/boot"
[TIMER]{INSTALL}: 23s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+ and Index to 1
The default is /boot/loader/entries/34175980436d43cf9464a93d4e1446da-4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+.conf with index 1 and kernel /boot/vmlinuz-4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+
The default is /boot/loader/entries/34175980436d43cf9464a93d4e1446da-4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+.conf with index 1 and kernel /boot/vmlinuz-4.18.0-dgomez-fips-8-compliant-CVE-2024-56642+
Generating grub configuration file ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 6s
[TIMER]{BUILD}: 2652s
[TIMER]{MODULES}: 36s
[TIMER]{INSTALL}: 23s
[TIMER]{TOTAL} 2720s
Rebooting in 10 seconds

Kernel Test Logs:
kernel-selftest_before.log
kernel-selftest_after.log

[dgomez@r810-fips-553 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log | wc -l
169
[dgomez@r810-fips-553 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log | wc -l
204
[dgomez@r810-fips-553 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log | wc -l
36
[dgomez@r810-fips-553 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log | wc -l
46

jira VULN-12931
cve CVE-2024-56642
commit-author Kuniyuki Iwashima <[email protected]>
commit 6a2fa13

syzkaller reported a use-after-free of UDP kernel socket
in cleanup_bearer() without repro. [0][1]

When bearer_disable() calls tipc_udp_disable(), cleanup
of the UDP kernel socket is deferred by work calling
cleanup_bearer().

tipc_net_stop() waits for such works to finish by checking
tipc_net(net)->wq_count.  However, the work decrements the
count too early before releasing the kernel socket,
unblocking cleanup_net() and resulting in use-after-free.

Let's move the decrement after releasing the socket in
cleanup_bearer().

[0]:
ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at
     sk_alloc+0x438/0x608
     inet_create+0x4c8/0xcb0
     __sock_create+0x350/0x6b8
     sock_create_kern+0x58/0x78
     udp_sock_create4+0x68/0x398
     udp_sock_create+0x88/0xc8
     tipc_udp_enable+0x5e8/0x848
     __tipc_nl_bearer_enable+0x84c/0xed8
     tipc_nl_bearer_enable+0x38/0x60
     genl_family_rcv_msg_doit+0x170/0x248
     genl_rcv_msg+0x400/0x5b0
     netlink_rcv_skb+0x1dc/0x398
     genl_rcv+0x44/0x68
     netlink_unicast+0x678/0x8b0
     netlink_sendmsg+0x5e4/0x898
     ____sys_sendmsg+0x500/0x830

[1]:
BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline]
BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 udp_hashslot include/net/udp.h:85 [inline]
 udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 sk_common_release+0xaf/0x3f0 net/core/sock.c:3820
 inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437
 inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489
 __sock_release net/socket.c:658 [inline]
 sock_release+0xa0/0x210 net/socket.c:686
 cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

Uninit was created at:
 slab_free_hook mm/slub.c:2269 [inline]
 slab_free mm/slub.c:4580 [inline]
 kmem_cache_free+0x207/0xc40 mm/slub.c:4682
 net_free net/core/net_namespace.c:454 [inline]
 cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: events cleanup_bearer

Fixes: 26abe14 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
	Reported-by: syzkaller <[email protected]>
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 6a2fa13)
	Signed-off-by: David Gomez <[email protected]>
@dvdgomez dvdgomez self-assigned this Feb 19, 2025
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat PlaidCat changed the title tipc: Fix use-after-free of kernel socket in cleanup_bearer(). [fip-8-compliant] tipc: Fix use-after-free of kernel socket in cleanup_bearer(). Feb 19, 2025
Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be in Draft?

otherwise
:shipit:

@dvdgomez
Copy link
Author

Yeah I was posting this late yesterday so I wanted to review it today before moving it out of draft. Thanks @PlaidCat for catching the title fix I knew I forgot something 👍

@dvdgomez dvdgomez marked this pull request as ready for review February 19, 2025 17:52
@dvdgomez dvdgomez merged commit 56ab0ac into fips-8-compliant/4.18.0-553.16.1 Feb 19, 2025
4 checks passed
@dvdgomez dvdgomez deleted the dgomez-fips-8-compliant-CVE-2024-56642 branch February 19, 2025 17:53
github-actions bot pushed a commit that referenced this pull request Aug 30, 2025
The fbnic driver was presenting with the following locking assert coming
out of a PM resume:
[   42.208116][  T164] RTNL: assertion failed at drivers/net/phy/phylink.c (2611)
[   42.208492][  T164] WARNING: CPU: 1 PID: 164 at drivers/net/phy/phylink.c:2611 phylink_resume+0x190/0x1e0
[   42.208872][  T164] Modules linked in:
[   42.209140][  T164] CPU: 1 UID: 0 PID: 164 Comm: bash Not tainted 6.17.0-rc2-virtme #134 PREEMPT(full)
[   42.209496][  T164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014
[   42.209861][  T164] RIP: 0010:phylink_resume+0x190/0x1e0
[   42.210057][  T164] Code: 83 e5 01 0f 85 b0 fe ff ff c6 05 1c cd 3e 02 01 90 ba 33 0a 00 00 48 c7 c6 20 3a 1d a5 48 c7 c7 e0 3e 1d a5 e8 21 b8 90 fe 90 <0f> 0b 90 90 e9 86 fe ff ff e8 42 ea 1f ff e9 e2 fe ff ff 48 89 ef
[   42.210708][  T164] RSP: 0018:ffffc90000affbd8 EFLAGS: 00010296
[   42.210983][  T164] RAX: 0000000000000000 RBX: ffff8880078d8400 RCX: 0000000000000000
[   42.211235][  T164] RDX: 0000000000000000 RSI: 1ffffffff4f10938 RDI: 0000000000000001
[   42.211466][  T164] RBP: 0000000000000000 R08: ffffffffa2ae79ea R09: fffffbfff4b3eb84
[   42.211707][  T164] R10: 0000000000000003 R11: 0000000000000000 R12: ffff888007ad8000
[   42.211997][  T164] R13: 0000000000000002 R14: ffff888006a18800 R15: ffffffffa34c59e0
[   42.212234][  T164] FS:  00007f0dc8e39740(0000) GS:ffff88808f51f000(0000) knlGS:0000000000000000
[   42.212505][  T164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   42.212704][  T164] CR2: 00007f0dc8e9fe10 CR3: 000000000b56d003 CR4: 0000000000772ef0
[   42.213227][  T164] PKRU: 55555554
[   42.213366][  T164] Call Trace:
[   42.213483][  T164]  <TASK>
[   42.213565][  T164]  __fbnic_pm_attach.isra.0+0x8e/0xa0
[   42.213725][  T164]  pci_reset_function+0x116/0x1d0
[   42.213895][  T164]  reset_store+0xa0/0x100
[   42.214025][  T164]  ? pci_dev_reset_attr_is_visible+0x50/0x50
[   42.214221][  T164]  ? sysfs_file_kobj+0xc1/0x1e0
[   42.214374][  T164]  ? sysfs_kf_write+0x65/0x160
[   42.214526][  T164]  kernfs_fop_write_iter+0x2f8/0x4c0
[   42.214677][  T164]  ? kernfs_vma_page_mkwrite+0x1f0/0x1f0
[   42.214836][  T164]  new_sync_write+0x308/0x6f0
[   42.214987][  T164]  ? __lock_acquire+0x34c/0x740
[   42.215135][  T164]  ? new_sync_read+0x6f0/0x6f0
[   42.215288][  T164]  ? lock_acquire.part.0+0xbc/0x260
[   42.215440][  T164]  ? ksys_write+0xff/0x200
[   42.215590][  T164]  ? perf_trace_sched_switch+0x6d0/0x6d0
[   42.215742][  T164]  vfs_write+0x65e/0xbb0
[   42.215876][  T164]  ksys_write+0xff/0x200
[   42.215994][  T164]  ? __ia32_sys_read+0xc0/0xc0
[   42.216141][  T164]  ? do_user_addr_fault+0x269/0x9f0
[   42.216292][  T164]  ? rcu_is_watching+0x15/0xd0
[   42.216442][  T164]  do_syscall_64+0xbb/0x360
[   42.216591][  T164]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[   42.216784][  T164] RIP: 0033:0x7f0dc8ea9986

A bit of digging showed that we were invoking the phylink_resume as a part
of the fbnic_up path when we were enabling the service task while not
holding the RTNL lock. We should be enabling this sooner as a part of the
ndo_open path and then just letting the service task come online later.
This will help to enforce the correct locking and brings the phylink
interface online at the same time as the network interface, instead of at a
later time.

I tested this on QEMU to verify this was working by putting the system to
sleep using "echo mem > /sys/power/state" to put the system to sleep in the
guest and then using the command "system_wakeup" in the QEMU monitor.

Fixes: 6968437 ("eth: fbnic: Add link detection")
Signed-off-by: Alexander Duyck <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Link: https://patch.msgid.link/175616257316.1963577.12238158800417771119.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Jakub Kicinski <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 5, 2025
[ Upstream commit 6ede14a ]

The fbnic driver was presenting with the following locking assert coming
out of a PM resume:
[   42.208116][  T164] RTNL: assertion failed at drivers/net/phy/phylink.c (2611)
[   42.208492][  T164] WARNING: CPU: 1 PID: 164 at drivers/net/phy/phylink.c:2611 phylink_resume+0x190/0x1e0
[   42.208872][  T164] Modules linked in:
[   42.209140][  T164] CPU: 1 UID: 0 PID: 164 Comm: bash Not tainted 6.17.0-rc2-virtme #134 PREEMPT(full)
[   42.209496][  T164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014
[   42.209861][  T164] RIP: 0010:phylink_resume+0x190/0x1e0
[   42.210057][  T164] Code: 83 e5 01 0f 85 b0 fe ff ff c6 05 1c cd 3e 02 01 90 ba 33 0a 00 00 48 c7 c6 20 3a 1d a5 48 c7 c7 e0 3e 1d a5 e8 21 b8 90 fe 90 <0f> 0b 90 90 e9 86 fe ff ff e8 42 ea 1f ff e9 e2 fe ff ff 48 89 ef
[   42.210708][  T164] RSP: 0018:ffffc90000affbd8 EFLAGS: 00010296
[   42.210983][  T164] RAX: 0000000000000000 RBX: ffff8880078d8400 RCX: 0000000000000000
[   42.211235][  T164] RDX: 0000000000000000 RSI: 1ffffffff4f10938 RDI: 0000000000000001
[   42.211466][  T164] RBP: 0000000000000000 R08: ffffffffa2ae79ea R09: fffffbfff4b3eb84
[   42.211707][  T164] R10: 0000000000000003 R11: 0000000000000000 R12: ffff888007ad8000
[   42.211997][  T164] R13: 0000000000000002 R14: ffff888006a18800 R15: ffffffffa34c59e0
[   42.212234][  T164] FS:  00007f0dc8e39740(0000) GS:ffff88808f51f000(0000) knlGS:0000000000000000
[   42.212505][  T164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   42.212704][  T164] CR2: 00007f0dc8e9fe10 CR3: 000000000b56d003 CR4: 0000000000772ef0
[   42.213227][  T164] PKRU: 55555554
[   42.213366][  T164] Call Trace:
[   42.213483][  T164]  <TASK>
[   42.213565][  T164]  __fbnic_pm_attach.isra.0+0x8e/0xa0
[   42.213725][  T164]  pci_reset_function+0x116/0x1d0
[   42.213895][  T164]  reset_store+0xa0/0x100
[   42.214025][  T164]  ? pci_dev_reset_attr_is_visible+0x50/0x50
[   42.214221][  T164]  ? sysfs_file_kobj+0xc1/0x1e0
[   42.214374][  T164]  ? sysfs_kf_write+0x65/0x160
[   42.214526][  T164]  kernfs_fop_write_iter+0x2f8/0x4c0
[   42.214677][  T164]  ? kernfs_vma_page_mkwrite+0x1f0/0x1f0
[   42.214836][  T164]  new_sync_write+0x308/0x6f0
[   42.214987][  T164]  ? __lock_acquire+0x34c/0x740
[   42.215135][  T164]  ? new_sync_read+0x6f0/0x6f0
[   42.215288][  T164]  ? lock_acquire.part.0+0xbc/0x260
[   42.215440][  T164]  ? ksys_write+0xff/0x200
[   42.215590][  T164]  ? perf_trace_sched_switch+0x6d0/0x6d0
[   42.215742][  T164]  vfs_write+0x65e/0xbb0
[   42.215876][  T164]  ksys_write+0xff/0x200
[   42.215994][  T164]  ? __ia32_sys_read+0xc0/0xc0
[   42.216141][  T164]  ? do_user_addr_fault+0x269/0x9f0
[   42.216292][  T164]  ? rcu_is_watching+0x15/0xd0
[   42.216442][  T164]  do_syscall_64+0xbb/0x360
[   42.216591][  T164]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[   42.216784][  T164] RIP: 0033:0x7f0dc8ea9986

A bit of digging showed that we were invoking the phylink_resume as a part
of the fbnic_up path when we were enabling the service task while not
holding the RTNL lock. We should be enabling this sooner as a part of the
ndo_open path and then just letting the service task come online later.
This will help to enforce the correct locking and brings the phylink
interface online at the same time as the network interface, instead of at a
later time.

I tested this on QEMU to verify this was working by putting the system to
sleep using "echo mem > /sys/power/state" to put the system to sleep in the
guest and then using the command "system_wakeup" in the QEMU monitor.

Fixes: 6968437 ("eth: fbnic: Add link detection")
Signed-off-by: Alexander Duyck <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Link: https://patch.msgid.link/175616257316.1963577.12238158800417771119.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants