Skip to content

Conversation

PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Sep 5, 2025

  • Download all unprocessed src.rpm
  • for each src,pm
    • Find all commits in changelog up to last known tag ... in this case 6.12.0-55
    • Re-play commits in reverse order (oldest in change log to newest) with git cherry-pick
    • After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
    • Tag Rebuild branch
  • Use New Local Build with prodman and test (note test results will be different than usual)

Checking Rebuild Commits for potentially missing commits:

kernel-6.12.0-55.29.1.el10_0

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 28
Number of commits matched with upstream: 24 (85.71%)
Number of commits in upstream but not in rpm: 65190
Number of commits NOT found in upstream: 4 (14.29%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.29.1.el10_0 for kernel-6.12.0-55.29.1.el10_0
Clean Cherry Picks: 23 (95.83%)
Empty Cherry Picks: 1 (4.17%)
_______________________________

__EMPTY COMMITS__________________________
c7f2188d68c114095660a950b7e880a1e5a71c8f selftests/bpf: Adjust data size to have ETH_HLEN

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 10, debranding and Rocky Linux branding'
Add partial riscv64 support for build root'
Provide basic VisionFive 2 support'
redhat: Mark kernel incompatible with xdp-tools<1.5.4

kernel-6.12.0-55.30.1.el10_0

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-6.12.0-55.30.1.el10_0/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 15
Number of commits matched with upstream: 12 (80.00%)
Number of commits in upstream but not in rpm: 65202
Number of commits NOT found in upstream: 3 (20.00%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.30.1.el10_0 for kernel-6.12.0-55.30.1.el10_0
Clean Cherry Picks: 12 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

__EMPTY COMMITS__________________________

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 10, debranding and Rocky Linux branding'
Add partial riscv64 support for build root'
Provide basic VisionFive 2 support'

BUILD

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
  CLEAN   include/config include/generated
[TIMER]{MRPROPER}: 7s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky10_0_rebuild-0b50c952cd24"
Making olddefconfig
--
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  GEN     arch/x86/include/generated/asm/orc_hash.h
  WRAP    arch/x86/include/generated/uapi/asm/bpf_perf_event.h
  WRAP    arch/x86/include/generated/uapi/asm/errno.h
  WRAP    arch/x86/include/generated/uapi/asm/fcntl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctl.h
--
  LD [M]  net/qrtr/qrtr.ko
  LD [M]  net/qrtr/qrtr-mhi.ko
  BTF [M] net/hsr/hsr.ko
  BTF [M] net/qrtr/qrtr.ko
  BTF [M] net/qrtr/qrtr-mhi.ko
[TIMER]{BUILD}: 2090s
Making Modules
  SYMLINK /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/build
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.order
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.builtin
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.builtin.modinfo
--
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/hsr/hsr.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/qrtr/qrtr.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/qrtr/qrtr-mhi.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/ceph/libceph.ko
  DEPMOD  /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+
[TIMER]{MODULES}: 8s
Making Install
  INSTALL /boot
[TIMER]{INSTALL}: 17s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-6.12.0-rocky10_0_rebuild-0b50c952cd24+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 7s
[TIMER]{BUILD}: 2090s
[TIMER]{MODULES}: 8s
[TIMER]{INSTALL}: 17s
[TIMER]{TOTAL} 2127s
Rebooting in 10 seconds

KselfTest

[jmaple@devbox code]$ ls -rt kselftest.* | tail -n4 | while read line; do echo $line; grep '^ok ' $line | wc -l ; done
kselftest.6.12.0-jmaple_sig-cloud-10_6.12.0-55.21.1.el10_0-8cd774f1d1e4+.log
498
kselftest.6.12.0-rocky10_0_rebuild-09e336dca27b+.log
498
kselftest.6.12.0-jmaple_sig-cloud-10_6.12.0-55.27.1.el10_0-7f919518cf4+.log
498
kselftest.6.12.0-rocky10_0_rebuild-0b50c952cd24+.log
497

jira LE-4076
cve CVE-2025-38380
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Michael J. Ruhl <[email protected]>
commit 3d30048

The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the
dev context to be initialized.

amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx.

This could allow an out of bounds access (of msgs).

Initialize msg_write_idx before calling i2c_dw_xfer_init().

	Reviewed-by: Andy Shevchenko <[email protected]>
Fixes: 17631e8 ("i2c: designware: Add driver support for AMD NAVI GPU")
	Cc: <[email protected]> # v5.13+
	Signed-off-by: Michael J. Ruhl <[email protected]>
	Signed-off-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 3d30048)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-21867
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Shigeru Yoshida <[email protected]>
commit 6b3d638

KMSAN reported a use-after-free issue in eth_skb_pkt_type()[1]. The
cause of the issue was that eth_skb_pkt_type() accessed skb's data
that didn't contain an Ethernet header. This occurs when
bpf_prog_test_run_xdp() passes an invalid value as the user_data
argument to bpf_test_init().

Fix this by returning an error when user_data is less than ETH_HLEN in
bpf_test_init(). Additionally, remove the check for "if (user_size >
size)" as it is unnecessary.

[1]
BUG: KMSAN: use-after-free in eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
BUG: KMSAN: use-after-free in eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
 eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
 eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
 __xdp_build_skb_from_frame+0x5a8/0xa50 net/core/xdp.c:635
 xdp_recv_frames net/bpf/test_run.c:272 [inline]
 xdp_test_run_batch net/bpf/test_run.c:361 [inline]
 bpf_test_run_xdp_live+0x2954/0x3330 net/bpf/test_run.c:390
 bpf_prog_test_run_xdp+0x148e/0x1b10 net/bpf/test_run.c:1318
 bpf_prog_test_run+0x5b7/0xa30 kernel/bpf/syscall.c:4371
 __sys_bpf+0x6a6/0xe20 kernel/bpf/syscall.c:5777
 __do_sys_bpf kernel/bpf/syscall.c:5866 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5864 [inline]
 __x64_sys_bpf+0xa4/0xf0 kernel/bpf/syscall.c:5864
 x64_sys_call+0x2ea0/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:322
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Uninit was created at:
 free_pages_prepare mm/page_alloc.c:1056 [inline]
 free_unref_page+0x156/0x1320 mm/page_alloc.c:2657
 __free_pages+0xa3/0x1b0 mm/page_alloc.c:4838
 bpf_ringbuf_free kernel/bpf/ringbuf.c:226 [inline]
 ringbuf_map_free+0xff/0x1e0 kernel/bpf/ringbuf.c:235
 bpf_map_free kernel/bpf/syscall.c:838 [inline]
 bpf_map_free_deferred+0x17c/0x310 kernel/bpf/syscall.c:862
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa2b/0x1b60 kernel/workqueue.c:3310
 worker_thread+0xedf/0x1550 kernel/workqueue.c:3391
 kthread+0x535/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

CPU: 1 UID: 0 PID: 17276 Comm: syz.1.16450 Not tainted 6.12.0-05490-g9bb88c659673 #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014

Fixes: be3d72a ("bpf: move user_size out of bpf_test_init")
	Reported-by: syzkaller <[email protected]>
	Suggested-by: Martin KaFai Lau <[email protected]>
	Signed-off-by: Shigeru Yoshida <[email protected]>
	Signed-off-by: Martin KaFai Lau <[email protected]>
	Acked-by: Stanislav Fomichev <[email protected]>
	Acked-by: Daniel Borkmann <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Alexei Starovoitov <[email protected]>
(cherry picked from commit 6b3d638)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-21867
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Shigeru Yoshida <[email protected]>
commit c7f2188
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/c7f2188d.failed

The function bpf_test_init() now returns an error if user_size
(.data_size_in) is less than ETH_HLEN, causing the tests to
fail. Adjust the data size to ensure it meets the requirement of
ETH_HLEN.

	Signed-off-by: Shigeru Yoshida <[email protected]>
	Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Alexei Starovoitov <[email protected]>
(cherry picked from commit c7f2188)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c
jira LE-4076
cve CVE-2025-38250
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Kuniyuki Iwashima <[email protected]>
commit 1d61231

syzbot reported use-after-free in vhci_flush() without repro. [0]

From the splat, a thread close()d a vhci file descriptor while
its device was being used by iotcl() on another thread.

Once the last fd refcnt is released, vhci_release() calls
hci_unregister_dev(), hci_free_dev(), and kfree() for struct
vhci_data, which is set to hci_dev->dev->driver_data.

The problem is that there is no synchronisation after unlinking
hdev from hci_dev_list in hci_unregister_dev().  There might be
another thread still accessing the hdev which was fetched before
the unlink operation.

We can use SRCU for such synchronisation.

Let's run hci_dev_reset() under SRCU and wait for its completion
in hci_unregister_dev().

Another option would be to restore hci_dev->destruct(), which was
removed in commit 587ae08 ("Bluetooth: Remove unused
hci-destruct cb").  However, this would not be a good solution, as
we should not run hci_unregister_dev() while there are in-flight
ioctl() requests, which could lead to another data-race KCSAN splat.

Note that other drivers seem to have the same problem, for exmaple,
virtbt_remove().

[0]:
BUG: KASAN: slab-use-after-free in skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
BUG: KASAN: slab-use-after-free in skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
Read of size 8 at addr ffff88807cb8d858 by task syz.1.219/6718

CPU: 1 UID: 0 PID: 6718 Comm: syz.1.219 Not tainted 6.16.0-rc1-syzkaller-00196-g08207f42d3ff #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xd2/0x2b0 mm/kasan/report.c:521
 kasan_report+0x118/0x150 mm/kasan/report.c:634
 skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
 skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
 skb_queue_purge include/linux/skbuff.h:3368 [inline]
 vhci_flush+0x44/0x50 drivers/bluetooth/hci_vhci.c:69
 hci_dev_do_reset net/bluetooth/hci_core.c:552 [inline]
 hci_dev_reset+0x420/0x5c0 net/bluetooth/hci_core.c:592
 sock_do_ioctl+0xd9/0x300 net/socket.c:1190
 sock_ioctl+0x576/0x790 net/socket.c:1311
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:907 [inline]
 __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcf5b98e929
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcf5c7b9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fcf5bbb6160 RCX: 00007fcf5b98e929
RDX: 0000000000000000 RSI: 00000000400448cb RDI: 0000000000000009
RBP: 00007fcf5ba10b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fcf5bbb6160 R15: 00007ffd6353d528
 </TASK>

Allocated by task 6535:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
 kasan_kmalloc include/linux/kasan.h:260 [inline]
 __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359
 kmalloc_noprof include/linux/slab.h:905 [inline]
 kzalloc_noprof include/linux/slab.h:1039 [inline]
 vhci_open+0x57/0x360 drivers/bluetooth/hci_vhci.c:635
 misc_open+0x2bc/0x330 drivers/char/misc.c:161
 chrdev_open+0x4c9/0x5e0 fs/char_dev.c:414
 do_dentry_open+0xdf0/0x1970 fs/open.c:964
 vfs_open+0x3b/0x340 fs/open.c:1094
 do_open fs/namei.c:3887 [inline]
 path_openat+0x2ee5/0x3830 fs/namei.c:4046
 do_filp_open+0x1fa/0x410 fs/namei.c:4073
 do_sys_openat2+0x121/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 __se_sys_openat fs/open.c:1463 [inline]
 __x64_sys_openat+0x138/0x170 fs/open.c:1463
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6535:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2381 [inline]
 slab_free mm/slub.c:4643 [inline]
 kfree+0x18e/0x440 mm/slub.c:4842
 vhci_release+0xbc/0xd0 drivers/bluetooth/hci_vhci.c:671
 __fput+0x44c/0xa70 fs/file_table.c:465
 task_work_run+0x1d1/0x260 kernel/task_work.c:227
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x6ad/0x22e0 kernel/exit.c:955
 do_group_exit+0x21c/0x2d0 kernel/exit.c:1104
 __do_sys_exit_group kernel/exit.c:1115 [inline]
 __se_sys_exit_group kernel/exit.c:1113 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1113
 x64_sys_call+0x21ba/0x21c0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff88807cb8d800
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 88 bytes inside of
 freed 1024-byte region [ffff88807cb8d800, ffff88807cb8dc00)

Fixes: bf18c71 ("Bluetooth: vhci: Free driver_data on file release")
	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=f62d64848fc4c7c30cd6
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Acked-by: Paul Menzel <[email protected]>
	Signed-off-by: Luiz Augusto von Dentz <[email protected]>
(cherry picked from commit 1d61231)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Mete Durlu <[email protected]>
commit 9988df0

Add early polarization detection instead of assuming horizontal
polarization.

	Signed-off-by: Mete Durlu <[email protected]>
	Reviewed-by: Heiko Carstens <[email protected]>
	Signed-off-by: Alexander Gordeev <[email protected]>
(cherry picked from commit 9988df0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit c4a585e

In commit 6ee600b ("s390/pci: remove hotplug slot when releasing the
device") the zpci_exit_slot() was moved from zpci_device_reserved() to
zpci_release_device() with the intention of keeping the hotplug slot
around until the device is actually removed.

Now zpci_release_device() is only called once all references are
dropped. Since the zPCI subsystem only drops its reference once the
device is in the reserved state it follows that zpci_release_device()
must only deal with devices in the reserved state. Despite that it
contains code to tear down from both configured and standby state. For
the standby case this already includes the removal of the hotplug slot
so would cause a double removal if a device was ever removed in
either configured or standby state.

Instead of causing a potential double removal in a case that should
never happen explicitly WARN_ON() if a device in non-reserved state is
released and get rid of the dead code cases.

Fixes: 6ee600b ("s390/pci: remove hotplug slot when releasing the device")
	Reviewed-by: Matthew Rosato <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit c4a585e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 42420c5

The zpci_create_device() function returns an error pointer that needs to
be checked before dereferencing it as a struct zpci_dev pointer. Add the
missing check in __clp_add() where it was missed when adding the
scan_list in the fixed commit. Simply not adding the device to the scan
list results in the previous behavior.

	Cc: [email protected]
Fixes: 0467cdd ("s390/pci: Sort PCI functions prior to creating virtual busses")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 42420c5)
	Signed-off-by: Jonathan Maple <[email protected]>
…hild VFs

jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 05a2538

With commit bcb5d6c ("s390/pci: introduce lock to synchronize state
of zpci_dev's") the code to ignore power off of a PF that has child VFs
was changed from a direct return to a goto to the unlock and
pci_dev_put() section. The change however left the existing pci_dev_put()
untouched resulting in a doubple put. This can subsequently cause a use
after free if the struct pci_dev is released in an unexpected state.
Fix this by removing the extra pci_dev_put().

	Cc: [email protected]
Fixes: bcb5d6c ("s390/pci: introduce lock to synchronize state of zpci_dev's")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 05a2538)
	Signed-off-by: Jonathan Maple <[email protected]>
…device()

jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit d76f963

Remove zpci_bus_remove_device() and zpci_disable_device() calls from
zpci_release_device(). These calls were done when the device
transitioned into the ZPCI_FN_STATE_STANDBY state which is guaranteed to
happen before it enters the ZPCI_FN_STATE_RESERVED state. When
zpci_release_device() is called the device is known to be in the
ZPCI_FN_STATE_RESERVED state which is also checked by a WARN_ON().

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Reviewed-by: Julian Ruess <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit d76f963)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 47c3978

As disable_slot() takes a struct zpci_dev from the Configured to the
Standby state. In Standby there is still a hotplug slot so this is not
usually a case of sysfs self deletion. This is important because self
deletion gets very hairy in terms of locking (see for example
recover_store() in arch/s390/pci/pci_sysfs.c).

Because the pci_dev_put() is not within the critical section of the
zdev->state_lock however, disable_slot() can turn into a case of self
deletion if zPCI device event handling slips between the mutex_unlock()
and the pci_dev_put(). If the latter is the last put and
zpci_release_device() is called this then tries to remove the hotplug
slot via zpci_exit_slot() which will try to remove the hotplug slot
directory the disable_slot() is part of i.e. self deletion.

Prevent this by widening the zdev->state_lock critical section to
include the pci_dev_put() which is then guaranteed to happen with the
struct zpci_dev still in Standby state ensuring it will not lead to
a zpci_release_device() call as at least the zPCI event handling code
still holds a reference.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 47c3978)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 4b1815a

The architecture assumes that PCI functions can be removed synchronously
as PCI events are processed. This however clashes with the reference
counting of struct pci_dev which allows device drivers to hold on to a
struct pci_dev reference even as the underlying device is removed. To
bridge this gap commit 2a671f7 ("s390/pci: fix use after free of
zpci_dev") keeps the struct zpci_dev in ZPCI_FN_STATE_RESERVED state
until common code releases the struct pci_dev. Only when all references
are dropped, the struct zpci_dev can be removed and freed.

Later commit a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
moved the deletion of the struct zpci_dev from the zpci_list in
zpci_release_device() to the point where the device is reserved. This
was done to prevent handling events for a device that is already being
removed, e.g. when the platform generates both PCI event codes 0x304
and 0x308. In retrospect, deletion from the zpci_list in the release
function without holding the zpci_list_lock was also racy.

A side effect of this handling is that if the underlying device
re-appears while the struct zpci_dev is in the ZPCI_FN_STATE_RESERVED
state, the new and old instances of the struct zpci_dev and/or struct
pci_dev may clash. For example when trying to create the IOMMU sysfs
files for the new instance. In this case, re-adding the new instance is
aborted. The old instance is removed, and the device will remain absent
until the platform issues another event.

Fix this by allowing the struct zpci_dev to be brought back up right
until it is finally removed. To this end also keep the struct zpci_dev
in the zpci_list until it is finally released when all references have
been dropped.

Deletion from the zpci_list from within the release function is made
safe by using kref_put_lock() with the zpci_list_lock. This ensures that
the releasing code holds the last reference.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 4b1815a)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 774a1fa

Prior changes ensured that when zpci_release_device() is called and it
removed the zdev from the zpci_list this instance can not be found via
the zpci_list anymore even while allowing re-add of reserved devices.
This only accounts for the overall lifetime and zpci_list addition and
removal, it does not yet prevent concurrent add of a new instance for
the same underlying device. Such concurrent add would subsequently cause
issues such as attempted re-use of the same IOMMU sysfs directory and is
generally undesired.

Introduce a new zpci_add_remove_lock mutex to serialize adding a new
device with removal. Together this ensures that if a struct zpci_dev is
not found in the zpci_list it was either already removed and torn down,
or its removal and tear down is in progress with the
zpci_add_remove_lock held.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 774a1fa)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Haren Myneni <[email protected]>
commit 05aa156

The mapping VMA address is saved in VAS window struct when the
paste address is mapped. This VMA address is used during migration
to unmap the paste address if the window is active. The paste
address mapping will be removed when the window is closed or with
the munmap(). But the VMA address in the VAS window is not updated
with munmap() which is causing invalid access during migration.

The KASAN report shows:
[16386.254991] BUG: KASAN: slab-use-after-free in reconfig_close_windows+0x1a0/0x4e8
[16386.255043] Read of size 8 at addr c00000014a819670 by task drmgr/696928

[16386.255096] CPU: 29 UID: 0 PID: 696928 Comm: drmgr Kdump: loaded Tainted: G    B              6.11.0-rc5-nxgzip #2
[16386.255128] Tainted: [B]=BAD_PAGE
[16386.255148] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.00 (NH1110_016) hv:phyp pSeries
[16386.255181] Call Trace:
[16386.255202] [c00000016b297660] [c0000000018ad0ac] dump_stack_lvl+0x84/0xe8 (unreliable)
[16386.255246] [c00000016b297690] [c0000000006e8a90] print_report+0x19c/0x764
[16386.255285] [c00000016b297760] [c0000000006e9490] kasan_report+0x128/0x1f8
[16386.255309] [c00000016b297880] [c0000000006eb5c8] __asan_load8+0xac/0xe0
[16386.255326] [c00000016b2978a0] [c00000000013f898] reconfig_close_windows+0x1a0/0x4e8
[16386.255343] [c00000016b297990] [c000000000140e58] vas_migration_handler+0x3a4/0x3fc
[16386.255368] [c00000016b297a90] [c000000000128848] pseries_migrate_partition+0x4c/0x4c4
...

[16386.256136] Allocated by task 696554 on cpu 31 at 16377.277618s:
[16386.256149]  kasan_save_stack+0x34/0x68
[16386.256163]  kasan_save_track+0x34/0x80
[16386.256175]  kasan_save_alloc_info+0x58/0x74
[16386.256196]  __kasan_slab_alloc+0xb8/0xdc
[16386.256209]  kmem_cache_alloc_noprof+0x200/0x3d0
[16386.256225]  vm_area_alloc+0x44/0x150
[16386.256245]  mmap_region+0x214/0x10c4
[16386.256265]  do_mmap+0x5fc/0x750
[16386.256277]  vm_mmap_pgoff+0x14c/0x24c
[16386.256292]  ksys_mmap_pgoff+0x20c/0x348
[16386.256303]  sys_mmap+0xd0/0x160
...

[16386.256350] Freed by task 0 on cpu 31 at 16386.204848s:
[16386.256363]  kasan_save_stack+0x34/0x68
[16386.256374]  kasan_save_track+0x34/0x80
[16386.256384]  kasan_save_free_info+0x64/0x10c
[16386.256396]  __kasan_slab_free+0x120/0x204
[16386.256415]  kmem_cache_free+0x128/0x450
[16386.256428]  vm_area_free_rcu_cb+0xa8/0xd8
[16386.256441]  rcu_do_batch+0x2c8/0xcf0
[16386.256458]  rcu_core+0x378/0x3c4
[16386.256473]  handle_softirqs+0x20c/0x60c
[16386.256495]  do_softirq_own_stack+0x6c/0x88
[16386.256509]  do_softirq_own_stack+0x58/0x88
[16386.256521]  __irq_exit_rcu+0x1a4/0x20c
[16386.256533]  irq_exit+0x20/0x38
[16386.256544]  interrupt_async_exit_prepare.constprop.0+0x18/0x2c
...

[16386.256717] Last potentially related work creation:
[16386.256729]  kasan_save_stack+0x34/0x68
[16386.256741]  __kasan_record_aux_stack+0xcc/0x12c
[16386.256753]  __call_rcu_common.constprop.0+0x94/0xd04
[16386.256766]  vm_area_free+0x28/0x3c
[16386.256778]  remove_vma+0xf4/0x114
[16386.256797]  do_vmi_align_munmap.constprop.0+0x684/0x870
[16386.256811]  __vm_munmap+0xe0/0x1f8
[16386.256821]  sys_munmap+0x54/0x6c
[16386.256830]  system_call_exception+0x1a0/0x4a0
[16386.256841]  system_call_vectored_common+0x15c/0x2ec

[16386.256868] The buggy address belongs to the object at c00000014a819670
                which belongs to the cache vm_area_struct of size 168
[16386.256887] The buggy address is located 0 bytes inside of
                freed 168-byte region [c00000014a819670, c00000014a819718)

[16386.256915] The buggy address belongs to the physical page:
[16386.256928] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a81
[16386.256950] memcg:c0000000ba430001
[16386.256961] anon flags: 0x43ffff800000000(node=4|zone=0|lastcpupid=0x7ffff)
[16386.256975] page_type: 0xfdffffff(slab)
[16386.256990] raw: 043ffff800000000 c00000000501c080 0000000000000000 5deadbee00000001
[16386.257003] raw: 0000000000000000 00000000011a011a 00000001fdffffff c0000000ba430001
[16386.257018] page dumped because: kasan: bad access detected

This patch adds close() callback in vas_vm_ops vm_operations_struct
which will be executed during munmap() before freeing VMA. The VMA
address in the VAS window is set to NULL after holding the window
mmap_mutex.

Fixes: 37e6764 ("powerpc/pseries/vas: Add VAS migration handler")
	Signed-off-by: Haren Myneni <[email protected]>
	Signed-off-by: Madhavan Srinivasan <[email protected]>
Link: https://patch.msgid.link/[email protected]

(cherry picked from commit 05aa156)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38124
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Shiming Cheng <[email protected]>
commit 3382a1e

Commit a1e40ac ("net: gso: fix udp gso fraglist segmentation after
pull from frag_list") detected invalid geometry in frag_list skbs and
redirects them from skb_segment_list to more robust skb_segment. But some
packets with modified geometry can also hit bugs in that code. We don't
know how many such cases exist. Addressing each one by one also requires
touching the complex skb_segment code, which risks introducing bugs for
other types of skbs. Instead, linearize all these packets that fail the
basic invariants on gso fraglist skbs. That is more robust.

If only part of the fraglist payload is pulled into head_skb, it will
always cause exception when splitting skbs by skb_segment. For detailed
call stack information, see below.

Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size

Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify fraglist skbs, breaking these invariants.

In extreme cases they pull one part of data into skb linear. For UDP,
this  causes three payloads with lengths of (11,11,10) bytes were
pulled tail to become (12,10,10) bytes.

The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because
payload was pulled into head_skb, it needs to be linearized before pass
to regular skb_segment.

    skb_segment+0xcd0/0xd14
    __udp_gso_segment+0x334/0x5f4
    udp4_ufo_fragment+0x118/0x15c
    inet_gso_segment+0x164/0x338
    skb_mac_gso_segment+0xc4/0x13c
    __skb_gso_segment+0xc4/0x124
    validate_xmit_skb+0x9c/0x2c0
    validate_xmit_skb_list+0x4c/0x80
    sch_direct_xmit+0x70/0x404
    __dev_queue_xmit+0x64c/0xe5c
    neigh_resolve_output+0x178/0x1c4
    ip_finish_output2+0x37c/0x47c
    __ip_finish_output+0x194/0x240
    ip_finish_output+0x20/0xf4
    ip_output+0x100/0x1a0
    NF_HOOK+0xc4/0x16c
    ip_forward+0x314/0x32c
    ip_rcv+0x90/0x118
    __netif_receive_skb+0x74/0x124
    process_backlog+0xe8/0x1a4
    __napi_poll+0x5c/0x1f8
    net_rx_action+0x154/0x314
    handle_softirqs+0x154/0x4b8

    [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278!
    [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
    [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000
    [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000
    [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO)
    [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14
    [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
    [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770

Fixes: a1e40ac ("gso: fix udp gso fraglist segmentation after pull from frag_list")
	Signed-off-by: Shiming Cheng <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit 3382a1e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Jakub Kicinski <[email protected]>
commit 920efe3

bpf_offload caught a spurious warning in TC recently, but the error
message did not provide enough information to know what the problem
is:

  FAIL: Found 'netdevsim' in command output, leaky extack?

Add the extack to the output:

  FAIL: Unexpected command output, leaky extack? ('netdevsim', 'Warning: Filter with specified priority/protocol not found.')

	Acked-by: Stanislav Fomichev <[email protected]>
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 920efe3)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Jakub Kicinski <[email protected]>
commit f9d2f5d

We hit a following exception on timeout, nmaps is never set:

    Test bpftool bound info reporting (own ns)...
    Traceback (most recent call last):
      File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 1128, in <module>
        check_dev_info(False, "")
      File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 583, in check_dev_info
        maps = bpftool_map_list_wait(expected=2, ns=ns)
      File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 215, in bpftool_map_list_wait
        raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps))
    NameError: name 'nmaps' is not defined

	Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit f9d2f5d)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Jakub Kicinski <[email protected]>
commit 56a5869

After installing pahole on the CI image we have a new map created
by libbpf. Ignore it otherwise we see:

  Exception: Time out waiting for map counts to stabilize want 2, have 3

	Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 56a5869)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38471
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Jakub Kicinski <[email protected]>
commit 4ab26bc

After recent changes in net-next TCP compacts skbs much more
aggressively. This unearthed a bug in TLS where we may try
to operate on an old skb when checking if all skbs in the
queue have matching decrypt state and geometry.

    BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls]
    (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544)
    Read of size 4 at addr ffff888013085750 by task tls/13529

    CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme
    Call Trace:
     kasan_report+0xca/0x100
     tls_strp_check_rcv+0x898/0x9a0 [tls]
     tls_rx_rec_wait+0x2c9/0x8d0 [tls]
     tls_sw_recvmsg+0x40f/0x1aa0 [tls]
     inet_recvmsg+0x1c3/0x1f0

Always reload the queue, fast path is to have the record in the queue
when we wake, anyway (IOW the path going down "if !strp->stm.full_len").

Fixes: 0d87bbd ("tls: strp: make sure the TCP skbs do not have overlapping data")
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 4ab26bc)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Anumula Murali Mohan Reddy <[email protected]>
commit 356983f

t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl()
uses port number to get mac addr, this leads to error when an attempt
to set MAC address on VF's of PF2 and PF3.
This patch fixes the issue by using port number to set mac address.

Fixes: e0cdac6 ("cxgb4vf: configure ports accessible by the VF")
	Signed-off-by: Anumula Murali Mohan Reddy <[email protected]>
	Signed-off-by: Potnuri Bharat Teja <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 356983f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38200
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Kyungwook Boo <[email protected]>
commit 015bac5

When the device sends a specific input, an integer underflow can occur, leading
to MMIO write access to an invalid page.

Prevent the integer underflow by changing the type of related variables.

	Signed-off-by: Kyungwook Boo <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/T/
	Reviewed-by: Przemek Kitszel <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
	Reviewed-by: Aleksandr Loktionov <[email protected]>
	Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 015bac5)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Cong Wang <[email protected]>
commit a7a15f3

est_qlen_notify() deletes its class from its active list with
list_del() when qlen is 0, therefore, it is not idempotent and
not friendly to its callers, like fq_codel_dequeue().

Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers'
life. Also change other list_del()'s to list_del_init() just to be
extra safe.

	Reported-by: Gerrard Tai <[email protected]>
	Signed-off-by: Cong Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Acked-by: Jamal Hadi Salim <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>
(cherry picked from commit a7a15f3)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-37914
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Victor Nogueira <[email protected]>
commit 1a6d0c0

As described in Gerrard's report [1], there are use cases where a netem
child qdisc will make the parent qdisc's enqueue callback reentrant.
In the case of ets, there won't be a UAF, but the code will add the same
classifier to the list twice, which will cause memory corruption.

In addition to checking for qlen being zero, this patch checks whether
the class was already added to the active_list (cl_is_active) before
doing the addition to cater for the reentrant case.

[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/

Fixes: 37d9cf1 ("sched: Fix detection of empty queues in child qdiscs")
	Acked-by: Jamal Hadi Salim <[email protected]>
	Signed-off-by: Victor Nogueira <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 1a6d0c0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Victor Nogueira <[email protected]>
commit ffdde7b

Lion's patch [1] revealed an ancient bug in the qdisc API.
Whenever a user creates/modifies a qdisc specifying as a parent another
qdisc, the qdisc API will, during grafting, detect that the user is
not trying to attach to a class and reject. However grafting is
performed after qdisc_create (and thus the qdiscs' init callback) is
executed. In qdiscs that eventually call qdisc_tree_reduce_backlog
during init or change (such as fq, hhf, choke, etc), an issue
arises. For example, executing the following commands:

sudo tc qdisc add dev lo root handle a: htb default 2
sudo tc qdisc add dev lo parent a: handle beef fq

Qdiscs such as fq, hhf, choke, etc unconditionally invoke
qdisc_tree_reduce_backlog() in their control path init() or change() which
then causes a failure to find the child class; however, that does not stop
the unconditional invocation of the assumed child qdisc's qlen_notify with
a null class. All these qdiscs make the assumption that class is non-null.

The solution is ensure that qdisc_leaf() which looks up the parent
class, and is invoked prior to qdisc_create(), should return failure on
not finding the class.
In this patch, we leverage qdisc_leaf to return ERR_PTRs whenever the
parentid doesn't correspond to a class, so that we can detect it
earlier on and abort before qdisc_create is called.

[1] https://lore.kernel.org/netdev/[email protected]/

Fixes: 5e50da0 ("[NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs")
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Acked-by: Jamal Hadi Salim <[email protected]>
	Reviewed-by: Cong Wang <[email protected]>
	Signed-off-by: Victor Nogueira <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit ffdde7b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38417
Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0
commit-author Grzegorz Nitka <[email protected]>
commit 48c8b21

Add simple eswitch mode checker in attaching VF procedure and allocate
required port representor memory structures only in switchdev mode.
The reset flows triggers VF (if present) detach/attach procedure.
It might involve VF port representor(s) re-creation if the device is
configured is switchdev mode (not legacy one).
The memory was blindly allocated in current implementation,
regardless of the mode and not freed if in legacy mode.

Kmemeleak trace:
unreferenced object (percpu) 0x7e3bce5b888458 (size 40):
  comm "bash", pid 1784, jiffies 4295743894
  hex dump (first 32 bytes on cpu 45):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 0):
    pcpu_alloc_noprof+0x4c4/0x7c0
    ice_repr_create+0x66/0x130 [ice]
    ice_repr_create_vf+0x22/0x70 [ice]
    ice_eswitch_attach_vf+0x1b/0xa0 [ice]
    ice_reset_all_vfs+0x1dd/0x2f0 [ice]
    ice_pci_err_resume+0x3b/0xb0 [ice]
    pci_reset_function+0x8f/0x120
    reset_store+0x56/0xa0
    kernfs_fop_write_iter+0x120/0x1b0
    vfs_write+0x31c/0x430
    ksys_write+0x61/0xd0
    do_syscall_64+0x5b/0x180
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

Testing hints (ethX is PF netdev):
- create at least one VF
    echo 1 > /sys/class/net/ethX/device/sriov_numvfs
- trigger the reset
    echo 1 > /sys/class/net/ethX/device/reset

Fixes: 415db83 ("ice: make representor code generic")
	Signed-off-by: Grzegorz Nitka <[email protected]>
	Reviewed-by: Przemek Kitszel <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
	Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 48c8b21)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 28
Number of commits matched with upstream: 24 (85.71%)
Number of commits in upstream but not in rpm: 65190
Number of commits NOT found in upstream: 4 (14.29%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.29.1.el10_0 for kernel-6.12.0-55.29.1.el10_0
Clean Cherry Picks: 23 (95.83%)
Empty Cherry Picks: 1 (4.17%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
jira LE-4076
cve CVE-2025-38472
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Florian Westphal <[email protected]>
commit 2d72afb

A crash in conntrack was reported while trying to unlink the conntrack
entry from the hash bucket list:
    [exception RIP: __nf_ct_delete_from_lists+172]
    [..]
 #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack]
 #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack]
 #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack]
    [..]

The nf_conn struct is marked as allocated from slab but appears to be in
a partially initialised state:

 ct hlist pointer is garbage; looks like the ct hash value
 (hence crash).
 ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected
 ct->timeout is 30000 (=30s), which is unexpected.

Everything else looks like normal udp conntrack entry.  If we ignore
ct->status and pretend its 0, the entry matches those that are newly
allocated but not yet inserted into the hash:
  - ct hlist pointers are overloaded and store/cache the raw tuple hash
  - ct->timeout matches the relative time expected for a new udp flow
    rather than the absolute 'jiffies' value.

If it were not for the presence of IPS_CONFIRMED,
__nf_conntrack_find_get() would have skipped the entry.

Theory is that we did hit following race:

cpu x 			cpu y			cpu z
 found entry E		found entry E
 E is expired		<preemption>
 nf_ct_delete()
 return E to rcu slab
					init_conntrack
					E is re-inited,
					ct->status set to 0
					reply tuplehash hnnode.pprev
					stores hash value.

cpu y found E right before it was deleted on cpu x.
E is now re-inited on cpu z.  cpu y was preempted before
checking for expiry and/or confirm bit.

					->refcnt set to 1
					E now owned by skb
					->timeout set to 30000

If cpu y were to resume now, it would observe E as
expired but would skip E due to missing CONFIRMED bit.

					nf_conntrack_confirm gets called
					sets: ct->status |= CONFIRMED
					This is wrong: E is not yet added
					to hashtable.

cpu y resumes, it observes E as expired but CONFIRMED:
			<resumes>
			nf_ct_expired()
			 -> yes (ct->timeout is 30s)
			confirmed bit set.

cpu y will try to delete E from the hashtable:
			nf_ct_delete() -> set DYING bit
			__nf_ct_delete_from_lists

Even this scenario doesn't guarantee a crash:
cpu z still holds the table bucket lock(s) so y blocks:

			wait for spinlock held by z

					CONFIRMED is set but there is no
					guarantee ct will be added to hash:
					"chaintoolong" or "clash resolution"
					logic both skip the insert step.
					reply hnnode.pprev still stores the
					hash value.

					unlocks spinlock
					return NF_DROP
			<unblocks, then
			 crashes on hlist_nulls_del_rcu pprev>

In case CPU z does insert the entry into the hashtable, cpu y will unlink
E again right away but no crash occurs.

Without 'cpu y' race, 'garbage' hlist is of no consequence:
ct refcnt remains at 1, eventually skb will be free'd and E gets
destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy.

To resolve this, move the IPS_CONFIRMED assignment after the table
insertion but before the unlock.

Pablo points out that the confirm-bit-store could be reordered to happen
before hlist add resp. the timeout fixup, so switch to set_bit and
before_atomic memory barrier to prevent this.

It doesn't matter if other CPUs can observe a newly inserted entry right
before the CONFIRMED bit was set:

Such event cannot be distinguished from above "E is the old incarnation"
case: the entry will be skipped.

Also change nf_ct_should_gc() to first check the confirmed bit.

The gc sequence is:
 1. Check if entry has expired, if not skip to next entry
 2. Obtain a reference to the expired entry.
 3. Call nf_ct_should_gc() to double-check step 1.

nf_ct_should_gc() is thus called only for entries that already failed an
expiry check. After this patch, once the confirmed bit check passes
ct->timeout has been altered to reflect the absolute 'best before' date
instead of a relative time.  Step 3 will therefore not remove the entry.

Without this change to nf_ct_should_gc() we could still get this sequence:

 1. Check if entry has expired.
 2. Obtain a reference.
 3. Call nf_ct_should_gc() to double-check step 1:
    4 - entry is still observed as expired
    5 - meanwhile, ct->timeout is corrected to absolute value on other CPU
      and confirm bit gets set
    6 - confirm bit is seen
    7 - valid entry is removed again

First do check 6), then 4) so the gc expiry check always picks up either
confirmed bit unset (entry gets skipped) or expiry re-check failure for
re-inited conntrack objects.

This change cannot be backported to releases before 5.19. Without
commit 8a75a2c ("netfilter: conntrack: remove unconfirmed list")
|= IPS_CONFIRMED line cannot be moved without further changes.

	Cc: Razvan Cojocaru <[email protected]>
Link: https://lore.kernel.org/netfilter-devel/[email protected]/
Link: https://lore.kernel.org/netfilter-devel/[email protected]/
Fixes: 1397af5 ("netfilter: conntrack: remove the percpu dying list")
	Signed-off-by: Florian Westphal <[email protected]>
	Signed-off-by: Pablo Neira Ayuso <[email protected]>
(cherry picked from commit 2d72afb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38461
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Michal Luczaj <[email protected]>
commit 687aa0c

Transport assignment may race with module unload. Protect new_transport
from becoming a stale pointer.

This also takes care of an insecure call in vsock_use_local_transport();
add a lockdep assert.

BUG: unable to handle page fault for address: fffffbfff8056000
Oops: Oops: 0000 [#1] SMP KASAN
RIP: 0010:vsock_assign_transport+0x366/0x600
Call Trace:
 vsock_connect+0x59c/0xc40
 __sys_connect+0xe8/0x100
 __x64_sys_connect+0x6e/0xc0
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: c0cfa2d ("vsock: add multi-transports support")
	Reviewed-by: Stefano Garzarella <[email protected]>
	Signed-off-by: Michal Luczaj <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 687aa0c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38464
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Kuniyuki Iwashima <[email protected]>
commit 667eeab

syzbot reported a null-ptr-deref in tipc_conn_close() during netns
dismantle. [0]

tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls
tipc_conn_close() for each tipc_conn.

The problem is that tipc_conn_close() is called after releasing the
IDR lock.

At the same time, there might be tipc_conn_recv_work() running and it
could call tipc_conn_close() for the same tipc_conn and release its
last ->kref.

Once we release the IDR lock in tipc_topsrv_stop(), there is no
guarantee that the tipc_conn is alive.

Let's hold the ref before releasing the lock and put the ref after
tipc_conn_close() in tipc_topsrv_stop().

[0]:
BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435

CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1fc/0x2ef lib/dump_stack.c:118
 print_address_description.cold+0x54/0x219 mm/kasan/report.c:256
 kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354
 kasan_report mm/kasan/report.c:412 [inline]
 __asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433
 tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
 tipc_topsrv_stop net/tipc/topsrv.c:701 [inline]
 tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722
 ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153
 cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

Allocated by task 23:
 kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625
 kmalloc include/linux/slab.h:515 [inline]
 kzalloc include/linux/slab.h:709 [inline]
 tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192
 tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

Freed by task 23:
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xcc/0x210 mm/slab.c:3822
 tipc_conn_kref_release net/tipc/topsrv.c:150 [inline]
 kref_put include/linux/kref.h:70 [inline]
 conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

The buggy address belongs to the object at ffff888099305a00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 8 bytes inside of
 512-byte region [ffff888099305a00, ffff888099305c00)
The buggy address belongs to the page:
page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0
flags: 0xfff00000000100(slab)
raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940
raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: c5fa7b3 ("tipc: introduce new TIPC server infrastructure")
	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Tung Nguyen <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 667eeab)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38220
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Brian Foster <[email protected]>
commit e26268f

fstest generic/388 occasionally reproduces a crash that looks as
follows:

BUG: kernel NULL pointer dereference, address: 0000000000000000
...
Call Trace:
 <TASK>
 ext4_block_zero_page_range+0x30c/0x380 [ext4]
 ext4_truncate+0x436/0x440 [ext4]
 ext4_process_orphan+0x5d/0x110 [ext4]
 ext4_orphan_cleanup+0x124/0x4f0 [ext4]
 ext4_fill_super+0x262d/0x3110 [ext4]
 get_tree_bdev_flags+0x132/0x1d0
 vfs_get_tree+0x26/0xd0
 vfs_cmd_create+0x59/0xe0
 __do_sys_fsconfig+0x4ed/0x6b0
 do_syscall_64+0x82/0x170
 ...

This occurs when processing a symlink inode from the orphan list. The
partial block zeroing code in the truncate path calls
ext4_dirty_journalled_data() -> folio_mark_dirty(). The latter calls
mapping->a_ops->dirty_folio(), but symlink inodes are not assigned an
a_ops vector in ext4, hence the crash.

To avoid this problem, update the ext4_dirty_journalled_data() helper to
only mark the folio dirty on regular files (for which a_ops is
assigned). This also matches the journaling logic in the ext4_symlink()
creation path, where ext4_handle_dirty_metadata() is called directly.

Fixes: d84c9eb ("ext4: Mark pages with journalled data dirty")
	Signed-off-by: Brian Foster <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Theodore Ts'o <[email protected]>
	Reviewed-by: Jan Kara <[email protected]>
	Cc: [email protected]
(cherry picked from commit e26268f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Kuniyuki Iwashima <[email protected]>
commit 5a465a0

__udp_enqueue_schedule_skb() has the following condition:

  if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
          goto drop;

sk->sk_rcvbuf is initialised by net.core.rmem_default and later can
be configured by SO_RCVBUF, which is limited by net.core.rmem_max,
or SO_RCVBUFFORCE.

If we set INT_MAX to sk->sk_rcvbuf, the condition is always false
as sk->sk_rmem_alloc is also signed int.

Then, the size of the incoming skb is added to sk->sk_rmem_alloc
unconditionally.

This results in integer overflow (possibly multiple times) on
sk->sk_rmem_alloc and allows a single socket to have skb up to
net.core.udp_mem[1].

For example, if we set a large value to udp_mem[1] and INT_MAX to
sk->sk_rcvbuf and flood packets to the socket, we can see multiple
overflows:

  # cat /proc/net/sockstat | grep UDP:
  UDP: inuse 3 mem 7956736  <-- (7956736 << 12) bytes > INT_MAX * 15
                                             ^- PAGE_SHIFT
  # ss -uam
  State  Recv-Q      ...
  UNCONN -1757018048 ...    <-- flipping the sign repeatedly
         skmem:(r2537949248,rb2147483646,t0,tb212992,f1984,w0,o0,bl0,d0)

Previously, we had a boundary check for INT_MAX, which was removed by
commit 6a1f12d ("udp: relax atomic operation on sk->sk_rmem_alloc").

A complete fix would be to revert it and cap the right operand by
INT_MAX:

  rmem = atomic_add_return(size, &sk->sk_rmem_alloc);
  if (rmem > min(size + (unsigned int)sk->sk_rcvbuf, INT_MAX))
          goto uncharge_drop;

but we do not want to add the expensive atomic_add_return() back just
for the corner case.

Casting rmem to unsigned int prevents multiple wraparounds, but we still
allow a single wraparound.

  # cat /proc/net/sockstat | grep UDP:
  UDP: inuse 3 mem 524288  <-- (INT_MAX + 1) >> 12

  # ss -uam
  State  Recv-Q      ...
  UNCONN -2147482816 ...   <-- INT_MAX + 831 bytes
         skmem:(r2147484480,rb2147483646,t0,tb212992,f3264,w0,o0,bl0,d14468947)

So, let's define rmem and rcvbuf as unsigned int and check skb->truesize
only when rcvbuf is large enough to lower the overflow possibility.

Note that we still have a small chance to see overflow if multiple skbs
to the same socket are processed on different core at the same time and
each size does not exceed the limit but the total size does.

Note also that we must ignore skb->truesize for a small buffer as
explained in commit 363dc73 ("udp: be less conservative with
sock rmem accounting").

Fixes: 6a1f12d ("udp: relax atomic operation on sk->sk_rmem_alloc")
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 5a465a0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-22058
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Kuniyuki Iwashima <[email protected]>
commit df207de

Matt Dowling reported a weird UDP memory usage issue.

Under normal operation, the UDP memory usage reported in /proc/net/sockstat
remains close to zero.  However, it occasionally spiked to 524,288 pages
and never dropped.  Moreover, the value doubled when the application was
terminated.  Finally, it caused intermittent packet drops.

We can reproduce the issue with the script below [0]:

  1. /proc/net/sockstat reports 0 pages

    # cat /proc/net/sockstat | grep UDP:
    UDP: inuse 1 mem 0

  2. Run the script till the report reaches 524,288

    # python3 test.py & sleep 5
    # cat /proc/net/sockstat | grep UDP:
    UDP: inuse 3 mem 524288  <-- (INT_MAX + 1) >> PAGE_SHIFT

  3. Kill the socket and confirm the number never drops

    # pkill python3 && sleep 5
    # cat /proc/net/sockstat | grep UDP:
    UDP: inuse 1 mem 524288

  4. (necessary since v6.0) Trigger proto_memory_pcpu_drain()

    # python3 test.py & sleep 1 && pkill python3

  5. The number doubles

    # cat /proc/net/sockstat | grep UDP:
    UDP: inuse 1 mem 1048577

The application set INT_MAX to SO_RCVBUF, which triggered an integer
overflow in udp_rmem_release().

When a socket is close()d, udp_destruct_common() purges its receive
queue and sums up skb->truesize in the queue.  This total is calculated
and stored in a local unsigned integer variable.

The total size is then passed to udp_rmem_release() to adjust memory
accounting.  However, because the function takes a signed integer
argument, the total size can wrap around, causing an overflow.

Then, the released amount is calculated as follows:

  1) Add size to sk->sk_forward_alloc.
  2) Round down sk->sk_forward_alloc to the nearest lower multiple of
      PAGE_SIZE and assign it to amount.
  3) Subtract amount from sk->sk_forward_alloc.
  4) Pass amount >> PAGE_SHIFT to __sk_mem_reduce_allocated().

When the issue occurred, the total in udp_destruct_common() was 2147484480
(INT_MAX + 833), which was cast to -2147482816 in udp_rmem_release().

At 1) sk->sk_forward_alloc is changed from 3264 to -2147479552, and
2) sets -2147479552 to amount.  3) reverts the wraparound, so we don't
see a warning in inet_sock_destruct().  However, udp_memory_allocated
ends up doubling at 4).

Since commit 3cd3399 ("net: implement per-cpu reserves for
memory_allocated"), memory usage no longer doubles immediately after
a socket is close()d because __sk_mem_reduce_allocated() caches the
amount in udp_memory_per_cpu_fw_alloc.  However, the next time a UDP
socket receives a packet, the subtraction takes effect, causing UDP
memory usage to double.

This issue makes further memory allocation fail once the socket's
sk->sk_rmem_alloc exceeds net.ipv4.udp_rmem_min, resulting in packet
drops.

To prevent this issue, let's use unsigned int for the calculation and
call sk_forward_alloc_add() only once for the small delta.

Note that first_packet_length() also potentially has the same problem.

[0]:
from socket import *

SO_RCVBUFFORCE = 33
INT_MAX = (2 ** 31) - 1

s = socket(AF_INET, SOCK_DGRAM)
s.bind(('', 0))
s.setsockopt(SOL_SOCKET, SO_RCVBUFFORCE, INT_MAX)

c = socket(AF_INET, SOCK_DGRAM)
c.connect(s.getsockname())

data = b'a' * 100

while True:
    c.send(data)

Fixes: f970bd9 ("udp: implement memory accounting helpers")
	Reported-by: Matt Dowling <[email protected]>
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit df207de)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-38211
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Shin'ichiro Kawasaki <[email protected]>
commit 6883b68

The commit 59c68ac ("iw_cm: free cm_id resources on the last
deref") simplified cm_id resource management by freeing cm_id once all
references to the cm_id were removed. The references are removed either
upon completion of iw_cm event handlers or when the application destroys
the cm_id. This commit introduced the use-after-free condition where
cm_id_private object could still be in use by event handler works during
the destruction of cm_id. The commit aee2424 ("RDMA/iwcm: Fix a
use-after-free related to destroying CM IDs") addressed this use-after-
free by flushing all pending works at the cm_id destruction.

However, still another use-after-free possibility remained. It happens
with the work objects allocated for each cm_id_priv within
alloc_work_entries() during cm_id creation, and subsequently freed in
dealloc_work_entries() once all references to the cm_id are removed.
If the cm_id's last reference is decremented in the event handler work,
the work object for the work itself gets removed, and causes the use-
after-free BUG below:

  BUG: KASAN: slab-use-after-free in __pwq_activate_work+0x1ff/0x250
  Read of size 8 at addr ffff88811f9cf800 by task kworker/u16:1/147091

  CPU: 2 UID: 0 PID: 147091 Comm: kworker/u16:1 Not tainted 6.15.0-rc2+ #27 PREEMPT(voluntary)
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
  Workqueue:  0x0 (iw_cm_wq)
  Call Trace:
   <TASK>
   dump_stack_lvl+0x6a/0x90
   print_report+0x174/0x554
   ? __virt_addr_valid+0x208/0x430
   ? __pwq_activate_work+0x1ff/0x250
   kasan_report+0xae/0x170
   ? __pwq_activate_work+0x1ff/0x250
   __pwq_activate_work+0x1ff/0x250
   pwq_dec_nr_in_flight+0x8c5/0xfb0
   process_one_work+0xc11/0x1460
   ? __pfx_process_one_work+0x10/0x10
   ? assign_work+0x16c/0x240
   worker_thread+0x5ef/0xfd0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0x3b0/0x770
   ? __pfx_kthread+0x10/0x10
   ? rcu_is_watching+0x11/0xb0
   ? _raw_spin_unlock_irq+0x24/0x50
   ? rcu_is_watching+0x11/0xb0
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x30/0x70
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1a/0x30
   </TASK>

  Allocated by task 147416:
   kasan_save_stack+0x2c/0x50
   kasan_save_track+0x10/0x30
   __kasan_kmalloc+0xa6/0xb0
   alloc_work_entries+0xa9/0x260 [iw_cm]
   iw_cm_connect+0x23/0x4a0 [iw_cm]
   rdma_connect_locked+0xbfd/0x1920 [rdma_cm]
   nvme_rdma_cm_handler+0x8e5/0x1b60 [nvme_rdma]
   cma_cm_event_handler+0xae/0x320 [rdma_cm]
   cma_work_handler+0x106/0x1b0 [rdma_cm]
   process_one_work+0x84f/0x1460
   worker_thread+0x5ef/0xfd0
   kthread+0x3b0/0x770
   ret_from_fork+0x30/0x70
   ret_from_fork_asm+0x1a/0x30

  Freed by task 147091:
   kasan_save_stack+0x2c/0x50
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x60
   __kasan_slab_free+0x4b/0x70
   kfree+0x13a/0x4b0
   dealloc_work_entries+0x125/0x1f0 [iw_cm]
   iwcm_deref_id+0x6f/0xa0 [iw_cm]
   cm_work_handler+0x136/0x1ba0 [iw_cm]
   process_one_work+0x84f/0x1460
   worker_thread+0x5ef/0xfd0
   kthread+0x3b0/0x770
   ret_from_fork+0x30/0x70
   ret_from_fork_asm+0x1a/0x30

  Last potentially related work creation:
   kasan_save_stack+0x2c/0x50
   kasan_record_aux_stack+0xa3/0xb0
   __queue_work+0x2ff/0x1390
   queue_work_on+0x67/0xc0
   cm_event_handler+0x46a/0x820 [iw_cm]
   siw_cm_upcall+0x330/0x650 [siw]
   siw_cm_work_handler+0x6b9/0x2b20 [siw]
   process_one_work+0x84f/0x1460
   worker_thread+0x5ef/0xfd0
   kthread+0x3b0/0x770
   ret_from_fork+0x30/0x70
   ret_from_fork_asm+0x1a/0x30

This BUG is reproducible by repeating the blktests test case nvme/061
for the rdma transport and the siw driver.

To avoid the use-after-free of cm_id_private work objects, ensure that
the last reference to the cm_id is decremented not in the event handler
works, but in the cm_id destruction context. For that purpose, move
iwcm_deref_id() call from destroy_cm_id() to the callers of
destroy_cm_id(). In iw_destroy_cm_id(), call iwcm_deref_id() after
flushing the pending works.

During the fix work, I noticed that iw_destroy_cm_id() is called from
cm_work_handler() and process_event() context. However, the comment of
iw_destroy_cm_id() notes that the function "cannot be called by the
event thread". Drop the false comment.

Closes: https://lore.kernel.org/linux-rdma/r5676e754sv35aq7cdsqrlnvyhiq5zktteaurl7vmfih35efko@z6lay7uypy3c/
Fixes: 59c68ac ("iw_cm: free cm_id resources on the last deref")
	Cc: [email protected]
	Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Reviewed-by: Zhu Yanjun <[email protected]>
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit 6883b68)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit dc287e4

Since commit 25f39d3 ("s390/pci: Ignore RID for isolated VFs") PFs
which are not initially configured but in standby are considered
isolated. That is they create only a single function PCI domain. Due to
the PCI domains being created on discovery, this means that even if they
are configured later on, sibling PFs and their child VFs will not be
added to their PCI domain breaking SR-IOV expectations.

The reason the referenced commit ignored standby PFs for the creation of
multi-function PCI subhierarchies, was to work around a PCI domain
renumbering scenario on reboot. The renumbering would occur after
removing a previously in standby PF, whose domain number is used for its
configured sibling PFs and their child VFs, but which itself remained in
standby. When this is followed by a reboot, the sibling PF is used
instead to determine the PCI domain number of it and its child VFs.

In principle it is not possible to know which standby PFs will be
configured later and which may be removed. The PCI domain and root bus
are pre-requisites for hotplug slots so the decision of which functions
belong to which domain can not be postponed. With the renumbering
occurring only in rare circumstances and being generally benign, accept
it as an oddity and fix SR-IOV for initially standby PFs simply by
allowing them to create PCI domains.

	Cc: [email protected]
	Reviewed-by: Gerd Bayer <[email protected]>
Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Alexander Gordeev <[email protected]>
(cherry picked from commit dc287e4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 0579388

This creates a new zpci_iov_find_parent_pf() function which a future
commit can use to find if a VF has a configured parent PF. Use
zdev->rid instead of zdev->devfn such that the new function can be used
before it has been decided if the RID will be exposed and zdev->devfn is
set. Also handle the hypotheical case that the RID is not available but
there is an otherwise matching zbus.

Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Cc: [email protected]
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 0579388)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 2844ddb

In contrast to the commit message of the fixed commit VFs whose parent
PF is not configured are not always isolated, that is put on their own
PCI domain. This is because for VFs to be added to an existing PCI
domain it is enough for that PCI domain to share the same topology ID or
PCHID. Such a matching PCI domain without a parent PF may exist when
a PF from the same PCI card created the domain with the VF being a child
of a different, non accessible, PF. While not causing technical issues
it makes the rules which VFs are isolated inconsistent.

Fix this by explicitly checking that the parent PF exists on the PCI
domain determined by the topology ID or PCHID before registering the VF.
This works because a parent PF which is under control of this Linux
instance must be enabled and configured at the point where its child VFs
appear because otherwise SR-IOV could not have been enabled on the
parent.

Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Cc: [email protected]
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 2844ddb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Niklas Schnelle <[email protected]>
commit 8691abd

For non-VFs, zpci_bus_is_isolated_vf() should return false because they
aren't VFs. While zpci_iov_find_parent_pf() specifically checks if
a function is a VF, it then simply returns that there is no parent. The
simplistic check for a parent then leads to these functions being
confused with isolated VFs and isolating them on their own domain even
if sibling PFs should share the domain.

Fix this by explicitly checking if a function is not a VF. Note also
that at this point the case where RIDs are ignored is already handled
and in this case all PCI functions get isolated by being detected in
zpci_bus_is_multifunction_root().

	Cc: [email protected]
Fixes: 2844ddb ("s390/pci: Fix handling of isolated VFs")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 8691abd)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4076
cve CVE-2025-37823
Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0
commit-author Cong Wang <[email protected]>
commit 6ccbda4

Similarly to the previous patch, we need to safe guard hfsc_dequeue()
too. But for this one, we don't have a reliable reproducer.

Fixes: 1da177e ("Linux-2.6.12-rc2")
	Reported-by: Gerrard Tai <[email protected]>
	Signed-off-by: Cong Wang <[email protected]>
	Reviewed-by: Jamal Hadi Salim <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 6ccbda4)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 15
Number of commits matched with upstream: 12 (80.00%)
Number of commits in upstream but not in rpm: 65202
Number of commits NOT found in upstream: 3 (20.00%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.30.1.el10_0 for kernel-6.12.0-55.30.1.el10_0
Clean Cherry Picks: 12 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-6.12.0-55.30.1.el10_0/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PlaidCat PlaidCat changed the title [rocky10_0] History Rebuild to kernel-6.12.0-55.27.1.el10_0 [rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 Sep 8, 2025
@PlaidCat PlaidCat merged commit 0b50c95 into rocky10_0 Sep 8, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky10_0_rebuild branch September 8, 2025 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants