[rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 #556

PlaidCat · 2025-09-05T19:02:44Z

Download all unprocessed src.rpm
for each src,pm
- Find all commits in changelog up to last known tag ... in this case 6.12.0-55
- Re-play commits in reverse order (oldest in change log to newest) with git cherry-pick
- After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
- Tag Rebuild branch
Use New Local Build with prodman and test (note test results will be different than usual)

Checking Rebuild Commits for potentially missing commits:

kernel-6.12.0-55.29.1.el10_0

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 28
Number of commits matched with upstream: 24 (85.71%)
Number of commits in upstream but not in rpm: 65190
Number of commits NOT found in upstream: 4 (14.29%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.29.1.el10_0 for kernel-6.12.0-55.29.1.el10_0
Clean Cherry Picks: 23 (95.83%)
Empty Cherry Picks: 1 (4.17%)
_______________________________

__EMPTY COMMITS__________________________
c7f2188d68c114095660a950b7e880a1e5a71c8f selftests/bpf: Adjust data size to have ETH_HLEN

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 10, debranding and Rocky Linux branding'
Add partial riscv64 support for build root'
Provide basic VisionFive 2 support'
redhat: Mark kernel incompatible with xdp-tools<1.5.4

kernel-6.12.0-55.30.1.el10_0

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-6.12.0-55.30.1.el10_0/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v6.12~1..kernel-mainline: 65214
Number of commits in rpm: 15
Number of commits matched with upstream: 12 (80.00%)
Number of commits in upstream but not in rpm: 65202
Number of commits NOT found in upstream: 3 (20.00%)

Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.30.1.el10_0 for kernel-6.12.0-55.30.1.el10_0
Clean Cherry Picks: 12 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

__EMPTY COMMITS__________________________

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 10, debranding and Rocky Linux branding'
Add partial riscv64 support for build root'
Provide basic VisionFive 2 support'

BUILD

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
  CLEAN   include/config include/generated
[TIMER]{MRPROPER}: 7s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky10_0_rebuild-0b50c952cd24"
Making olddefconfig
--
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  GEN     arch/x86/include/generated/asm/orc_hash.h
  WRAP    arch/x86/include/generated/uapi/asm/bpf_perf_event.h
  WRAP    arch/x86/include/generated/uapi/asm/errno.h
  WRAP    arch/x86/include/generated/uapi/asm/fcntl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctl.h
--
  LD [M]  net/qrtr/qrtr.ko
  LD [M]  net/qrtr/qrtr-mhi.ko
  BTF [M] net/hsr/hsr.ko
  BTF [M] net/qrtr/qrtr.ko
  BTF [M] net/qrtr/qrtr-mhi.ko
[TIMER]{BUILD}: 2090s
Making Modules
  SYMLINK /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/build
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.order
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.builtin
  INSTALL /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/modules.builtin.modinfo
--
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/hsr/hsr.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/qrtr/qrtr.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/qrtr/qrtr-mhi.ko
  SIGN    /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+/kernel/net/ceph/libceph.ko
  DEPMOD  /lib/modules/6.12.0-rocky10_0_rebuild-0b50c952cd24+
[TIMER]{MODULES}: 8s
Making Install
  INSTALL /boot
[TIMER]{INSTALL}: 17s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-6.12.0-rocky10_0_rebuild-0b50c952cd24+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 7s
[TIMER]{BUILD}: 2090s
[TIMER]{MODULES}: 8s
[TIMER]{INSTALL}: 17s
[TIMER]{TOTAL} 2127s
Rebooting in 10 seconds

KselfTest

[jmaple@devbox code]$ ls -rt kselftest.* | tail -n4 | while read line; do echo $line; grep '^ok ' $line | wc -l ; done
kselftest.6.12.0-jmaple_sig-cloud-10_6.12.0-55.21.1.el10_0-8cd774f1d1e4+.log
498
kselftest.6.12.0-rocky10_0_rebuild-09e336dca27b+.log
498
kselftest.6.12.0-jmaple_sig-cloud-10_6.12.0-55.27.1.el10_0-7f919518cf4+.log
498
kselftest.6.12.0-rocky10_0_rebuild-0b50c952cd24+.log
497

jira LE-4076 cve CVE-2025-38380 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Michael J. Ruhl <[email protected]> commit 3d30048 The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the dev context to be initialized. amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx. This could allow an out of bounds access (of msgs). Initialize msg_write_idx before calling i2c_dw_xfer_init(). Reviewed-by: Andy Shevchenko <[email protected]> Fixes: 17631e8 ("i2c: designware: Add driver support for AMD NAVI GPU") Cc: <[email protected]> # v5.13+ Signed-off-by: Michael J. Ruhl <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 3d30048) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-21867 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Shigeru Yoshida <[email protected]> commit 6b3d638 KMSAN reported a use-after-free issue in eth_skb_pkt_type()[1]. The cause of the issue was that eth_skb_pkt_type() accessed skb's data that didn't contain an Ethernet header. This occurs when bpf_prog_test_run_xdp() passes an invalid value as the user_data argument to bpf_test_init(). Fix this by returning an error when user_data is less than ETH_HLEN in bpf_test_init(). Additionally, remove the check for "if (user_size > size)" as it is unnecessary. [1] BUG: KMSAN: use-after-free in eth_skb_pkt_type include/linux/etherdevice.h:627 [inline] BUG: KMSAN: use-after-free in eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165 eth_skb_pkt_type include/linux/etherdevice.h:627 [inline] eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165 __xdp_build_skb_from_frame+0x5a8/0xa50 net/core/xdp.c:635 xdp_recv_frames net/bpf/test_run.c:272 [inline] xdp_test_run_batch net/bpf/test_run.c:361 [inline] bpf_test_run_xdp_live+0x2954/0x3330 net/bpf/test_run.c:390 bpf_prog_test_run_xdp+0x148e/0x1b10 net/bpf/test_run.c:1318 bpf_prog_test_run+0x5b7/0xa30 kernel/bpf/syscall.c:4371 __sys_bpf+0x6a6/0xe20 kernel/bpf/syscall.c:5777 __do_sys_bpf kernel/bpf/syscall.c:5866 [inline] __se_sys_bpf kernel/bpf/syscall.c:5864 [inline] __x64_sys_bpf+0xa4/0xf0 kernel/bpf/syscall.c:5864 x64_sys_call+0x2ea0/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:322 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: free_pages_prepare mm/page_alloc.c:1056 [inline] free_unref_page+0x156/0x1320 mm/page_alloc.c:2657 __free_pages+0xa3/0x1b0 mm/page_alloc.c:4838 bpf_ringbuf_free kernel/bpf/ringbuf.c:226 [inline] ringbuf_map_free+0xff/0x1e0 kernel/bpf/ringbuf.c:235 bpf_map_free kernel/bpf/syscall.c:838 [inline] bpf_map_free_deferred+0x17c/0x310 kernel/bpf/syscall.c:862 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa2b/0x1b60 kernel/workqueue.c:3310 worker_thread+0xedf/0x1550 kernel/workqueue.c:3391 kthread+0x535/0x6b0 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 CPU: 1 UID: 0 PID: 17276 Comm: syz.1.16450 Not tainted 6.12.0-05490-g9bb88c659673 #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Fixes: be3d72a ("bpf: move user_size out of bpf_test_init") Reported-by: syzkaller <[email protected]> Suggested-by: Martin KaFai Lau <[email protected]> Signed-off-by: Shigeru Yoshida <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> (cherry picked from commit 6b3d638) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-21867 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Shigeru Yoshida <[email protected]> commit c7f2188 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/c7f2188d.failed The function bpf_test_init() now returns an error if user_size (.data_size_in) is less than ETH_HLEN, causing the tests to fail. Adjust the data size to ensure it meets the requirement of ETH_HLEN. Signed-off-by: Shigeru Yoshida <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> (cherry picked from commit c7f2188) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c

jira LE-4076 cve CVE-2025-38250 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Kuniyuki Iwashima <[email protected]> commit 1d61231 syzbot reported use-after-free in vhci_flush() without repro. [0] From the splat, a thread close()d a vhci file descriptor while its device was being used by iotcl() on another thread. Once the last fd refcnt is released, vhci_release() calls hci_unregister_dev(), hci_free_dev(), and kfree() for struct vhci_data, which is set to hci_dev->dev->driver_data. The problem is that there is no synchronisation after unlinking hdev from hci_dev_list in hci_unregister_dev(). There might be another thread still accessing the hdev which was fetched before the unlink operation. We can use SRCU for such synchronisation. Let's run hci_dev_reset() under SRCU and wait for its completion in hci_unregister_dev(). Another option would be to restore hci_dev->destruct(), which was removed in commit 587ae08 ("Bluetooth: Remove unused hci-destruct cb"). However, this would not be a good solution, as we should not run hci_unregister_dev() while there are in-flight ioctl() requests, which could lead to another data-race KCSAN splat. Note that other drivers seem to have the same problem, for exmaple, virtbt_remove(). [0]: BUG: KASAN: slab-use-after-free in skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline] BUG: KASAN: slab-use-after-free in skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937 Read of size 8 at addr ffff88807cb8d858 by task syz.1.219/6718 CPU: 1 UID: 0 PID: 6718 Comm: syz.1.219 Not tainted 6.16.0-rc1-syzkaller-00196-g08207f42d3ff #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xd2/0x2b0 mm/kasan/report.c:521 kasan_report+0x118/0x150 mm/kasan/report.c:634 skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline] skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937 skb_queue_purge include/linux/skbuff.h:3368 [inline] vhci_flush+0x44/0x50 drivers/bluetooth/hci_vhci.c:69 hci_dev_do_reset net/bluetooth/hci_core.c:552 [inline] hci_dev_reset+0x420/0x5c0 net/bluetooth/hci_core.c:592 sock_do_ioctl+0xd9/0x300 net/socket.c:1190 sock_ioctl+0x576/0x790 net/socket.c:1311 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fcf5b98e929 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fcf5c7b9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fcf5bbb6160 RCX: 00007fcf5b98e929 RDX: 0000000000000000 RSI: 00000000400448cb RDI: 0000000000000009 RBP: 00007fcf5ba10b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fcf5bbb6160 R15: 00007ffd6353d528 </TASK> Allocated by task 6535: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] vhci_open+0x57/0x360 drivers/bluetooth/hci_vhci.c:635 misc_open+0x2bc/0x330 drivers/char/misc.c:161 chrdev_open+0x4c9/0x5e0 fs/char_dev.c:414 do_dentry_open+0xdf0/0x1970 fs/open.c:964 vfs_open+0x3b/0x340 fs/open.c:1094 do_open fs/namei.c:3887 [inline] path_openat+0x2ee5/0x3830 fs/namei.c:4046 do_filp_open+0x1fa/0x410 fs/namei.c:4073 do_sys_openat2+0x121/0x1c0 fs/open.c:1437 do_sys_open fs/open.c:1452 [inline] __do_sys_openat fs/open.c:1468 [inline] __se_sys_openat fs/open.c:1463 [inline] __x64_sys_openat+0x138/0x170 fs/open.c:1463 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 6535: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2381 [inline] slab_free mm/slub.c:4643 [inline] kfree+0x18e/0x440 mm/slub.c:4842 vhci_release+0xbc/0xd0 drivers/bluetooth/hci_vhci.c:671 __fput+0x44c/0xa70 fs/file_table.c:465 task_work_run+0x1d1/0x260 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x6ad/0x22e0 kernel/exit.c:955 do_group_exit+0x21c/0x2d0 kernel/exit.c:1104 __do_sys_exit_group kernel/exit.c:1115 [inline] __se_sys_exit_group kernel/exit.c:1113 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1113 x64_sys_call+0x21ba/0x21c0 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff88807cb8d800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 88 bytes inside of freed 1024-byte region [ffff88807cb8d800, ffff88807cb8dc00) Fixes: bf18c71 ("Bluetooth: vhci: Free driver_data on file release") Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=f62d64848fc4c7c30cd6 Signed-off-by: Kuniyuki Iwashima <[email protected]> Acked-by: Paul Menzel <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> (cherry picked from commit 1d61231) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Mete Durlu <[email protected]> commit 9988df0 Add early polarization detection instead of assuming horizontal polarization. Signed-off-by: Mete Durlu <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> (cherry picked from commit 9988df0) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit c4a585e In commit 6ee600b ("s390/pci: remove hotplug slot when releasing the device") the zpci_exit_slot() was moved from zpci_device_reserved() to zpci_release_device() with the intention of keeping the hotplug slot around until the device is actually removed. Now zpci_release_device() is only called once all references are dropped. Since the zPCI subsystem only drops its reference once the device is in the reserved state it follows that zpci_release_device() must only deal with devices in the reserved state. Despite that it contains code to tear down from both configured and standby state. For the standby case this already includes the removal of the hotplug slot so would cause a double removal if a device was ever removed in either configured or standby state. Instead of causing a potential double removal in a case that should never happen explicitly WARN_ON() if a device in non-reserved state is released and get rid of the dead code cases. Fixes: 6ee600b ("s390/pci: remove hotplug slot when releasing the device") Reviewed-by: Matthew Rosato <[email protected]> Reviewed-by: Gerd Bayer <[email protected]> Tested-by: Gerd Bayer <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit c4a585e) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 42420c5 The zpci_create_device() function returns an error pointer that needs to be checked before dereferencing it as a struct zpci_dev pointer. Add the missing check in __clp_add() where it was missed when adding the scan_list in the fixed commit. Simply not adding the device to the scan list results in the previous behavior. Cc: [email protected] Fixes: 0467cdd ("s390/pci: Sort PCI functions prior to creating virtual busses") Signed-off-by: Niklas Schnelle <[email protected]> Reviewed-by: Gerd Bayer <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit 42420c5) Signed-off-by: Jonathan Maple <[email protected]>

…hild VFs jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 05a2538 With commit bcb5d6c ("s390/pci: introduce lock to synchronize state of zpci_dev's") the code to ignore power off of a PF that has child VFs was changed from a direct return to a goto to the unlock and pci_dev_put() section. The change however left the existing pci_dev_put() untouched resulting in a doubple put. This can subsequently cause a use after free if the struct pci_dev is released in an unexpected state. Fix this by removing the extra pci_dev_put(). Cc: [email protected] Fixes: bcb5d6c ("s390/pci: introduce lock to synchronize state of zpci_dev's") Signed-off-by: Niklas Schnelle <[email protected]> Reviewed-by: Gerd Bayer <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit 05a2538) Signed-off-by: Jonathan Maple <[email protected]>

…device() jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit d76f963 Remove zpci_bus_remove_device() and zpci_disable_device() calls from zpci_release_device(). These calls were done when the device transitioned into the ZPCI_FN_STATE_STANDBY state which is guaranteed to happen before it enters the ZPCI_FN_STATE_RESERVED state. When zpci_release_device() is called the device is known to be in the ZPCI_FN_STATE_RESERVED state which is also checked by a WARN_ON(). Cc: [email protected] Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve") Reviewed-by: Gerd Bayer <[email protected]> Reviewed-by: Julian Ruess <[email protected]> Tested-by: Gerd Bayer <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit d76f963) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 47c3978 As disable_slot() takes a struct zpci_dev from the Configured to the Standby state. In Standby there is still a hotplug slot so this is not usually a case of sysfs self deletion. This is important because self deletion gets very hairy in terms of locking (see for example recover_store() in arch/s390/pci/pci_sysfs.c). Because the pci_dev_put() is not within the critical section of the zdev->state_lock however, disable_slot() can turn into a case of self deletion if zPCI device event handling slips between the mutex_unlock() and the pci_dev_put(). If the latter is the last put and zpci_release_device() is called this then tries to remove the hotplug slot via zpci_exit_slot() which will try to remove the hotplug slot directory the disable_slot() is part of i.e. self deletion. Prevent this by widening the zdev->state_lock critical section to include the pci_dev_put() which is then guaranteed to happen with the struct zpci_dev still in Standby state ensuring it will not lead to a zpci_release_device() call as at least the zPCI event handling code still holds a reference. Cc: [email protected] Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve") Reviewed-by: Gerd Bayer <[email protected]> Tested-by: Gerd Bayer <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit 47c3978) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 4b1815a The architecture assumes that PCI functions can be removed synchronously as PCI events are processed. This however clashes with the reference counting of struct pci_dev which allows device drivers to hold on to a struct pci_dev reference even as the underlying device is removed. To bridge this gap commit 2a671f7 ("s390/pci: fix use after free of zpci_dev") keeps the struct zpci_dev in ZPCI_FN_STATE_RESERVED state until common code releases the struct pci_dev. Only when all references are dropped, the struct zpci_dev can be removed and freed. Later commit a46044a ("s390/pci: fix zpci_zdev_put() on reserve") moved the deletion of the struct zpci_dev from the zpci_list in zpci_release_device() to the point where the device is reserved. This was done to prevent handling events for a device that is already being removed, e.g. when the platform generates both PCI event codes 0x304 and 0x308. In retrospect, deletion from the zpci_list in the release function without holding the zpci_list_lock was also racy. A side effect of this handling is that if the underlying device re-appears while the struct zpci_dev is in the ZPCI_FN_STATE_RESERVED state, the new and old instances of the struct zpci_dev and/or struct pci_dev may clash. For example when trying to create the IOMMU sysfs files for the new instance. In this case, re-adding the new instance is aborted. The old instance is removed, and the device will remain absent until the platform issues another event. Fix this by allowing the struct zpci_dev to be brought back up right until it is finally removed. To this end also keep the struct zpci_dev in the zpci_list until it is finally released when all references have been dropped. Deletion from the zpci_list from within the release function is made safe by using kref_put_lock() with the zpci_list_lock. This ensures that the releasing code holds the last reference. Cc: [email protected] Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve") Reviewed-by: Gerd Bayer <[email protected]> Tested-by: Gerd Bayer <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit 4b1815a) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 774a1fa Prior changes ensured that when zpci_release_device() is called and it removed the zdev from the zpci_list this instance can not be found via the zpci_list anymore even while allowing re-add of reserved devices. This only accounts for the overall lifetime and zpci_list addition and removal, it does not yet prevent concurrent add of a new instance for the same underlying device. Such concurrent add would subsequently cause issues such as attempted re-use of the same IOMMU sysfs directory and is generally undesired. Introduce a new zpci_add_remove_lock mutex to serialize adding a new device with removal. Together this ensures that if a struct zpci_dev is not found in the zpci_list it was either already removed and torn down, or its removal and tear down is in progress with the zpci_add_remove_lock held. Cc: [email protected] Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve") Reviewed-by: Gerd Bayer <[email protected]> Tested-by: Gerd Bayer <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> (cherry picked from commit 774a1fa) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Haren Myneni <[email protected]> commit 05aa156 The mapping VMA address is saved in VAS window struct when the paste address is mapped. This VMA address is used during migration to unmap the paste address if the window is active. The paste address mapping will be removed when the window is closed or with the munmap(). But the VMA address in the VAS window is not updated with munmap() which is causing invalid access during migration. The KASAN report shows: [16386.254991] BUG: KASAN: slab-use-after-free in reconfig_close_windows+0x1a0/0x4e8 [16386.255043] Read of size 8 at addr c00000014a819670 by task drmgr/696928 [16386.255096] CPU: 29 UID: 0 PID: 696928 Comm: drmgr Kdump: loaded Tainted: G B 6.11.0-rc5-nxgzip #2 [16386.255128] Tainted: [B]=BAD_PAGE [16386.255148] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.00 (NH1110_016) hv:phyp pSeries [16386.255181] Call Trace: [16386.255202] [c00000016b297660] [c0000000018ad0ac] dump_stack_lvl+0x84/0xe8 (unreliable) [16386.255246] [c00000016b297690] [c0000000006e8a90] print_report+0x19c/0x764 [16386.255285] [c00000016b297760] [c0000000006e9490] kasan_report+0x128/0x1f8 [16386.255309] [c00000016b297880] [c0000000006eb5c8] __asan_load8+0xac/0xe0 [16386.255326] [c00000016b2978a0] [c00000000013f898] reconfig_close_windows+0x1a0/0x4e8 [16386.255343] [c00000016b297990] [c000000000140e58] vas_migration_handler+0x3a4/0x3fc [16386.255368] [c00000016b297a90] [c000000000128848] pseries_migrate_partition+0x4c/0x4c4 ... [16386.256136] Allocated by task 696554 on cpu 31 at 16377.277618s: [16386.256149] kasan_save_stack+0x34/0x68 [16386.256163] kasan_save_track+0x34/0x80 [16386.256175] kasan_save_alloc_info+0x58/0x74 [16386.256196] __kasan_slab_alloc+0xb8/0xdc [16386.256209] kmem_cache_alloc_noprof+0x200/0x3d0 [16386.256225] vm_area_alloc+0x44/0x150 [16386.256245] mmap_region+0x214/0x10c4 [16386.256265] do_mmap+0x5fc/0x750 [16386.256277] vm_mmap_pgoff+0x14c/0x24c [16386.256292] ksys_mmap_pgoff+0x20c/0x348 [16386.256303] sys_mmap+0xd0/0x160 ... [16386.256350] Freed by task 0 on cpu 31 at 16386.204848s: [16386.256363] kasan_save_stack+0x34/0x68 [16386.256374] kasan_save_track+0x34/0x80 [16386.256384] kasan_save_free_info+0x64/0x10c [16386.256396] __kasan_slab_free+0x120/0x204 [16386.256415] kmem_cache_free+0x128/0x450 [16386.256428] vm_area_free_rcu_cb+0xa8/0xd8 [16386.256441] rcu_do_batch+0x2c8/0xcf0 [16386.256458] rcu_core+0x378/0x3c4 [16386.256473] handle_softirqs+0x20c/0x60c [16386.256495] do_softirq_own_stack+0x6c/0x88 [16386.256509] do_softirq_own_stack+0x58/0x88 [16386.256521] __irq_exit_rcu+0x1a4/0x20c [16386.256533] irq_exit+0x20/0x38 [16386.256544] interrupt_async_exit_prepare.constprop.0+0x18/0x2c ... [16386.256717] Last potentially related work creation: [16386.256729] kasan_save_stack+0x34/0x68 [16386.256741] __kasan_record_aux_stack+0xcc/0x12c [16386.256753] __call_rcu_common.constprop.0+0x94/0xd04 [16386.256766] vm_area_free+0x28/0x3c [16386.256778] remove_vma+0xf4/0x114 [16386.256797] do_vmi_align_munmap.constprop.0+0x684/0x870 [16386.256811] __vm_munmap+0xe0/0x1f8 [16386.256821] sys_munmap+0x54/0x6c [16386.256830] system_call_exception+0x1a0/0x4a0 [16386.256841] system_call_vectored_common+0x15c/0x2ec [16386.256868] The buggy address belongs to the object at c00000014a819670 which belongs to the cache vm_area_struct of size 168 [16386.256887] The buggy address is located 0 bytes inside of freed 168-byte region [c00000014a819670, c00000014a819718) [16386.256915] The buggy address belongs to the physical page: [16386.256928] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a81 [16386.256950] memcg:c0000000ba430001 [16386.256961] anon flags: 0x43ffff800000000(node=4|zone=0|lastcpupid=0x7ffff) [16386.256975] page_type: 0xfdffffff(slab) [16386.256990] raw: 043ffff800000000 c00000000501c080 0000000000000000 5deadbee00000001 [16386.257003] raw: 0000000000000000 00000000011a011a 00000001fdffffff c0000000ba430001 [16386.257018] page dumped because: kasan: bad access detected This patch adds close() callback in vas_vm_ops vm_operations_struct which will be executed during munmap() before freeing VMA. The VMA address in the VAS window is set to NULL after holding the window mmap_mutex. Fixes: 37e6764 ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 05aa156) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38124 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Shiming Cheng <[email protected]> commit 3382a1e Commit a1e40ac ("net: gso: fix udp gso fraglist segmentation after pull from frag_list") detected invalid geometry in frag_list skbs and redirects them from skb_segment_list to more robust skb_segment. But some packets with modified geometry can also hit bugs in that code. We don't know how many such cases exist. Addressing each one by one also requires touching the complex skb_segment code, which risks introducing bugs for other types of skbs. Instead, linearize all these packets that fail the basic invariants on gso fraglist skbs. That is more robust. If only part of the fraglist payload is pulled into head_skb, it will always cause exception when splitting skbs by skb_segment. For detailed call stack information, see below. Valid SKB_GSO_FRAGLIST skbs - consist of two or more segments - the head_skb holds the protocol headers plus first gso_size - one or more frag_list skbs hold exactly one segment - all but the last must be gso_size Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can modify fraglist skbs, breaking these invariants. In extreme cases they pull one part of data into skb linear. For UDP, this causes three payloads with lengths of (11,11,10) bytes were pulled tail to become (12,10,10) bytes. The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because payload was pulled into head_skb, it needs to be linearized before pass to regular skb_segment. skb_segment+0xcd0/0xd14 __udp_gso_segment+0x334/0x5f4 udp4_ufo_fragment+0x118/0x15c inet_gso_segment+0x164/0x338 skb_mac_gso_segment+0xc4/0x13c __skb_gso_segment+0xc4/0x124 validate_xmit_skb+0x9c/0x2c0 validate_xmit_skb_list+0x4c/0x80 sch_direct_xmit+0x70/0x404 __dev_queue_xmit+0x64c/0xe5c neigh_resolve_output+0x178/0x1c4 ip_finish_output2+0x37c/0x47c __ip_finish_output+0x194/0x240 ip_finish_output+0x20/0xf4 ip_output+0x100/0x1a0 NF_HOOK+0xc4/0x16c ip_forward+0x314/0x32c ip_rcv+0x90/0x118 __netif_receive_skb+0x74/0x124 process_backlog+0xe8/0x1a4 __napi_poll+0x5c/0x1f8 net_rx_action+0x154/0x314 handle_softirqs+0x154/0x4b8 [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278! [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000 [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000 [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO) [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14 [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14 [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770 Fixes: a1e40ac ("gso: fix udp gso fraglist segmentation after pull from frag_list") Signed-off-by: Shiming Cheng <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]> (cherry picked from commit 3382a1e) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Jakub Kicinski <[email protected]> commit 920efe3 bpf_offload caught a spurious warning in TC recently, but the error message did not provide enough information to know what the problem is: FAIL: Found 'netdevsim' in command output, leaky extack? Add the extack to the output: FAIL: Unexpected command output, leaky extack? ('netdevsim', 'Warning: Filter with specified priority/protocol not found.') Acked-by: Stanislav Fomichev <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 920efe3) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Jakub Kicinski <[email protected]> commit f9d2f5d We hit a following exception on timeout, nmaps is never set: Test bpftool bound info reporting (own ns)... Traceback (most recent call last): File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 1128, in <module> check_dev_info(False, "") File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 583, in check_dev_info maps = bpftool_map_list_wait(expected=2, ns=ns) File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 215, in bpftool_map_list_wait raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps)) NameError: name 'nmaps' is not defined Acked-by: Stanislav Fomichev <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit f9d2f5d) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Jakub Kicinski <[email protected]> commit 56a5869 After installing pahole on the CI image we have a new map created by libbpf. Ignore it otherwise we see: Exception: Time out waiting for map counts to stabilize want 2, have 3 Acked-by: Stanislav Fomichev <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 56a5869) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38471 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Jakub Kicinski <[email protected]> commit 4ab26bc After recent changes in net-next TCP compacts skbs much more aggressively. This unearthed a bug in TLS where we may try to operate on an old skb when checking if all skbs in the queue have matching decrypt state and geometry. BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls] (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544) Read of size 4 at addr ffff888013085750 by task tls/13529 CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme Call Trace: kasan_report+0xca/0x100 tls_strp_check_rcv+0x898/0x9a0 [tls] tls_rx_rec_wait+0x2c9/0x8d0 [tls] tls_sw_recvmsg+0x40f/0x1aa0 [tls] inet_recvmsg+0x1c3/0x1f0 Always reload the queue, fast path is to have the record in the queue when we wake, anyway (IOW the path going down "if !strp->stm.full_len"). Fixes: 0d87bbd ("tls: strp: make sure the TCP skbs do not have overlapping data") Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 4ab26bc) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Anumula Murali Mohan Reddy <[email protected]> commit 356983f t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl() uses port number to get mac addr, this leads to error when an attempt to set MAC address on VF's of PF2 and PF3. This patch fixes the issue by using port number to set mac address. Fixes: e0cdac6 ("cxgb4vf: configure ports accessible by the VF") Signed-off-by: Anumula Murali Mohan Reddy <[email protected]> Signed-off-by: Potnuri Bharat Teja <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 356983f) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38200 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Kyungwook Boo <[email protected]> commit 015bac5 When the device sends a specific input, an integer underflow can occur, leading to MMIO write access to an invalid page. Prevent the integer underflow by changing the type of related variables. Signed-off-by: Kyungwook Boo <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/T/ Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 015bac5) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Cong Wang <[email protected]> commit a7a15f3 est_qlen_notify() deletes its class from its active list with list_del() when qlen is 0, therefore, it is not idempotent and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life. Also change other list_del()'s to list_del_init() just to be extra safe. Reported-by: Gerrard Tai <[email protected]> Signed-off-by: Cong Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit a7a15f3) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-37914 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Victor Nogueira <[email protected]> commit 1a6d0c0 As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of ets, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 1a6d0c0) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Victor Nogueira <[email protected]> commit ffdde7b Lion's patch [1] revealed an ancient bug in the qdisc API. Whenever a user creates/modifies a qdisc specifying as a parent another qdisc, the qdisc API will, during grafting, detect that the user is not trying to attach to a class and reject. However grafting is performed after qdisc_create (and thus the qdiscs' init callback) is executed. In qdiscs that eventually call qdisc_tree_reduce_backlog during init or change (such as fq, hhf, choke, etc), an issue arises. For example, executing the following commands: sudo tc qdisc add dev lo root handle a: htb default 2 sudo tc qdisc add dev lo parent a: handle beef fq Qdiscs such as fq, hhf, choke, etc unconditionally invoke qdisc_tree_reduce_backlog() in their control path init() or change() which then causes a failure to find the child class; however, that does not stop the unconditional invocation of the assumed child qdisc's qlen_notify with a null class. All these qdiscs make the assumption that class is non-null. The solution is ensure that qdisc_leaf() which looks up the parent class, and is invoked prior to qdisc_create(), should return failure on not finding the class. In this patch, we leverage qdisc_leaf to return ERR_PTRs whenever the parentid doesn't correspond to a class, so that we can detect it earlier on and abort before qdisc_create is called. [1] https://lore.kernel.org/netdev/[email protected]/ Fixes: 5e50da0 ("[NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs") Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/ Acked-by: Jamal Hadi Salim <[email protected]> Reviewed-by: Cong Wang <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit ffdde7b) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38417 Rebuild_History Non-Buildable kernel-6.12.0-55.29.1.el10_0 commit-author Grzegorz Nitka <[email protected]> commit 48c8b21 Add simple eswitch mode checker in attaching VF procedure and allocate required port representor memory structures only in switchdev mode. The reset flows triggers VF (if present) detach/attach procedure. It might involve VF port representor(s) re-creation if the device is configured is switchdev mode (not legacy one). The memory was blindly allocated in current implementation, regardless of the mode and not freed if in legacy mode. Kmemeleak trace: unreferenced object (percpu) 0x7e3bce5b888458 (size 40): comm "bash", pid 1784, jiffies 4295743894 hex dump (first 32 bytes on cpu 45): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): pcpu_alloc_noprof+0x4c4/0x7c0 ice_repr_create+0x66/0x130 [ice] ice_repr_create_vf+0x22/0x70 [ice] ice_eswitch_attach_vf+0x1b/0xa0 [ice] ice_reset_all_vfs+0x1dd/0x2f0 [ice] ice_pci_err_resume+0x3b/0xb0 [ice] pci_reset_function+0x8f/0x120 reset_store+0x56/0xa0 kernfs_fop_write_iter+0x120/0x1b0 vfs_write+0x31c/0x430 ksys_write+0x61/0xd0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Testing hints (ethX is PF netdev): - create at least one VF echo 1 > /sys/class/net/ethX/device/sriov_numvfs - trigger the reset echo 1 > /sys/class/net/ethX/device/reset Fixes: 415db83 ("ice: make representor code generic") Signed-off-by: Grzegorz Nitka <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 48c8b21) Signed-off-by: Jonathan Maple <[email protected]>

Rebuild_History BUILDABLE Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50% Number of commits in upstream range v6.12~1..kernel-mainline: 65214 Number of commits in rpm: 28 Number of commits matched with upstream: 24 (85.71%) Number of commits in upstream but not in rpm: 65190 Number of commits NOT found in upstream: 4 (14.29%) Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.29.1.el10_0 for kernel-6.12.0-55.29.1.el10_0 Clean Cherry Picks: 23 (95.83%) Empty Cherry Picks: 1 (4.17%) _______________________________ Full Details Located here: ciq/ciq_backports/kernel-6.12.0-55.29.1.el10_0/rebuild.details.txt Includes: * git commit header above * Empty Commits with upstream SHA * RPM ChangeLog Entries that could not be matched Individual Empty Commit failures contained in the same containing directory. The git message for empty commits will have the path for the failed commit. File names are the first 8 characters of the upstream SHA

jira LE-4076 cve CVE-2025-38472 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Florian Westphal <[email protected]> commit 2d72afb A crash in conntrack was reported while trying to unlink the conntrack entry from the hash bucket list: [exception RIP: __nf_ct_delete_from_lists+172] [..] #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack] #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack] #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack] [..] The nf_conn struct is marked as allocated from slab but appears to be in a partially initialised state: ct hlist pointer is garbage; looks like the ct hash value (hence crash). ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected ct->timeout is 30000 (=30s), which is unexpected. Everything else looks like normal udp conntrack entry. If we ignore ct->status and pretend its 0, the entry matches those that are newly allocated but not yet inserted into the hash: - ct hlist pointers are overloaded and store/cache the raw tuple hash - ct->timeout matches the relative time expected for a new udp flow rather than the absolute 'jiffies' value. If it were not for the presence of IPS_CONFIRMED, __nf_conntrack_find_get() would have skipped the entry. Theory is that we did hit following race: cpu x cpu y cpu z found entry E found entry E E is expired <preemption> nf_ct_delete() return E to rcu slab init_conntrack E is re-inited, ct->status set to 0 reply tuplehash hnnode.pprev stores hash value. cpu y found E right before it was deleted on cpu x. E is now re-inited on cpu z. cpu y was preempted before checking for expiry and/or confirm bit. ->refcnt set to 1 E now owned by skb ->timeout set to 30000 If cpu y were to resume now, it would observe E as expired but would skip E due to missing CONFIRMED bit. nf_conntrack_confirm gets called sets: ct->status |= CONFIRMED This is wrong: E is not yet added to hashtable. cpu y resumes, it observes E as expired but CONFIRMED: <resumes> nf_ct_expired() -> yes (ct->timeout is 30s) confirmed bit set. cpu y will try to delete E from the hashtable: nf_ct_delete() -> set DYING bit __nf_ct_delete_from_lists Even this scenario doesn't guarantee a crash: cpu z still holds the table bucket lock(s) so y blocks: wait for spinlock held by z CONFIRMED is set but there is no guarantee ct will be added to hash: "chaintoolong" or "clash resolution" logic both skip the insert step. reply hnnode.pprev still stores the hash value. unlocks spinlock return NF_DROP <unblocks, then crashes on hlist_nulls_del_rcu pprev> In case CPU z does insert the entry into the hashtable, cpu y will unlink E again right away but no crash occurs. Without 'cpu y' race, 'garbage' hlist is of no consequence: ct refcnt remains at 1, eventually skb will be free'd and E gets destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy. To resolve this, move the IPS_CONFIRMED assignment after the table insertion but before the unlock. Pablo points out that the confirm-bit-store could be reordered to happen before hlist add resp. the timeout fixup, so switch to set_bit and before_atomic memory barrier to prevent this. It doesn't matter if other CPUs can observe a newly inserted entry right before the CONFIRMED bit was set: Such event cannot be distinguished from above "E is the old incarnation" case: the entry will be skipped. Also change nf_ct_should_gc() to first check the confirmed bit. The gc sequence is: 1. Check if entry has expired, if not skip to next entry 2. Obtain a reference to the expired entry. 3. Call nf_ct_should_gc() to double-check step 1. nf_ct_should_gc() is thus called only for entries that already failed an expiry check. After this patch, once the confirmed bit check passes ct->timeout has been altered to reflect the absolute 'best before' date instead of a relative time. Step 3 will therefore not remove the entry. Without this change to nf_ct_should_gc() we could still get this sequence: 1. Check if entry has expired. 2. Obtain a reference. 3. Call nf_ct_should_gc() to double-check step 1: 4 - entry is still observed as expired 5 - meanwhile, ct->timeout is corrected to absolute value on other CPU and confirm bit gets set 6 - confirm bit is seen 7 - valid entry is removed again First do check 6), then 4) so the gc expiry check always picks up either confirmed bit unset (entry gets skipped) or expiry re-check failure for re-inited conntrack objects. This change cannot be backported to releases before 5.19. Without commit 8a75a2c ("netfilter: conntrack: remove unconfirmed list") |= IPS_CONFIRMED line cannot be moved without further changes. Cc: Razvan Cojocaru <[email protected]> Link: https://lore.kernel.org/netfilter-devel/[email protected]/ Link: https://lore.kernel.org/netfilter-devel/[email protected]/ Fixes: 1397af5 ("netfilter: conntrack: remove the percpu dying list") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 2d72afb) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38461 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Michal Luczaj <[email protected]> commit 687aa0c Transport assignment may race with module unload. Protect new_transport from becoming a stale pointer. This also takes care of an insecure call in vsock_use_local_transport(); add a lockdep assert. BUG: unable to handle page fault for address: fffffbfff8056000 Oops: Oops: 0000 [#1] SMP KASAN RIP: 0010:vsock_assign_transport+0x366/0x600 Call Trace: vsock_connect+0x59c/0xc40 __sys_connect+0xe8/0x100 __x64_sys_connect+0x6e/0xc0 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: c0cfa2d ("vsock: add multi-transports support") Reviewed-by: Stefano Garzarella <[email protected]> Signed-off-by: Michal Luczaj <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 687aa0c) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38464 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Kuniyuki Iwashima <[email protected]> commit 667eeab syzbot reported a null-ptr-deref in tipc_conn_close() during netns dismantle. [0] tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls tipc_conn_close() for each tipc_conn. The problem is that tipc_conn_close() is called after releasing the IDR lock. At the same time, there might be tipc_conn_recv_work() running and it could call tipc_conn_close() for the same tipc_conn and release its last ->kref. Once we release the IDR lock in tipc_topsrv_stop(), there is no guarantee that the tipc_conn is alive. Let's hold the ref before releasing the lock and put the ref after tipc_conn_close() in tipc_topsrv_stop(). [0]: BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435 CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1fc/0x2ef lib/dump_stack.c:118 print_address_description.cold+0x54/0x219 mm/kasan/report.c:256 kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354 kasan_report mm/kasan/report.c:412 [inline] __asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433 tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 tipc_topsrv_stop net/tipc/topsrv.c:701 [inline] tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722 ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153 cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Allocated by task 23: kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625 kmalloc include/linux/slab.h:515 [inline] kzalloc include/linux/slab.h:709 [inline] tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192 tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Freed by task 23: __cache_free mm/slab.c:3503 [inline] kfree+0xcc/0x210 mm/slab.c:3822 tipc_conn_kref_release net/tipc/topsrv.c:150 [inline] kref_put include/linux/kref.h:70 [inline] conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 The buggy address belongs to the object at ffff888099305a00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 8 bytes inside of 512-byte region [ffff888099305a00, ffff888099305c00) The buggy address belongs to the page: page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0 flags: 0xfff00000000100(slab) raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940 raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: c5fa7b3 ("tipc: introduce new TIPC server infrastructure") Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Tung Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 667eeab) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38220 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Brian Foster <[email protected]> commit e26268f fstest generic/388 occasionally reproduces a crash that looks as follows: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... Call Trace: <TASK> ext4_block_zero_page_range+0x30c/0x380 [ext4] ext4_truncate+0x436/0x440 [ext4] ext4_process_orphan+0x5d/0x110 [ext4] ext4_orphan_cleanup+0x124/0x4f0 [ext4] ext4_fill_super+0x262d/0x3110 [ext4] get_tree_bdev_flags+0x132/0x1d0 vfs_get_tree+0x26/0xd0 vfs_cmd_create+0x59/0xe0 __do_sys_fsconfig+0x4ed/0x6b0 do_syscall_64+0x82/0x170 ... This occurs when processing a symlink inode from the orphan list. The partial block zeroing code in the truncate path calls ext4_dirty_journalled_data() -> folio_mark_dirty(). The latter calls mapping->a_ops->dirty_folio(), but symlink inodes are not assigned an a_ops vector in ext4, hence the crash. To avoid this problem, update the ext4_dirty_journalled_data() helper to only mark the folio dirty on regular files (for which a_ops is assigned). This also matches the journaling logic in the ext4_symlink() creation path, where ext4_handle_dirty_metadata() is called directly. Fixes: d84c9eb ("ext4: Mark pages with journalled data dirty") Signed-off-by: Brian Foster <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: [email protected] (cherry picked from commit e26268f) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Kuniyuki Iwashima <[email protected]> commit 5a465a0 __udp_enqueue_schedule_skb() has the following condition: if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) goto drop; sk->sk_rcvbuf is initialised by net.core.rmem_default and later can be configured by SO_RCVBUF, which is limited by net.core.rmem_max, or SO_RCVBUFFORCE. If we set INT_MAX to sk->sk_rcvbuf, the condition is always false as sk->sk_rmem_alloc is also signed int. Then, the size of the incoming skb is added to sk->sk_rmem_alloc unconditionally. This results in integer overflow (possibly multiple times) on sk->sk_rmem_alloc and allows a single socket to have skb up to net.core.udp_mem[1]. For example, if we set a large value to udp_mem[1] and INT_MAX to sk->sk_rcvbuf and flood packets to the socket, we can see multiple overflows: # cat /proc/net/sockstat | grep UDP: UDP: inuse 3 mem 7956736 <-- (7956736 << 12) bytes > INT_MAX * 15 ^- PAGE_SHIFT # ss -uam State Recv-Q ... UNCONN -1757018048 ... <-- flipping the sign repeatedly skmem:(r2537949248,rb2147483646,t0,tb212992,f1984,w0,o0,bl0,d0) Previously, we had a boundary check for INT_MAX, which was removed by commit 6a1f12d ("udp: relax atomic operation on sk->sk_rmem_alloc"). A complete fix would be to revert it and cap the right operand by INT_MAX: rmem = atomic_add_return(size, &sk->sk_rmem_alloc); if (rmem > min(size + (unsigned int)sk->sk_rcvbuf, INT_MAX)) goto uncharge_drop; but we do not want to add the expensive atomic_add_return() back just for the corner case. Casting rmem to unsigned int prevents multiple wraparounds, but we still allow a single wraparound. # cat /proc/net/sockstat | grep UDP: UDP: inuse 3 mem 524288 <-- (INT_MAX + 1) >> 12 # ss -uam State Recv-Q ... UNCONN -2147482816 ... <-- INT_MAX + 831 bytes skmem:(r2147484480,rb2147483646,t0,tb212992,f3264,w0,o0,bl0,d14468947) So, let's define rmem and rcvbuf as unsigned int and check skb->truesize only when rcvbuf is large enough to lower the overflow possibility. Note that we still have a small chance to see overflow if multiple skbs to the same socket are processed on different core at the same time and each size does not exceed the limit but the total size does. Note also that we must ignore skb->truesize for a small buffer as explained in commit 363dc73 ("udp: be less conservative with sock rmem accounting"). Fixes: 6a1f12d ("udp: relax atomic operation on sk->sk_rmem_alloc") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 5a465a0) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-22058 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Kuniyuki Iwashima <[email protected]> commit df207de Matt Dowling reported a weird UDP memory usage issue. Under normal operation, the UDP memory usage reported in /proc/net/sockstat remains close to zero. However, it occasionally spiked to 524,288 pages and never dropped. Moreover, the value doubled when the application was terminated. Finally, it caused intermittent packet drops. We can reproduce the issue with the script below [0]: 1. /proc/net/sockstat reports 0 pages # cat /proc/net/sockstat | grep UDP: UDP: inuse 1 mem 0 2. Run the script till the report reaches 524,288 # python3 test.py & sleep 5 # cat /proc/net/sockstat | grep UDP: UDP: inuse 3 mem 524288 <-- (INT_MAX + 1) >> PAGE_SHIFT 3. Kill the socket and confirm the number never drops # pkill python3 && sleep 5 # cat /proc/net/sockstat | grep UDP: UDP: inuse 1 mem 524288 4. (necessary since v6.0) Trigger proto_memory_pcpu_drain() # python3 test.py & sleep 1 && pkill python3 5. The number doubles # cat /proc/net/sockstat | grep UDP: UDP: inuse 1 mem 1048577 The application set INT_MAX to SO_RCVBUF, which triggered an integer overflow in udp_rmem_release(). When a socket is close()d, udp_destruct_common() purges its receive queue and sums up skb->truesize in the queue. This total is calculated and stored in a local unsigned integer variable. The total size is then passed to udp_rmem_release() to adjust memory accounting. However, because the function takes a signed integer argument, the total size can wrap around, causing an overflow. Then, the released amount is calculated as follows: 1) Add size to sk->sk_forward_alloc. 2) Round down sk->sk_forward_alloc to the nearest lower multiple of PAGE_SIZE and assign it to amount. 3) Subtract amount from sk->sk_forward_alloc. 4) Pass amount >> PAGE_SHIFT to __sk_mem_reduce_allocated(). When the issue occurred, the total in udp_destruct_common() was 2147484480 (INT_MAX + 833), which was cast to -2147482816 in udp_rmem_release(). At 1) sk->sk_forward_alloc is changed from 3264 to -2147479552, and 2) sets -2147479552 to amount. 3) reverts the wraparound, so we don't see a warning in inet_sock_destruct(). However, udp_memory_allocated ends up doubling at 4). Since commit 3cd3399 ("net: implement per-cpu reserves for memory_allocated"), memory usage no longer doubles immediately after a socket is close()d because __sk_mem_reduce_allocated() caches the amount in udp_memory_per_cpu_fw_alloc. However, the next time a UDP socket receives a packet, the subtraction takes effect, causing UDP memory usage to double. This issue makes further memory allocation fail once the socket's sk->sk_rmem_alloc exceeds net.ipv4.udp_rmem_min, resulting in packet drops. To prevent this issue, let's use unsigned int for the calculation and call sk_forward_alloc_add() only once for the small delta. Note that first_packet_length() also potentially has the same problem. [0]: from socket import * SO_RCVBUFFORCE = 33 INT_MAX = (2 ** 31) - 1 s = socket(AF_INET, SOCK_DGRAM) s.bind(('', 0)) s.setsockopt(SOL_SOCKET, SO_RCVBUFFORCE, INT_MAX) c = socket(AF_INET, SOCK_DGRAM) c.connect(s.getsockname()) data = b'a' * 100 while True: c.send(data) Fixes: f970bd9 ("udp: implement memory accounting helpers") Reported-by: Matt Dowling <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit df207de) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-38211 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Shin'ichiro Kawasaki <[email protected]> commit 6883b68 The commit 59c68ac ("iw_cm: free cm_id resources on the last deref") simplified cm_id resource management by freeing cm_id once all references to the cm_id were removed. The references are removed either upon completion of iw_cm event handlers or when the application destroys the cm_id. This commit introduced the use-after-free condition where cm_id_private object could still be in use by event handler works during the destruction of cm_id. The commit aee2424 ("RDMA/iwcm: Fix a use-after-free related to destroying CM IDs") addressed this use-after- free by flushing all pending works at the cm_id destruction. However, still another use-after-free possibility remained. It happens with the work objects allocated for each cm_id_priv within alloc_work_entries() during cm_id creation, and subsequently freed in dealloc_work_entries() once all references to the cm_id are removed. If the cm_id's last reference is decremented in the event handler work, the work object for the work itself gets removed, and causes the use- after-free BUG below: BUG: KASAN: slab-use-after-free in __pwq_activate_work+0x1ff/0x250 Read of size 8 at addr ffff88811f9cf800 by task kworker/u16:1/147091 CPU: 2 UID: 0 PID: 147091 Comm: kworker/u16:1 Not tainted 6.15.0-rc2+ #27 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Workqueue: 0x0 (iw_cm_wq) Call Trace: <TASK> dump_stack_lvl+0x6a/0x90 print_report+0x174/0x554 ? __virt_addr_valid+0x208/0x430 ? __pwq_activate_work+0x1ff/0x250 kasan_report+0xae/0x170 ? __pwq_activate_work+0x1ff/0x250 __pwq_activate_work+0x1ff/0x250 pwq_dec_nr_in_flight+0x8c5/0xfb0 process_one_work+0xc11/0x1460 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 worker_thread+0x5ef/0xfd0 ? __pfx_worker_thread+0x10/0x10 kthread+0x3b0/0x770 ? __pfx_kthread+0x10/0x10 ? rcu_is_watching+0x11/0xb0 ? _raw_spin_unlock_irq+0x24/0x50 ? rcu_is_watching+0x11/0xb0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 147416: kasan_save_stack+0x2c/0x50 kasan_save_track+0x10/0x30 __kasan_kmalloc+0xa6/0xb0 alloc_work_entries+0xa9/0x260 [iw_cm] iw_cm_connect+0x23/0x4a0 [iw_cm] rdma_connect_locked+0xbfd/0x1920 [rdma_cm] nvme_rdma_cm_handler+0x8e5/0x1b60 [nvme_rdma] cma_cm_event_handler+0xae/0x320 [rdma_cm] cma_work_handler+0x106/0x1b0 [rdma_cm] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 Freed by task 147091: kasan_save_stack+0x2c/0x50 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kfree+0x13a/0x4b0 dealloc_work_entries+0x125/0x1f0 [iw_cm] iwcm_deref_id+0x6f/0xa0 [iw_cm] cm_work_handler+0x136/0x1ba0 [iw_cm] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 Last potentially related work creation: kasan_save_stack+0x2c/0x50 kasan_record_aux_stack+0xa3/0xb0 __queue_work+0x2ff/0x1390 queue_work_on+0x67/0xc0 cm_event_handler+0x46a/0x820 [iw_cm] siw_cm_upcall+0x330/0x650 [siw] siw_cm_work_handler+0x6b9/0x2b20 [siw] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 This BUG is reproducible by repeating the blktests test case nvme/061 for the rdma transport and the siw driver. To avoid the use-after-free of cm_id_private work objects, ensure that the last reference to the cm_id is decremented not in the event handler works, but in the cm_id destruction context. For that purpose, move iwcm_deref_id() call from destroy_cm_id() to the callers of destroy_cm_id(). In iw_destroy_cm_id(), call iwcm_deref_id() after flushing the pending works. During the fix work, I noticed that iw_destroy_cm_id() is called from cm_work_handler() and process_event() context. However, the comment of iw_destroy_cm_id() notes that the function "cannot be called by the event thread". Drop the false comment. Closes: https://lore.kernel.org/linux-rdma/r5676e754sv35aq7cdsqrlnvyhiq5zktteaurl7vmfih35efko@z6lay7uypy3c/ Fixes: 59c68ac ("iw_cm: free cm_id resources on the last deref") Cc: [email protected] Signed-off-by: Shin'ichiro Kawasaki <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 6883b68) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit dc287e4 Since commit 25f39d3 ("s390/pci: Ignore RID for isolated VFs") PFs which are not initially configured but in standby are considered isolated. That is they create only a single function PCI domain. Due to the PCI domains being created on discovery, this means that even if they are configured later on, sibling PFs and their child VFs will not be added to their PCI domain breaking SR-IOV expectations. The reason the referenced commit ignored standby PFs for the creation of multi-function PCI subhierarchies, was to work around a PCI domain renumbering scenario on reboot. The renumbering would occur after removing a previously in standby PF, whose domain number is used for its configured sibling PFs and their child VFs, but which itself remained in standby. When this is followed by a reboot, the sibling PF is used instead to determine the PCI domain number of it and its child VFs. In principle it is not possible to know which standby PFs will be configured later and which may be removed. The PCI domain and root bus are pre-requisites for hotplug slots so the decision of which functions belong to which domain can not be postponed. With the renumbering occurring only in rare circumstances and being generally benign, accept it as an oddity and fix SR-IOV for initially standby PFs simply by allowing them to create PCI domains. Cc: [email protected] Reviewed-by: Gerd Bayer <[email protected]> Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs") Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> (cherry picked from commit dc287e4) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 0579388 This creates a new zpci_iov_find_parent_pf() function which a future commit can use to find if a VF has a configured parent PF. Use zdev->rid instead of zdev->devfn such that the new function can be used before it has been decided if the RID will be exposed and zdev->devfn is set. Also handle the hypotheical case that the RID is not available but there is an otherwise matching zbus. Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs") Cc: [email protected] Reviewed-by: Halil Pasic <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> (cherry picked from commit 0579388) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 2844ddb In contrast to the commit message of the fixed commit VFs whose parent PF is not configured are not always isolated, that is put on their own PCI domain. This is because for VFs to be added to an existing PCI domain it is enough for that PCI domain to share the same topology ID or PCHID. Such a matching PCI domain without a parent PF may exist when a PF from the same PCI card created the domain with the VF being a child of a different, non accessible, PF. While not causing technical issues it makes the rules which VFs are isolated inconsistent. Fix this by explicitly checking that the parent PF exists on the PCI domain determined by the topology ID or PCHID before registering the VF. This works because a parent PF which is under control of this Linux instance must be enabled and configured at the point where its child VFs appear because otherwise SR-IOV could not have been enabled on the parent. Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs") Cc: [email protected] Reviewed-by: Halil Pasic <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> (cherry picked from commit 2844ddb) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Niklas Schnelle <[email protected]> commit 8691abd For non-VFs, zpci_bus_is_isolated_vf() should return false because they aren't VFs. While zpci_iov_find_parent_pf() specifically checks if a function is a VF, it then simply returns that there is no parent. The simplistic check for a parent then leads to these functions being confused with isolated VFs and isolating them on their own domain even if sibling PFs should share the domain. Fix this by explicitly checking if a function is not a VF. Note also that at this point the case where RIDs are ignored is already handled and in this case all PCI functions get isolated by being detected in zpci_bus_is_multifunction_root(). Cc: [email protected] Fixes: 2844ddb ("s390/pci: Fix handling of isolated VFs") Signed-off-by: Niklas Schnelle <[email protected]> Reviewed-by: Halil Pasic <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> (cherry picked from commit 8691abd) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-4076 cve CVE-2025-37823 Rebuild_History Non-Buildable kernel-6.12.0-55.30.1.el10_0 commit-author Cong Wang <[email protected]> commit 6ccbda4 Similarly to the previous patch, we need to safe guard hfsc_dequeue() too. But for this one, we don't have a reliable reproducer. Fixes: 1da177e ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai <[email protected]> Signed-off-by: Cong Wang <[email protected]> Reviewed-by: Jamal Hadi Salim <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 6ccbda4) Signed-off-by: Jonathan Maple <[email protected]>

Rebuild_History BUILDABLE Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50% Number of commits in upstream range v6.12~1..kernel-mainline: 65214 Number of commits in rpm: 15 Number of commits matched with upstream: 12 (80.00%) Number of commits in upstream but not in rpm: 65202 Number of commits NOT found in upstream: 3 (20.00%) Rebuilding Kernel on Branch rocky10_0_rebuild_kernel-6.12.0-55.30.1.el10_0 for kernel-6.12.0-55.30.1.el10_0 Clean Cherry Picks: 12 (100.00%) Empty Cherry Picks: 0 (0.00%) _______________________________ Full Details Located here: ciq/ciq_backports/kernel-6.12.0-55.30.1.el10_0/rebuild.details.txt Includes: * git commit header above * Empty Commits with upstream SHA * RPM ChangeLog Entries that could not be matched Individual Empty Commit failures contained in the same containing directory. The git message for empty commits will have the path for the failed commit. File names are the first 8 characters of the upstream SHA

bmastbergen

🥌

jdieter

LGTM

PlaidCat added 30 commits September 4, 2025 14:48

PlaidCat added 8 commits September 4, 2025 14:49

PlaidCat requested review from jdieter, kerneltoast, bmastbergen, thefossguy-ciq and shreeya-patel98 September 5, 2025 19:02

PlaidCat self-assigned this Sep 5, 2025

bmastbergen approved these changes Sep 5, 2025

View reviewed changes

jdieter approved these changes Sep 5, 2025

View reviewed changes

PlaidCat changed the title ~~[rocky10_0] History Rebuild to kernel-6.12.0-55.27.1.el10_0~~ [rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 Sep 8, 2025

PlaidCat merged commit 0b50c95 into rocky10_0 Sep 8, 2025
4 checks passed

PlaidCat deleted the rocky10_0_rebuild branch September 8, 2025 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 #556

[rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 #556

Uh oh!

PlaidCat commented Sep 5, 2025

Uh oh!

bmastbergen left a comment

Uh oh!

jdieter left a comment

Uh oh!

Uh oh!

Uh oh!

[rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 #556

[rocky10_0] History Rebuild to kernel-6.12.0-55.30.1.el10_0 #556

Uh oh!

Conversation

PlaidCat commented Sep 5, 2025

Checking Rebuild Commits for potentially missing commits:

kernel-6.12.0-55.29.1.el10_0

kernel-6.12.0-55.30.1.el10_0

BUILD

KselfTest

Uh oh!

bmastbergen left a comment

Choose a reason for hiding this comment

Uh oh!

jdieter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!