summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2026-04-27wifi: mac80211: check ieee80211_rx_data_set_link return in pubsta MLO pathMichael Bommarito
__ieee80211_rx_handle_packet() resolves the link via ieee80211_rx_data_set_link() on the pubsta->mlo path but ignores the helper's return value. Inside the helper, rx->link = rcu_dereference(rx->sdata->link[link_id]); can leave rx->link NULL if link_id references a slot already cleared by ieee80211_vif_set_links() during station-initiated ML reconfiguration (see mlme.c's ieee80211_ml_reconfiguration(), which invalidates sdata->link[] before the matching ieee80211_sta_remove_link() loop walks the link-sta hash). RX dispatch still resolves a link_sta from the hash and then drops into ieee80211_prepare_and_rx_handle(), which dereferences link->conf->addr. Every other user site of ieee80211_rx_data_set_link() checks the return and bails on failure; only this branch did not. Mirror the safe pattern. Fixes: e66b7920aa5a ("wifi: mac80211: fix initialization of rx->link and rx->link_sta") Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260422000651.4184602-1-michael.bommarito@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-27wifi: nl80211: require admin perm on SET_PMK / DEL_PMKMichael Bommarito
NL80211_CMD_SET_PMK and NL80211_CMD_DEL_PMK manage the offloaded 4-way-handshake PMK state used by drivers advertising NL80211_EXT_FEATURE_4WAY_HANDSHAKE_STA_1X. The only in-tree driver that wires up both ->set_pmk / ->del_pmk and advertises the feature today is brcmfmac, so the practical reach of this patch is narrow. Both ops were introduced without a .flags gate, so the generic netlink layer dispatches them to an unprivileged caller instead of rejecting with -EPERM at the permission check. Every other connection-state op in the adjacent block (CONNECT, ASSOCIATE, AUTHENTICATE, SET_KEY, ...) carries GENL_UNS_ADMIN_PERM; SET_PMK / DEL_PMK were introduced without the flag in 2017 and left unchanged by later refactors. Johannes checked the original Intel submission history and confirmed there is no admin check in any prior revision either, so this seems likely to be a simple oversight rather than an intentional carve-out. Require GENL_UNS_ADMIN_PERM so the genl layer performs the same capable(CAP_NET_ADMIN) check as its siblings. wpa_supplicant already needs CAP_NET_ADMIN for every other nl80211 op it issues, so supplicant operation is unaffected. The worst case the missing gate enables today is an unprivileged local process on a multi-user system invalidating the offloaded PMK state of another user's 4-way-handshake session, forcing a full EAP re-auth on the next reconnect. Verified in UML: an unprivileged probe (uid=1000) sees SET_MULTICAST_TO_UNICAST (sibling op with GENL_UNS_ADMIN_PERM) return -EPERM on both pre- and post-fix kernels, while SET_PMK / DEL_PMK return -ENODEV from nl80211_pre_doit()'s wdev lookup pre- fix (proving dispatch crossed the genl permission check) and -EPERM post-fix (rejected at the genl layer as intended). Suggested-by: Johannes Berg <johannes@sipsolutions.net> Fixes: 3a00df5707b6 ("cfg80211: support 4-way handshake offloading for 802.1X") Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom> Link: https://patch.msgid.link/20260421224552.4044147-1-michael.bommarito@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-27wifi: mac80211: skip ieee80211_verify_sta_ht_mcs_support check in non-strict ↵Rio Liu
mode Some Xfinity XB8 firmware advertises >1 spatial stream MCS indexes in their basic HT-MCS set. On cards with lower spatial streams, the check would fail, and we'd be stuck with no HT when in fact work fine with its own supported rate. This change makes it so the check is only performed in strict mode. Fixes: 574faa0e936d ("wifi: mac80211: add HT and VHT basic set verification") Signed-off-by: Rio Liu <rio@r26.me> Link: https://patch.msgid.link/99Mv9QEceyPrQhSP52MtAVmz0_kWJmzqotJjD9YW6LGLqk-AZloAueUyHCURilFkuqOh6Ecv8i2KKdSE1ujP3AnbU5QEouVisT1w_V3xdfc=@r26.me Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-24Merge tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Bugfixes: - Fix handling of ENOSPC so that if we have to resend writes, they are written synchronously - SUNRPC RDMA transport fixes from Chuck - Several fixes for delegated timestamps in NFSv4.2 - Failure to obtain a directory delegation should not cause stat() to fail with NFSv4 - Rename was failing to update timestamps when a directory delegation is held on NFSv4 - Ensure we check rsize/wsize after crossing a NFSv4 filesystem boundary - NFSv4/pnfs: - If the server is down, retry the layout returns on reboot - Fallback to MDS could result in a short write being incorrectly logged Cleanups: - Use memcpy_and_pad in decode_fh" * tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits) NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address NFS: remove redundant __private attribute from nfs_page_class NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes NFS: fix writeback in presence of errors nfs: use memcpy_and_pad in decode_fh NFSv4.1: Apply session size limits on clone path NFSv4: retry GETATTR if GET_DIR_DELEGATION failed NFS: fix RENAME attr in presence of directory delegations pnfs/flexfiles: validate ds_versions_cnt is non-zero NFS/blocklayout: print each device used for SCSI layouts xprtrdma: Post receive buffers after RPC completion xprtrdma: Scale receive batch size with credit window xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor xprtrdma: Decouple frwr_wp_create from frwr_map xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot xprtrdma: Avoid 250 ms delay on backlog wakeup xprtrdma: Close sendctx get/put race that can block a transport nfs: update inode ctime after removexattr operation nfs: fix utimensat() for atime with delegated timestamps NFS: improve "Server wrote zero bytes" error ...
2026-04-24Merge tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "We have a series from Alex which extends CephFS client metrics with support for per-subvolume data I/O performance and latency tracking (metadata operations aren't included) and a good variety of fixes and cleanups across RBD and CephFS" * tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client: ceph: add subvolume metrics collection and reporting ceph: parse subvolume_id from InodeStat v9 and store in inode ceph: handle InodeStat v8 versioned field in reply parsing libceph: Fix slab-out-of-bounds access in auth message processing rbd: fix null-ptr-deref when device_add_disk() fails crush: cleanup in crush_do_rule() method ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails ceph: only d_add() negative dentries when they are unhashed libceph: update outdated comment in ceph_sock_write_space() libceph: Remove obsolete session key alignment logic ceph: fix num_ops off-by-one when crypto allocation fails libceph: Prevent potential null-ptr-deref in ceph_handle_auth_reply()
2026-04-24Merge tag '9p-for-7.1-rc1' of https://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: - 9p access flag fix (cannot change access flag since new mount API implem) - some minor cleanup * tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux: 9p/trans_xen: replace simple_strto* with kstrtouint 9p/trans_xen: make cleanup idempotent after dataring alloc errors 9p: document missing enum values in kernel-doc comments 9p: fix access mode flags being ORed instead of replaced 9p: fix memory leak in v9fs_init_fs_context error path
2026-04-24bpf: Fix sk_local_storage diag dumping uninitialized special fieldsAmery Hung
Call check_and_init_map_value() after the copy_map_value() to zero out special field regions. diag_get() copies sk_local_storage map values into a netlink message using copy_map_value{_locked}(), which intentionally skip special fields. However, the destination buffer from nla_reserve_64bit() is not zeroed and the skipped regions contain uninitialized skb data can be sent to userspace. Fixes: 1ed4d92458a9 ("bpf: INET_DIAG support in bpf_sk_storage") Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260423222356.155387-1-ameryhung@gmail.com
2026-04-24netfilter: nf_conntrack_sip: don't use simple_strtoulFlorian Westphal
Replace unsafe port parsing in epaddr_len(), ct_sip_parse_header_uri(), and ct_sip_parse_request() with a new sip_parse_port() helper that validates each digit against the buffer limit, eliminating the use of simple_strtoul() which assumes NUL-terminated strings. The previous code dereferenced pointers without bounds checks after sip_parse_addr() and relied on simple_strtoul() on non-NUL-terminated skb data. A port that reaches the buffer limit without a trailing character is also rejected as malformed. Also get rid of all simple_strtoul() usage in conntrack, prefer a stricter version instead. There are intentional changes: - Bail out if number is > UINT_MAX and indicate a failure, same for too long sequences. While we do accept 05535 as port 5535, we will not accept e.g. 'sip:10.0.0.1:005060'. While its syntactically valid under RFC 3261, we should restrict this to not waste cycles when presented with malformed packets with 64k '0' characters. - Force base 10 in ct_sip_parse_numerical_param(). This is used to fetch 'expire=' and 'rports='; both are expected to use base-10. - In nf_nat_sip.c, only accept the parsed value if its within the 1k-64k range. - epaddr_len now returns 0 if the port is invalid, as it already does for invalid ip addresses. This is intentional. nf_conntrack_sip performs lots of guesswork to find the right parts of the message to parse. Being stricter could break existing setups. Connection tracking helpers are designed to allow traffic to pass, not to block it. Based on an earlier patch from Jenny Guanni Qu <qguanni@gmail.com>. Fixes: 05e3ced297fe ("[NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper") Reported-by: Klaudia Kloc <klaudia@vidocsecurity.com> Reported-by: Dawid Moczadło <dawid@vidocsecurity.com> Reported-by: Jenny Guanni Qu <qguanni@gmail.com>. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-24netfilter: reject zero shift in nft_bitwiseKai Ma
Reject zero shift operands for nft_bitwise left and right shift expressions during initialization. The carry propagation logic computes the carry from the adjacent 32-bit word using BITS_PER_TYPE(u32) - shift. A zero shift operand turns this into a 32-bit shift, which is undefined behaviour. Reject zero shift operands in the control plane, alongside the existing check for values greater than or equal to 32, so malformed rules never reach the packet path. Fixes: 567d746b55bc ("netfilter: bitwise: add support for shifts.") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Kai Ma <k4729.23098@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-24netfilter: xt_policy: fix strict mode inbound policy matchingJiexun Wang
match_policy_in() walks sec_path entries from the last transform to the first one, but strict policy matching needs to consume info->pol[] in the same forward order as the rule layout. Derive the strict-match policy position from the number of transforms already consumed so that multi-element inbound rules are matched consistently. Fixes: c4b885139203 ("[NETFILTER]: x_tables: replace IPv4/IPv6 policy match by address family independant version") Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-24Merge tag 'net-deletions' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking deletions from Jakub Kicinski: "Delete some obsolete networking code Old code like amateur radio and NFC have long been a burden to core networking developers. syzbot loves to find bugs in BKL-era code, and noobs try to fix them. If we want to have a fighting chance of surviving the LLM-pocalypse this code needs to find a dedicated owner or get deleted. We've talked about these deletions multiple times in the past and every time someone wanted the code to stay. It is never very clear to me how many of those people actually use the code vs are just nostalgic to see it go. Amateur radio did have occasional users (or so I think) but most users switched to user space implementations since its all super slow stuff. Nobody stepped up to maintain the kernel code. We were lucky enough to find someone who wants to help with NFC so we're giving that a chance. Let's try to put the rest of this code behind us" * tag 'net-deletions' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: drivers: net: 8390: wd80x3: Remove this driver drivers: net: 8390: ultra: Remove this driver drivers: net: 8390: AX88190: Remove this driver drivers: net: fujitsu: fmvj18x: Remove this driver drivers: net: smsc: smc91c92: Remove this driver drivers: net: smsc: smc9194: Remove this driver drivers: net: amd: nmclan: Remove this driver drivers: net: amd: lance: Remove this driver drivers: net: 3com: 3c589: Remove this driver drivers: net: 3com: 3c574: Remove this driver drivers: net: 3com: 3c515: Remove this driver drivers: net: 3com: 3c509: Remove this driver net: packetengines: remove obsolete yellowfin driver and vendor dir net: packetengines: remove obsolete hamachi driver net: remove unused ATM protocols and legacy ATM device drivers net: remove ax25 and amateur radio (hamradio) subsystem net: remove ISDN subsystem and Bluetooth CMTP caif: remove CAIF NETWORK LAYER
2026-04-23bpf: Fix NULL pointer dereference in bpf_skb_fib_lookup()Weiming Shi
When tot_len is not provided by the user, bpf_skb_fib_lookup() resolves the FIB result's output device via dev_get_by_index_rcu() to check skb forwardability and fill in mtu_result. The returned pointer is dereferenced without a NULL check. If the device is concurrently unregistered, dev_get_by_index_rcu() returns NULL and is_skb_forwardable() crashes at dev->flags: KASAN: null-ptr-deref in range [0x00000000000000b0-0x00000000000000b7] Call Trace: is_skb_forwardable (include/linux/netdevice.h:4365) bpf_skb_fib_lookup (net/core/filter.c:6446) bpf_prog_test_run_skb (net/bpf/test_run.c) __sys_bpf (kernel/bpf/syscall.c) Add the missing NULL check, returning -ENODEV to be consistent with how bpf_ipv4_fib_lookup() and bpf_ipv6_fib_lookup() handle the same condition. Fixes: 4f74fede40df ("bpf: Add mtu checking to FIB forwarding helper") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://patch.msgid.link/20260423183831.1325480-2-bestswngs@gmail.com
2026-04-23sockmap: Fix sk_psock_drop() race vs sock_map_{unhash,close,destroy}().Kuniyuki Iwashima
syzbot reported a splat in sock_map_destroy() [0], where psock was NULL even though sk->sk_prot still pointed to tcp_bpf_prots[][]. The stack trace shows how badly the path was excercised, see inet_release() calls tcp_close(), not sock_map_close() yet, but finally reaching sock_map_destroy(). The root cause is a lack of synchronisation. Even if sk_psock_get() fails to bump psock->refcnt, it does not guarantee that sk_psock_drop() has finished, and thus sk->sk_prot might not have been restored to the original one. Commit 4b4647add7d3 ("sock_map: avoid race between sock_map_close and sk_psock_put") attempted to address this, but it was insufficient for two reasons. It did not cover sock_map_unhash() and sock_map_destroy(), and it missed the corner case where sk_psock() is NULL. On non-x86 platforms, sk_psock_restore_proto(sk, psock) and rcu_assign_sk_user_data(sk, NULL) can be reordered because there is no address dependency between sk->sk_prot and sk->sk_user_data. sk_psock_get() returning NULL implies nothing about sk->sk_prot. Let's simply retry sk_psock_get() in the unlikely case. Note that we cannot avoid loop even if we added memory barrier in sk_psock_drop() and sock_map_psock_get_checked(). Also note that sock_map_destroy() cannot be called from softirq while sock_map_close() has also been running. It is because sock_map_destroy() requires SOCK_DEAD, so sock_map_destroy() cannot happen until sock_map_close() has finished the saved_close() (which is tcp_close()). [0]: WARNING: CPU: 1 PID: 8459 at net/core/sock_map.c:1667 sock_map_destroy+0x28b/0x2b0 net/core/sock_map.c:1667 Modules linked in: CPU: 1 UID: 0 PID: 8459 Comm: syz.0.1109 Not tainted syzkaller #0 PREEMPT_{RT,(full)} Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:sock_map_destroy+0x28b/0x2b0 net/core/sock_map.c:1667 Code: 8b 36 49 83 c6 38 4c 89 f0 48 c1 e8 03 42 80 3c 38 00 74 08 4c 89 f7 e8 93 62 22 f9 4d 8b 3e e9 79 ff ff ff e8 a6 2b c3 f8 90 <0f> 0b 90 eb 9c e8 9b 2b c3 f8 4c 89 e7 be 03 00 00 00 e8 0e 4e bc RSP: 0018:ffffc9000d067be8 EFLAGS: 00010293 RAX: ffffffff88fb30aa RBX: ffff888024832000 RCX: ffff888024283b80 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: dffffc0000000000 R11: ffffed100862e946 R12: dffffc0000000000 R13: ffff888024832000 R14: ffffffff995b2208 R15: ffffffff88fb2e20 FS: 0000555579a7d500(0000) GS:ffff8881269c2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002000000048c0 CR3: 000000003713a000 CR4: 00000000003526f0 Call Trace: <TASK> inet_csk_destroy_sock+0x166/0x3a0 net/ipv4/inet_connection_sock.c:1294 __tcp_close+0xcc1/0xfd0 net/ipv4/tcp.c:3262 tcp_close+0x28/0x110 net/ipv4/tcp.c:3274 inet_release+0x144/0x190 net/ipv4/af_inet.c:435 __sock_release net/socket.c:649 [inline] sock_close+0xc0/0x240 net/socket.c:1439 __fput+0x45b/0xa80 fs/file_table.c:468 task_work_run+0x1d4/0x260 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f265847ebe9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd158dfbd8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4 RAX: 0000000000000000 RBX: 000000000002ddb0 RCX: 00007f265847ebe9 RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003 RBP: 00007f26586a7da0 R08: 0000000000000001 R09: 0000000e158dfecf R10: 0000001b30a20000 R11: 0000000000000246 R12: 00007f26586a5fac R13: 00007f26586a5fa0 R14: ffffffffffffffff R15: 00007ffd158dfcf0 </TASK> Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks") Fixes: b05545e15e1f ("bpf: sockmap, fix transition through disconnect without close") Fixes: d8616ee2affc ("bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues") Reported-by: syzbot+b0842d38af58376d1fdc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/69cec5ef.050a0220.2dbe29.0009.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://patch.msgid.link/20260420194846.1089595-1-kuniyu@google.com
2026-04-23bpf: Fix NULL pointer dereference in bpf_sk_storage_clone and diag pathsWeiming Shi
bpf_selem_unlink_nofail() sets SDATA(selem)->smap to NULL before removing the selem from the storage hlist. A concurrent RCU reader in bpf_sk_storage_clone() can observe the selem still on the list with smap already NULL, causing a NULL pointer dereference. general protection fault, probably for non-canonical address 0xdffffc000000000a: KASAN: null-ptr-deref in range [0x0000000000000050-0x0000000000000057] RIP: 0010:bpf_sk_storage_clone+0x1cd/0xaa0 net/core/bpf_sk_storage.c:174 Call Trace: <IRQ> sk_clone+0xfed/0x1980 net/core/sock.c:2591 inet_csk_clone_lock+0x30/0x760 net/ipv4/inet_connection_sock.c:1222 tcp_create_openreq_child+0x35/0x2680 net/ipv4/tcp_minisocks.c:571 tcp_v4_syn_recv_sock+0x123/0xf90 net/ipv4/tcp_ipv4.c:1729 tcp_check_req+0x8e1/0x2580 include/net/tcp.h:855 tcp_v4_rcv+0x1845/0x3b80 net/ipv4/tcp_ipv4.c:2347 Add a NULL check for smap in bpf_sk_storage_clone(). bpf_sk_storage_diag_put_all() has the same issue. Add a NULL check and pass the validated smap directly to diag_get(), which is refactored to take smap as a parameter instead of reading it internally. bpf_sk_storage_diag_put() uses diag->maps[i] which is always valid under its refcount, so diag->maps[i] is passed directly to diag_get(). Fixes: 5d800f87d0a5 ("bpf: Support lockless unlink when freeing map or local storage") Reported-by: Xiang Mei <xmei5@asu.edu> Acked-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260422065411.1007737-2-bestswngs@gmail.com
2026-04-23Merge tag 'net-7.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter. Steady stream of fixes. Last two weeks feel comparable to the two weeks before the merge window. Lots of AI-aided bug discovery. A newer big source is Sashiko/Gemini (Roman Gushchin's system), which points out issues in existing code during patch review (maybe 25% of fixes here likely originating from Sashiko). Nice thing is these are often fixed by the respective maintainers, not drive-bys. Current release - new code bugs: - kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP Previous releases - regressions: - add async ndo_set_rx_mode and switch drivers which we promised to be called under the per-netdev mutex to it - dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops - hv_sock: report EOF instead of -EIO for FIN - vsock/virtio: fix MSG_PEEK calculation on bytes to copy Previous releases - always broken: - ipv6: fix possible UAF in icmpv6_rcv() - icmp: validate reply type before using icmp_pointers - af_unix: drop all SCM attributes for SOCKMAP - netfilter: fix a number of bugs in the osf (OS fingerprinting) - eth: intel: fix timestamp interrupt configuration for E825C Misc: - bunch of data-race annotations" * tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (148 commits) rxrpc: Fix error handling in rxgk_extract_token() rxrpc: Fix re-decryption of RESPONSE packets rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets rxrpc: Fix missing validation of ticket length in non-XDR key preparsing rxgk: Fix potential integer overflow in length check rxrpc: Fix conn-level packet handling to unshare RESPONSE packets rxrpc: Fix potential UAF after skb_unshare() failure rxrpc: Fix rxkad crypto unalignment handling rxrpc: Fix memory leaks in rxkad_verify_response() net: rds: fix MR cleanup on copy error m68k: mvme147: Make me the maintainer net: txgbe: fix firmware version check selftests/bpf: check epoll readiness during reuseport migration tcp: call sk_data_ready() after listener migration vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll() ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim tipc: fix double-free in tipc_buf_append() llc: Return -EINPROGRESS from llc_ui_connect() ipv4: icmp: validate reply type before using icmp_pointers selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges ...
2026-04-23rxrpc: Fix error handling in rxgk_extract_token()David Howells
Fix a missing bit of error handling in rxgk_extract_token(): in the event that rxgk_decrypt_skb() returns -ENOMEM, it should just return that rather than continuing on (for anything else, it generates an abort). Fixes: 64863f4ca494 ("rxrpc: Fix unhandled errors in rxgk_verify_packet_integrity()") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix re-decryption of RESPONSE packetsDavid Howells
If a RESPONSE packet gets a temporary failure during processing, it may end up in a partially decrypted state - and then get requeued for a retry. Fix this by just discarding the packet; we will send another CHALLENGE packet and thereby elicit a further response. Similarly, discard an incoming CHALLENGE packet if we get an error whilst generating a RESPONSE; the server will send another CHALLENGE. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packetsDavid Howells
Fix rxrpc_input_call_event() to only unshare DATA packets and not ACK, ABORT, etc.. And with that, rxrpc_input_packet() doesn't need to take a pointer to the pointer to the packet, so change that to just a pointer. Fixes: 1f2740150f90 ("rxrpc: Fix potential UAF after skb_unshare() failure") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix missing validation of ticket length in non-XDR key preparsingAnderson Nascimento
In rxrpc_preparse(), there are two paths for parsing key payloads: the XDR path (for large payloads) and the non-XDR path (for payloads <= 28 bytes). While the XDR path (rxrpc_preparse_xdr_rxkad()) correctly validates the ticket length against AFSTOKEN_RK_TIX_MAX, the non-XDR path fails to do so. This allows an unprivileged user to provide a very large ticket length. When this key is later read via rxrpc_read(), the total token size (toksize) calculation results in a value that exceeds AFSTOKEN_LENGTH_MAX, triggering a WARN_ON(). [ 2001.302904] WARNING: CPU: 2 PID: 2108 at net/rxrpc/key.c:778 rxrpc_read+0x109/0x5c0 [rxrpc] Fix this by adding a check in the non-XDR parsing path of rxrpc_preparse() to ensure the ticket length does not exceed AFSTOKEN_RK_TIX_MAX, bringing it into parity with the XDR parsing logic. Fixes: 8a7a3eb4ddbe ("KEYS: RxRPC: Use key preparsing") Fixes: 84924aac08a4 ("rxrpc: Fix checker warning") Reported-by: Anderson Nascimento <anderson@allelesecurity.com> Signed-off-by: Anderson Nascimento <anderson@allelesecurity.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-7-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxgk: Fix potential integer overflow in length checkDavid Howells
Fix potential integer overflow in rxgk_extract_token() when checking the length of the ticket. Rather than rounding up the value to be tested (which might overflow), round down the size of the available data. Fixes: 2429a1976481 ("rxrpc: Fix untrusted unsigned subtract") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix conn-level packet handling to unshare RESPONSE packetsDavid Howells
The security operations that verify the RESPONSE packets decrypt bits of it in place - however, the sk_buff may be shared with a packet sniffer, which would lead to the sniffer seeing an apparently corrupt packet (actually decrypted). Fix this by handing a copy of the packet off to the specific security handler if the packet was cloned. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix potential UAF after skb_unshare() failureDavid Howells
If skb_unshare() fails to unshare a packet due to allocation failure in rxrpc_input_packet(), the skb pointer in the parent (rxrpc_io_thread()) will be NULL'd out. This will likely cause the call to trace_rxrpc_rx_done() to oops. Fix this by moving the unsharing down to where rxrpc_input_call_event() calls rxrpc_input_call_packet(). There are a number of places prior to that where we ignore DATA packets for a variety of reasons (such as the call already being complete) for which an unshare is then avoided. And with that, rxrpc_input_packet() doesn't need to take a pointer to the pointer to the packet, so change that to just a pointer. Fixes: 2d1faf7a0ca3 ("rxrpc: Simplify skbuff accounting in receive path") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix rxkad crypto unalignment handlingDavid Howells
Fix handling of a packet with a misaligned crypto length. Also handle non-ENOMEM errors from decryption by aborting. Further, remove the WARN_ON_ONCE() so that it can't be remotely triggered (a trace line can still be emitted). Fixes: f93af41b9f5f ("rxrpc: Fix missing error checks for rxkad encryption/decryption failure") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23rxrpc: Fix memory leaks in rxkad_verify_response()David Howells
Fix rxkad_verify_response() to free the ticket and the server key under all circumstances by initialising the ticket pointer to NULL and then making all paths through the function after the first allocation has been done go through a single common epilogue that just releases everything - where all the releases skip on a NULL pointer. Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure") Fixes: ec832bd06d6f ("rxrpc: Don't retain the server key in the connection") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23net: remove unused ATM protocols and legacy ATM device driversJakub Kicinski
Remove the ATM protocol modules and PCI/SBUS ATM device drivers that are no longer in active use. The ATM core protocol stack, PPPoATM, BR2684, and USB DSL modem drivers (drivers/usb/atm/) are retained in-tree to maintain PPP over ATM (PPPoA) and PPPoE-over-BR2684 support for DSL connections. The Solos ADSL2+ PCI driver is also retained. Removed ATM protocol modules: - net/atm/clip.c - Classical IP over ATM (RFC 2225) - net/atm/lec.c - LAN Emulation Client (LANE) - net/atm/mpc.c, mpoa_caches.c, mpoa_proc.c - Multi-Protocol Over ATM Removed PCI/SBUS ATM device drivers (drivers/atm/): - adummy, atmtcp - software/testing ATM devices - eni - Efficient Networks ENI155P (OC-3, ~1995) - fore200e - FORE Systems 200E PCI/SBUS (OC-3, ~1999) - he - ForeRunner HE (OC-3/OC-12, ~2000) - idt77105 - IDT 77105 25 Mbps ATM PHY - idt77252 - IDT 77252 NICStAR II (OC-3, ~2000) - iphase - Interphase ATM PCI (OC-3/DS3/E3) - lanai - Efficient Networks Speedstream 3010 - nicstar - IDT 77201 NICStAR (155/25 Mbps, ~1999) - suni - PMC S/UNI SONET PHY library Also clean up references in: - net/bridge/ - remove ATM LANE hook (br_fdb_test_addr_hook, br_fdb_test_addr) - net/core/dev.c - remove br_fdb_test_addr_hook export - defconfig files - remove ATM driver config options The removed code is moved to an out-of-tree module package (mod-orphan). Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260422041846.2035118-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23net: rds: fix MR cleanup on copy errorAo Zhou
__rds_rdma_map() hands sg/pages ownership to the transport after get_mr() succeeds. If copying the generated cookie back to user space fails after that point, the error path must not free those resources again before dropping the MR reference. Remove the duplicate unpin/free from the put_user() failure branch so that MR teardown is handled only through the existing final cleanup path. Fixes: 0d4597c8c5ab ("net/rds: Track user mapped pages through special API") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ao Zhou <draw51280@163.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/79c8ef73ec8e5844d71038983940cc2943099baf.1776764247.git.draw51280@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23tcp: call sk_data_ready() after listener migrationZhenzhong Wu
When inet_csk_listen_stop() migrates an established child socket from a closing listener to another socket in the same SO_REUSEPORT group, the target listener gets a new accept-queue entry via inet_csk_reqsk_queue_add(), but that path never notifies the target listener's waiters. A nonblocking accept() still works because it checks the queue directly, but poll()/epoll_wait() waiters and blocking accept() callers can also remain asleep indefinitely. Call READ_ONCE(nsk->sk_data_ready)(nsk) after a successful migration in inet_csk_listen_stop(). However, after inet_csk_reqsk_queue_add() succeeds, the ref acquired in reuseport_migrate_sock() is effectively transferred to nreq->rsk_listener. Another CPU can then dequeue nreq via accept() or listener shutdown, hit reqsk_put(), and drop that listener ref. Since listeners are SOCK_RCU_FREE, wrap the post-queue_add() dereferences of nsk in rcu_read_lock()/rcu_read_unlock(), which also covers the existing sock_net(nsk) access in that path. The reqsk_timer_handler() path does not need the same changes for two reasons: half-open requests become readable only after the final ACK, where tcp_child_process() already wakes the listener; and once nreq is visible via inet_ehash_insert(), the success path no longer touches nsk directly. Fixes: 54b92e841937 ("tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.") Cc: stable@vger.kernel.org Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422024554.130346-2-jt26wzz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_limDaniel Borkmann
Commit 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination options") added net.ipv6.max_{hbh,dst}_opts_{cnt,len} and applied them in ip6_parse_tlv(), the generic TLV walker invoked from ipv6_destopt_rcv() and ipv6_parse_hopopts(). ip6_tnl_parse_tlv_enc_lim() does not go through ip6_parse_tlv(); it has its own hand-rolled TLV scanner inside its NEXTHDR_DEST branch which looks for IPV6_TLV_TNL_ENCAP_LIMIT. That inner loop is bounded only by optlen, which can be up to 2048 bytes. Stuffing the Destination Options header with 2046 Pad1 (type=0) entries advances the scanner a single byte at a time, yielding ~2000 TLV iterations per extension header. Reusing max_dst_opts_cnt to bound the TLV iterations, matching the semantics from 47d3d7ac656a, would require duplicating ip6_parse_tlv() to also validate Pad1/PadN payload. It would also mandate enforcing max_dst_opts_len, since otherwise an attacker shifts the axis to few options with a giant PadN and recovers the original DoS. Allowing up to 8 options before the tunnel encapsulation limit TLV is liberal enough; in practice encap limit is the first TLV. Thus, go with a hard-coded limit IP6_TUNNEL_MAX_DEST_TLVS (8). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23tipc: fix double-free in tipc_buf_append()Lee Jones
tipc_msg_validate() can potentially reallocate the skb it is validating, freeing the old one. In tipc_buf_append(), it was being called with a pointer to a local variable which was a copy of the caller's skb pointer. If the skb was reallocated and validation subsequently failed, the error handling path would free the original skb pointer, which had already been freed, leading to double-free. Fix this by checking if head now points to a newly allocated reassembled skb. If it does, reassign *headbuf for later freeing operations. Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize and contents") Suggested-by: Tung Nguyen <tung.quang.nguyen@est.tech> Signed-off-by: Lee Jones <lee@kernel.org> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23llc: Return -EINPROGRESS from llc_ui_connect()Ernestas Kulik
Given a zero sk_sndtimeo, llc_ui_connect() skips waiting for state change and returns 0, confusing userspace applications that will assume the socket is connected, making e.g. getpeername() calls error out. More specifically, the issue was discovered in libcoap, where newly-added AF_LLC socket support was behaving differently from AF_INET connections due to EINPROGRESS handling being skipped. Set rc to -EINPROGRESS if connect() would not block, akin to AF_INET sockets. Signed-off-by: Ernestas Kulik <ernestas.k@iconn-networks.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260421060304.285419-1-ernestas.k@iconn-networks.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23ipv4: icmp: validate reply type before using icmp_pointersRuide Cao
Extended echo replies use ICMP_EXT_ECHOREPLY as the outbound reply type. That value is outside the range covered by icmp_pointers[], which only describes the traditional ICMP types up to NR_ICMP_TYPES. Avoid consulting icmp_pointers[] for reply types outside that range, and use array_index_nospec() for the remaining in-range lookup. Normal ICMP replies keep their existing behavior unchanged. Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ruide Cao <caoruide123@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/0dace90c01a5978e829ca741ef684dbd7304ce62.1776628519.git.caoruide123@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23tcp: send a challenge ACK on SEG.ACK > SND.NXTJiayuan Chen
RFC 5961 Section 5.2 validates an incoming segment's ACK value against the range [SND.UNA - MAX.SND.WND, SND.NXT] and states: "All incoming segments whose ACK value doesn't satisfy the above condition MUST be discarded and an ACK sent back." Commit 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") opted Linux into this mitigation and implements the challenge ACK on the lower side (SEG.ACK < SND.UNA - MAX.SND.WND), but the symmetric upper side (SEG.ACK > SND.NXT) still takes the pre-RFC-5961 path and silently returns SKB_DROP_REASON_TCP_ACK_UNSENT_DATA, even though RFC 793 Section 3.9 (now RFC 9293 Section 3.10.7.4) has always required: "If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return." Complete the mitigation by sending a challenge ACK on that branch, reusing the existing tcp_send_challenge_ack() path which already enforces the per-socket RFC 5961 Section 7 rate limit via __tcp_oow_rate_limited(). FLAG_NO_CHALLENGE_ACK is honoured for symmetry with the lower-edge case. Update the existing tcp_ts_recent_invalid_ack.pkt selftest, which drives this exact path, to consume the new challenge ACK. Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422123605.320000-2-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23net/smc: avoid early lgr access in smc_clc_wait_msgRuijie Li
A CLC decline can be received while the handshake is still in an early stage, before the connection has been associated with a link group. The decline handling in smc_clc_wait_msg() updates link-group level sync state for first-contact declines, but that state only exists after link group setup has completed. Guard the link-group update accordingly and keep the per-socket peer diagnosis handling unchanged. This preserves the existing sync_err handling for established link-group contexts and avoids touching link-group state before it is available. Fixes: 0cfdd8f92cac ("smc: connection and link group creation") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ruijie Li <ruijieli51@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Link: https://patch.msgid.link/08c68a5c817acf198cce63d22517e232e8d60718.1776850759.git.ruijieli51@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23hv_sock: Return -EIO for malformed/short packetsDexuan Cui
Commit f63152958994 fixes a regression, however it fails to report an error for malformed/short packets -- normally we should never see such packets, but let's report an error for them just in case. Fixes: f63152958994 ("hv_sock: Report EOF instead of -EIO for FIN") Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260423064811.1371749-1-decui@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23net: remove ax25 and amateur radio (hamradio) subsystemJakub Kicinski
Remove the amateur radio (AX.25, NET/ROM, ROSE) protocol implementation and all associated hamradio device drivers from the kernel tree. This set of protocols has long been a huge bug/syzbot magnet, and since nobody stepped up to help us deal with the influx of the AI-generated bug reports we need to move it out of tree to protect our sanity. The code is moved to an out-of-tree repo: https://github.com/linux-netdev/mod-orphan if it's cleaned up and reworked there we can accept it back. Minimal stub headers are kept for include/net/ax25.h (AX25_P_IP, AX25_ADDR_LEN, ax25_address) and include/net/rose.h (ROSE_ADDR_LEN) so that the conditional integration code in arp.c and tun.c continues to compile and work when the out-of-tree modules are loaded. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Carlos Bilbao <carlos.bilbao@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20260421021824.1293976-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23net: remove ISDN subsystem and Bluetooth CMTPJakub Kicinski
Remove the ISDN (mISDN, CAPI) subsystem and Bluetooth CMTP protocol from the kernel tree. ISDN is a pretty old technology and it's unclear whether anyone still uses it. I went over the last few years of git history and all the commits are either tree-wide conversions or syzbot/static analyzer fixes. When we discussed removal in the past IIRC there were some concerns about ISDN still being used in parts of Germany. Unfortunately, the code base is quite old, none of the current maintainers are familiar with it and AI tools will have a field day finding bugs here. Delete this code and preserve it in an out-of-tree repository for any remaining users: https://github.com/linux-netdev/mod-orphan UAPI constants AF_ISDN/PF_ISDN and the SELinux isdn_socket class are preserved for ABI stability, but the rest of uAPI is removed. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260421022108.1299678-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23caif: remove CAIF NETWORK LAYERJakub Kicinski
Remove CAIF (Communication CPU to Application CPU Interface), the ST-Ericsson modem protocol. The subsystem has been orphaned since 2013. The last meaningful changes from the maintainers were in March 2013: a8c7687bf216 ("caif_virtio: Check that vringh_config is not null") b2273be8d2df ("caif_virtio: Use vringh_notify_enable correctly") 0d2e1a2926b1 ("caif_virtio: Introduce caif over virtio") Not-so-coincidentally, according to "the Internet" ST-Ericsson officially shut down its modem joint venture in Aug 2013. If anyone is using this code please yell! In the 13 years since, the code has accumulated 200 non-merge commits, of which 71 were cross-tree API changes, 21 carried Fixes: tags, and the remaining ~110 were cleanups, doc conversions, treewide refactors, and one partial removal (caif_hsi, ca75bcf0a83b). We are still getting fixes to this code, in the last 10 days there were 3 reports on security@ about CAIF that I have been CCed on. UAPI constants (AF_CAIF, ARPHRD_CAIF, N_CAIF, VIRTIO_ID_CAIF) and the SELinux classmap entry are intentionally kept for ABI stability. Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Linus Walleij <linusw@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260416182829.1440262-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23mptcp: sync the msk->sndbuf at accept() timeGang Yan
On passive MPTCP connections, the msk sndbuf is not updated correctly. The root cause is an order issue in the accept path: - tcp_check_req() -> subflow_syn_recv_sock() -> mptcp_sk_clone_init() calls __mptcp_propagate_sndbuf() to copy the ssk sndbuf into msk - Later, tcp_child_process() -> tcp_init_transfer() -> tcp_sndbuf_expand() grows the ssk sndbuf. So __mptcp_propagate_sndbuf() runs before the ssk sndbuf has been expanded and the msk ends up with a much smaller sndbuf than the subflow: MPTCP: msk->sndbuf:20480, msk->first->sndbuf:2626560 Fix this by moving the __mptcp_propagate_sndbuf() call from mptcp_sk_clone_init() -- the ssk sndbuf is not yet finalized there -- to __mptcp_propagate_sndbuf() at accept() time, when the ssk sndbuf has been fully expanded by tcp_sndbuf_expand(). Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/602 Signed-off-by: Gang Yan <yangang@kylinos.cn> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-1-e3523e3aeb44@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23vsock/virtio: fix MSG_ZEROCOPY pinned-pages accountingStefano Garzarella
virtio_transport_init_zcopy_skb() uses iter->count as the size argument for msg_zerocopy_realloc(), which in turn passes it to mm_account_pinned_pages() for RLIMIT_MEMLOCK accounting. However, this function is called after virtio_transport_fill_skb() has already consumed the iterator via __zerocopy_sg_from_iter(), so on the last skb, iter->count will be 0, skipping the RLIMIT_MEMLOCK enforcement. Pass pkt_len (the total bytes being sent) as an explicit parameter to virtio_transport_init_zcopy_skb() instead of reading the already-consumed iter->count. This matches TCP and UDP, which both call msg_zerocopy_realloc() with the original message size. Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260420132051.217589-1-sgarzare@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-238021q: delete cleared egress QoS mappingsLongxuan Yu
vlan_dev_set_egress_priority() currently keeps cleared egress priority mappings in the hash as tombstones. Repeated set/clear cycles with distinct skb priorities therefore accumulate mapping nodes until device teardown and leak memory. Delete mappings when vlan_prio is cleared instead of keeping tombstones. Now that the egress mapping lists are RCU protected, the node can be unlinked safely and freed after a grace period. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Longxuan Yu <ylong030@ucr.edu> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Link: https://patch.msgid.link/ecfa6f6ce2467a42647ff4c5221238ae85b79a59.1776647968.git.yuantan098@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-238021q: use RCU for egress QoS mappingsLongxuan Yu
The TX fast path and reporting paths walk egress QoS mappings without RTNL. Convert the mapping lists to RCU-protected pointers, use RCU reader annotations in readers, and defer freeing mapping nodes with an embedded rcu_head. This prepares the egress QoS mapping code for safe removal of mapping nodes in a follow-up change while preserving the current behavior. Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Longxuan Yu <ylong030@ucr.edu> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Link: https://patch.msgid.link/9136768189f8c6d3f824f476c62d2fa1111688e8.1776647968.git.yuantan098@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23Merge tag 'nf-26-04-20' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following batch contains Netfilter/IPVS fixes for net: 1) nft_osf actually only supports IPv4, restrict it. 2) Address possible division by zero in nfnetlink_osf, from Xiang Mei. 3) Remove unsafe use of sprintf to fix possible buffer overflow in the SIP NAT helper, from Florian Westphal. 4) Restrict xt_mac, xt_owner and xt_physdev to inet families only; xt_realm is only for ipv4, otherwise null-pointer-deref is possible. 5) Use kfree_rcu() in nat core to release hooks, this can be an issue once nfnetlink_hook gets support to dump NAT hook information, not currently a real issue but better fix it now. From Florian Westphal. 6) Fix MTU checks in IPVS, from Yingnan Zhang. 7) Fix possible out-of-bounds when matching TCP options in nfnetlink_osf, from Fernando Fernandez Mancera. 8) Fix potential nul-ptr-deref in ttl check in nfnetlink_osf, remove useless loop to fix this, also from Fernando. This is a smaller batch, there are more patches pending in the queue to arm another pull request as soon as this is considered good enough. AI might complain again about one more issue regarding osf and big-endian arches in osf but this batch is targetting crash fixes for osf at this stage. netfilter pull request 26-04-20 * tag 'nf-26-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nfnetlink_osf: fix potential NULL dereference in ttl check netfilter: nfnetlink_osf: fix out-of-bounds read on option matching ipvs: fix MTU check for GSO packets in tunnel mode netfilter: nat: use kfree_rcu to release ops netfilter: xtables: restrict several matches to inet family netfilter: conntrack: remove sprintf usage netfilter: nfnetlink_osf: fix divide-by-zero in OSF_WSS_MODULO netfilter: nft_osf: restrict it to ipv4 ==================== Link: https://patch.msgid.link/20260420220215.111510-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-22net/sched: sch_sfb: annotate data-races in sfb_dump_stats()Eric Dumazet
sfb_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_sfb_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260421141655.3953721-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22net/sched: sch_red: annotate data-races in red_dump_stats()Eric Dumazet
red_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_red_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260421142309.3964322-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22net/sched: sch_fq_codel: remove data-races from fq_codel_dump_stats()Eric Dumazet
fq_codel_dump_stats() acquires the qdisc spinlock a bit too late. Move this acquisition before we fill st.qdisc_stats with live data. Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260421142509.3967231-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22net/sched: sch_pie: annotate data-races in pie_dump_stats()Eric Dumazet
pie_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_pie_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260421142944.4009941-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22net_sched: sch_hhf: annotate data-races in hhf_dump_stats()Eric Dumazet
hhf_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260421143349.4052215-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22net/rds: zero per-item info buffer before handing it to visitorsMichael Bommarito
rds_for_each_conn_info() and rds_walk_conn_path_info() both hand a caller-allocated on-stack u64 buffer to a per-connection visitor and then copy the full item_len bytes back to user space via rds_info_copy() regardless of how much of the buffer the visitor actually wrote. rds_ib_conn_info_visitor() and rds6_ib_conn_info_visitor() only write a subset of their output struct when the underlying rds_connection is not in state RDS_CONN_UP (src/dst addr, tos, sl and the two GIDs via explicit memsets). Several u32 fields (max_send_wr, max_recv_wr, max_send_sge, rdma_mr_max, rdma_mr_size, cache_allocs) and the 2-byte alignment hole between sl and cache_allocs remain as whatever stack contents preceded the visitor call and are then memcpy_to_user()'d out to user space. struct rds_info_rdma_connection and struct rds6_info_rdma_connection are the only rds_info_* structs in include/uapi/linux/rds.h that are not marked __attribute__((packed)), so they have a real alignment hole. The other info visitors (rds_conn_info_visitor, rds6_conn_info_visitor, rds_tcp_tc_info, ...) write all fields of their packed output struct today and are not known to be vulnerable, but a future visitor that adds a conditional write-path would have the same bug. Reproduction on a kernel built without CONFIG_INIT_STACK_ALL_ZERO=y: a local unprivileged user opens AF_RDS, sets SO_RDS_TRANSPORT=IB, binds to a local address on an RDMA-capable netdev (rxe soft-RoCE on any netdev is sufficient), sendto()'s any peer on the same subnet (fails cleanly but installs an rds_connection in the global hash in RDS_CONN_CONNECTING), then calls getsockopt(SOL_RDS, RDS_INFO_IB_CONNECTIONS). The returned 68-byte item contains 26 bytes of stack garbage including kernel text/data pointers: 0..7 0a 63 00 01 0a 63 00 02 src=10.99.0.1 dst=10.99.0.2 8..39 00 ... gids (memset-zeroed) 40..47 e0 92 a3 81 ff ff ff ff kernel pointer (max_send_wr) 48..55 7f 37 b5 81 ff ff ff ff kernel pointer (rdma_mr_max) 56..59 01 00 08 00 rdma_mr_size (garbage) 60..61 00 00 tos, sl 62..63 00 00 alignment padding 64..67 18 00 00 00 cache_allocs (garbage) Fix by zeroing the per-item buffer in both rds_for_each_conn_info() and rds_walk_conn_path_info() before invoking the visitor. This covers the IPv4/IPv6 IB visitors and hardens all current and future visitors against the same class of bug. No functional change for visitors that fully populate their output. Changes in v2: - retarget at the net tree (subject prefix "[PATCH net v2]", net/rds: prefix in the title) - pick up Reviewed-by tags from Sharath Srinivasan and Allison Henderson Fixes: ec16227e1414 ("RDS/IB: Infiniband transport") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com> Reviewed-by: Allison Henderson <achender@kernel.org> Assisted-by: Claude:claude-opus-4-7 Link: https://patch.msgid.link/20260418141047.3398203-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22seg6: fix seg6 lwtunnel output redirect for L2 reduced encap modeAndrea Mayer
When SEG6_IPTUN_MODE_L2ENCAP_RED (L2ENCAP_RED) was introduced, the condition in seg6_build_state() that excludes L2 encap modes from setting LWTUNNEL_STATE_OUTPUT_REDIRECT was not updated to account for the new mode. As a consequence, L2ENCAP_RED routes incorrectly trigger seg6_output() on the output path, where the packet is silently dropped because skb_mac_header_was_set() fails on L3 packets. Extend the check to also exclude L2ENCAP_RED, consistent with L2ENCAP. Fixes: 13f0296be8ec ("seg6: add support for SRv6 H.L2Encaps.Red behavior") Cc: stable@vger.kernel.org Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260418162838.31979-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-22sctp: fix sockets_allocated imbalance after sk_clone()Xin Long
sk_clone() increments sockets_allocated and sets the socket refcount to 2. SCTP performs additional accounting in sctp_clone_sock(), so the clone-time increment must be undone to avoid double counting. Note we cannot simply remove the SCTP-side increment, because the SCTP destroy path in sctp_destroy_sock() only decrements sockets_allocated when sp->ep is set, which may not be true for all failure paths in sctp_clone_sock(). Fixes: 16942cf4d3e3 ("sctp: Use sk_clone() in sctp_accept().") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/af8d66f928dec3e9fcbee8d4a85b7d5a6b86f515.1776460180.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>