summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
13 daysMerge tag 'fuse-update-7.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Fix lots of bugs, most from the late 6.x era, but some going back to 2.6.x - Add subsystems (io-uring, passthrough) and respective maintainers (Bernd, Joanne and Amir) - Separate transport and fs layers (Miklos) - Don't block on cat /dev/fuse (Joanne) - Perform some refactoring in fuse-uring (Joanne) - Don't use bounce-buffer for READDIR reply in virtio-fs (Matthew Ochs) - Clean up documentation (Randy) - Improve tracing (Amir) - Extend page cache invalidation after DIO (Cheng Ding) - Invalidate readdir cache on epoch change (Jun Wu) - Misc cleanups * tag 'fuse-update-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (81 commits) fuse-uring: clear ent->fuse_req in commit_fetch error path fuse-uring: use named constants for io-uring iovec indices fuse-uring: refactor setting up copy state for payload copying fuse-uring: use enum types for header copying fuse-uring: refactor io-uring header copying from ring fuse-uring: refactor io-uring header copying to ring fuse-uring: separate next request fetching from sending logic fuse: invalidate readdir cache on epoch bump virtio-fs: avoid double-free on failed queue setup fuse: invalidate page cache after DIO and async DIO writes fuse: set ff->flock only on success fuse: clean up interrupt reading fuse: remove stray newline in fuse_dev_do_read() fuse: use READ_ONCE in fuse_chan_num_background() fuse: dax: Move long delayed work on system_dfl_long_wq fuse: add fuse_request_sent tracepoint fuse: Add SPDX ID lines to some files fuse: use QSTR() instead of QSTR_INIT() in fuse_get_dentry fuse: convert page array allocation to kcalloc() fuse: use current creds for backing files ...
13 daysocfs2: fix circular locking dependency in ocfs2_dio_end_io_writeAleksandr Nogikh
A circular locking dependency involves INODE_ALLOC_SYSTEM_INODE, EXTENT_ALLOC_SYSTEM_INODE, and ORPHAN_DIR_SYSTEM_INODE. 1. ocfs2_mknod() acquires INODE_ALLOC then EXTENT_ALLOC. 2. ocfs2_dio_end_io_write() acquires EXTENT_ALLOC for unwritten extents, then ORPHAN_DIR via ocfs2_del_inode_from_orphan() while still holding EXTENT_ALLOC. 3. ocfs2_wipe_inode() acquires ORPHAN_DIR then INODE_ALLOC via ocfs2_remove_inode. Break the cycle in ocfs2_dio_end_io_write() by freeing the allocation contexts (releasing EXTENT_ALLOC) before acquiring ORPHAN_DIR. WARNING: possible circular locking dependency detected ------------------------------------------------------ is trying to acquire lock: ffff8881e78b33a0 (&ocfs2_sysfile_lock_key[INODE_ALLOC_SYSTEM_INODE]){+.+.}-{4:4}, at: ocfs2_evict_inode+0x1539/0x43b0 fs/ocfs2/inode.c:1299 but task is already holding lock: ffff8881e78b4fa0 (&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]){+.+.}-{4:4}, at: ocfs2_evict_inode+0xe97/0x43b0 fs/ocfs2/inode.c:1299 the existing dependency chain (in reverse order) is: -> #2 (&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]){+.+.}-{4:4}: inode_lock include/linux/fs.h:1029 [inline] ocfs2_del_inode_from_orphan+0x12e/0x7a0 fs/ocfs2/namei.c:2728 ocfs2_dio_end_io+0xf9c/0x1370 fs/ocfs2/aops.c:2418 dio_complete+0x25b/0x790 fs/direct-io.c:281 -> #1 (&ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE]){+.+.}-{4:4}: inode_lock include/linux/fs.h:1029 [inline] ocfs2_reserve_suballoc_bits+0x16d/0x4840 fs/ocfs2/suballoc.c:882 ocfs2_reserve_new_metadata_blocks+0x415/0x9a0 fs/ocfs2/suballoc.c:1078 ocfs2_mknod+0x10f3/0x2260 fs/ocfs2/namei.c:351 -> #0 (&ocfs2_sysfile_lock_key[INODE_ALLOC_SYSTEM_INODE]){+.+.}-{4:4}: __lock_acquire+0x15a5/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5868 down_write+0x96/0x200 kernel/locking/rwsem.c:1625 inode_lock include/linux/fs.h:1029 [inline] ocfs2_remove_inode fs/ocfs2/inode.c:733 [inline] ocfs2_wipe_inode fs/ocfs2/inode.c:896 [inline] ocfs2_delete_inode fs/ocfs2/inode.c:1157 [inline] ocfs2_evict_inode+0x1539/0x43b0 fs/ocfs2/inode.c:1299 Chain exists of: &ocfs2_sysfile_lock_key[INODE_ALLOC_SYSTEM_INODE] --> &ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE] --> &ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]); lock(&ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE]); lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]); lock(&ocfs2_sysfile_lock_key[INODE_ALLOC_SYSTEM_INODE]); *** DEADLOCK *** Link: https://lore.kernel.org/97c902a6-3bcf-43ea-9b70-f1f136a6c3f2@mail.kernel.org Fixes: d647c5b2fbf8 ("ocfs2: split transactions in dio completion to avoid credit exhaustion") Assisted-by: Gemini:gemini-3.1-pro-preview Gemini:gemini-3-flash-preview syzbot Reported-by: syzbot+b225d4dfce6219600c42@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b225d4dfce6219600c42 Link: https://syzkaller.appspot.com/ai_job?id=0b53ce1e-2972-4192-aa85-8097a702762c Signed-off-by: Aleksandr Nogikh <nogikh@google.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysocfs2: fix NULL h_transaction deref in ocfs2_assure_trans_creditsIan Bridges
[BUG] A direct write over unwritten extents can panic the kernel in ocfs2_assure_trans_credits() when the journal aborts during DIO completion. The crash is a general protection fault from a NULL pointer dereference. [CAUSE] ocfs2_dio_end_io_write() loops over a direct write's unwritten extents, marking each written under a single journal handle. If the journal aborts (for example after an I/O error) while the extent tree is being updated, the handle is left aborted with its transaction pointer cleared. The extent merge treats that failure as not critical and reports success, so the loop keeps using the handle. ocfs2_assure_trans_credits() reads the handle's remaining credits without first checking whether the handle is aborted, and that read dereferences the cleared transaction pointer. [FIX] A journal abort is recorded in the handle itself, so callers are expected to test the handle rather than rely on a returned error. Make ocfs2_assure_trans_credits() do that, as the other ocfs2 journal helpers already do, and return -EROFS when the handle is aborted. Link: https://lore.kernel.org/airKTsM1fRVN-Wj7@dev Fixes: be346c1a6eeb ("ocfs2: fix DIO failure due to insufficient transaction credits") Signed-off-by: Ian Bridges <icb@fastmail.org> Reported-by: syzbot+e9c15ff790cea6a0cfae@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e9c15ff790cea6a0cfae Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysocfs2: avoid moving extents to occupied clustersKyle Zeng
For non-auto OCFS2_IOC_MOVE_EXT operations, userspace supplies a physical me_goal. ocfs2_move_extent() initializes new_phys_cpos from that goal and expects ocfs2_probe_alloc_group() to replace it with a free run in the target block group. The probe currently leaves *phys_cpos unchanged if the scan reaches the end of the group without finding a free run. An occupied goal at the last bit can therefore survive the probe and be passed to __ocfs2_move_extent(), which copies file data into a cluster still owned by another inode before the bitmap is updated. When the probe does find a free run, it also subtracts move_len from the ending bit. The start of an N-bit run ending at i is i - N + 1, so the current calculation can report the bit immediately before the free run. Clear *phys_cpos before scanning and use the correct free-run start. Callers already treat a zero result as -ENOSPC, so failed probes no longer continue with an occupied caller-controlled goal. Link: https://lore.kernel.org/20260611213510.16956-1-kylebot@openai.com Fixes: e6b5859cccfa ("Ocfs2/move_extents: helper to probe a proper region to move in an alloc group.") Fixes: 236b9254f8d1 ("ocfs2: fix non-auto defrag path not working issue") Assisted-by: Codex:gpt-5.5 Signed-off-by: Kyle Zeng <kylebot@openai.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysocfs2: fix UBSAN array-index-out-of-bounds in ocfs2_sum_rightmost_recIan Bridges
[BUG] On-disk corruption setting l_next_free_rec to 0 in an inode's embedded extent list triggers a UBSAN panic on the next write to that file. [CAUSE] ocfs2_sum_rightmost_rec() computes i = le16_to_cpu(el->l_next_free_rec) - 1 and accesses el->l_recs[i] without validating i. When l_next_free_rec is 0, i becomes -1; when l_next_free_rec exceeds l_count, i falls past the end of the array. Either case violates the __counted_by_le(l_count) annotation on l_recs[] and triggers UBSAN. [FIX] Validate the inode's embedded extent list when the inode is read, in ocfs2_validate_inode_block(): l_count must be non-zero and no larger than the inode block can hold, and l_next_free_rec must not exceed l_count. A corrupt list is rejected at read time, before the b-tree code can index l_recs[] out of bounds. Link: https://lore.kernel.org/ain_780qc0P4ypNd@dev Signed-off-by: Ian Bridges <icb@fastmail.org> Reported-by: syzbot+be16e33db01e6644db7a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=be16e33db01e6644db7a Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysfat: reject BPB volumes whose data area starts beyond total sectorsSamuel Moelius
fat_fill_super() subtracts sbi->data_start from the BPB total sector count before computing the number of clusters. A malformed image can declare a total sector count smaller than data_start, causing the subtraction to underflow and the mount code to derive a plausible cluster count from the FAT length instead. Reject such images before the subtraction. In QEMU, a crafted FAT image with total_sectors=2 and data_start=3 mounted successfully before the fix and reading a file returned bytes stored past the BPB-declared end of the volume. With this change, the same image is rejected during mount. Assisted-by: Codex:gpt-5.5-cyber-preview Link: https://lore.kernel.org/20260605155216.2126545-1-sam.moelius@trailofbits.com Signed-off-by: Samuel Moelius <sam.moelius@trailofbits.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysNFS: Use common error handling code in nfs_alloc_server()Markus Elfring
Use an additional label so that a bit of exception handling can be better reused at the end of this function implementation. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
13 daysNFS: Prevent resource leak in nfs_alloc_server()Markus Elfring
It was overlooked to call ida_free() after a failed nfs_alloc_iostats() call. Thus add the missed function call in an if branch. Fixes: 1c7251187dc067a6d460cf33ca67da9c1dd87807 ("NFS: add superblock sysfs entries") Cc: stable@vger.kernel.org Reported-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Closes: https://lore.kernel.org/linux-nfs/1c8e10c9-def7-4f0d-8aa1-23c8035a38c8@wanadoo.fr/ Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
14 daysMerge tag 'lsm-pr-20260615' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm update from Paul Moore: "A single LSM update the security_inode_listsecurity() hook to be able to leverage the xattr_list_one() helper function. We wanted to do this for a while, but we needed to fixup the callers in the NFS code first. With the NFS code changes shipping in Linux v7.0 and no one complaining, it seemed a good time to complete the shift" * tag 'lsm-pr-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: security,fs,nfs,net: update security_inode_listsecurity() interface
14 daysMerge tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: - Continued progress toward making alloc_workqueue() unbound by default: more callers converted to WQ_PERCPU / system_percpu_wq / system_dfl_wq, and new warnings for queues that use neither WQ_PERCPU nor WQ_UNBOUND or the legacy system_wq / system_unbound_wq. - Misc: drop the now-trivial apply_wqattrs_lock()/unlock() wrappers, forbid the TEST_WORKQUEUE benchmark from being built-in, and fix a spurious pointer level in the worker debug-dump path. * tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: drm/bridge: anx7625: Add WQ_PERCPU add to alloc_workqueue wifi: ath6kl: fix invalid workqueue flags in ath6kl_usb_create() btrfs: Drop WQ_PERCPU from ordered_flags in btrfs_init_workqueues() workqueue: Add warnings and ensure one among WQ_PERCPU or WQ_UNBOUND is present workqueue: Add warnings and fallback if system_{unbound}_wq is used workqueue: drop spurious '*' from print_worker_info() fn declaration workqueue: forbid TEST_WORKQUEUE from being built-in workqueue: drop apply_wqattrs_lock()/unlock() wrappers umh: replace use of system_unbound_wq with system_dfl_wq rapidio: rio: add WQ_PERCPU to alloc_workqueue users media: ddbridge: add WQ_PERCPU to alloc_workqueue users platform: cznic: turris-omnia-mcu: replace use of system_wq with system_percpu_wq media: synopsys: hdmirx: replace use of system_unbound_wq with system_dfl_wq virt: acrn: Add WQ_PERCPU to alloc_workqueue users
14 daysMerge tag 'bpf-next-7.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "Major changes: - Recover from BPF arena page faults using a scratch page and add ptep_try_set() for lockless empty-slot installs on x86 and arm64. This allows BPF kfuncs to access arena pointers directly. The 'arena_direct_access' stable branch was created for this work and was pulled into sched-ext and bpf-next trees (Tejun Heo, Kumar Kartikeya Dwivedi) - Lift old restriction and support 6+ arguments in BPF programs and kfuncs on x86 and arm64 (Yonghong Song, Puranjay Mohan) Other features and fixes: - Add 24-bit BTF vlen and reclaim unused bits in the BTF UAPI to ease addition of new BTF kinds (Alan Maguire) - Raise the maximum BPF call chain depth from 8 to 16 frames (Alexei Starovoitov) - Refactor object relationship tracking in the verifier and fix a dynptr use-after-free bug (Amery Hung) - Harden the signed program loader and reject exclusive maps as inner maps (Daniel Borkmann) - Replace the verifier min/max bounds fields with a circular number (cnum) representation and improve 32->64 bit range refinements (Eduard Zingerman) - Introduce the arena library and runtime (libarena) with a buddy allocator, rbtree and SPMC queue data structures, ASAN support and a parallel test harness. Allow subprograms to return arena pointers and switch to a BTF type-tag based __arena annotation (Emil Tsalapatis) - Cache build IDs in the sleepable stackmap path and avoid faultable build ID reads under mm locks (Ihor Solodrai) - Introduce the tracing_multi link to attach a single BPF program to many kernel functions at once. Allow specifying the uprobe_multi target via FD (Jiri Olsa) - Extend the bpf_list family of kfuncs with bpf_list_add/del(), and bpf_list_is_first/is_last/empty() (Kaitao Cheng) - Extend the BPF syscall with common attributes support for prog_load, btf_load and map_create (Leon Hwang) - Wrap rhashtable as BPF map (Mykyta Yatsenko, Herbert Xu) - Add sleepable support for tracepoint programs and fix deadlocks in LRU map due to NMI reentry (Mykyta Yatsenko) - Fix OOB access in bpf_flow_keys, fix nullness analysis of inner arrays, enforce write checks for global subprograms (Nuoqi Gui) - Report the maximum combined stack depth and print a breakdown of instructions processed per subprogram (Paul Chaignon) - Add an XDP load-balancer benchmark and arm64 JIT support for stack arguments (Puranjay Mohan) - Add kfuncs to traverse over wakeup_sources (Samuel Wu) - Allow sleepable BPF programs to use LPM trie maps directly (Vlad Poenaru) - Many more fixes and cleanups across the verifier, BTF, sockmap, devmap, bpffs, security hooks, s390/riscv/loongarch JITs, rqspinlock, libbpf, bpftool, selftests" * tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (336 commits) selftests/bpf: Work around llvm stack overflow in crypto progs selftests/bpf: add test for bpf_msg_pop_data() overflow bpf, sockmap: fix integer overflow in bpf_msg_pop_data() bounds check sockmap: Fix use-after-free in udp_bpf_recvmsg() bpf, sockmap: keep sk_msg copy state in sync bpf, sockmap: Fix wrong rsge offset in bpf_msg_push_data() bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data() selftsets/bpf: Retry map update on helper_fill_hashmap() selftests/bpf: Add test for sleepable lsm_cgroup rejection selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket selftests/bpf: Avoid static LLVM linking for cross builds selftests/bpf: Use common CFLAGS for urandom_read selftests/bpf: Initialize operation name before use tools/bpf: build: Append extra cflags libbpf: Initialize CFLAGS before including Makefile.include bpftool: Append extra host flags bpftool: Avoid adding EXTRA_CFLAGS to HOST_CFLAGS bpftool: Pass host flags to bootstrap libbpf selftests/bpf: correct CONFIG_PPC64 macro name in comment ...
14 daysMerge tag 'net-next-7.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Work on removing rtnl_lock protection throughout the stack continues. In this chapter: - don't use rtnl_lock for IPv6 multicast routing configuration - don't take rtnl_lock in ethtool for modern drivers - prepare Qdisc dump callbacks for rtnl_lock removal - Support dumping just ifindex + name of all interfaces, under RCU. It's a common operation for Netlink CLI tools (when translating names to ifindexes) and previously required full rtnl_lock. - Support dumping qdiscs and page pools for a specific netdev. Even tho user space wants a dump of all netdevs, most of the time, the OOO programming model results in repeating the dump for each netdev. Which, in absence of a cache, leads to a O(n^2) behavior. - Flush nexthops once on multi-nexthop removal (e.g. when device goes down), another O(n^2) -> O(n) improvement. - Rehash locally generated traffic to a different nexthop on retransmit timeout. - Honor oif when choosing nexthop for locally generated IPv6 traffic. - Convert TCP Auth Option to crypto library, and drop non-RFC algos. - Increase subflow limits in MPTCP to 64 and endpoint limit to 256. - Support MPTCP signaling of IPv6 address + port (ADD_ADDR). We need to selectively skip reporting of the standard TCP Timestamp option, because they won't fit into the header space together (12 + 30 > 40). - Support using bridge neighbor suppression, Duplicate Address Detection, Gratuitous ARP and unsolicited NA forwarding - in EVPN deployments, e.g. VXLAN fabrics (IPv4 and IPv6). - Improve link state reporting for upper netdevs (e.g. macvlan) over tunnel devices (again, mostly for EVPN deployments). - Support binding GENEVE tunnels to a local address. - Speed up UDP tunnel destruction (remove one synchronize_rcu()). - Support exponential field encoding in multicast (IGMPv3 and MLDv2). - Support attaching PSP crypto offload to containers (veth, netkit). - Add a new IPSec Netlink message XFRM_MSG_MIGRATE_STATE that allows migrating individual IPsec SAs independently of their policies. The existing XFRM_MSG_MIGRATE is tightly coupled to policy+SA migration, lacks SPI for unique SA identification, and cannot express reqid changes or migrate Transport mode selectors. The new interface identifies the SA via SPI and mark, supports reqid changes, address family changes, encap removal, and uses an atomic create+install flow under x->lock to prevent SN/IV reuse during AEAD SA migration. - Implement GRO/GSO support for PPPoE. - Convert sockopt callbacks in a number of protocols to iov_iter. Cross-tree stuff: - Remove support for Crypto TFM cloning (unblocked after the TCP Auth Option rework). This feature regressed performance for all crypto API users, since it changed crypto transformation objects into reference-counted objects. - Add FCrypt-PCBC implementation to rxrpc and remove it from the global crypto API as obsolete and insecure. Wireless: - Major rework of station bandwidth handling, fixing issues with lower capability than AP. - Cleanups for EMLSR spec issues (drafts differed). - More Neighbor Awareness Networking (Wi-Fi Aware) work (multicast, schedule improvements, multi-station etc.) - Some Ultra High Reliability (UHR) / IEEE 802.11bn (D1.4) work (e.g. non-primary channel access, UHR DBE support). - Fine Timing Measurement ranging (i.e. distance measurement) APIs. Netfilter: - Use per-rule hash initval in nf_conncount. This avoids unnecessary lock contention with short keys (e.g. conntrack zones) in different namespaces. - Various safety improvements, both in packet parsing and object lifetimes. Notably add refcounts to conntrack timeout policy. Deletions: - Remove TLS + sockmap integration. TLS wants to pin user pages to avoid a copy, and sockmap wants to write to the input stream. More work on this integration is clearly needed, and we can't find any users (original author admitted that they never deployed it). - Remove support for TLS offload with TCP Offload Engine (the far more common opportunistic offload is retained). The locking looks unfixable (driver sleeps under TCP spin locks) and people from the vendor that added this are AWOL. - Remove more ATM code, trying to leave behind only what PPPoATM needs, AAL5 and br2684 with permanent circuits. - Remove AppleTalk. Let it join hamradio in our out of tree protocol graveyard, I mean, repository. - Disable 32-bit x_tables compatibility (32bit binaries on 64bit kernel) interface in user namespaces. To be deleted completely, soon. - Remove 5/10 MHz support from cfg80211/mac80211. Drivers: - Software: - Support DEVMEM/DMABUF Tx over NETMEM_TX_NO_DMA devices (netkit) - bonding: add knob to strictly follow 802.3ad for link state - New drivers: - Alibaba Elastic Ethernet Adaptor (cloud vNIC). - NXP NETC switch within i.MX94. - DPLL: - Add operational state to pins (implement in zl3073x). - Add generic DPLL type, for daisy-chaining DPLLs (implement in ice). - Ethernet high-speed NICs: - Huawei (hinic3): - enhance tc flow offload support with queue selection, tunnels - nVidia/Mellanox: - avoid over-copying payload to the skb's linear part (up to 60% win for LRO on slow CPUs like ARM64 V2) - expose more per-queue stats over the standard API - support additional, unprivileged PFs in the DPU configuration - support Socket Direct (multi-PF) with switchdev offloads - add a pool / frag allocator for DMA mapped buffers for control objects, save memory on systems with 64kB page size - take advantage of the ability to dynamically change RSS table size, even when table is configured by the user - increase the max RSS table size for even traffic distribution - Ethernet NICs: - Marvell/Aquantia: - AQC113 PTP support - Realtek USB (r8152): - support 10Gbit Link Speeds and Energy-Efficient Ethernet (EEE) - support firmware loaded (for RTL8157/RTL8159) - support for the RTL8159 - Intel (ixgbe): - support Energy-Efficient Ethernet (EEE) on E610 devices - Ethernet switches: - Airoha: - support multiple netdevs on a single GDM block / port - Marvell (mv88e6xxx): - support SERDES of mv88e6321 - Microchip (ksz8/9): - rework the driver callbacks to remove one indirection layer - Motorcomm (yt921x): - support port rate policing - support TBF qdisc offload - support ACL/flower offload - nVidia/Mellanox: - expose per-PG rx_discards - Realtek: - rtl8365mb: bridge offloading and VLAN support - Ethernet PHYs: - Airoha: - support Airoha AN8801R Gigabit PHYs. - Micrel: - implement 3 low-loss cable tunables - Realtek: - support MDI swapping for RTL8226-CG - support MDIO for RTL931x - Qualcomm: - at803x: Rx and Tx clock management for IPQ5018 PHY - Motorcomm: - support YT8522 100M RMII PHY - set drive strength in YT8531s RGMII - TI: - dp83822: add optional external PHY clock - Bluetooth: - hci_sync: add support for HCI_LE_Set_Host_Feature [v2] - SMP: use AES-CMAC library API - Intel: - support Product level reset - support smart trigger dump - Mediatek: - add event filter to filter specific event - Realtek: - fix RTL8761B/BU broken LE extended scan - WiFi: - Broadcom (b43): - new support for a 11n device - MediaTek (mt76): - support mt7927 - mt792x: broken usb transport detection - mt7921: regulatory improvements - Qualcomm (ath9k): - GPIO interface improvements - Qualcomm (ath12k): - WDS support - replace dynamic memory allocation in WMI Rx path - thermal throttling/cooling device support - 6 GHz incumbent interference detection - channel 177 in 5 GHz - Realtek (rt89): - RTL8922AU support - USB 3 mode switch for performance - better monitor radiotap support - RTL8922DE preparations" * tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1778 commits) ipv4: fib_rule: Move fib4_rules_exit() to ->exit(). net: serialize netif_running() check in enqueue_to_backlog() net: skmsg: preserve sg.copy across SG transforms appletalk: move the protocol out of tree appletalk: stop storing per-interface state in struct net_device selftests/bpf: test that TLS crypto is rejected on a sockmap socket selftests/bpf: drop the unused kTLS program from test_sockmap selftests/bpf: remove sockmap + ktls tests tls: remove dead sockmap (psock) handling from the SW path tls: reject the combination of TLS and sockmap atm: remove orphaned uAPI for deleted drivers, protocols and SVCs atm: remove unused ATM PHY operations atm: remove the unused pre_send and send_bh device operations atm: remove the unused change_qos device operation atm: remove SVC socket support and the signaling daemon interface atm: remove the local ATM (NSAP) address registry atm: remove dead SONET PHY ioctls atm: remove the unused send_oam / push_oam callbacks atm: remove AAL3/4 transport support net: dsa: sja1105: fix lastused timestamp in flower stats ...
14 dayserofs: introduce erofs_map_chunks()Gao Xiang
Try to map more chunks in the same metadata on-disk block for more efficient IO performance. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
14 dayserofs: call erofs_exit_ishare() before rcu_barrier()Gao Xiang
Ensure all inode free callbacks have completed before destroying the inode slab cache. Fixes: 5ef3208e3be5 ("erofs: introduce the page cache share feature") Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-06-17erofs: clean up erofs_ishare_fill_inode()Gao Xiang
- Use the shorthand `si` to replace the overly long `sharedinode`; - Introduce erofs_warn() and get rid of barely-used _erofs_printk(); - Get rid of the variable `hash`; - Simplify error paths. Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-06-16ksmbd: fix path resolution in ksmbd_vfs_kern_path_createDavide Ornaghi
The SMB2 open lookup is rooted at the share with LOOKUP_BENEATH, but the create/mkdir/hardlink sink is not: ksmbd_vfs_kern_path_create() builds an absolute path with convert_to_unix_name() and resolves it from AT_FDCWD via start_creating_path(), so a ".." component is walked from the real filesystem root and escapes the export. An authenticated client races a missing path component so the rooted open lookup returns -ENOENT (taking the create branch) while the same component is present (a directory) when the create walk runs; the create then resolves ".." out of the share. Root the create walk at the share like the lookup and rename paths already are: resolve the parent with vfs_path_parent_lookup(..., LOOKUP_BENEATH, &share_conf->vfs_path) and create the final component with start_creating_noperm(). convert_to_unix_name() then has no callers and is removed. Fixes: 265fd1991c1d ("ksmbd: use LOOKUP_BENEATH to prevent the out of share access") Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for FSCTL mutationsNamjae Jeon
SET_SPARSE, SET_ZERO_DATA and SET_COMPRESSION operate on an open SMB handle but call VFS xattr, fallocate or fileattr helpers with the current ksmbd worker credentials. Those helpers can revalidate inode permissions, ownership and LSM policy independently of the SMB handle access mask. Run each operation with the credentials captured in the target file when the handle was opened. Keep credential handling local to these single-file FSCTLs rather than applying session credentials to the complete IOCTL handler, which also contains handle-less and multi-handle operations. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for ADS I/ONamjae Jeon
Alternate data streams are stored as xattrs. Unlike regular file I/O, their read and write paths therefore call VFS xattr helpers which recheck inode permissions and LSM policy using the current task credentials. Run ADS I/O with the credentials captured when the SMB handle was opened. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: require source read access for duplicate extentsNamjae Jeon
FSCTL_DUPLICATE_EXTENTS_TO_FILE passes the source file directly to vfs_clone_file_range() or vfs_copy_file_range() without checking the SMB access mask granted to the source handle. A handle opened with attribute access can consequently be used to copy file contents into an attacker-readable destination. Require FILE_READ_DATA on the source handle before either VFS operation, matching other ksmbd data-copy paths. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: run set info with opener credentialsNamjae Jeon
SMB2 SET_INFO handlers call path-based VFS helpers after checking the access mask granted to the SMB handle. Those helpers perform their owner, inode permission and LSM checks using the current ksmbd worker credentials. Run the complete SET_INFO dispatch with the credentials captured when the handle was opened. This also removes the separate security information credential setup and keeps all SET_INFO classes under one credential scope. Direct override_creds() is used because it can nest with the request credential overrides already used by rename and link helpers. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for delete-on-closeNamjae Jeon
Delete-on-close can be completed by deferred or durable handle teardown, where no request work is available. Both the base-file unlink and the ADS xattr removal consequently run with the ksmbd worker credentials and can bypass filesystem permission checks. Run both operations with the credentials captured in struct file when the handle was opened. This preserves the authenticated user's fsuid, fsgid, supplementary groups and capability restrictions at final close. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: serialize QUERY_DIRECTORY requests per fileNamjae Jeon
smb2_query_dir() stores a pointer to its stack-allocated private data in the ksmbd_file readdir_data. Concurrent QUERY_DIRECTORY requests using the same file handle can overwrite this pointer while an iterate_dir() callback is still using it, resulting in a stack use-after-free. Add a per-file mutex and hold it while accessing the shared directory enumeration state. The lock covers scan restart, dot entry state, readdir_data setup and iteration, and response construction. This prevents another request from replacing readdir_data.private before the current request has finished using it and also serializes the shared file position. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-30527 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add permission checks for FSCTL_DUPLICATE_EXTENTS_TO_FILEGil Portnoy
The FSCTL_DUPLICATE_EXTENTS_TO_FILE arm of smb2_ioctl() overwrites the destination file's data via vfs_clone_file_range() with neither the share-level KSMBD_TREE_CONN_FLAG_WRITABLE check nor a per-handle fp->daccess check that the other write-bearing arms carry. A client can overwrite destination data on a read-only share, or from a handle opened with only FILE_WRITE_ATTRIBUTES (which still yields an FMODE_WRITE filp). FILE_WRITE_ATTRIBUTES-only destination handle overwrote the file's data via the clone. Add both checks, matching the FSCTL_SET_SPARSE permission fix; require FILE_WRITE_DATA since this writes data. Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: enforce FILE_READ_ATTRIBUTES on SMB_FIND_FILE_POSIX_INFORMATIONGil Portnoy
find_file_posix_info() in smb2_query_info() returns file metadata (owner uid, group gid, mode, inode, size, allocation size, hard-link count and all four timestamps) but performs no per-handle access check. Every sibling query handler gates on the handle's granted access first -- get_file_basic_info(), get_file_all_info(), get_file_network_open_info() and get_file_attribute_tag_info() all reject a handle lacking FILE_READ_ATTRIBUTES_LE with -EACCES. The POSIX handler is gated only by the connection-scoped tcon->posix_extensions flag, which is not a per-handle authorization, so a handle opened with only FILE_WRITE_DATA is correctly denied FileBasicInformation yet is allowed the strict-superset POSIX info. Mirror the FILE_READ_ATTRIBUTES_LE gate the sibling info handlers already use. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: reject non-VALID session in compound request branchGil Portnoy
smb2_check_user_session() takes a shortcut for any operation that is not the first in a COMPOUND request: it reuses work->sess (the session bound by the first operation) and validates only the SessionId, then returns "valid". It never re-checks work->sess->state == SMB2_SESSION_VALID, and a SessionId of 0xFFFFFFFFFFFFFFFF (ULLONG_MAX, the MS-SMB2 related-operation value) skips even the id comparison. The standalone path (ksmbd_session_lookup_all() plus the SESSION_SETUP state machine) does enforce the VALID state; the compound branch bypasses all of it. A SESSION_SETUP carrying only an NTLM Type-1 (NtLmNegotiate) blob publishes a fresh SMB2_SESSION_IN_PROGRESS session whose sess->user is still NULL (->user is assigned later, by ntlm_authenticate()). Used as operation 1 of a COMPOUND with operation 2 = TREE_CONNECT (related, SessionId=ULLONG_MAX, \\host\IPC$), the tree-connect then runs on that IN_PROGRESS session and reaches ksmbd_ipc_tree_connect_request(), which dereferences user_name(sess->user) with sess->user == NULL (transport_ipc.c:687/701/704) -> remote NULL-pointer dereference and a kernel Oops that wedges the ksmbd worker for all clients. Reject any non-first compound operation that lands on a session which is not SMB2_SESSION_VALID, mirroring the validity the standalone lookup path enforces. SESSION_SETUP itself legitimately runs on an IN_PROGRESS session, but it is never carried as a non-first compound operation, so multi-leg authentication is unaffected by this check. Fixes: 5005bcb42191 ("ksmbd: validate session id and tree id in the compound request") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: compress SMB2 READ responsesNamjae Jeon
Handle SMB2_READFLAG_REQUEST_COMPRESSED for non-RDMA reads. Flatten the response iov, emit chained or unchained LZ77 transforms when compression is beneficial, and retain the generated buffer until the work item is released. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: negotiate and decode SMB2 compressionNamjae Jeon
Parse the SMB 3.1.1 compression capabilities context and negotiate LZ77 with optional chained Pattern_V1 support. Advertise compression on tree connections and decode compressed requests before normal SMB dispatch. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16cifs: negotiate chained SMB2 compression capabilitiesNamjae Jeon
Advertise LZ77 and Pattern_V1 with chained transform support in the SMB 3.1.1 compression negotiate context. Validate the server's returned algorithm list and flags, then retain the negotiated capabilities for a future compressed transform receive implementation. This patch only negotiates capabilities. It does not request compressed READ responses or add a compressed transform receive path. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: add common SMB2 compression transform helpersNamjae Jeon
Implement common validation, compression and decompression for SMB2 compression transforms. Support unchained LZ77 and chained NONE, LZ77 and Pattern_V1 payloads. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: move LZ77 compression into common codeNamjae Jeon
Move the LZ77 codec in cifs.ko to smb/common/ so both the SMB client and ksmbd can use it. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add per-handle permission check to FILE_LINK_INFORMATIONGil Portnoy
The FILE_LINK_INFORMATION arm of smb2_set_info_file() calls smb2_create_link() with no per-handle fp->daccess check. On the ReplaceIfExists path smb2_create_link() unlinks an existing file at the target name (ksmbd_vfs_remove_file) and creates a hardlink (ksmbd_vfs_link); neither helper checks daccess. A handle opened with FILE_READ_DATA only (no FILE_DELETE, no FILE_WRITE_DATA) can therefore delete an arbitrary file in the share and plant a hardlink over its name. The sibling delete/move arms in the same switch already gate: FILE_RENAME_INFORMATION and FILE_DISPOSITION_INFORMATION both require FILE_DELETE_LE; FILE_FULL_EA_INFORMATION requires FILE_WRITE_EA_LE. Gate the link arm the same way as its closest analogue (rename), since it mutates the namespace and, on replace, deletes an existing entry. This is a sibling of commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE"). Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add a permission check for FSCTL_SET_ZERO_DATAGil Portnoy
FSCTL_SET_ZERO_DATA in smb2_ioctl() destroys file data via ksmbd_vfs_zero_data() -> vfs_fallocate(PUNCH_HOLE/ZERO_RANGE) after checking only the share-level KSMBD_TREE_CONN_FLAG_WRITABLE, with no per-handle access check. A handle opened with only FILE_WRITE_ATTRIBUTES still yields an FMODE_WRITE filp (FILE_WRITE_ATTRIBUTES is part of FILE_WRITE_DESIRE_ACCESS_LE, so smb2_create_open_flags() opens it O_WRONLY), so the vfs_fallocate FMODE_WRITE check does not stop it; only the missing fp->daccess gate would. Reproduced on mainline 7.1-rc7 with KASAN by an authenticated SMB client: a FILE_WRITE_ATTRIBUTES-only handle zeroed 4096 bytes of file data it had no FILE_WRITE_DATA right to (6/6; a FILE_READ_DATA-only handle was correctly denied). This is the unfixed sibling of commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE"). Because SET_ZERO_DATA writes data (not an attribute), require FILE_WRITE_DATA. Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add a WRITE_DAC/WRITE_OWNER check to SMB2 SET_INFO SECURITYGil Portnoy
commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE") added a fp->daccess gate to fsctl_set_sparse and noted that "similar handle-level checks exist in other functions but are missing here." The SMB2 SET_INFO SECURITY arm is one of the missing ones, and the most security-relevant: smb2_set_info_sec() calls set_info_sec() with no per-handle access check. set_info_sec() (fs/smb/server/smbacl.c) re-permissions the file: it rewrites owner/group/mode via notify_change(), rewrites the POSIX ACL via set_posix_acl(), and on KSMBD_SHARE_FLAG_ACL_XATTR shares removes and rewrites the Windows security descriptor via ksmbd_vfs_set_sd_xattr(). Every other persistent-mutation arm of the sibling handler smb2_set_info_file() checks fp->daccess first (FILE_WRITE_DATA / FILE_DELETE / FILE_WRITE_EA / FILE_WRITE_ATTRIBUTES); the SECURITY arm — which mutates the access control itself — is the only one with no gate. A client can therefore open a handle with FILE_WRITE_ATTRIBUTES only (no FILE_WRITE_DAC / FILE_WRITE_OWNER) and use SMB2_SET_INFO with InfoType SMB2_O_INFO_SECURITY to rewrite the file's DACL and owner, granting itself access the handle's daccess never carried. Unlike the FSCTL data arms this is a metadata/xattr operation, so there is no FMODE_WRITE VFS backstop — the missing fp->daccess check is the entire gate. Setting a security descriptor is the WRITE_DAC / WRITE_OWNER operation, so require at least one of those on the handle before re-permissioning the file. -EACCES is mapped to STATUS_ACCESS_DENIED by smb2_set_info(). Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix use-after-free of a deferred file_lock on SMB2_CLOSE then SMB2_CANCELGil Portnoy
Commit f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL") made smb2_cancel() skip a work whose state is KSMBD_WORK_CANCELLED, so its cancel_fn cannot be fired a second time. But KSMBD_WORK has three states (ACTIVE, CANCELLED, CLOSED), and the same freeing producer path is reached for CLOSED too: SMB2_CLOSE on the locking handle -> set_close_state_blocked_works() sets the deferred work's state to KSMBD_WORK_CLOSED and wakes the smb2_lock() worker. The worker takes the non-ACTIVE early-exit, locks_free_lock()s the file_lock and, because the state is not KSMBD_WORK_CANCELLED, takes the STATUS_RANGE_NOT_LOCKED branch with "goto out2" -- which, like the cancelled branch, skips release_async_work(). The work stays on conn->async_requests with a live cancel_fn = smb2_remove_blocked_lock pointing at the freed file_lock. A subsequent SMB2_CANCEL for the same AsyncId then passes the KSMBD_WORK_CANCELLED-only guard (its state is KSMBD_WORK_CLOSED), so smb2_cancel() fires cancel_fn again over the freed file_lock -- the same use-after-free fixed, via SMB2_CLOSE instead of a first SMB2_CANCEL: BUG: KASAN: slab-use-after-free in __locks_delete_block __locks_delete_block locks_delete_block ksmbd_vfs_posix_lock_unblock smb2_remove_blocked_lock smb2_cancel <- 2nd SMB2_CANCEL fires cancel_fn handle_ksmbd_work Allocated by ...: locks_alloc_lock <- smb2_lock Freed by ...: locks_free_lock <- smb2_lock (non-ACTIVE early-exit) ... cache file_lock_cache of size 192 Reproduced on mainline 7.1-rc7 (which already contains f580d27e8928) with KASAN by an authenticated SMB client; the double-SMB2_CANCEL control is silent on that kernel, so the splat is attributable to the CLOSE trigger. Only an ACTIVE deferred work may have its cancel_fn fired: both terminal states (CANCELLED and CLOSED) reach the smb2_lock() early-exit that frees the file_lock and skips release_async_work(). Guard on KSMBD_WORK_ACTIVE so any non-active work is skipped. Fixes: f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: server: remove code guarded by nonexistent config optionEthan Nelson-Moore
A small piece of code in fs/smb/server/smb_common.c depends on CONFIG_SMB_INSECURE_SERVER, which has never been defined in the mainline kernel, but was present in old out-of-tree versions of ksmbd. Remove this dead code. Discovered while searching for CONFIG_* symbols referenced in code but not defined in any Kconfig file. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: fix incorrect file size in get_file_compression_info()ChenXiaoSong
Before this patch, we got the wrong file size: - client: touch /mnt/file - client: smbinfo setcompression default /mnt/file - client: dd if=/dev/zero of=/mnt/file bs=1 count=1000 - client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 4096 Compression Format: 2 (LZNT1) After this patch, we get the correct file size: - client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 1000 Compression Format: 2 (LZNT1) Note that the actual compressed file size must be got by other methods. For Btrfs, use the following command to get actual compressed file size: - server: compsize /export/file Processed 1 file, 0 regular extents (0 refs), 1 inline. Type Perc Disk Usage Uncompressed Referenced TOTAL 4% 47B 1000B 1000B zlib 4% 47B 1000B 1000B Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: get compression format in get_file_compression_info()ChenXiaoSong
I have added `filecompressioninfo` subcommand to `smbinfo` in cifs-utils.git. Example: 1. client: smbinfo setcompression lznt1 /mnt/file 2. client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 104857600 Compression Format: 2 (LZNT1) Compression Unit Shift: 0 Chunk Shift: 0 Cluster Shift: 0 Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: implement FSCTL_SET_COMPRESSION ioctl handlerChenXiaoSong
Example: 1. client: smbinfo setcompression no /mnt/file 2. client: smbinfo getcompression /mnt/file Compression: 0 (NONE) 3. client: smbinfo setcompression lznt1 /mnt/file 4. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) 5. client: smbinfo setcompression default /mnt/file 6. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: implement FSCTL_GET_COMPRESSION ioctl handlerChenXiaoSong
Example: 1. server: chattr +c /export/file 2. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: get compression file attribute on openChenXiaoSong
Example: 1. server: chattr +c /export/file 2. client: lsattr /mnt/file --------c------------- /mnt/file Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: move compression definitions into common/fscc.hChenXiaoSong
These definitions will also be used by ksmbd, move them into common header file. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: remove duplicate server/smbfsctl.hChenXiaoSong
Rename the following places: - FSCTL_COPYCHUNK -> FSCTL_SRV_COPYCHUNK - FSCTL_COPYCHUNK_WRITE -> FSCTL_SRV_COPYCHUNK_WRITE - FSCTL_REQUEST_RESUME_KEY -> FSCTL_SRV_REQUEST_RESUME_KEY server/smbfsctl.h contains the following additional definitions compared to common/smbfsctl.h: - IO_REPARSE_TAG_LX_SYMLINK_LE - IO_REPARSE_TAG_AF_UNIX_LE - IO_REPARSE_TAG_LX_FIFO_LE - IO_REPARSE_TAG_LX_CHR_LE - IO_REPARSE_TAG_LX_BLK_LE Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: prevent path traversal bypass by restricting caseless retryNamjae Jeon
ksmbd_vfs_path_lookup() enforces LOOKUP_BENEATH to restrict path resolution within the share root. When a crafted path attempts to escape the share boundary using parent-directory components ('..'), vfs_path_parent_lookup() detects this and immediately fails, returning -EXDEV. However, a bug exists in __ksmbd_vfs_kern_path() under caseless mode. The function fails to intercept the -EXDEV error and erroneously falls through to the caseless retry logic, which is intended only for genuinely missing files. During this retry process, the path is reconstructed, leading to an unintended LOOKUP_BENEATH bypass that allows write-capable users to create zero-length files or directories outside the exported share. Fix this by ensuring that the execution only proceeds to the caseless lookup retry when the error is specifically -ENOENT. Any other errors, such as -EXDEV from a path traversal attempt, must be returned immediately. Cc: stable@vger.kernel.org Reported-by: Y s65 <yu4ys@outlook.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix UAF of struct file_lock in SMB2_LOCK deferred-lock cancellationDavide Ornaghi
When a blocking byte-range lock request is deferred in the FILE_LOCK_DEFERRED path, ksmbd registers the asynchronous work into the connection's async_requests list via setup_async_work(). The cancel callback smb2_remove_blocked_lock() holds a reference to the flock. If the lock waiter is subsequently woken up but the work state is no longer KSMBD_WORK_ACTIVE (e.g., due to a concurrent cancellation), the cleanup path calls locks_free_lock(flock) without dequeuing the work from the async_requests list. Concurrently, smb2_cancel() walks the list under conn->request_lock and invokes the cancel callback, which then dereferences the already freed 'flock'. This leads to a slab-use-after-free inside __wake_up_common. Fix this by restructuring the cleanup logic after the worker returns from ksmbd_vfs_posix_lock_wait(). Move list_del(&smb_lock->llist) and release_async_work(work) to the top of the cleanup block. This guarantees that the async work is completely dequeued and serialized under conn->request_lock before locks_free_lock(flock) is called, rendering the flock unreachable for any concurrent smb2_cancel(). Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix use-after-free in same_client_has_lease()Guangshuo Li
same_client_has_lease() returns an opinfo pointer from ci->m_op_list after dropping ci->m_lock without taking a reference. smb_grant_oplock() then dereferences that pointer in copy_lease() and when checking breaking_cnt. A concurrent close can remove the old lease from ci->m_op_list and drop the last reference before the caller uses the returned pointer, leading to a use-after-free. Take a reference when same_client_has_lease() selects an existing lease, drop any previous match while scanning, and release the returned reference in smb_grant_oplock() after copying the lease state. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix out-of-bounds read in smb_check_perm_dacl()Hem Parekh
The permission-check ACE walk in smb_check_perm_dacl() validates the ACE header size and caps sid.num_subauth at SID_MAX_SUB_AUTHORITIES, but it never checks that ace->size is actually large enough to contain num_subauth sub-authorities before compare_sids() dereferences them. CIFS_SID_BASE_SIZE covers the SID header up to but excluding the sub_auth[] array, and offsetof(struct smb_ace, sid) is the ACE header, so the existing guards only guarantee the 8-byte SID base, i.e. zero sub-authorities. compare_sids() then reads ace->sid.sub_auth[i] for i < min(local_sid->num_subauth, ace->sid.num_subauth). The local comparison SIDs (sid_everyone, sid_unix_NFS_mode, and the id_to_sid() result) always have at least one sub-authority, and an attacker controls the ACE revision and authority bytes (which lie within the in-bounds SID base), so they can match one of those SIDs and force the sub_auth read. A crafted ACE with size == 16 and num_subauth >= 1 placed at the tail of the security descriptor therefore causes a heap out-of-bounds read of up to SID_MAX_SUB_AUTHORITIES * sizeof(__le32) bytes past the pntsd allocation. The security descriptor is loaded by ksmbd_vfs_get_sd_xattr() into a buffer sized exactly to the on-disk data (kzalloc(sd_size) in ndr_decode_v4_ntacl()), so the read lands past the allocation. The malformed descriptor can be stored verbatim via SMB2_SET_INFO (the DACL is not normalised before being written to the security.NTACL xattr) and the read fires on a subsequent SMB2_CREATE access check, making this reachable by an authenticated client on a share that uses ACL xattrs. Add the missing num_subauth-versus-ace_size check, mirroring the identical guards already present in the sibling parsers parse_dacl() and smb_inherit_dacl(). Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") Cc: stable@vger.kernel.org Signed-off-by: Hem Parekh <hemparekh1596@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16Merge tag 'for-7.2/block-20260615' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - NVMe pull request via Keith: - Per-controller admin and IO timeout sysfs attributes, and letting the block layer set request timeouts (Maurizio, Maximilian) - Multipath passthrough iostats, and PCI P2PDMA enablement for multipath devices (Keith, Kiran) - A new diag sysfs attribute group exporting per-controller counters (retries, multipath failover, error counters, requeue and failure counts, reset and reconnect events) (Nilay) - FDP configuration validation and bounds check fixes (liuxixin) - Various nvmet fixes, including a pre-auth out-of-bounds read in the Discovery Get Log Page handler, auth payload bounds validation, and tcp error-path leak fixes (Bryam, Tianchu, Geliang) - nvme-tcp lockdep and workqueue fixes (Shin'ichiro, Kuniyuki, Eric) - Assorted other fixes and cleanups (John, Yao, Chao, Mateusz, Achkinazi, Wentao) - MD pull request via Yu Kuai: - raid1/raid10 fixes for a deadlock in the read error recovery path, error-path detection and bio accounting with cloned bios, and an nr_pending leak in the REQ_ATOMIC bad-block error path (Abd-Alrhman) - PCI P2PDMA propagation from member devices to the RAID device (Kiran) - dm-raid bio requeue fix, and various smaller fixes and cleanups (Benjamin, Chen, Li, Thorsten) - Enable Clang lock context analysis for the block layer, with the accompanying annotations across queue limits, the blk_holder_ops callbacks, crypto, cgroup, iocost, kyber and mq-deadline (Bart) - Block status code infrastructure work: a tagged status table, a str_to_blk_op() helper, a bio_endio_status() helper, and on top of that a new configurable block-layer error injection facility (Christoph) - DRBD netlink rework, replacing the genl_magic machinery with explicit netlink serialization and moving the DRBD UAPI headers to include/uapi/linux/ (Christoph Böhmwalder) - bvec improvements: a bvec_folio() helper and making the bvec_iter helpers proper inline functions (Willy, Christoph) - ublk cleanups and a canceling-flag fix for the disk-not-allocated case (Caleb, Ming) - Partition handling fixes: bound the AIX pp_count scan, fix an of_node refcount leak, and replace __get_free_page() with kmalloc() (Bryam, Wentao, Mike) - Convert numa_node to int in blk_mq_hw_ctx and ->init_request, and add WQ_PERCPU to the block workqueue users (Mateusz, Marco) - Block statistics and tracing: propagate in-flight to the whole disk on partition IO, export passthrough stats, and a new block_rq_tag_wait tracepoint (Tang, Keith, Aaron) - A round of removals, unexports and cleanups across bio, direct-io and the bvec helpers (Christoph) - Various driver fixes (mtip32xx use-after-free, rbd snap_count validation and strscpy conversion, nbd socket lockdep reclassify, virtio-blk zone report clamp, floppy) and a batch of MAINTAINERS email/list updates (Coly, Li, Yu, Christoph Böhmwalder) - Other little fixes and cleanups all over * tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (117 commits) MAINTAINERS: Update Coly Li's email address block: check bio split for unaligned bvec nbd: Reclassify sockets to avoid lockdep circular dependency block: add configurable error injection block: add a str_to_blk_op helper block: add a "tag" for block status codes block: add a macro to initialize the status table floppy: Drop unused pnp driver data block: propagate in_flight to whole disk on partition I/O virtio-blk: clamp zone report to the report buffer capacity block: optimize I/O merge hot path with unlikely() hints drivers/block/rbd: Use strscpy() to copy strings into arrays partitions: aix: bound the pp_count scan to the ppe array block: Enable lock context analysis block/mq-deadline: Make the lock context annotations compatible with Clang block/Kyber: Make the lock context annotations compatible with Clang block/blk-mq-debugfs: Improve lock context annotations block/blk-iocost: Inline iocg_lock() and iocg_unlock() block/blk-iocost: Split ioc_rqos_throttle() block/crypto: Annotate the crypto functions ...
2026-06-16Merge tag 'hfs-v7.2-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs Pull hfs/hfsplus updates from Viacheslav Dubeyko: "Several fixes in HFS/HFS+ of syzbot reported issues and HFS//HFS+ fixes of xfstests failures. - fix a null-ptr-deref issue reported by syzbot (Edward Adam Davis) If the attributes file is not loaded during system mount hfsplus_create_attributes_file can dereference a NULL pointer. Also, add a b-tree node size check in hfs_btree_open() with the goal to prevent an uninit-value bug reported by syzbot for the case of corrupted HFS+ image. - fix __hfs_bnode_create() by using kzalloc_flex() instead of kzalloc() (Rosen Penev) - fix early return in hfs_bnode_read() (Tristan Madani) hfs_bnode_read() can return early without writing to the output buffer when is_bnode_offset_valid() fails or when check_and_correct_requested_ length() corrects the length to zero. Callers such as hfs_bnode_read_ u16() and hfs_bnode_read_u8() pass stack-allocated buffers and use the result unconditionally, leading to KMSAN uninit-value reports. The rest fix (1) generic/637, generic/729 issue for the case of HFS+ file system, (2) generic/003, generic/637 for the case of HFS file system" * tag 'hfs-v7.2-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs: hfs: rework hfsplus_readdir() logic hfs: disable the updating of file access times (atime) hfs: fix incorrect inode ID assignment in hfs_new_inode() hfsplus: rework hfsplus_readdir() logic hfs/hfsplus: zero-initialize buffer in hfs_bnode_read hfs/hfsplus: fix u32 overflow in check_and_correct_requested_length hfsplus: Add a sanity check for btree node size hfsplus: fix issue of direct writes beyond end-of-file hfs/hfxplus: use kzalloc_flex() hfsplus: Remove the duplicate attr inode dirty marking action
2026-06-16Merge tag 'nilfs2-v7.2-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2 Pull nilfs2 updates from Viacheslav Dubeyko: "Fixes of syzbot reported issue and various small fixes in NILFS2 functionality. - fix hung task in nilfs_transaction_begin() (Deepanshu Kartikey) Reported by syzbot. The root cause is that user-supplied segment numbers were not validated before nilfs_clean_segments() began doing work; the range check on each segnum was performed deep inside the call chain by nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry while still holding the segctor lock and the sufile mi_sem. Fix it by validating the contents of kbufs[4] in nilfs_clean_segments() immediately after acquiring ns_segctor_sem via nilfs_transaction_lock(). - fix a smatch warning in nilfs_mkdir() warn (Hongling Zeng) This corrects a semantic issue related to the use of the ERR_PTR macro that arose from a recent VFS change. - fix a backing_dev_info reference leak (Shuangpeng Bai) setup_bdev_super() initializes sb->s_bdev and takes a reference on the block device backing_dev_info when assigning sb->s_bdi. nilfs_fill_super() takes another reference to the same backing_dev_info and stores it in sb->s_bdi again. The extra reference is not paired with a matching bdi_put(), since generic_shutdown_super() releases sb->s_bdi only once. Drop the redundant bdi_get() in nilfs_fill_super(). The single reference taken by setup_bdev_super() is enough and is released during superblock shutdown" * tag 'nilfs2-v7.2-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2: nilfs2: Fix return in nilfs_mkdir nilfs2: fix backing_dev_info reference leak nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
2026-06-16Merge tag 'for-7.2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "The most noticeable change is to enable large folios by default, it's been in testing for a few releases. Related to that is huge folio support (still under experimental config). Otherwise a few ioctl updates, performance improvements and usual fixes and core changes. User visible changes: - enable large folios by default, added in 6.17 (under experimental build), no feature limitations, a big change internally - new ioctl to return raw checksums to userspace (a bit tricky given compression and tail extents), can be used for mkfs and deduplication optimizations - provide stable UUID for e.g. overlayfs and temp_fsid, also reflected in statvfs() field f_fsid, internal dev_t is hashed in to allow cloning - add 32bit compat version of GET_SUBVOL_INFO ioctl - in experimental build, support huge folios (up to 2M) Performance related improvements/changes: - limit bio size to the estimated optimum derived from the queue, this prevents build up of too much data for writeback, which could cause latency spikes (reported improvement 15% on sequential writes) - don't force direct IO to be serialized, forgotten change during mount API port, brings back +60% of throughput - lockless calculation of number of shrinkable extent maps, improve performance with many memcg allocated objects Notable fixes: - in zoned mode, fix a deadlock due to zone reclaim and relocation when space needs to be flushed - don't trim device which is internally not tracked as writeable (e.g. when missing device is being rescanned) - fix deadlock when cloning inline extent and mounted with flushoncommit - fix false IO failures after direct IO falls back to buffered write in some cases Core: - remove COW fixup mechanism completely; detect and fix changes to pages outside of filesystem tracking, guaranteed since 5.8, grace period is over - remove 2K block size support, experimental to test subpage code on x86_64 but now it would block folio changes - tree-checker improvements of: - free-space cache and tree items - root reference and backref items - extent state exceptions in reloc tree - subpage mode updates: - code optimizations, simplify tracking bitmaps - re-enable readahead of compressed extent - extend bitmap size to cover huge folios - add tracepoints related to sync, tree-log and transactions - device stats item tracking unification, remove item if there are no stats recorded, also don't leave stale stats on replaced device - allow extent buffer pages to be allocated as movable, to help page migration - added checks for proper extent buffer release - btrfs.ko code size reduction due to transaction abort call simplifications - several struct size reductions - more auto free conversions - more verbose assertions" * tag 'for-7.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (130 commits) btrfs: fix use-after-free after relocation failure with concurrent COW btrfs: move WARN_ON on unexpected error in __add_tree_block() btrfs: move locking into btrfs_get_reloc_bg_bytenr() btrfs: lzo: reject compressed segment that overflows the compressed input btrfs: retry faulting in the pages after a zero sized short direct write btrfs: fix incorrect buffered IO fallback for append direct writes btrfs: fix false IO failure after falling back to buffered write btrfs: use verbose assertions in backref.c btrfs: print a message when a missing device re-appears btrfs: do not trim a device which is not writeable btrfs: return real error after lookup failure in btrfs_ioctl_default_subvol() btrfs: use mapping shared locking for reading super block btrfs: use lockless read in nr_cached_objects shrinker callback btrfs: switch local indicator variables to bools btrfs: send: pass bool for pending_move and refs_processed parameters btrfs: use shifts for sectorsize and nodesize btrfs: fix deadlock cloning inline extent when using flushoncommit btrfs: allocate eb-attached btree pages as movable btrfs: add 32-bit compat ioctl for BTRFS_IOC_GET_SUBVOL_INFO btrfs: derive f_fsid from on-disk fsid and dev_t ...