summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
9 hoursMerge tag 'vfs-6.18-rc2.fixes' of ↵HEADmasterLinus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Handle inode number mismatches in nsfs file handles - Update the comment to init_file() - Add documentation link for EBADF in the rust file code - Skip read lock assertion for read-only filesystems when using dax - Don't leak disconnected dentries during umount - Fix new coredump input pattern validation - Handle ENOIOCTLCMD conversion in vfs_fileattr_{g,s}et() correctly - Remove redundant IOCB_DIO_CALLER_COMP clearing in overlayfs * tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ovl: remove redundant IOCB_DIO_CALLER_COMP clearing fs: return EOPNOTSUPP from file_setattr/file_getattr syscalls Revert "fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP" coredump: fix core_pattern input validation vfs: Don't leak disconnected dentries on umount dax: skip read lock assertion for read-only filesystems rust: file: add intra-doc link for 'EBADF' fs: update comment in init_file() nsfs: handle inode number mismatches gracefully in file handles
17 hoursMerge tag 'ext4_for_linus-6.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: - Fix regression caused by removing CONFIG_EXT3_FS when testing some very old defconfigs - Avoid a BUG_ON when opening a file on a maliciously corrupted file system - Avoid mm warnings when freeing a very large orphan file metadata - Avoid a theoretical races between metadata writeback and checkpoints (it's very hard to hit in practice, since the race requires that the writeback take a very long time) * tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs ext4: free orphan info with kvfree ext4: detect invalid INLINE_DATA + EXTENTS flag combination ext4, doc: fix and improve directory hash tree description ext4: wait for ongoing I/O to complete before freeing blocks jbd2: ensure that all ongoing I/O complete before freeing blocks
39 hoursMerge tag 'nfsd-6.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix a crasher reported by rtm@csail.mit.edu * tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Define a proc_layoutcommit for the FlexFiles layout type
5 daysMerge tag 'for-6.18/hpfs-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull hpfs updates from Mikulas Patocka: - Avoid -Wflex-array-member-not-at-end warnings - Replace simple_strtoul with kstrtoint - Fix error code for new_inode() failure * tag 'for-6.18/hpfs-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: fs/hpfs: Fix error code for new_inode() failure in mkdir/create/mknod/symlink hpfs: Replace simple_strtoul with kstrtoint in hpfs_parse_param fs: hpfs: Avoid multiple -Wflex-array-member-not-at-end warnings
5 daysext4: free orphan info with kvfreeJan Kara
Orphan info is now getting allocated with kvmalloc_array(). Free it with kvfree() instead of kfree() to avoid complaints from mm. Reported-by: Chris Mason <clm@meta.com> Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-ID: <20251007134936.7291-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
5 daysext4: detect invalid INLINE_DATA + EXTENTS flag combinationDeepanshu Kartikey
syzbot reported a BUG_ON in ext4_es_cache_extent() when opening a verity file on a corrupted ext4 filesystem mounted without a journal. The issue is that the filesystem has an inode with both the INLINE_DATA and EXTENTS flags set: EXT4-fs error (device loop0): ext4_cache_extents:545: inode #15: comm syz.0.17: corrupted extent tree: lblk 0 < prev 66 Investigation revealed that the inode has both flags set: DEBUG: inode 15 - flag=1, i_inline_off=164, has_inline=1, extents_flag=1 This is an invalid combination since an inode should have either: - INLINE_DATA: data stored directly in the inode - EXTENTS: data stored in extent-mapped blocks Having both flags causes ext4_has_inline_data() to return true, skipping extent tree validation in __ext4_iget(). The unvalidated out-of-order extents then trigger a BUG_ON in ext4_es_cache_extent() due to integer underflow when calculating hole sizes. Fix this by detecting this invalid flag combination early in ext4_iget() and rejecting the corrupted inode. Cc: stable@kernel.org Reported-and-tested-by: syzbot+038b7bf43423e132b308@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=038b7bf43423e132b308 Suggested-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Message-ID: <20250930112810.315095-1-kartikey406@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
6 daysMerge tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: - some messenger improvements (Eric and Max) - address an issue (also affected userspace) of incorrect permissions being granted to users who have access to multiple different CephFS instances within the same cluster (Kotresh) - a bunch of assorted CephFS fixes (Slava) * tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client: ceph: add bug tracking system info to MAINTAINERS ceph: fix multifs mds auth caps issue ceph: cleanup in ceph_alloc_readdir_reply_buffer() ceph: fix potential NULL dereference issue in ceph_fill_trace() libceph: add empty check to ceph_con_get_out_msg() libceph: pass the message pointer instead of loading con->out_msg libceph: make ceph_con_get_out_msg() return the message pointer ceph: fix potential race condition on operations with CEPH_I_ODIRECT flag ceph: refactor wake_up_bit() pattern of calling ceph: fix potential race condition in ceph_ioctl_lazyio() ceph: fix overflowed constant issue in ceph_do_objects_copy() ceph: fix wrong sizeof argument issue in register_session() ceph: add checking of wait_for_completion_killable() return value ceph: make ceph_start_io_*() killable libceph: Use HMAC-SHA256 library instead of crypto_shash
6 daysMerge tag 'v6.18-rc-part2-smb-client-fixes' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull more smb client updates from Steve French: - fix i_size in fallocate - two truncate fixes - utime fix - minor cleanups - SMB1 fixes - improve error check in read - improve perf of copy file_range (copy_chunk) * tag 'v6.18-rc-part2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal version number cifs: Add comments for DeletePending assignments in open functions cifs: Add fallback code path for cifs_mkdir_setinfo() cifs: Allow fallback code in smb_set_file_info() also for directories cifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse points smb: client: remove cfids_invalidation_worker smb: client: remove redudant assignment in cifs_strict_fsync() smb: client: fix race with fallocate(2) and AIO+DIO smb: client: fix missing timestamp updates after utime(2) smb: client: fix missing timestamp updates after ftruncate(2) smb: client: fix missing timestamp updates with O_TRUNC cifs: Fix copy_to_iter return value check smb: client: batch SRV_COPYCHUNK entries to cut round trips smb: client: Omit an if branch in smb2_find_smb_tcon() smb: client: Return directly after a failed genlmsg_new() in cifs_swn_send_register_message() smb: client: Use common code in cifs_do_create() smb: client: Improve unlocking of a mutex in cifs_get_swn_reg() smb: client: Return a status code only as a constant in cifs_spnego_key_instantiate() smb: client: Use common code in cifs_lookup() smb: client: Reduce the scopes for a few variables in two functions
6 daysMerge tag 'block-6.18-20251009' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - Don't include __GFP_NOWARN for loop worker allocation, as it already uses GFP_NOWAIT which has __GFP_NOWARN set already - Small series cleaning up the recent bio_iov_iter_get_pages() changes - loop fix for leaking the backing reference file, if validation fails - Update of a comment pertaining to disk/partition stat locking * tag 'block-6.18-20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: loop: remove redundant __GFP_NOWARN flag block: move bio_iov_iter_get_bdev_pages to block/fops.c iomap: open code bio_iov_iter_get_bdev_pages block: rename bio_iov_iter_get_pages_aligned to bio_iov_iter_get_pages block: remove bio_iov_iter_get_pages block: Update a comment of disk statistics loop: fix backing file reference leak on validation error
6 daysext4: wait for ongoing I/O to complete before freeing blocksZhang Yi
When freeing metadata blocks in nojournal mode, ext4_forget() calls bforget() to clear the dirty flag on the buffer_head and remvoe associated mappings. This is acceptable if the metadata has not yet begun to be written back. However, if the write-back has already started but is not yet completed, ext4_forget() will have no effect. Subsequently, ext4_mb_clear_bb() will immediately return the block to the mb allocator. This block can then be reallocated immediately, potentially causing an data corruption issue. Fix this by clearing the buffer's dirty flag and waiting for the ongoing I/O to complete, ensuring that no further writes to stale data will occur. Fixes: 16e08b14a455 ("ext4: cleanup clean_bdev_aliases() calls") Cc: stable@kernel.org Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com> Closes: https://lore.kernel.org/linux-ext4/a9417096-9549-4441-9878-b1955b899b4e@huaweicloud.com/ Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-ID: <20250916093337.3161016-3-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
6 daysjbd2: ensure that all ongoing I/O complete before freeing blocksZhang Yi
When releasing file system metadata blocks in jbd2_journal_forget(), if this buffer has not yet been checkpointed, it may have already been written back, currently be in the process of being written back, or has not yet written back. jbd2_journal_forget() calls jbd2_journal_try_remove_checkpoint() to check the buffer's status and add it to the current transaction if it has not been written back. This buffer can only be reallocated after the transaction is committed. jbd2_journal_try_remove_checkpoint() attempts to lock the buffer and check its dirty status while holding the buffer lock. If the buffer has already been written back, everything proceeds normally. However, there are two issues. First, the function returns immediately if the buffer is locked by the write-back process. It does not wait for the write-back to complete. Consequently, until the current transaction is committed and the block is reallocated, there is no guarantee that the I/O will complete. This means that ongoing I/O could write stale metadata to the newly allocated block, potentially corrupting data. Second, the function unlocks the buffer as soon as it detects that the buffer is still dirty. If a concurrent write-back occurs immediately after this unlocking and before clear_buffer_dirty() is called in jbd2_journal_forget(), data corruption can theoretically still occur. Although these two issues are unlikely to occur in practice since the undergoing metadata writeback I/O does not take this long to complete, it's better to explicitly ensure that all ongoing I/O operations are completed. Fixes: 597599268e3b ("jbd2: discard dirty data when forgetting an un-journalled buffer") Cc: stable@kernel.org Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-ID: <20250916093337.3161016-2-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
6 daysNFSD: Define a proc_layoutcommit for the FlexFiles layout typeChuck Lever
Avoid a crash if a pNFS client should happen to send a LAYOUTCOMMIT operation on a FlexFiles layout. Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/linux-nfs/152f99b2-ba35-4dec-93a9-4690e625dccd@oracle.com/T/#t Cc: Thomas Haynes <loghyr@hammerspace.com> Cc: stable@vger.kernel.org Fixes: 9b9960a0ca47 ("nfsd: Add a super simple flex file server") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
6 dayscifs: update internal version numberSteve French
to 2.57 Signed-off-by: Steve French <stfrench@microsoft.com>
6 daysovl: remove redundant IOCB_DIO_CALLER_COMP clearingSeong-Gwang Heo
The backing_file_write_iter() function, which is called immediately after this code, already contains identical logic to clear the IOCB_DIO_CALLER_COMP flag along with the same explanatory comment. There is no need to duplicate this operation in the overlayfs code. Signed-off-by: Seong-Gwang Heo <heo@mykernel.net> Fixes: a6293b3e285c ("fs: factor out backing_file_{read,write}_iter() helpers") Acked-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
6 daysfs: return EOPNOTSUPP from file_setattr/file_getattr syscallsAndrey Albershteyn
These syscalls call to vfs_fileattr_get/set functions which return ENOIOCTLCMD if filesystem doesn't support setting file attribute on an inode. For syscalls EOPNOTSUPP would be more appropriate return error. Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
6 daysRevert "fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP"Andrey Albershteyn
This reverts commit 474b155adf3927d2c944423045757b54aa1ca4de. This patch caused regression in ioctl_setflags(). Underlying filesystems use EOPNOTSUPP to indicate that flag is not supported. This error is also gets converted in ioctl_setflags(). Therefore, for unsupported flags error changed from EOPNOSUPP to ENOIOCTLCMD. Link: https://lore.kernel.org/linux-xfs/a622643f-1585-40b0-9441-cf7ece176e83@kernel.org/ Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
6 dayscifs: Add comments for DeletePending assignments in open functionsPali Rohár
On more places is set DeletePending member to 0. Add comments why is 0 the correct value. Paths in DELETE_PENDING state cannot be opened by new calls. So if the newly issued open for that path succeed then it means that the path cannot be in DELETE_PENDING state. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
6 dayscifs: Add fallback code path for cifs_mkdir_setinfo()Pali Rohár
Use SMBSetInformation() as a fallback function (when CIFSSMBSetPathInfo() fails) which can set attribudes on the directory, including changing read-only attribute. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
6 dayscifs: Allow fallback code in smb_set_file_info() also for directoriesPali Rohár
On NT systems, it is possible to do SMB open call also for directories. Open argument CREATE_NOT_DIR disallows opening directories. So in fallback code path in smb_set_file_info() remove CREATE_NOT_DIR restriction to allow it also for directories. Similar fallback is implemented also in CIFSSMBSetPathInfoFB() function and this function already allows to call operation for directories. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
6 dayscifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse pointsPali Rohár
EA $LXMOD is required for WSL non-symlink reparse points. Fixes: ef86ab131d91 ("cifs: Fix querying of WSL CHR and BLK reparse points over SMB1") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
7 daysMerge tag '9p-for-6.18-rc1' of https://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: "A bunch of unrelated fixes: - polling fix for trans fd that ought to have been fixed otherwise back in March, but apparently came back somewhere else... - USB transport buffer overflow fix - Some dentry lifetime rework to handle metadata update for currently opened files in uncached mode, or inode type change in cached mode - a double-put on invalid flush found by syzbot - and finally /sys/fs/9p/caches not advancing buffer and overwriting itself for large contents Thanks to everyone involved!" * tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux: 9p: sysfs_init: don't hardcode error to ENOMEM 9p: fix /sys/fs/9p/caches overwriting itself 9p: clean up comment typos 9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN net/9p: fix double req put in p9_fd_cancelled net/9p: Fix buffer overflow in USB transport layer fs/9p: Add p9_debug(VFS) in d_revalidate fs/9p: Invalidate dentry if inode type change detected in cached mode fs/9p: Refresh metadata in d_revalidate for uncached mode too
7 dayssmb: client: remove cfids_invalidation_workerEnzo Matsumiya
We can do the same cleanup on laundromat. On invalidate_all_cached_dirs(), run laundromat worker with 0 timeout and flush it for immediate + sync cleanup. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: remove redudant assignment in cifs_strict_fsync()Paulo Alcantara
Remove redudant assignment of @rc as it will be overwritten by the following cifs_file_flush() call. Reported-by: Steve French <stfrench@microsoft.com> Addresses-Coverity: 1665925 Fixes: 210627b0aca9 ("smb: client: fix missing timestamp updates with O_TRUNC") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: fix race with fallocate(2) and AIO+DIOPaulo Alcantara
AIO+DIO may extend the file size, hence we need to make sure ->i_size is stable across the entire fallocate(2) operation, otherwise it would become a truncate and then inode size reduced back down when it finishes. Fix this by calling netfs_wait_for_outstanding_io() right after acquiring ->i_rwsem exclusively in cifs_fallocate() and then guarantee a stable ->i_size across fallocate(2). Also call netfs_wait_for_outstanding_io() after truncating pagecache to avoid any potential races with writeback. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Fixes: 210627b0aca9 ("smb: client: fix missing timestamp updates with O_TRUNC") Cc: Frank Sorenson <sorenson@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: fix missing timestamp updates after utime(2)Paulo Alcantara
Don't reuse open handle when changing timestamps to prevent the server from disabling automatic timestamp updates as per MS-FSA 2.1.4.17. ---8<--- import os import time filename = '/mnt/foo' def print_stat(prefix): st = os.stat(filename) print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime)) fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644) print_stat('old') os.utime(fd, None) time.sleep(2) os.write(fd, b'foo') os.close(fd) time.sleep(2) print_stat('new') ---8<--- Before patch: $ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025 new : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025 After patch: $ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 17:03:34 2025 Fri Oct 3 17:03:34 2025 new : Fri Oct 3 17:03:36 2025 Fri Oct 3 17:03:36 2025 Fixes: b6f2a0f89d7e ("cifs: for compound requests, use open handle if possible") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: Frank Sorenson <sorenson@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: fix missing timestamp updates after ftruncate(2)Paulo Alcantara
Mask off ATTR_MTIME|ATTR_CTIME bits on ATTR_SIZE (e.g. ftruncate(2)) to prevent the client from sending set info calls and then disabling automatic timestamp updates on server side as per MS-FSA 2.1.4.17. ---8<--- import os import time filename = '/mnt/foo' def print_stat(prefix): st = os.stat(filename) print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime)) fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644) print_stat('old') os.ftruncate(fd, 10) time.sleep(2) os.write(fd, b'foo') os.close(fd) time.sleep(2) print_stat('new') ---8<--- Before patch: $ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 13:47:03 2025 Fri Oct 3 13:47:03 2025 new : Fri Oct 3 13:47:00 2025 Fri Oct 3 13:47:03 2025 After patch: $ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 13:48:39 2025 Fri Oct 3 13:48:39 2025 new : Fri Oct 3 13:48:41 2025 Fri Oct 3 13:48:41 2025 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: Frank Sorenson <sorenson@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: fix missing timestamp updates with O_TRUNCPaulo Alcantara
Don't call ->set_file_info() on open handle to prevent the server from stopping [cm]time updates automatically as per MS-FSA 2.1.4.17. Fix this by checking for ATTR_OPEN bit earlier in cifs_setattr() to prevent ->set_file_info() from being called when opening a file with O_TRUNC. Do the truncation in ->open() instead. This also saves two roundtrips when opening a file with O_TRUNC and there are currently no open handles to be reused. Before patch: $ mount.cifs //srv/share /mnt -o ... $ cd /mnt $ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo old: 2025-10-03 13:26:23.151030500 -0300 2025-10-03 13:26:23.151030500 -0300 new: 2025-10-03 13:26:23.151030500 -0300 2025-10-03 13:26:23.151030500 -0300 After patch: $ mount.cifs //srv/share /mnt -o ... $ cd /mnt $ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo $ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo old: 2025-10-03 13:28:13.911933800 -0300 2025-10-03 13:28:13.911933800 -0300 new: 2025-10-03 13:28:26.647492700 -0300 2025-10-03 13:28:26.647492700 -0300 Reported-by: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayscifs: Fix copy_to_iter return value checkFushuai Wang
The return value of copy_to_iter() function will never be negative, it is the number of bytes copied, or zero if nothing was copied. Update the check to treat 0 as an error, and return -1 in that case. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Acked-by: Tom Talpey <tom@talpey.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: batch SRV_COPYCHUNK entries to cut round tripsHenrique Carvalho
smb2_copychunk_range() used to send a single SRV_COPYCHUNK per SRV_COPYCHUNK_COPY IOCTL. Implement variable Chunks[] array in struct copychunk_ioctl and fill it with struct copychunk (MS-SMB2 2.2.31.1.1), bounded by server-advertised limits. This reduces the number of IOCTL requests for large copies. While we are at it, rename a couple variables to follow the terminology used in the specification. Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: Omit an if branch in smb2_find_smb_tcon()Markus Elfring
Statements from an if branch and the end of this function implementation were equivalent. Thus delete duplicate source code. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
7 daysceph: fix multifs mds auth caps issueKotresh HR
The mds auth caps check should also validate the fsname along with the associated caps. Not doing so would result in applying the mds auth caps of one fs on to the other fs in a multifs ceph cluster. The bug causes multiple issues w.r.t user authentication, following is one such example. Steps to Reproduce (on vstart cluster): 1. Create two file systems in a cluster, say 'fsname1' and 'fsname2' 2. Authorize read only permission to the user 'client.usr' on fs 'fsname1' $ceph fs authorize fsname1 client.usr / r 3. Authorize read and write permission to the same user 'client.usr' on fs 'fsname2' $ceph fs authorize fsname2 client.usr / rw 4. Update the keyring $ceph auth get client.usr >> ./keyring With above permssions for the user 'client.usr', following is the expectation. a. The 'client.usr' should be able to only read the contents and not allowed to create or delete files on file system 'fsname1'. b. The 'client.usr' should be able to read/write on file system 'fsname2'. But, with this bug, the 'client.usr' is allowed to read/write on file system 'fsname1'. See below. 5. Mount the file system 'fsname1' with the user 'client.usr' $sudo bin/mount.ceph usr@.fsname1=/ /kmnt_fsname1_usr/ 6. Try creating a file on file system 'fsname1' with user 'client.usr'. This should fail but passes with this bug. $touch /kmnt_fsname1_usr/file1 7. Mount the file system 'fsname1' with the user 'client.admin' and create a file. $sudo bin/mount.ceph admin@.fsname1=/ /kmnt_fsname1_admin $echo "data" > /kmnt_fsname1_admin/admin_file1 8. Try removing an existing file on file system 'fsname1' with the user 'client.usr'. This shoudn't succeed but succeeds with the bug. $rm -f /kmnt_fsname1_usr/admin_file1 For more information, please take a look at the corresponding mds/fuse patch and tests added by looking into the tracker mentioned below. v2: Fix a possible null dereference in doutc v3: Don't store fsname from mdsmap, validate against ceph_mount_options's fsname and use it v4: Code refactor, better warning message and fix possible compiler warning [ Slava.Dubeyko: "fsname check failed" -> "fsname mismatch" ] Link: https://tracker.ceph.com/issues/72167 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: cleanup in ceph_alloc_readdir_reply_buffer()Viacheslav Dubeyko
The Coverity Scan service has reported potential issue in ceph_alloc_readdir_reply_buffer() [1]. If order could be negative one, then it expects the issue in the logic: num_entries = (PAGE_SIZE << order) / size; Technically speaking, this logic [2] should prevent from making the order variable negative: if (!rinfo->dir_entries) return -ENOMEM; However, the allocation logic requires some cleanup. This patch makes sure that calculated bytes count will never exceed ULONG_MAX before get_order() calculation. And it adds the checking of order variable on negative value to guarantee that second half of the function's code will never operate by negative value of order variable even if something will be wrong or to be changed in the first half of the function's logic. v2 Alex Markuze suggested to add unlikely() macro for introduced condition checks. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1198252 [2] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/mds_client.c#L2553 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: fix potential NULL dereference issue in ceph_fill_trace()Viacheslav Dubeyko
The Coverity Scan service has detected a potential dereference of an explicit NULL value in ceph_fill_trace() [1]. The variable in is declared in the beggining of ceph_fill_trace() [2]: struct inode *in = NULL; However, the initialization of the variable is happening under condition [3]: if (rinfo->head->is_target) { <skipped> in = req->r_target_inode; <skipped> } Potentially, if rinfo->head->is_target == FALSE, then in variable continues to be NULL and later the dereference of NULL value could happen in ceph_fill_trace() logic [4,5]: else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP || req->r_op == CEPH_MDS_OP_MKSNAP) && test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) && !test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) { <skipped> ihold(in); err = splice_dentry(&req->r_dentry, in); if (err < 0) goto done; } This patch adds the checking of in variable for NULL value and it returns -EINVAL error code if it has NULL value. v2 Alex Markuze suggested to add unlikely macro in the checking condition. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1141197 [2] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1522 [3] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1629 [4] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1745 [5] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1777 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: fix potential race condition on operations with CEPH_I_ODIRECT flagViacheslav Dubeyko
The Coverity Scan service has detected potential race conditions in ceph_block_o_direct(), ceph_start_io_read(), ceph_block_buffered(), and ceph_start_io_direct() [1 - 4]. The CID 1590942, 1590665, 1589664, 1590377 contain explanation: "The value of the shared data will be determined by the interleaving of thread execution. Thread shared data is accessed without holding an appropriate lock, possibly causing a race condition (CWE-366)". This patch reworks the pattern of accessing/modification of CEPH_I_ODIRECT flag by means of adding smp_mb__before_atomic() before reading the status of CEPH_I_ODIRECT flag and smp_mb__after_atomic() after clearing set/clear this flag. Also, it was reworked the pattern of using of ci->i_ceph_lock in ceph_block_o_direct(), ceph_start_io_read(), ceph_block_buffered(), and ceph_start_io_direct() methods. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590942 [2] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590665 [3] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1589664 [4] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590377 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: refactor wake_up_bit() pattern of callingViacheslav Dubeyko
The wake_up_bit() is called in ceph_async_unlink_cb(), wake_async_create_waiters(), and ceph_finish_async_create(). It makes sense to switch on clear_bit() function, because it makes the code much cleaner and easier to understand. More important rework is the adding of smp_mb__after_atomic() memory barrier after the bit modification and before wake_up_bit() call. It can prevent potential race condition of accessing the modified bit in other threads. Luckily, clear_and_wake_up_bit() already implements the required functionality pattern: static inline void clear_and_wake_up_bit(int bit, unsigned long *word) { clear_bit_unlock(bit, word); /* See wake_up_bit() for which memory barrier you need to use. */ smp_mb__after_atomic(); wake_up_bit(word, bit); } Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: fix potential race condition in ceph_ioctl_lazyio()Viacheslav Dubeyko
The Coverity Scan service has detected potential race condition in ceph_ioctl_lazyio() [1]. The CID 1591046 contains explanation: "Check of thread-shared field evades lock acquisition (LOCK_EVASION). Thread1 sets fmode to a new value. Now the two threads have an inconsistent view of fmode and updates to fields correlated with fmode may be lost. The data guarded by this critical section may be read while in an inconsistent state or modified by multiple racing threads. In ceph_ioctl_lazyio: Checking the value of a thread-shared field outside of a locked region to determine if a locked operation involving that thread shared field has completed. (CWE-543)". The patch places fi->fmode field access under ci->i_ceph_lock protection. Also, it introduces the is_file_already_lazy variable that is set under the lock and it is checked later out of scope of critical section. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1591046 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: fix overflowed constant issue in ceph_do_objects_copy()Viacheslav Dubeyko
The Coverity Scan service has detected overflowed constant issue in ceph_do_objects_copy() [1]. The CID 1624308 defect contains explanation: "The overflowed value due to arithmetic on constants is too small or unexpectedly negative, causing incorrect computations. Expression bytes, which is equal to -95, where ret is known to be equal to -95, underflows the type that receives it, an unsigned integer 64 bits wide. In ceph_do_objects_copy: Integer overflow occurs in arithmetic on constant operands (CWE-190)". The patch changes the type of bytes variable from size_t to ssize_t with the goal of to be capable to receive negative values. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1624308 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: fix wrong sizeof argument issue in register_session()Viacheslav Dubeyko
The Coverity Scan service has detected the wrong sizeof argument in register_session() [1]. The CID 1598909 defect contains explanation: "The wrong sizeof value is used in an expression or as argument to a function. The result is an incorrect value that may cause unexpected program behaviors. In register_session: The sizeof operator is invoked on the wrong argument (CWE-569)". The patch introduces a ptr_size variable that is initialized by sizeof(struct ceph_mds_session *). And this variable is used instead of sizeof(void *) in the code. [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1598909 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: add checking of wait_for_completion_killable() return valueViacheslav Dubeyko
The Coverity Scan service has detected the calling of wait_for_completion_killable() without checking the return value in ceph_lock_wait_for_completion() [1]. The CID 1636232 defect contains explanation: "If the function returns an error value, the error value may be mistaken for a normal value. In ceph_lock_wait_for_completion(): Value returned from a function is not checked for errors before being used. (CWE-252)". The patch adds the checking of wait_for_completion_killable() return value and return the error code from ceph_lock_wait_for_completion(). [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1636232 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 daysceph: make ceph_start_io_*() killableMax Kellermann
This allows killing processes that wait for a lock when one process is stuck waiting for the Ceph server. This is similar to the NFS commit 38a125b31504 ("fs/nfs/io: make nfs_start_io_*() killable"). [ idryomov: drop comment on include, formatting ] Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
9 dayssmb: client: Return directly after a failed genlmsg_new() in ↵Markus Elfring
cifs_swn_send_register_message() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Return directly after a call of the function “genlmsg_new” failed at the beginning. * Delete the label “fail” which became unnecessary with this refactoring. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb: client: Use common code in cifs_do_create()Markus Elfring
Use a label once more so that a bit of common code can be better reused at the end of this function implementation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb: client: Improve unlocking of a mutex in cifs_get_swn_reg()Markus Elfring
Use two additional labels so that another bit of common code can be better reused at the end of this function implementation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb: client: Return a status code only as a constant in ↵Markus Elfring
cifs_spnego_key_instantiate() * Return a status code without storing it in an intermediate variable. * Delete the local variable “ret” and the label “error” which became unnecessary with this refactoring. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysiomap: open code bio_iov_iter_get_bdev_pagesChristoph Hellwig
Prepare for passing different alignments, and to retired bio_iov_iter_get_bdev_pages as a global helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 dayscoredump: fix core_pattern input validationChristian Brauner
In be1e0283021e ("coredump: don't pointlessly check and spew warnings") we tried to fix input validation so it only happens during a write to core_pattern. This would avoid needlessly logging a lot of warnings during a read operation. However the logic accidently got inverted in this commit. Fix it so the input validation only happens on write and is skipped on read. Fixes: be1e0283021e ("coredump: don't pointlessly check and spew warnings") Fixes: 16195d2c7dd2 ("coredump: validate socket name as it is written") Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: Yu Watanabe <watanabe.yu@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysvfs: Don't leak disconnected dentries on umountJan Kara
When user calls open_by_handle_at() on some inode that is not cached, we will create disconnected dentry for it. If such dentry is a directory, exportfs_decode_fh_raw() will then try to connect this dentry to the dentry tree through reconnect_path(). It may happen for various reasons (such as corrupted fs or race with rename) that the call to lookup_one_unlocked() in reconnect_one() will fail to find the dentry we are trying to reconnect and instead create a new dentry under the parent. Now this dentry will not be marked as disconnected although the parent still may well be disconnected (at least in case this inconsistency happened because the fs is corrupted and .. doesn't point to the real parent directory). This creates inconsistency in disconnected flags but AFAICS it was mostly harmless. At least until commit f1ee616214cb ("VFS: don't keep disconnected dentries on d_anon") which removed adding of most disconnected dentries to sb->s_anon list. Thus after this commit cleanup of disconnected dentries implicitely relies on the fact that dput() will immediately reclaim such dentries. However when some leaf dentry isn't marked as disconnected, as in the scenario described above, the reclaim doesn't happen and the dentries are "leaked". Memory reclaim can eventually reclaim them but otherwise they stay in memory and if umount comes first, we hit infamous "Busy inodes after unmount" bug. Make sure all dentries created under a disconnected parent are marked as disconnected as well. Reported-by: syzbot+1d79ebe5383fc016cf07@syzkaller.appspotmail.com Fixes: f1ee616214cb ("VFS: don't keep disconnected dentries on d_anon") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysdax: skip read lock assertion for read-only filesystemsYuezhang Mo
The commit 168316db3583("dax: assert that i_rwsem is held exclusive for writes") added lock assertions to ensure proper locking in DAX operations. However, these assertions trigger false-positive lockdep warnings since read lock is unnecessary on read-only filesystems(e.g., erofs). This patch skips the read lock assertion for read-only filesystems, eliminating the spurious warnings while maintaining the integrity checks for writable filesystems. Fixes: 168316db3583 ("dax: assert that i_rwsem is held exclusive for writes") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Friendy Su <friendy.su@sony.com> Reviewed-by: Daniel Palmer <daniel.palmer@sony.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysfs: update comment in init_file()Zhou Yuhang
The f_count member in struct file has been replaced by f_ref, so update f_count to f_ref in the comment. Signed-off-by: Zhou Yuhang <zhouyuhang@kylinos.cn> Signed-off-by: Christian Brauner <brauner@kernel.org>
9 daysnsfs: handle inode number mismatches gracefully in file handlesDeepanshu Kartikey
Replace VFS_WARN_ON_ONCE() with graceful error handling when file handles contain inode numbers that don't match the actual namespace inode. This prevents userspace from triggering kernel warnings by providing malformed file handles to open_by_handle_at(). The issue occurs when userspace provides a file handle with valid namespace type and ID that successfully locates a namespace, but specifies an incorrect inode number. Previously, this would trigger VFS_WARN_ON_ONCE() when comparing the real inode number against the provided value. Since file handle data is user-controllable, inode number mismatches should be treated as invalid input rather than kernel consistency errors. Handle this case by returning NULL to indicate the file handle is invalid, rather than warning about what is essentially user input validation. Reported-by: syzbot+9eefe09bedd093f156c2@syzkaller.appspotmail.com Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>