summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2026-02-28Merge tag 'v7.0rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Two multichannel fixes - Locking fix for superblock flags - Fix to remove debug message that could log password - Cleanup fix for setting credentials * tag 'v7.0rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: Use snprintf in cifs_set_cifscreds smb: client: Don't log plaintext credentials in cifs_set_cifscreds smb: client: fix broken multichannel with krb5+signing smb: client: use atomic_t for mnt_cifs_flags smb: client: fix cifs_pick_channel when channels are equally loaded
2026-02-27nsfs: tighten permission checks for handle openingChristian Brauner
Even privileged services should not necessarily be able to see other privileged service's namespaces so they can't leak information to each other. Use may_see_all_namespaces() helper that centralizes this policy until the nstree adapts. Link: https://patch.msgid.link/20260226-work-visibility-fixes-v1-2-d2c2853313bd@kernel.org Fixes: 5222470b2fbb ("nsfs: support file handles") Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@kernel.org # v6.18+ Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-27nsfs: tighten permission checks for ns iteration ioctlsChristian Brauner
Even privileged services should not necessarily be able to see other privileged service's namespaces so they can't leak information to each other. Use may_see_all_namespaces() helper that centralizes this policy until the nstree adapts. Link: https://patch.msgid.link/20260226-work-visibility-fixes-v1-1-d2c2853313bd@kernel.org Fixes: a1d220d9dafa ("nsfs: iterate through mount namespaces") Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@kernel.org # v6.12+ Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-27NFS: Fix NFS KConfig typosAnna Schumaker
Two issues were noticed after the NFS v4.0 KConfig changes were merged upstream. First, the text of CONFIG_NFS_V4 should not encourage people to select it if they are unsure. Second, the new CONFIG_NFS_V4_0 option should default to "on" instead of "off" to avoid breaking people's setups if they are using NFS v4.0. Reported-by: Niklas Cassel <cassel@kernel.org> Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 4e0269352534 ("NFS: Add a way to disable NFS v4.0 via KConfig") Fixes: 7537db24806f ("NFS: Merge CONFIG_NFS_V4_1 with CONFIG_NFS_V4") Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2026-02-27Merge tag 'xfs-fixes-7.0-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: "Nothing reeeally stands out here: a few bug fixes, some refactoring to easily fit the bug fixes, and a couple cosmetic changes" * tag 'xfs-fixes-7.0-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: add static size checks for ioctl UABI xfs: remove duplicate static size checks xfs: Add comments for usages of some macros. xfs: Update lazy counters in xfs_growfs_rt_bmblock() xfs: Add a comment in xfs_log_sb() xfs: Fix xfs_last_rt_bmblock() xfs: don't report half-built inodes to fserror xfs: don't report metadata inodes to fserror xfs: fix potential pointer access race in xfs_healthmon_get xfs: fix xfs_group release bug in xfs_dax_notify_dev_failure xfs: fix xfs_group release bug in xfs_verify_report_losses xfs: fix copy-paste error in previous fix xfs: Fix error pointer dereference xfs: remove metafile inodes from the active inode stat xfs: cleanup inode counter stats xfs: fix code alignment issues in xfs_ondisk.c xfs: Replace &rtg->rtg_group with rtg_group() xfs: Refactoring the nagcount and delta calculation xfs: Replace ASSERT with XFS_IS_CORRUPT in xfs_rtcopy_summary()
2026-02-27smb: client: Use snprintf in cifs_set_cifscredsThorsten Blum
Replace unbounded sprintf() calls with the safer snprintf(). Avoid using magic numbers and use strlen() to calculate the key descriptor buffer size. Save the size in a local variable and reuse it for the bounded snprintf() calls. Remove CIFSCREDS_DESC_SIZE. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-02-26smb: client: Don't log plaintext credentials in cifs_set_cifscredsThorsten Blum
When debug logging is enabled, cifs_set_cifscreds() logs the key payload and exposes the plaintext username and password. Remove the debug log to avoid exposing credentials. Fixes: 8a8798a5ff90 ("cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts") Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-02-26smb: client: fix broken multichannel with krb5+signingPaulo Alcantara
When mounting a share with 'multichannel,max_channels=n,sec=krb5i', the client was duplicating signing key for all secondary channels, thus making the server fail all commands sent from secondary channels due to bad signatures. Every channel has its own signing key, so when establishing a new channel with krb5 auth, make sure to use the new session key as the derived key to generate channel's signing key in SMB2_auth_kerberos(). Repro: $ mount.cifs //srv/share /mnt -o multichannel,max_channels=4,sec=krb5i $ sleep 5 $ umount /mnt $ dmesg ... CIFS: VFS: sign fail cmd 0x5 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 CIFS: VFS: sign fail cmd 0x5 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 CIFS: VFS: sign fail cmd 0x4 message id 0x2 CIFS: VFS: \\srv SMB signature verification returned error = -13 Reported-by: Xiaoli Feng <xifeng@redhat.com> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2026-02-26smb: client: use atomic_t for mnt_cifs_flagsPaulo Alcantara
Use atomic_t for cifs_sb_info::mnt_cifs_flags as it's currently accessed locklessly and may be changed concurrently in mount/remount and reconnect paths. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2026-02-26Merge tag 'v7.0-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - auth security improvement - fix potential buffer overflow in smbdirect negotiation * tag 'v7.0-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix signededness bug in smb_direct_prepare_negotiation() ksmbd: Compare MACs in constant time
2026-02-26Merge tag 'mm-hotfixes-stable-2026-02-26-14-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "12 hotfixes. 7 are cc:stable. 8 are for MM. All are singletons - please see the changelogs for details" * tag 'mm-hotfixes-stable-2026-02-26-14-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: update Yosry Ahmed's email address mailmap: add entry for Daniele Alessandrelli mm: fix NULL NODE_DATA dereference for memoryless nodes on boot mm/tracing: rss_stat: ensure curr is false from kthread context mm/kfence: fix KASAN hardware tag faults during late enablement mm/damon/core: disallow non-power of two min_region_sz Squashfs: check metadata block offset is within range MAINTAINERS, mailmap: update e-mail address for Vlastimil Babka liveupdate: luo_file: remember retrieve() status mm: thp: deny THP for files on anonymous inodes mm: change vma_alloc_folio_noprof() macro to inline function mm/kfence: disable KFENCE upon KASAN HW tags enablement
2026-02-26btrfs: check block group lookup in remove_range_from_remap_tree()Mark Harmstone
Add a check in remove_range_from_remap_tree() after we call btrfs_lookup_block_group(), to check if it is NULL. This shouldn't happen, but if it does we at least get an error rather than a segfault. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125129.2245240-1-clm@meta.com/ Fixes: 979e1dc3d69e ("btrfs: handle deletions from remapped block group") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix transaction handle leaks in btrfs_last_identity_remap_gone()Mark Harmstone
btrfs_abort_transaction(), unlike btrfs_commit_transaction(), doesn't also free the transaction handle. Fix the instances in btrfs_last_identity_remap_gone() where we're also leaking the transaction on abort. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125129.2245240-1-clm@meta.com/ Fixes: 979e1dc3d69e ("btrfs: handle deletions from remapped block group") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix chunk map leak in btrfs_map_block() after btrfs_translate_remap()Mark Harmstone
If the call to btrfs_translate_remap() in btrfs_map_block() returns an error code, we were leaking the chunk map. Fix it by jumping to out rather than returning directly. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125830.2352988-1-clm@meta.com/ Fixes: 18ba64992871 ("btrfs: redirect I/O for remapped block groups") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix chunk map leak in btrfs_map_block() after ↵Mark Harmstone
btrfs_chunk_map_num_copies() Fix a chunk map leak in btrfs_map_block(): if we return early with -EINVAL, we're not freeing the chunk map that we've just looked up. Fixes: 0ae653fbec2b ("btrfs: reduce chunk_map lookups in btrfs_map_block()") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix compat mask in error messages in btrfs_check_features()Mark Harmstone
Commit d7f67ac9a928 ("btrfs: relax block-group-tree feature dependency checks") introduced a regression when it comes to handling unsupported incompat or compat_ro flags. Beforehand we only printed the flags that we didn't recognize, afterwards we printed them all, which is less useful. Fix the error handling so it behaves like it used to. Fixes: d7f67ac9a928 ("btrfs: relax block-group-tree feature dependency checks") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: print correct subvol num if active swapfile prevents deletionMark Harmstone
Fix the error message in btrfs_delete_subvolume() if we can't delete a subvolume because it has an active swapfile: we were printing the number of the parent rather than the target. Fixes: 60021bd754c6 ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix warning in scrub_verify_one_metadata()Mark Harmstone
Commit b471965fdb2d ("btrfs: fix replace/scrub failure with metadata_uuid") fixed the comparison in scrub_verify_one_metadata() to use metadata_uuid rather than fsid, but left the warning as it was. Fix it so it matches what we're doing. Fixes: b471965fdb2d ("btrfs: fix replace/scrub failure with metadata_uuid") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix objectid value in error message in check_extent_data_ref()Mark Harmstone
Fix a copy-paste error in check_extent_data_ref(): we're printing root as in the message above, we should be printing objectid. Fixes: f333a3c7e832 ("btrfs: tree-checker: validate dref root and objectid") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix incorrect key offset in error message in check_dev_extent_item()Mark Harmstone
Fix the error message in check_dev_extent_item(), when an overlapping stripe is encountered. For dev extents, objectid is the disk number and offset the physical address, so prev_key->objectid should actually be prev_key->offset. (I can't take any credit for this one - this was discovered by Chris and his friend Claude.) Reported-by: Chris Mason <clm@fb.com> Fixes: 008e2512dc56 ("btrfs: tree-checker: add dev extent item checks") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix error message order of parameters in btrfs_delete_delayed_dir_index()Mark Harmstone
Fix the error message in btrfs_delete_delayed_dir_index() if __btrfs_add_delayed_item() fails: the message says root, inode, index, error, but we're actually passing index, root, inode, error. Fixes: adc1ef55dc04 ("btrfs: add details to error messages at btrfs_delete_delayed_dir_index()") Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: don't commit the super block when unmounting a shutdown filesystemMiquel Sabaté Solà
When unmounting a filesystem we will try, among many other things, to commit the super block. On a filesystem that was shutdown, though, this will always fail with -EROFS as writes are forbidden on this context; and an error will be reported. Don't commit the super block on this situation, which should be fine as the filesystem is frozen before shutdown and, therefore, it should be at a consistent state. Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: free pages on error in btrfs_uring_read_extent()Miquel Sabaté Solà
In this function the 'pages' object is never freed in the hopes that it is picked up by btrfs_uring_read_finished() whenever that executes in the future. But that's just the happy path. Along the way previous allocations might have gone wrong, or we might not get -EIOCBQUEUED from btrfs_encoded_read_regular_fill_pages(). In all these cases, we go to a cleanup section that frees all memory allocated by this function without assuming any deferred execution, and this also needs to happen for the 'pages' allocation. Fixes: 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix referenced/exclusive check in squota_check_parent_usage()Boris Burkov
We compared rfer_cmpr against excl_cmpr_sum instead of rfer_cmpr_sum which is confusing. I expect that rfer_cmpr == excl_cmpr in squota, but it is much better to be consistent in case of any surprises or bugs. Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/cover.1764796022.git.boris@bur.io/T/#mccb231643ffd290b44a010d4419474d280be5537 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: remove pointless WARN_ON() in cache_save_setup()Filipe Manana
This WARN_ON(ret) is never executed since the previous if statement makes us jump into the 'out_put' label when ret is not zero. The existing transaction abort inside the if statement also gives us a stack trace, so we don't need to move the WARN_ON(ret) into the if statement either. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: convert log messages to error level in btrfs_replay_log()Filipe Manana
We are logging messages as warnings but they should really have an error level instead, as if the respective conditions are met the mount will fail. So convert them to error level and also log the error code returned by read_tree_block(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: remove btrfs_handle_fs_error() after failure to recover log treesFilipe Manana
There is no need to call btrfs_handle_fs_error() (which we are trying to deprecate) if we fail to recover log trees: 1) Such a failure results in failing the mount immediately; 2) If the recovery started a transaction before failing, it has already aborted the transaction down in the call chain. So remove the btrfs_handle_fs_error() call, replace it with an error message and assert that the FS is in error state (so that no partial updates are committed due to a transaction that was not aborted). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: remove redundant warning message in btrfs_check_uuid_tree()Filipe Manana
If we fail to start the UUID rescan kthread, btrfs_check_uuid_tree() logs an error message and returns the error to the single caller, open_ctree(). This however is redundant since the caller already logs an error message, which is also more informative since it logs the error code. Some remove the warning message from btrfs_check_uuid_tree() as it doesn't add any value. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: change warning messages to error level in open_ctree()Filipe Manana
Failure to read the fs root results in a mount error, but we log a warning message. Same goes for checking the UUID tree, an error results in a mount failure but we log a warning message. Change the level of the logged messages from warning to error. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: fix a double release on reserved extents in cow_one_range()Qu Wenruo
[BUG] Commit c28214bde6da ("btrfs: refactor the main loop of cow_file_range()") refactored the handling of COWing one range. However it changed the error handling of the reserved extent. The old cleanup looks like this: out_drop_extent_cache: btrfs_drop_extent_map_range(inode, start, start + cur_alloc_size - 1, false); out_reserve: btrfs_dec_block_group_reservations(fs_info, ins.objectid); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, true); [...] clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV; page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK; /* * For the range (2). If we reserved an extent for our delalloc range * (or a subrange) and failed to create the respective ordered extent, * then it means that when we reserved the extent we decremented the * extent's size from the data space_info's bytes_may_use counter and * incremented the space_info's bytes_reserved counter by the same * amount. We must make sure extent_clear_unlock_delalloc() does not try * to decrement again the data space_info's bytes_may_use counter, * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV. */ if (cur_alloc_size) { extent_clear_unlock_delalloc(inode, start, start + cur_alloc_size - 1, locked_folio, &cached, clear_bits, page_ops); btrfs_qgroup_free_data(inode, NULL, start, cur_alloc_size, NULL); } Which only calls EXTENT_CLEAR_META_RESV. As the reserved extent is properly handled by btrfs_free_reserved_extent(). However the new cleanup is: extent_clear_unlock_delalloc(inode, file_offset, cur_end, locked_folio, cached, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK); btrfs_qgroup_free_data(inode, NULL, file_offset, cur_len, NULL); btrfs_dec_block_group_reservations(fs_info, ins->objectid); btrfs_free_reserved_extent(fs_info, ins->objectid, ins->offset, true); The flag EXTENT_DO_ACCOUNTING implies both EXTENT_CLEAR_META_RESV and EXTENT_CLEAR_DATA_RESV, which will release the bytes_may_use, which later btrfs_free_reserved_extent() will do again, causing incorrect double release (and may underflow bytes_may_use). [FIX] Use EXTENT_CLEAR_META_RESV to replace EXTENT_DO_ACCOUNTING, and add back the comments on why we only use EXTENT_CLEAR_META_RESV. Fixes: c28214bde6da ("btrfs: refactor the main loop of cow_file_range()") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208184920.1102719-1-clm@meta.com/ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26btrfs: handle discard errors in in btrfs_finish_extent_commit()Jingkai Tan
Coverity (ID: 1226842) reported that the return value of btrfs_discard_extent() is assigned to ret but is immediately overwritten by unpin_extent_range() without being checked. Use the same error handling that is done later in the same function. Signed-off-by: Jingkai Tan <contact@jingk.ai> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-26netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequenceDavid Howells
Fix netfslib such that when it's making an unbuffered or DIO write, to make sure that it sends each subrequest strictly sequentially, waiting till the previous one is 'committed' before sending the next so that we don't have pieces landing out of order and potentially leaving a hole if an error occurs (ENOSPC for example). This is done by copying in just those bits of issuing, collecting and retrying subrequests that are necessary to do one subrequest at a time. Retrying, in particular, is simpler because if the current subrequest needs retrying, the source iterator can just be copied again and the subrequest prepped and issued again without needing to be concerned about whether it needs merging with the previous or next in the sequence. Note that the issuing loop waits for a subrequest to complete right after issuing it, but this wait could be moved elsewhere allowing preparatory steps to be performed whilst the subrequest is in progress. In particular, once content encryption is available in netfslib, that could be done whilst waiting, as could cleanup of buffers that have been completed. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/58526.1772112753@warthog.procyon.org.uk Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-26iomap: don't report direct-io retries to fserrorDarrick J. Wong
iomap's directio implementation has two magic errno codes that it uses to signal callers -- ENOTBLK tells the filesystem that it should retry a write with the pagecache; and EAGAIN tells the caller that pagecache flushing or invalidation failed and that it should try again. Neither of these indicate data loss, so let's not report them. Fixes: a9d573ee88af98 ("iomap: report file I/O errors to the VFS") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/20260224154637.GD2390381@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-25Merge tag 'erofs-for-7.0-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Do not share the page cache if the real @aops differs - Fix the incomplete condition for interlaced plain extents - Get rid of more unnecessary #ifdefs * tag 'erofs-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix interlaced plain identification for encoded extents erofs: remove more unnecessary #ifdefs erofs: allow sharing page cache with the same aops only
2026-02-25Merge tag 'vfs-7.0-rc2.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix an uninitialized variable in file_getattr(). The flags_valid field wasn't initialized before calling vfs_fileattr_get(), triggering KMSAN uninit-value reports in fuse - Fix writeback wakeup and logging timeouts when DETECT_HUNG_TASK is not enabled. sysctl_hung_task_timeout_secs is 0 in that case causing spurious "waiting for writeback completion for more than 1 seconds" warnings - Fix a null-ptr-deref in do_statmount() when the mount is internal - Add missing kernel-doc description for the @private parameter in iomap_readahead() - Fix mount namespace creation to hold namespace_sem across the mount copy in create_new_namespace(). The previous drop-and-reacquire pattern was fragile and failed to clean up mount propagation links if the real rootfs was a shared or dependent mount - Fix /proc mount iteration where m->index wasn't updated when m->show() overflows, causing a restart to repeatedly show the same mount entry in a rapidly expanding mount table - Return EFSCORRUPTED instead of ENOSPC in minix_new_inode() when the inode number is out of range - Fix unshare(2) when CLONE_NEWNS is set and current->fs isn't shared. copy_mnt_ns() received the live fs_struct so if a subsequent namespace creation failed the rollback would leave pwd and root pointing to detached mounts. Always allocate a new fs_struct when CLONE_NEWNS is requested - fserror bug fixes: - Remove the unused fsnotify_sb_error() helper now that all callers have been converted to fserror_report_metadata - Fix a lockdep splat in fserror_report() where igrab() takes inode::i_lock which can be held in IRQ context. Replace igrab() with a direct i_count bump since filesystems should not report inodes that are about to be freed or not yet exposed - Handle error pointer in procfs for try_lookup_noperm() - Fix an integer overflow in ep_loop_check_proc() where recursive calls returning INT_MAX would overflow when +1 is added, breaking the recursion depth check - Fix a misleading break in pidfs * tag 'vfs-7.0-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: pidfs: avoid misleading break eventpoll: Fix integer overflow in ep_loop_check_proc() proc: Fix pointer error dereference fserror: fix lockdep complaint when igrabbing inode fsnotify: drop unused helper unshare: fix unshare_fs() handling minix: Correct errno in minix_new_inode namespace: fix proc mount iteration mount: hold namespace_sem across copy in create_new_namespace() iomap: Describe @private in iomap_readahead() statmount: Fix the null-ptr-deref in do_statmount() writeback: Fix wakeup and logging timeouts for !DETECT_HUNG_TASK fs: init flags_valid before calling vfs_fileattr_get
2026-02-25xfs: add static size checks for ioctl UABIWilfred Mallawa
The ioctl structures in libxfs/xfs_fs.h are missing static size checks. It is useful to have static size checks for these structures as adding new fields to them could cause issues (e.g. extra padding that may be inserted by the compiler). So add these checks to xfs/xfs_ondisk.h. Due to different padding/alignment requirements across different architectures, to avoid build failures, some structures are ommited from the size checks. For example, structures with "compat_" definitions in xfs/xfs_ioctl32.h are ommited. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: remove duplicate static size checksWilfred Mallawa
In libxfs/xfs_ondisk.h, remove some duplicate entries of XFS_CHECK_STRUCT_SIZE(). Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Add comments for usages of some macros.Nirjhar Roy (IBM)
Add comments explaining when to use XFS_IS_CORRUPT() and ASSERT() Suggested-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Update lazy counters in xfs_growfs_rt_bmblock()Nirjhar Roy (IBM)
Update lazy counters in xfs_growfs_rt_bmblock() similar to the way it is done xfs_growfs_data_private(). This is because the lazy counters are not always updated and synching the counters will avoid inconsistencies between frexents and rtextents(total realtime extent count). This will be more useful once realtime shrink is implemented as this will prevent some transient state to occur where frexents might be greater than total rtextents. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Add a comment in xfs_log_sb()Nirjhar Roy (IBM)
Add a comment explaining why the sb_frextents are updated outside the if (xfs_has_lazycount(mp) check even though it is a lazycounter. RT groups are supported only in v5 filesystems which always have lazycounter enabled - so putting it inside the if(xfs_has_lazycount(mp) check is redundant. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Fix xfs_last_rt_bmblock()Nirjhar Roy (IBM)
Bug description: If the size of the last rtgroup i.e, the rtg passed to xfs_last_rt_bmblock() is such that the last rtextent falls in 0th word offset of a bmblock of the bitmap file tracking this (last) rtgroup, then in that case xfs_last_rt_bmblock() incorrectly returns the next bmblock number instead of the current/last used bmblock number. When xfs_last_rt_bmblock() incorrectly returns the next bmblock, the loop to grow/modify the bmblocks in xfs_growfs_rtg() doesn't execute and xfs_growfs basically does a nop in certain cases. xfs_growfs will do a nop when the new size of the fs will have the same number of rtgroups i.e, we are only growing the last rtgroup. Reproduce: $ mkfs.xfs -m metadir=0 -r rtdev=/dev/loop1 /dev/loop0 \ -r size=32769b -f $ mount -o rtdev=/dev/loop1 /dev/loop0 /mnt/scratch $ xfs_growfs -R $(( 32769 + 1 )) /mnt/scratch $ xfs_info /mnt/scratch | grep rtextents $ # We can see that rtextents hasn't changed Fix: Fix this by returning the current/last used bmblock when the last rtgroup size is not a multiple xfs_rtbitmap_rtx_per_rbmblock() and the next bmblock when the rtgroup size is a multiple of xfs_rtbitmap_rtx_per_rbmblock() i.e, the existing blocks are completely used up. Also, I have renamed xfs_last_rt_bmblock() to xfs_last_rt_bmblock_to_extend() to signify that this function returns the bmblock number to extend and NOT always the last used bmblock number. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: don't report half-built inodes to fserrorDarrick J. Wong
Sam Sun apparently found a syzbot way to fuzz a filesystem such that xfs_iget_cache_miss would free the inode before the fserror code could catch up. Frustratingly he doesn't use the syzbot dashboard so there's no C reproducer and not even a full error report, so I'm guessing that: Inodes that are being constructed or torn down inside XFS are not visible to the VFS. They should never be reported to fserror. Also, any inode that has been freshly allocated in _cache_miss should be marked INEW immediately because, well, it's an incompletely constructed inode that isn't yet visible to the VFS. Reported-by: Sam Sun <samsun1006219@gmail.com> Fixes: 5eb4cb18e445d0 ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: don't report metadata inodes to fserrorDarrick J. Wong
Internal metadata inodes are not exposed to userspace programs, so it makes no sense to pass them to the fserror functions (aka fsnotify). Instead, report metadata file problems as general filesystem corruption. Fixes: 5eb4cb18e445d0 ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix potential pointer access race in xfs_healthmon_getDarrick J. Wong
Pankaj Raghav asks about this code in xfs_healthmon_get: hm = mp->m_healthmon; if (hm && !refcount_inc_not_zero(&hm->ref)) hm = NULL; rcu_read_unlock(); return hm; (slightly edited to compress a mailing list thread) "Nit: Should we do a READ_ONCE(mp->m_healthmon) here to avoid any compiler tricks that can result in an undefined behaviour? I am not sure if I am being paranoid here. "So this is my understanding: RCU guarantees that we get a valid object (actual data of m_healthmon) but does not guarantee the compiler will not reread the pointer between checking if hm is !NULL and accessing the pointer as we are doing it lockless. "So just a barrier() call in rcu_read_lock is enough to make sure this doesn't happen and probably adding a READ_ONCE() is not needed?" After some initial confusion I concluded that he's correct. The compiler could very well eliminate the hm variable in favor of walking the pointers twice, turning the code into: if (mp->m_healthmon && !refcount_inc_not_zero(&mp->m_healthmon->ref)) If this happens, then xfs_healthmon_detach can sneak in between the two sides of the && expression and set mp->m_healthmon to NULL, and thereby cause a null pointer dereference crash. Fix this by using the rcu pointer assignment and dereference functions, which ensure that the proper reordering barriers are in place. Practically speaking, gcc seems to allocate an actual variable for hm and only reads mp->m_healthmon once (as intended), but we ought to be more explicit about requiring this. Reported-by: Pankaj Raghav <pankaj.raghav@linux.dev> Fixes: a48373e7d35a89f6f ("xfs: start creating infrastructure for health monitoring") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix xfs_group release bug in xfs_dax_notify_dev_failureDarrick J. Wong
Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Cc: <stable@vger.kernel.org> # v6.0 Fixes: 6f643c57d57c56 ("xfs: implement ->notify_failure() for XFS") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix xfs_group release bug in xfs_verify_report_lossesDarrick J. Wong
Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Fixes: b8accfd65d31f2 ("xfs: add media verification ioctl") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-xfs/20260206030527.2506821-1-clm@meta.com/ Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix copy-paste error in previous fixDarrick J. Wong
Chris Mason noticed that there is a copy-paste error in a recent change to xrep_dir_teardown that nulls out pointers after freeing the resources. Fixes: ba408d299a3bb3c ("xfs: only call xf{array,blob}_destroy if we have a valid pointer") Link: https://lore.kernel.org/linux-xfs/20260205194211.2307232-1-clm@meta.com/ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Fix error pointer dereferenceEthan Tidmore
The function try_lookup_noperm() can return an error pointer and is not checked for one. Add checks for error pointer in xrep_adoption_check_dcache() and xrep_adoption_zap_dcache(). Detected by Smatch: fs/xfs/scrub/orphanage.c:449 xrep_adoption_check_dcache() error: 'd_child' dereferencing possible ERR_PTR() fs/xfs/scrub/orphanage.c:485 xrep_adoption_zap_dcache() error: 'd_child' dereferencing possible ERR_PTR() Fixes: 73597e3e42b4 ("xfs: ensure dentry consistency when the orphanage adopts a file") Cc: stable@vger.kernel.org # v6.16 Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: remove metafile inodes from the active inode statChristoph Hellwig
The active inode (or active vnode until recently) stat can get much larger than expected on file systems with a lot of metafile inodes like zoned file systems on SMR hard disks with 10.000s of rtg rmap inodes. Remove all metafile inodes from the active counter to make it more useful to track actual workloads and add a separate counter for active metafile inodes. This fixes xfs/177 on SMR hard drives. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: cleanup inode counter statsChristoph Hellwig
Most of them are unused, so mark them as such. Give the remaining ones names that match their use instead of the historic IRIX ones based on vnodes. Note that the names are purely internal to the XFS code, the user interface is based on section names and arrays of counters. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>