| Age | Commit message (Collapse) | Author |
|
Syzbot creates a fuzzed record where xfs_has_logv2() but the
xlog_rec_header h_version != XLOG_VERSION_2. This causes a
KASAN: slab-out-of-bounds read in xlog_do_recovery_pass() ->
xlog_recover_process() -> xlog_cksum().
Fix by adding a check to xlog_valid_rec_header() to abort journal
recovery if the xlog_rec_header h_version does not match the super
block log version.
A file system with a version 2 log will only ever set
XLOG_VERSION_2 in its headers (and v1 will only ever set V_1), so if
there is any mismatch, either the journal or the superblock has been
corrupted and therefore we abort processing with a -EFSCORRUPTED error
immediately.
Also, refactor the structure of the validity checks for better
readability. At the default error level (LOW), XFS_IS_CORRUPT() emits
the condition that failed, the file and line number it is
located at, then dumps the stack. This gives us everything we need
to know about the failure if we do a single validity check per
XFS_IS_CORRUPT().
Reported-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9f6d080dece587cfdd4c
Tested-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com
Fixes: 45cf976008dd ("xfs: fix log recovery buffer allocation for the legacy h_size fixup")
Signed-off-by: Raphael Pinsonneault-Thibeault <rpthibeault@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
This patch fixes the sparse warning issue in inode.c
by adding static to hfsplus_symlink_inode_operations
and hfsplus_special_inode_operations declarations.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601291957.bunRsD8R-lkp@intel.com/
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20260129195442.594884-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
|
|
ext4 and f2fs are largely using the same code to read a page full
of Merkle tree blocks from the page cache, and the upcoming xfs
fsverity support would add another copy.
Move the ext4 code to fs/verity/ and use it in f2fs as well. For f2fs
this removes the previous f2fs-specific error injection, but otherwise
the behavior remains unchanged.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260128152630.627409-7-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
This will make an iomap implementation of the method easier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260128152630.627409-6-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Use IS_ENABLED to disable this code, leading to a slight size reduction:
text data bss dec hex filename
25709 2412 24 28145 6df1 fs/f2fs/compress.o.old
25198 2252 24 27474 6b52 fs/f2fs/compress.o
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260128152630.627409-5-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Use IS_ENABLED to disable this code, leading to a slight size reduction:
text data bss dec hex filename
4121 376 16 4513 11a1 fs/ext4/readpage.o.old
4030 328 16 4374 1116 fs/ext4/readpage.o
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260128152630.627409-4-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Free the fsverity_info directly in clear_inode instead of requiring file
systems to handle it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260128152630.627409-3-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add the check to reject truncates of fsverity files directly to
setattr_prepare instead of requiring the file system to handle it.
Besides removing boilerplate code, this also fixes the complete lack of
such check in btrfs.
Fixes: 146054090b08 ("btrfs: initial fsverity support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260128152630.627409-2-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix leaked folio refcount on s390x when using hw zlib compression
acceleration
- remove own threshold from ->writepages() which could collide with
cgroup limits and lead to a deadlock when metadadata are not written
because the amount is under the internal limit
* tag 'for-6.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zlib: fix the folio leak on S390 hardware acceleration
btrfs: do not strictly require dirty metadata threshold for metadata writepages
|
|
pidfs and nsfs recently gained support for encode/decode of file handles
via name_to_handle_at(2)/open_by_handle_at(2).
These special kernel filesystems have custom ->open() and ->permission()
export methods, which nfsd does not respect and it was never meant to be
used for exporting those filesystems by nfsd.
Therefore, do not allow nfsd to export filesystems with custom ->open()
or ->permission() methods.
Fixes: b3caba8f7a34a ("pidfs: implement file handle support")
Fixes: 5222470b2fbb3 ("nsfs: support file handles")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260129100212.49727-3-amir73il@gmail.com
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
fs-verity previously had debug printk but it was removed. This patch
adds trace points to similar places, as a better alternative.
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix formatting]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://patch.msgid.link/20260126115658.27656-3-aalbersh@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
fs-verity introduced inode flag for inodes with enabled fs-verity on
them. This patch adds FS_XFLAG_VERITY file attribute which can be
retrieved with FS_IOC_FSGETXATTR ioctl() and file_getattr() syscall.
This flag is read-only and can not be set with corresponding set ioctl()
and file_setattr(). The FS_IOC_SETFLAGS requires file to be opened for
writing which is not allowed for verity files. The FS_IOC_FSSETXATTR and
file_setattr() clears this flag from the user input.
As this is now common flag for both flag interfaces (flags/xflags) add
it to overlapping flags list to exclude it from overwrite.
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260126115658.27656-2-aalbersh@kernel.org
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Now that infrastructure for NFSv4 POSIX draft ACL has been added
to NFSD, it should be safe to advertise support to NFS clients.
NFSD_SUPPATTR_EXCLCREAT_WORD2 includes NFSv4.2-only attributes,
but version filtering occurs via nfsd_suppattrs[] before this
mask is applied, ensuring pre-4.2 clients never see unsupported
attributes.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The POSIX ACL extension to NFSv4 enables clients to set access
and default ACLs via FATTR4_POSIX_ACCESS_ACL and
FATTR4_POSIX_DEFAULT_ACL attributes. Integration of these
attributes into SETATTR processing requires wiring them through
the nfsd_attrs structure and ensuring proper cleanup on all
code paths.
This patch connects the na_pacl and na_dpacl fields in
nfsd_attrs to the decoded ACL pointers from the NFSv4 SETATTR
decoder. Ownership of these ACL references transfers to attrs
immediately after initialization, with the decoder's pointers
cleared to NULL. This transfer ensures nfsd_attrs_free()
releases the ACLs on normal completion, while new error paths
call posix_acl_release() directly when cleanup occurs before
nfsd_attrs_free() runs.
Early returns in the nfsd4_setattr() function gain conversions
to goto statements that branch to proper cleanup handlers.
Error paths before fh_want_write() branch to out_err for ACL
release only; paths after fh_want_write() use the existing out
label for full cleanup via nfsd_attrs_free().
The patch adds mutual exclusion between NFSv4 ACLs (sa_acl) and
POSIX ACLs. Setting both types simultaneously returns
nfserr_inval because these ACL models cannot coexist on the
same file object.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
NFSv4.2 clients can specify POSIX draft ACLs when creating file
objects via OPEN(CREATE) and CREATE operations. The previous patch
added POSIX ACL support to the NFSv4 SETATTR operation for modifying
existing objects, but file creation follows different code paths
that also require POSIX ACL handling.
This patch integrates POSIX ACL support into nfsd4_create() and
nfsd4_create_file(). Ownership of the decoded ACL pointers
(op_dpacl, op_pacl, cr_dpacl, cr_pacl) transfers to the nfsd_attrs
structure immediately, with the original fields cleared to NULL.
This transfer ensures nfsd_attrs_free() releases the ACLs upon
completion while preventing double-free on error paths.
Mutual exclusion between NFSv4 ACLs and POSIX ACLs is enforced:
setting both op_acl and op_dpacl/op_pacl simultaneously returns
nfserr_inval. Errors during ACL application clear the corresponding
bits in the result bitmask (fattr->bmval), signaling partial
completion to the client.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_ACCESS_ACL
and FATTR4_POSIX_DEFAULT_ACL for setting access and default ACLs
via CREATE, OPEN, and SETATTR operations. This patch adds the XDR
decoders for those attributes.
The nfsd4_decode_fattr4() function gains two additional parameters
for receiving decoded POSIX ACLs. CREATE, OPEN, and SETATTR
decoders pass pointers to these new parameters, enabling clients
to set POSIX ACLs during object creation or modification.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Support for FATTR4_POSIX_ACCESS_ACL and FATTR4_POSIX_DEFAULT_ACL
attributes in subsequent patches allows clients to set both ACL
types simultaneously during SETATTR and file creation. Each ACL
type can succeed or fail independently, requiring the server to
clear individual attribute bits in the reply bitmap when one
fails while the other succeeds.
The existing na_aclerr field cannot distinguish which ACL type
encountered an error. Separate error fields (na_paclerr for
access ACLs, na_dpaclerr for default ACLs) enable the server to
report per-ACL-type failures accurately.
This refactoring also adds validation previously absent: default
ACL processing rejects non-directory targets with EINVAL and
passes NULL to set_posix_acl() when a_count is zero to delete
the ACL. Access ACL processing rejects zero a_count with EINVAL
for ACL_SCOPE_FILE_SYSTEM semantics (the only scope currently
supported).
The changes preserve compatibility with existing NFSv4 ACL code.
NFSv4 ACL conversion (nfs4_acl_nfsv4_to_posix()) never produces
POSIX ACLs with a_count == 0, so the new validation logic only
affects future POSIX ACL attribute handling.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Section 9.3 of draft-ietf-nfsv4-posix-acls-00 prohibits use of
the POSIX ACL attributes with VERIFY and NVERIFY operations: the
server MUST reply NFS4ERR_INVAL when a client attempts this.
Beyond the protocol requirement, comparison of POSIX draft ACLs
via (N)VERIFY presents an implementation challenge. Clients are
not required to order the ACEs within a POSIX ACL in any
particular way, making reliable attribute comparison impractical.
Return nfserr_inval when the client requests FATTR4_POSIX_ACCESS_ACL
or FATTR4_POSIX_DEFAULT_ACL in a VERIFY or NVERIFY operation.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_ACCESS_ACL
for retrieving the access ACL of a file or directory. This patch
adds the XDR encoder for that attribute.
The access ACL is retrieved via get_inode_acl(). If the filesystem
provides no explicit access ACL, one is synthesized from the file
mode via posix_acl_from_mode(). Each entry is encoded as a
posixace4: tag type, permission bits, and principal name (empty
for structural entries, resolved via idmapping for USER/GROUP
entries).
Unlike the default ACL encoder which applies only to directories,
this encoder handles all inode types and ensures an access ACL is
always available through mode-based synthesis when needed.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_DEFAULT_ACL
for retrieving a directory's default ACL. This patch adds the
XDR encoder for that attribute.
For directories, the default ACL is retrieved via get_inode_acl()
and each entry is encoded as a posixace4: tag type, permission
bits, and principal name (empty for structural entries like
USER_OBJ/GROUP_OBJ/MASK/OTHER, resolved via idmapping for
USER/GROUP entries).
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The FATTR4_ACL_TRUEFORM_SCOPE attribute indicates the granularity at
which the ACL model can vary: per file object, per file system, or
uniformly across the entire server.
In Linux, the ACL model is determined by the SB_POSIXACL superblock
flag, which applies uniformly to all files within a file system.
Different exported file systems can have different ACL models, but
individual files cannot differ from their containing file system.
ACL_SCOPE_FILE_SYSTEM accurately reflects this behavior.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Mapping between NFSv4 ACLs and POSIX ACLs is semantically imprecise:
a client that sets an NFSv4 ACL and reads it back may see a different
ACL than it wrote. The proposed NFSv4 POSIX ACL extension introduces
the FATTR4_ACL_TRUEFORM attribute, which reports whether a file
object stores its access control permissions using NFSv4 ACLs or
POSIX ACLs.
A client aware of this extension can avoid lossy translation by
requesting and setting ACLs in their native format.
When NFSD is built with CONFIG_NFSD_V4_POSIX_ACLS, report
ACL_MODEL_POSIX_DRAFT for file objects on file systems with the
SB_POSIXACL flag set, and ACL_MODEL_NONE otherwise. Linux file
systems do not store NFSv4 ACLs natively, so ACL_MODEL_NFS4 is never
reported.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The language definition was extracted from the new
draft-ietf-nfsv4-posix-acls specification. This ensures good
constant and type name alignment between the spec and the Linux
kernel source code, and brings in some basic XDR utilities for
handling NFSv4 POSIX draft ACLs.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
A new IETF draft extends NFSv4.2 with POSIX ACL attributes:
https://www.ietf.org/archive/id/draft-ietf-nfsv4-posix-acls-00.txt
This draft has not yet been ratified. A build-time configuration
option allows developers and distributors to decide whether to
expose this experimental protocol extension to NFSv4 clients. The
option is disabled by default to prevent unintended deployment of
potentially unstable protocol features in production environments.
This approach mirrors the existing NFSD_V4_DELEG_TIMESTAMPS option,
which gates another experimental NFSv4 extension based on an
unratified IETF draft.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
XDR specification files can contain lines prefixed with '%' that
pass through unchanged to generated output. Traditional rpcgen
removes the '%' and emits the remainder verbatim, allowing direct
insertion of C includes, pragma directives, or other language-
specific content into the generated code.
Until now, xdrgen silently discarded these lines during parsing.
This prevented specifications from including necessary headers or
preprocessor directives that might be required for the generated
code to compile correctly.
The grammar now captures pass-through lines instead of ignoring
them. A new AST node type represents pass-through content, and
the AST transformer strips the leading '%' character. Definition
and source generators emit pass-through content in document order,
preserving the original placement within the specification.
This brings xdrgen closer to feature parity with traditional
rpcgen while maintaining the existing document-order processing
model.
Existing generated xdrgen source code has been regenerated.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
If the folio does not have an iomap_folio_state (ifs) attached and the
folio gets read in by the filesystem's IO helper, folio_end_read() will
be called by the IO helper at any time. For this case, we cannot access
the folio after dispatching it to the IO helper, eg subsequent accesses
like
if (ctx->cur_folio &&
offset_in_folio(ctx->cur_folio, iter->pos) == 0) {
are incorrect.
Fix these invalid accesses by invalidating ctx->cur_folio if all bytes
of the folio have been read in by the IO helper.
This allows us to also remove the +1 bias added for the ifs case. The
bias was previously added to ensure that if all bytes are read in, the
IO helper does not end the read on the folio until iomap has decremented
the bias.
Fixes: b2f35ac4146d ("iomap: add caller-provided callbacks for read and readahead")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://patch.msgid.link/20260126224107.2182262-2-joannelkoong@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Fix minor spelling and indentation errors in the
documentation comments.
Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Link: https://patch.msgid.link/20260128143150.3674284-1-chelsyratnawat2001@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Fix checkpatch.pl errors regarding missing spaces around assignment
operators in xfs_alloc_compute_diff() and xfs_alloc_fixup_trees().
Adhere to the Linux kernel coding style by ensuring spaces are placed
around the assignment operator '='.
Signed-off-by: Shin Seong-jun <shinsj4653@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
xfs_zone_gc_space_available only has one caller left, so fold it into
that. Reorder the checks so that the cheaper scratch_available check
is done first.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
When scratch_head wraps back to 0 and scratch_tail is also 0 because no
I/O has completed yet, the ring buffer could be mistaken for empty.
Fix this by introducing a separate scratch_available member in
struct xfs_zone_gc_data. This actually ends up simplifying the code as
well.
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
Use the helpers in place of all the different open coded variants.
This makes the code more readable and robust.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260128132406.23768-4-amir73il@gmail.com
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Rename the helper is_dot_dotdot() into the name_ namespace
and add complementary helpers to check for dot and dotdot
names individually.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260128132406.23768-3-amir73il@gmail.com
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Syzbot reported a KMSAN uninit-value issue in ovl_fill_real.
This iusse's call chain is:
__do_sys_getdents64()
-> iterate_dir()
...
-> ext4_readdir()
-> fscrypt_fname_alloc_buffer() // alloc
-> fscrypt_fname_disk_to_usr // write without tail '\0'
-> dir_emit()
-> ovl_fill_real() // read by strcmp()
The string is used to store the decrypted directory entry name for an
encrypted inode. As shown in the call chain, fscrypt_fname_disk_to_usr()
write it without null-terminate. However, ovl_fill_real() uses strcmp() to
compare the name against "..", which assumes a null-terminated string and
may trigger a KMSAN uninit-value warning when the buffer tail contains
uninit data.
Reported-by: syzbot+d130f98b2c265fae5297@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d130f98b2c265fae5297
Fixes: 4edb83bb1041 ("ovl: constant d_ino for non-merge dirs")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260128132406.23768-2-amir73il@gmail.com
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The xfstests' test-case generic/062 fails to execute
correctly:
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.15.0-rc4+ #8 SMP PREEMPT_DYNAMIC Thu May 1 16:43:22 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/062 - output mismatch (see xfstests-dev/results//generic/062.out.bad)
The generic/062 test tries to set and get xattrs for various types
of objects (regular file, folder, block device, character
device, pipe, etc) with the goal to check that xattr operations
works correctly for all possible types of file system objects.
But current HFS+ implementation somehow hasn't support of
xattr operatioons for the case of block device, character
device, and pipe objects. Also, it has not completely correct
set of operations for the case symlinks.
This patch implements proper declaration of xattrs operations
hfsplus_special_inode_operations and hfsplus_symlink_inode_operations.
Also, it slightly corrects the logic of hfsplus_listxattr()
method.
sudo ./check generic/062
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.19.0-rc1+ #59 SMP PREEMPT_DYNAMIC Mon Jan 19 16:26:21 PST 2026
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
generic/062 20s ... 20s
Ran: generic/062
Passed all 1 tests
[1] https://github.com/hfs-linux-kernel/hfs-linux-kernel/issues/93
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20260120041937.3450928-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
|
|
Async COPY operations hold copy stateids that represent NFSv4 state.
Thus, when the NFS server administrator revokes all NFSv4 state for
a filesystem via the unlock_fs interface, ongoing async COPY
operations referencing that filesystem must also be canceled.
Each cancelled copy triggers a CB_OFFLOAD callback carrying the
NFS4ERR_ADMIN_REVOKED status to notify the client that the server
terminated the operation.
The static drop_client() function is renamed to nfsd4_put_client()
and exported. The function must be exported because both the new
nfsd4_cancel_copy_by_sb() and the CB_OFFLOAD release callback in
nfs4proc.c need to release client references.
Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Add a new "min_threads" variable to the nfsd_net, along with the
corresponding netlink interface, to set that value from userland.
Pass that value to svc_set_pool_threads() and svc_set_num_threads().
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
nfsd() is changed to pass a timeout to svc_recv() when there is a min
number of threads set, and to handle error returns from it:
In the case of -ETIMEDOUT, if the service mutex can be taken (via
trylock), the thread becomes an RQ_VICTIM so that it will exit,
providing that the actual number of threads is above pool->sp_nrthrmin.
In the case of -EBUSY, if the actual number of threads is below
pool->sp_nrthrmax, it will attempt to start a new thread. This attempt
is gated on a new SP_TASK_STARTING pool flag that serializes thread
creation attempts within a pool, and further by mutex_trylock().
Neil says: "I think we want memory pressure to be able to push a thread
into returning -ETIMEDOUT. That can come later."
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
To dynamically adjust the thread count, nfsd requires some information
about how busy things are.
Change svc_recv() to take a timeout value, and then allow the wait for
work to time out if it's set. If a timeout is not defined, then the
schedule will be set to MAX_SCHEDULE_TIMEOUT. If the task waits for the
full timeout, then have it return -ETIMEDOUT to the caller.
If it wakes up, finds that there is more work and that no threads are
available, then attempt to set SP_TASK_STARTING. If wasn't already set,
have the task return -EBUSY to cue to the caller that the service could
use more threads.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Add a new pool->sp_nrthrmin field to track the minimum number of threads
in a pool. Add min_threads parameters to both svc_set_num_threads() and
svc_set_pool_threads(). If min_threads is non-zero and less than the
max, svc_set_num_threads() will ensure that the number of running
threads is between the min and the max.
If the min is 0 or greater than the max, then it is ignored, and the
maximum number of threads will be started, and never spun down.
For now, the min_threads is always 0, but a later patch will pass the
proper value through from nfsd.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
svc_set_num_threads() will set the number of running threads for a given
pool. If the pool argument is set to NULL however, it will distribute
the threads among all of the pools evenly.
These divergent codepaths complicate the move to dynamic threading.
Simplify the API by splitting these two cases into different helpers:
Add a new svc_set_pool_threads() function that sets the number of
threads in a single, given pool. Modify svc_set_num_threads() to
distribute the threads evenly between all of the pools and then call
svc_set_pool_threads() for each.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Fix direct I/O on devices that require stable pages by asking iomap
to bounce buffer. To support this, ioends are used for direct reads
in this case to provide a user context for copying data back from the
bounce buffer.
This fixes qemu when used on devices using T10 protection information
and probably other cases like iSCSI using data digests.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a new flag that request bounce buffering for direct I/O. This is
needed to provide the stable pages requirement requested by devices
that need to calculate checksums or parity over the data and allows
file systems to properly work with things like T10 protection
information. The implementation just calls out to the new bio bounce
buffering helpers to allocate a bounce buffer, which is used for
I/O and to copy to/from it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Support using the ioend structure to defer I/O completion for direct
reads in addition to writes. This requires a check for the operation
to not merge reads and writes in iomap_ioend_can_merge. This support
will be used for bounce buffered direct I/O reads that need to copy
data back to the user address space on read completion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Match the more descriptive iov_iter terminology instead of encoding
what we do with them for reads only.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are good arguments for processing the user completions ASAP vs.
freeing resources ASAP, but freeing the bio first here removes potential
use after free hazards when checking flags, and will simplify the
upcoming bounce buffer support.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Refactor the two per-bio completion handlers to share common code using
a new helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Factor out a separate helper that builds and submits a single bio.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use iov_iter_count to check if we need to continue as that just reads
a field in the iov_iter, and only use bio_iov_vecs_to_alloc to calculate
the actual number of vectors to allocate for the bio.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The "if (dio->error)" in iomap_dio_bio_iter exists to stop submitting
more bios when a completion already return an error. Commit cfe057f7db1f
("iomap_dio_actor(): fix iov_iter bugs") made it revert the iov by
"copied", which is very wrong given that we've already consumed that
range and submitted a bio for it.
Fixes: cfe057f7db1f ("iomap_dio_actor(): fix iov_iter bugs")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge
xfs: syzbot fixes for online fsck [3/3]
Fix various syzbot complaints about scrub that Jiaming Zhang found.
With a bit of luck, this should all go splendidly.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|