diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-20 10:44:02 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-20 10:44:02 -0700 |
| commit | 36d179fd6bea35698d53444b7bd3025fa3788266 (patch) | |
| tree | 399ec5f312f24b6d6eadc60125960e1d53216086 /Documentation | |
| parent | c1f49dea2b8f335813d3b348fd39117fb8efb428 (diff) | |
| parent | d644a698de12e996778657f65a4608299368e138 (diff) | |
Merge tag 'nfsd-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
- filehandle signing to defend against filehandle-guessing attacks
(Benjamin Coddington)
The server now appends a SipHash-2-4 MAC to each filehandle when
the new "sign_fh" export option is enabled. NFSD then verifies
filehandles received from clients against the expected MAC;
mismatches return NFS error STALE
- convert the entire NLMv4 server-side XDR layer from hand-written C to
xdrgen-generated code, spanning roughly thirty patches (Chuck Lever)
XDR functions are generally boilerplate code and are easy to get
wrong. The goals of this conversion are improved memory safety, lower
maintenance burden, and groundwork for eventual Rust code generation
for these functions.
- improve pNFS block/SCSI layout robustness with two related changes
(Dai Ngo)
SCSI persistent reservation fencing is now tracked per client and
per device via an xarray, to avoid both redundant preempt operations
on devices already fenced and a potential NFSD deadlock when all nfsd
threads are waiting for a layout return.
- scalability and infrastructure improvements
Sincere thanks to all contributors, reviewers, testers, and bug
reporters who participated in the v7.1 NFSD development cycle.
* tag 'nfsd-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (83 commits)
NFSD: Docs: clean up pnfs server timeout docs
nfsd: fix comment typo in nfsxdr
nfsd: fix comment typo in nfs3xdr
NFSD: convert callback RPC program to per-net namespace
NFSD: use per-operation statidx for callback procedures
svcrdma: Use contiguous pages for RDMA Read sink buffers
SUNRPC: Add svc_rqst_page_release() helper
SUNRPC: xdr.h: fix all kernel-doc warnings
svcrdma: Factor out WR chain linking into helper
svcrdma: Add Write chunk WRs to the RPC's Send WR chain
svcrdma: Clean up use of rdma->sc_pd->device
svcrdma: Clean up use of rdma->sc_pd->device in Receive paths
svcrdma: Add fair queuing for Send Queue access
SUNRPC: Optimize rq_respages allocation in svc_alloc_arg
SUNRPC: Track consumed rq_pages entries
svcrdma: preserve rq_next_page in svc_rdma_save_io_pages
SUNRPC: Handle NULL entries in svc_rqst_release_pages
SUNRPC: Allocate a separate Reply page array
SUNRPC: Tighten bounds checking in svc_rqst_replace_page
NFSD: Sign filehandles
...
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/admin-guide/nfs/pnfs-block-server.rst | 30 | ||||
| -rw-r--r-- | Documentation/admin-guide/nfs/pnfs-scsi-server.rst | 31 | ||||
| -rw-r--r-- | Documentation/filesystems/locking.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/nfs/exporting.rst | 85 | ||||
| -rw-r--r-- | Documentation/netlink/specs/nfsd.yaml | 6 | ||||
| -rw-r--r-- | Documentation/sunrpc/xdr/nlm4.x | 211 |
6 files changed, 365 insertions, 0 deletions
diff --git a/Documentation/admin-guide/nfs/pnfs-block-server.rst b/Documentation/admin-guide/nfs/pnfs-block-server.rst index 20fe9f5117fe..7667dd2e17f1 100644 --- a/Documentation/admin-guide/nfs/pnfs-block-server.rst +++ b/Documentation/admin-guide/nfs/pnfs-block-server.rst @@ -40,3 +40,33 @@ how to translate the device into a serial number from SCSI EVPD 0x80:: echo "fencing client ${CLIENT} serial ${EVPD}" >> /var/log/pnfsd-fence.log EOF + +If the nfsd server needs to fence a non-responding client and the +fencing operation fails, the server logs a warning message in the +system log with the following format: + + FENCE failed client[IP_address] clid[#n] device[dev_name] + + where: + + - IP_address: refers to the IP address of the affected client. + - #n: indicates the unique client identifier. + - dev_name: specifies the name of the block device related + to the fencing attempt. + +The server will repeatedly retry the operation indefinitely. During +this time, access to the affected file is restricted for all other +clients. This is to prevent potential data corruption if multiple +clients access the same file simultaneously. + +To restore access to the affected file for other clients, the admin +needs to take the following actions: + + - shutdown or power off the client being fenced. + - manually expire the client to release all its state on the server:: + + echo 'expire' > /proc/fs/nfsd/clients/clid/ctl + + where: + + - clid: is the unique client identifier displayed in the system log. diff --git a/Documentation/admin-guide/nfs/pnfs-scsi-server.rst b/Documentation/admin-guide/nfs/pnfs-scsi-server.rst index b2eec2288329..b202508d281d 100644 --- a/Documentation/admin-guide/nfs/pnfs-scsi-server.rst +++ b/Documentation/admin-guide/nfs/pnfs-scsi-server.rst @@ -22,3 +22,34 @@ option and the underlying SCSI device support persistent reservations. On the client make sure the kernel has the CONFIG_PNFS_BLOCK option enabled, and the file system is mounted using the NFSv4.1 protocol version (mount -o vers=4.1). + +If the nfsd server needs to fence a non-responding client and the +fencing operation fails, the server logs a warning message in the +system log with the following format: + + FENCE failed client[IP_address] clid[#n] device[dev_name] + + where: + + - IP_address: refers to the IP address of the affected client. + - #n: indicates the unique client identifier. + - dev_name: specifies the name of the block device related + to the fencing attempt. + +The server will repeatedly retry the operation indefinitely. During +this time, access to the affected file is restricted for all other +clients. This is to prevent potential data corruption if multiple +clients access the same file simultaneously. + +To restore access to the affected file for other clients, the admin +needs to take the following actions: + + - shutdown or power off the client being fenced. + - manually expire the client to release all its state on the server:: + + echo 'expire' > /proc/fs/nfsd/clients/clid/ctl + + where: + + - clid: is the unique client identifier displayed in the system log. + diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 8025df6e6499..8421ea21bd35 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -398,6 +398,7 @@ prototypes:: bool (*lm_breaker_owns_lease)(struct file_lock *); bool (*lm_lock_expirable)(struct file_lock *); void (*lm_expire_lock)(void); + bool (*lm_breaker_timedout)(struct file_lease *); locking rules: @@ -412,6 +413,7 @@ lm_breaker_owns_lease: yes no no lm_lock_expirable yes no no lm_expire_lock no no yes lm_open_conflict yes no no +lm_breaker_timedout yes no no ====================== ============= ================= ========= buffer_head diff --git a/Documentation/filesystems/nfs/exporting.rst b/Documentation/filesystems/nfs/exporting.rst index a01d9b9b5bc3..4aa59b0bf253 100644 --- a/Documentation/filesystems/nfs/exporting.rst +++ b/Documentation/filesystems/nfs/exporting.rst @@ -206,3 +206,88 @@ following flags are defined: all of an inode's dirty data on last close. Exports that behave this way should set EXPORT_OP_FLUSH_ON_CLOSE so that NFSD knows to skip waiting for writeback when closing such files. + +Signed Filehandles +------------------ + +To protect against filehandle guessing attacks, the Linux NFS server can be +configured to sign filehandles with a Message Authentication Code (MAC). + +Standard NFS filehandles are often predictable. If an attacker can guess +a valid filehandle for a file they do not have permission to access via +directory traversal, they may be able to bypass path-based permissions +(though they still remain subject to inode-level permissions). + +Signed filehandles prevent this by appending a MAC to the filehandle +before it is sent to the client. Upon receiving a filehandle back from a +client, the server re-calculates the MAC using its internal key and +verifies it against the one provided. If the signatures do not match, +the server treats the filehandle as invalid (returning NFS[34]ERR_STALE). + +Note that signing filehandles provides integrity and authenticity but +not confidentiality. The contents of the filehandle remain visible to +the client; they simply cannot be forged or modified. + +Configuration +~~~~~~~~~~~~~ + +To enable signed filehandles, the administrator must provide a signing +key to the kernel and enable the "sign_fh" export option. + +1. Providing a Key + The signing key is managed via the nfsd netlink interface. This key + is per-network-namespace and must be set before any exports using + "sign_fh" become active. + +2. Export Options + The feature is controlled on a per-export basis in /etc/exports: + + sign_fh + Enables signing for all filehandles generated under this export. + + no_sign_fh + (Default) Disables signing. + +Key Management and Rotation +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The security of this mechanism relies entirely on the secrecy of the +signing key. + +Initial Setup: + The key should be generated using a high-quality random source and + loaded early in the boot process or during the nfs-server startup + sequence. + +Changing Keys: + If a key is changed while clients have active mounts, existing + filehandles held by those clients will become invalid, resulting in + "Stale file handle" errors on the client side. + +Safe Rotation: + Currently, there is no mechanism for "graceful" key rotation + (maintaining multiple valid keys). Changing the key is an atomic + operation that immediately invalidates all previous signatures. + +Transitioning Exports +~~~~~~~~~~~~~~~~~~~~~ + +When adding or removing the "sign_fh" flag from an active export, the +following behaviors should be expected: + ++-------------------+---------------------------------------------------+ +| Change | Result for Existing Clients | ++===================+===================================================+ +| Adding sign_fh | Clients holding unsigned filehandles will find | +| | them rejected, as the server now expects a | +| | signature. | ++-------------------+---------------------------------------------------+ +| Removing sign_fh | Clients holding signed filehandles will find them | +| | rejected, as the server now expects the | +| | filehandle to end at its traditional boundary | +| | without a MAC. | ++-------------------+---------------------------------------------------+ + +Because filehandles are often cached persistently by clients, adding or +removing this option should generally be done during a scheduled maintenance +window involving a NFS client unmount/remount. diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml index f87b5a05e5e9..8ab43c8253b2 100644 --- a/Documentation/netlink/specs/nfsd.yaml +++ b/Documentation/netlink/specs/nfsd.yaml @@ -81,6 +81,11 @@ attribute-sets: - name: min-threads type: u32 + - + name: fh-key + type: binary + checks: + exact-len: 16 - name: version attributes: @@ -163,6 +168,7 @@ operations: - leasetime - scope - min-threads + - fh-key - name: threads-get doc: get the maximum number of running threads diff --git a/Documentation/sunrpc/xdr/nlm4.x b/Documentation/sunrpc/xdr/nlm4.x new file mode 100644 index 000000000000..0c44a80ef674 --- /dev/null +++ b/Documentation/sunrpc/xdr/nlm4.x @@ -0,0 +1,211 @@ +/* + * This file was extracted by hand from + * https://www.rfc-editor.org/rfc/rfc1813.html . + * + * Note that RFC 1813 is Informational. Its official date of + * publication (June 1995) is before the IETF required its RFCs to + * carry an explicit copyright or other IP ownership notices. + * + * Note also that RFC 1813 does not specify the whole NLM4 protocol. + * In particular, the argument and result types are not present in + * that document, and had to be reverse-engineered. + */ + +/* + * The NLMv4 protocol + */ + +pragma header nlm4; + +/* + * The following definitions are missing in RFC 1813, + * but can be found in the OpenNetworking Network Lock + * Manager protocol: + * + * https://pubs.opengroup.org/onlinepubs/9629799/chap10.htm + */ + +const LM_MAXSTRLEN = 1024; + +const LM_MAXNAMELEN = 1025; + +const MAXNETOBJ_SZ = 1024; + +typedef opaque netobj<MAXNETOBJ_SZ>; + +enum fsh4_mode { + fsm_DN = 0, /* deny none */ + fsm_DR = 1, /* deny read */ + fsm_DW = 2, /* deny write */ + fsm_DRW = 3 /* deny read/write */ +}; + +enum fsh4_access { + fsa_NONE = 0, /* for completeness */ + fsa_R = 1, /* read-only */ + fsa_W = 2, /* write-only */ + fsa_RW = 3 /* read/write */ +}; + +/* + * The following definitions come from the OpenNetworking + * Network Status Monitor protocol: + * + * https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm + */ + +const SM_MAXSTRLEN = 1024; + +/* + * The NLM protocol as extracted from: + * https://tools.ietf.org/html/rfc1813 Appendix II + */ + +typedef unsigned hyper uint64; + +typedef hyper int64; + +typedef unsigned long uint32; + +typedef long int32; + +enum nlm4_stats { + NLM4_GRANTED = 0, + NLM4_DENIED = 1, + NLM4_DENIED_NOLOCKS = 2, + NLM4_BLOCKED = 3, + NLM4_DENIED_GRACE_PERIOD = 4, + NLM4_DEADLCK = 5, + NLM4_ROFS = 6, + NLM4_STALE_FH = 7, + NLM4_FBIG = 8, + NLM4_FAILED = 9 +}; + +pragma big_endian nlm4_stats; + +struct nlm4_holder { + bool exclusive; + int32 svid; + netobj oh; + uint64 l_offset; + uint64 l_len; +}; + +union nlm4_testrply switch (nlm4_stats stat) { + case NLM4_DENIED: + nlm4_holder holder; + default: + void; +}; + +struct nlm4_stat { + nlm4_stats stat; +}; + +struct nlm4_res { + netobj cookie; + nlm4_stat stat; +}; + +struct nlm4_testres { + netobj cookie; + nlm4_testrply stat; +}; + +struct nlm4_lock { + string caller_name<LM_MAXSTRLEN>; + netobj fh; + netobj oh; + int32 svid; + uint64 l_offset; + uint64 l_len; +}; + +struct nlm4_lockargs { + netobj cookie; + bool block; + bool exclusive; + nlm4_lock alock; + bool reclaim; + int32 state; +}; + +struct nlm4_cancargs { + netobj cookie; + bool block; + bool exclusive; + nlm4_lock alock; +}; + +struct nlm4_testargs { + netobj cookie; + bool exclusive; + nlm4_lock alock; +}; + +struct nlm4_unlockargs { + netobj cookie; + nlm4_lock alock; +}; + +struct nlm4_share { + string caller_name<LM_MAXSTRLEN>; + netobj fh; + netobj oh; + fsh4_mode mode; + fsh4_access access; +}; + +struct nlm4_shareargs { + netobj cookie; + nlm4_share share; + bool reclaim; +}; + +struct nlm4_shareres { + netobj cookie; + nlm4_stats stat; + int32 sequence; +}; + +struct nlm4_notify { + string name<LM_MAXNAMELEN>; + int32 state; +}; + +/* + * Argument for the Linux-private SM_NOTIFY procedure + */ +const SM_PRIV_SIZE = 16; + +struct nlm4_notifyargs { + nlm4_notify notify; + opaque private[SM_PRIV_SIZE]; +}; + +program NLM4_PROG { + version NLM4_VERS { + void NLMPROC4_NULL(void) = 0; + nlm4_testres NLMPROC4_TEST(nlm4_testargs) = 1; + nlm4_res NLMPROC4_LOCK(nlm4_lockargs) = 2; + nlm4_res NLMPROC4_CANCEL(nlm4_cancargs) = 3; + nlm4_res NLMPROC4_UNLOCK(nlm4_unlockargs) = 4; + nlm4_res NLMPROC4_GRANTED(nlm4_testargs) = 5; + void NLMPROC4_TEST_MSG(nlm4_testargs) = 6; + void NLMPROC4_LOCK_MSG(nlm4_lockargs) = 7; + void NLMPROC4_CANCEL_MSG(nlm4_cancargs) = 8; + void NLMPROC4_UNLOCK_MSG(nlm4_unlockargs) = 9; + void NLMPROC4_GRANTED_MSG(nlm4_testargs) = 10; + void NLMPROC4_TEST_RES(nlm4_testres) = 11; + void NLMPROC4_LOCK_RES(nlm4_res) = 12; + void NLMPROC4_CANCEL_RES(nlm4_res) = 13; + void NLMPROC4_UNLOCK_RES(nlm4_res) = 14; + void NLMPROC4_GRANTED_RES(nlm4_res) = 15; + void NLMPROC4_SM_NOTIFY(nlm4_notifyargs) = 16; + nlm4_shareres NLMPROC4_SHARE(nlm4_shareargs) = 20; + nlm4_shareres NLMPROC4_UNSHARE(nlm4_shareargs) = 21; + nlm4_res NLMPROC4_NM_LOCK(nlm4_lockargs) = 22; + void NLMPROC4_FREE_ALL(nlm4_notify) = 23; + } = 4; +} = 100021; |
