linux-toradex.git/drivers/nvme/host, branch master

Merge tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

2026-06-16T07:32:47+00:00

Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
     - Per-controller admin and IO timeout sysfs attributes, and
       letting the block layer set request timeouts (Maurizio,
       Maximilian)
     - Multipath passthrough iostats, and PCI P2PDMA enablement for
       multipath devices (Keith, Kiran)
     - A new diag sysfs attribute group exporting per-controller
       counters (retries, multipath failover, error counters, requeue
       and failure counts, reset and reconnect events) (Nilay)
     - FDP configuration validation and bounds check fixes (liuxixin)
     - Various nvmet fixes, including a pre-auth out-of-bounds read in
       the Discovery Get Log Page handler, auth payload bounds
       validation, and tcp error-path leak fixes (Bryam, Tianchu,
       Geliang)
     - nvme-tcp lockdep and workqueue fixes (Shin'ichiro, Kuniyuki,
       Eric)
     - Assorted other fixes and cleanups (John, Yao, Chao, Mateusz,
       Achkinazi, Wentao)

 - MD pull request via Yu Kuai:
     - raid1/raid10 fixes for a deadlock in the read error recovery
       path, error-path detection and bio accounting with cloned bios,
       and an nr_pending leak in the REQ_ATOMIC bad-block error path
       (Abd-Alrhman)
     - PCI P2PDMA propagation from member devices to the RAID device
       (Kiran)
     - dm-raid bio requeue fix, and various smaller fixes and cleanups
       (Benjamin, Chen, Li, Thorsten)

 - Enable Clang lock context analysis for the block layer, with the
   accompanying annotations across queue limits, the blk_holder_ops
   callbacks, crypto, cgroup, iocost, kyber and mq-deadline (Bart)

 - Block status code infrastructure work: a tagged status table, a
   str_to_blk_op() helper, a bio_endio_status() helper, and on top of
   that a new configurable block-layer error injection facility
   (Christoph)

 - DRBD netlink rework, replacing the genl_magic machinery with explicit
   netlink serialization and moving the DRBD UAPI headers to
   include/uapi/linux/ (Christoph Böhmwalder)

 - bvec improvements: a bvec_folio() helper and making the bvec_iter
   helpers proper inline functions (Willy, Christoph)

 - ublk cleanups and a canceling-flag fix for the disk-not-allocated
   case (Caleb, Ming)

 - Partition handling fixes: bound the AIX pp_count scan, fix an of_node
   refcount leak, and replace __get_free_page() with kmalloc() (Bryam,
   Wentao, Mike)

 - Convert numa_node to int in blk_mq_hw_ctx and ->init_request, and add
   WQ_PERCPU to the block workqueue users (Mateusz, Marco)

 - Block statistics and tracing: propagate in-flight to the whole disk
   on partition IO, export passthrough stats, and a new
   block_rq_tag_wait tracepoint (Tang, Keith, Aaron)

 - A round of removals, unexports and cleanups across bio, direct-io and
   the bvec helpers (Christoph)

 - Various driver fixes (mtip32xx use-after-free, rbd snap_count
   validation and strscpy conversion, nbd socket lockdep reclassify,
   virtio-blk zone report clamp, floppy) and a batch of MAINTAINERS
   email/list updates (Coly, Li, Yu, Christoph Böhmwalder)

 - Other little fixes and cleanups all over

* tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (117 commits)
  MAINTAINERS: Update Coly Li's email address
  block: check bio split for unaligned bvec
  nbd: Reclassify sockets to avoid lockdep circular dependency
  block: add configurable error injection
  block: add a str_to_blk_op helper
  block: add a "tag" for block status codes
  block: add a macro to initialize the status table
  floppy: Drop unused pnp driver data
  block: propagate in_flight to whole disk on partition I/O
  virtio-blk: clamp zone report to the report buffer capacity
  block: optimize I/O merge hot path with unlikely() hints
  drivers/block/rbd: Use strscpy() to copy strings into arrays
  partitions: aix: bound the pp_count scan to the ppe array
  block: Enable lock context analysis
  block/mq-deadline: Make the lock context annotations compatible with Clang
  block/Kyber: Make the lock context annotations compatible with Clang
  block/blk-mq-debugfs: Improve lock context annotations
  block/blk-iocost: Inline iocg_lock() and iocg_unlock()
  block/blk-iocost: Split ioc_rqos_throttle()
  block/crypto: Annotate the crypto functions
  ...

Merge tag 'nvme-7.2-2026-06-04' of git://git.infradead.org/nvme into for-7.2/block

2026-06-05T11:18:58+00:00

Pull NVMe updates from Keith:

"- Per-controller timeouts
 - Multipath telemetry
 - Namespace format validation
 - Various other fixes"

* tag 'nvme-7.2-2026-06-04' of git://git.infradead.org/nvme: (34 commits)
  nvme: export controller reconnect event count via sysfs
  nvme: export controller reset event count via sysfs
  nvme: export I/O failure count when no path is available via sysfs
  nvme: export I/O requeue count when no path is usable via sysfs
  nvme: export command error counters via sysfs
  nvme: export multipath failover count via sysfs
  nvme: export command retry count via sysfs
  nvme: add diag attribute group under sysfs
  nvme-tcp: lockdep: use dynamic lockdep keys per socket instance
  nvme-tcp: move nvme_tcp_reclassify_socket()
  nvme: validate FDP configuration descriptor sizes
  nvmet-auth: validate reply message payload bounds against transfer length
  nvme: refresh multipath head zoned limits from path limits
  nvme: fix FDP fdpcidx bounds check
  nvme-tcp: Use WQ_PERCPU explicitly if wq_unbound is false.
  nvmet: fix pre-auth out-of-bounds heap read in Discovery Get Log Page
  nvme-multipath: set BIO_REMAPPED on bios remapped to per-path namespace disks
  nvme-multipath: require exact iopolicy names for module parameter
  nvme-multipath: pass NS head to nvme_mpath_revalidate_paths()
  nvme-pci: fix out-of-bounds access in nvme_setup_descriptor_pools
  ...

nvme: export controller reconnect event count via sysfs

2026-06-04T08:57:40+00:00

When an NVMe-oF link goes down, the driver attempts to recover the
connection by repeatedly reconnecting to the remote controller at
configured intervals. A maximum number of reconnect attempts is also
configured, after which recovery stops and the controller is removed
if the connection cannot be re-established.

The driver maintains a counter, nr_reconnects, which is incremented on
each reconnect attempt. However if in case the reconnect is successful
then this counter reset to zero. Moreover, currently, this counter is
only reported via kernel log messages and is not exposed to userspace.
Since dmesg is a circular buffer, this information may be lost over
time.

So introduce a new accumulator which accumulates nr_reconnect attempts
and also expose this accumulator per-fabric ctrl via a new sysfs
attribute reconnect_count, under diag attribute grroup to provide
persistent visibility into the number of reconnect attempts made by the
host. This information can help users diagnose unstable links or
connectivity issues. Furthermore, this sysfs attribute is also writable
so user may reset it to zero, if needed.

The reconnect_count can also be consumed by monitoring tools such as
nvme-top to improve controller-level observability.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export controller reset event count via sysfs

2026-06-04T08:57:36+00:00

The NVMe controller transitions into the RESETTING state during error
recovery, link instability, firmware activation, or when a reset is
explicitly triggered by the user.

Expose a per-ctrl sysfs attribute reset_count, under diag attribute
group to provide visibility into these RESETTING state transitions.
Observing the frequency of reset events can help users identify issues
such as PCIe errors or unstable fabric links. This counter is also
writable thus allowing user to reset its value, if needed.

This counter can also be consumed by monitoring tools such as nvme-top
to improve controller-level observability.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export I/O failure count when no path is available via sysfs

2026-06-04T08:57:32+00:00

When I/O is submitted to the NVMe namespace head and no available path
can handle the request, the driver fails the I/O immediately. Currently,
such failures are only reported via kernel log messages, which may be
lost over time since dmesg is a circular buffer.

Add a new ns-head sysfs counter io_fail_no_available_path_count, under
diag attribute group to expose the number of I/Os that failed due to the
absence of an available path. This provides persistent visibility into
path-related I/O failures and can help users diagnose the cause of I/O
errors. This counter is also writable and so user may reset its value,
if needed.

This counter can also be consumed by monitoring tools such as nvme-top.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export I/O requeue count when no path is usable via sysfs

2026-06-04T08:57:28+00:00

When the NVMe namespace head determines that there is no currently
available path to handle I/O (for example, while a controller is
resetting/connecting or due to a transient link failure), incoming
I/Os are added to the requeue list.

Currently, there is no visibility into how many I/Os have been requeued
in this situation. Add a new ns-head sysfs counter
io_requeue_no_usable_path_count, under diag attribute group to expose
the number of I/Os that were requeued due to the absence of an available
path. This counter is also writable thus allowing user to reset it, if
needed.

This statistic can help users understand I/O slowdowns or stalls caused
by temporary path unavailability, and can be consumed by monitoring
tools such as nvme-top for real-time observability.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export command error counters via sysfs

2026-06-04T08:57:25+00:00

When an NVMe command completes with an error status, the driver
logs the error to the kernel log. However, these messages may be
lost or overwritten over time since dmesg is a circular buffer.

Expose per-path and ctrl sysfs attribute command_error_count, under
diag attribute group to provide persistent visibility into error
occurrences. This allows users to observe the total number of commands
that have failed on a given path over time, which can be useful for
diagnosing path health and stability.

This attribute is both readable and writable thus allowing user to reset
these counters. These counters can also be consumed by observability
tools such as nvme-top to provide additional insight into NVMe error
behavior.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export multipath failover count via sysfs

2026-06-04T08:57:19+00:00

When an NVMe command completes with a path-specific error, the NVMe
driver may retry the command on an alternate controller or path if one
is available. These failover events indicate that I/O was redirected
away from the original path.

Currently, the number of times requests are failed over to another
available path is not visible to userspace. Exposing this information
can be useful for diagnosing path health and stability.

Export per-path sysfs attribute "multipath_failover_count" under diag
attribute group. This attribute is both readable and writable and thus
allowing user to reset the counter. This counter can be consumed by
monitoring tools such as nvme-top to help identify paths that
consistently trigger failovers under load.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: export command retry count via sysfs

2026-06-04T08:57:13+00:00

When Advanced Command Retry Enable (ACRE) is configured, a controller
may interrupt command execution and return a completion status
indicating command interrupted with the DNR bit cleared. In this case,
the driver retries the command based on the Command Retry Delay (CRD)
value provided in the completion status.

Currently, these command retries are handled entirely within the NVMe
driver and are not visible to userspace. As a result, there is no
observability into retry behavior, which can be a useful diagnostic
signal.

Expose a per-namespace sysfs attribute command_retries_count, under
diag attribute group to provide visibility into retry activity. This
information can help identify controller-side congestion under load
and enables comparison across paths in multipath setups (for example,
detecting cases where one path experiences significantly more retries
than another under identical workloads).

This exported metric is intended for diagnostics and monitoring tools
such as nvme-top, and does not change command retry behavior. A new
sysfs attribute named "command_retries_count" is added for this purpose.
This attribute is both readable as well as writable. So user could
reset this counter if needed.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch

nvme: add diag attribute group under sysfs

2026-06-04T08:57:06+00:00

Add a new diag attribute group under:
/sys/class/nvme//
/sys/block//
/sys/block//

This new sysfs attribute group will be used to organize NVMe diagnostic
and telemetry-related counters under it.

Tested-by: Venkat Rao Bagalkote 
Signed-off-by: Nilay Shroff 
Signed-off-by: Keith Busch