<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/nvme/host, branch v6.16</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge tag 'block-6.16-20250718' of git://git.kernel.dk/linux</title>
<updated>2025-07-18T19:16:13+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-07-18T19:16:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e5ac874257bf06eba00282e5b627f23b72ac8f7d'/>
<id>e5ac874257bf06eba00282e5b627f23b72ac8f7d</id>
<content type='text'>
Pull block fixes from Jens Axboe:

 - NVMe changes via Christoph:
     - revert the cross-controller atomic write size validation
       that caused regressions (Christoph Hellwig)
     - fix endianness of command word printout in
       nvme_log_err_passthru() (John Garry)
     - fix callback lock for TLS handshake (Maurizio Lombardi)
     - fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
     - fix inconsistent RCU list manipulation in
       nvme_ns_add_to_ctrl_list() (Zheng Qixing)

 - Fix for a kobject leak in queue unregistration

 - Fix for loop async file write start/end handling

* tag 'block-6.16-20250718' of git://git.kernel.dk/linux:
  loop: use kiocb helpers to fix lockdep warning
  nvmet-tcp: fix callback lock for TLS handshake
  nvme: fix misaccounting of nvme-mpath inflight I/O
  nvme: revert the cross-controller atomic write size validation
  nvme: fix endianness of command word prints in nvme_log_err_passthru()
  nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
  block: fix kobject leak in blk_unregister_queue
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block fixes from Jens Axboe:

 - NVMe changes via Christoph:
     - revert the cross-controller atomic write size validation
       that caused regressions (Christoph Hellwig)
     - fix endianness of command word printout in
       nvme_log_err_passthru() (John Garry)
     - fix callback lock for TLS handshake (Maurizio Lombardi)
     - fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
     - fix inconsistent RCU list manipulation in
       nvme_ns_add_to_ctrl_list() (Zheng Qixing)

 - Fix for a kobject leak in queue unregistration

 - Fix for loop async file write start/end handling

* tag 'block-6.16-20250718' of git://git.kernel.dk/linux:
  loop: use kiocb helpers to fix lockdep warning
  nvmet-tcp: fix callback lock for TLS handshake
  nvme: fix misaccounting of nvme-mpath inflight I/O
  nvme: revert the cross-controller atomic write size validation
  nvme: fix endianness of command word prints in nvme_log_err_passthru()
  nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
  block: fix kobject leak in blk_unregister_queue
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: fix misaccounting of nvme-mpath inflight I/O</title>
<updated>2025-07-15T07:49:13+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-07-15T01:28:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=71257925e83eae1cb6913d65ca71927d2220e6d1'/>
<id>71257925e83eae1cb6913d65ca71927d2220e6d1</id>
<content type='text'>
Procedures for nvme-mpath IO accounting:

 1) initialize nvme_request and clear flags;
 2) set NVME_MPATH_IO_STATS and increase inflight counter when IO
    started;
 3) check NVME_MPATH_IO_STATS and decrease inflight counter when IO is
    done;

However, for the case nvme_fail_nonready_command(), both step 1) and 2)
are skipped, and if old nvme_request set NVME_MPATH_IO_STATS and then
request is reused, step 3) will still be executed, causing inflight I/O
counter to be negative.

Fix the problem by clearing nvme_request in nvme_fail_nonready_command().

Fixes: ea5e5f42cd2c ("nvme-fabrics: avoid double completions in nvmf_fail_nonready_command")
Reported-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Closes: https://lore.kernel.org/all/CAHj4cs_+dauobyYyP805t33WMJVzOWj=7+51p4_j9rA63D9sog@mail.gmail.com/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Procedures for nvme-mpath IO accounting:

 1) initialize nvme_request and clear flags;
 2) set NVME_MPATH_IO_STATS and increase inflight counter when IO
    started;
 3) check NVME_MPATH_IO_STATS and decrease inflight counter when IO is
    done;

However, for the case nvme_fail_nonready_command(), both step 1) and 2)
are skipped, and if old nvme_request set NVME_MPATH_IO_STATS and then
request is reused, step 3) will still be executed, causing inflight I/O
counter to be negative.

Fix the problem by clearing nvme_request in nvme_fail_nonready_command().

Fixes: ea5e5f42cd2c ("nvme-fabrics: avoid double completions in nvmf_fail_nonready_command")
Reported-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Closes: https://lore.kernel.org/all/CAHj4cs_+dauobyYyP805t33WMJVzOWj=7+51p4_j9rA63D9sog@mail.gmail.com/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: revert the cross-controller atomic write size validation</title>
<updated>2025-07-15T07:49:13+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-07-09T09:54:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1fc09f2961f5c6d8bb53bc989f17b12fdc6bc93d'/>
<id>1fc09f2961f5c6d8bb53bc989f17b12fdc6bc93d</id>
<content type='text'>
This was originally added by commit 8695f060a029 ("nvme: all namespaces
in a subsystem must adhere to a common atomic write size") to check
the all controllers in a subsystem report the same atomic write size,
but the check wasn't quite correct and caused problems for devices
with multiple namespaces that report different LBA sizes.  Commit
f46d273449ba ("nvme: fix atomic write size validation") tried to fix
this, but then caused problems for namespace rediscovery after a
format with an LBA size change that changes the AWUPF value.

This drops the validation and essentially reverts those two commits while
keeping the cleanup that went in between the two.  We'll need to figure
out how to properly check for the mouse trap that nvme left us, but for
now revert the check to keep devices working for users who couldn't care
less about the atomic write feature.

Fixes: 8695f060a029 ("nvme: all namespaces in a subsystem must adhere to a common atomic write size")
Fixes: f46d273449ba ("nvme: fix atomic write size validation")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: John Garry &lt;john.g.garry@oracle.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Tested-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was originally added by commit 8695f060a029 ("nvme: all namespaces
in a subsystem must adhere to a common atomic write size") to check
the all controllers in a subsystem report the same atomic write size,
but the check wasn't quite correct and caused problems for devices
with multiple namespaces that report different LBA sizes.  Commit
f46d273449ba ("nvme: fix atomic write size validation") tried to fix
this, but then caused problems for namespace rediscovery after a
format with an LBA size change that changes the AWUPF value.

This drops the validation and essentially reverts those two commits while
keeping the cleanup that went in between the two.  We'll need to figure
out how to properly check for the mouse trap that nvme left us, but for
now revert the check to keep devices working for users who couldn't care
less about the atomic write feature.

Fixes: 8695f060a029 ("nvme: all namespaces in a subsystem must adhere to a common atomic write size")
Fixes: f46d273449ba ("nvme: fix atomic write size validation")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: John Garry &lt;john.g.garry@oracle.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Tested-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: fix endianness of command word prints in nvme_log_err_passthru()</title>
<updated>2025-07-14T13:51:06+00:00</updated>
<author>
<name>John Garry</name>
<email>john.g.garry@oracle.com</email>
</author>
<published>2025-06-30T16:21:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd8e34afd6709cb2f9c0e63340f567e6c066ed8e'/>
<id>dd8e34afd6709cb2f9c0e63340f567e6c066ed8e</id>
<content type='text'>
The command word members of struct nvme_common_command are __le32 type,
so use helper le32_to_cpu() to read them properly.

Fixes: 9f079dda1433 ("nvme: allow passthru cmd error logging")
Signed-off-by: John Garry &lt;john.g.garry@oracle.com&gt;
Reviewed-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The command word members of struct nvme_common_command are __le32 type,
so use helper le32_to_cpu() to read them properly.

Fixes: 9f079dda1433 ("nvme: allow passthru cmd error logging")
Signed-off-by: John Garry &lt;john.g.garry@oracle.com&gt;
Reviewed-by: Alan Adamson &lt;alan.adamson@oracle.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()</title>
<updated>2025-07-14T13:50:32+00:00</updated>
<author>
<name>Zheng Qixing</name>
<email>zhengqixing@huawei.com</email>
</author>
<published>2025-07-01T07:17:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=80d7762e0a42307ee31b21f090e21349b98c14f6'/>
<id>80d7762e0a42307ee31b21f090e21349b98c14f6</id>
<content type='text'>
When inserting a namespace into the controller's namespace list, the
function uses list_add_rcu() when the namespace is inserted in the middle
of the list, but falls back to a regular list_add() when adding at the
head of the list.

This inconsistency could lead to race conditions during concurrent
access, as users might observe a partially updated list. Fix this by
consistently using list_add_rcu() in both code paths to ensure proper
RCU protection throughout the entire function.

Fixes: be647e2c76b2 ("nvme: use srcu for iterating namespace list")
Signed-off-by: Zheng Qixing &lt;zhengqixing@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When inserting a namespace into the controller's namespace list, the
function uses list_add_rcu() when the namespace is inserted in the middle
of the list, but falls back to a regular list_add() when adding at the
head of the list.

This inconsistency could lead to race conditions during concurrent
access, as users might observe a partially updated list. Fix this by
consistently using list_add_rcu() in both code paths to ensure proper
RCU protection throughout the entire function.

Fixes: be647e2c76b2 ("nvme: use srcu for iterating namespace list")
Signed-off-by: Zheng Qixing &lt;zhengqixing@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'block-6.16-20250704' of git://git.kernel.dk/linux</title>
<updated>2025-07-04T16:33:59+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-07-04T16:33:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1880df2cf44af6266b48a905596726c267bc2b04'/>
<id>1880df2cf44af6266b48a905596726c267bc2b04</id>
<content type='text'>
Pull block fixes from Jens Axboe:

 - NVMe fixes via Christoph:
     - fix incorrect cdw15 value in passthru error logging (Alok Tiwari)
     - fix memory leak of bio integrity in nvmet (Dmitry Bogdanov)
     - refresh visible attrs after being checked (Eugen Hristev)
     - fix suspicious RCU usage warning in the multipath code (Geliang Tang)
     - correctly account for namespace head reference counter (Nilay Shroff)

 - Fix for a regression introduced in ublk in this cycle, where it would
   attempt to queue a canceled request.

 - brd RCU sleeping fix, also introduced in this cycle. Bare bones fix,
   should be improved upon for the next release.

* tag 'block-6.16-20250704' of git://git.kernel.dk/linux:
  brd: fix sleeping function called from invalid context in brd_insert_page()
  ublk: don't queue request if the associated uring_cmd is canceled
  nvme-multipath: fix suspicious RCU usage warning
  nvme-pci: refresh visible attrs after being checked
  nvmet: fix memory leak of bio integrity
  nvme: correctly account for namespace head reference counter
  nvme: Fix incorrect cdw15 value in passthru error logging
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block fixes from Jens Axboe:

 - NVMe fixes via Christoph:
     - fix incorrect cdw15 value in passthru error logging (Alok Tiwari)
     - fix memory leak of bio integrity in nvmet (Dmitry Bogdanov)
     - refresh visible attrs after being checked (Eugen Hristev)
     - fix suspicious RCU usage warning in the multipath code (Geliang Tang)
     - correctly account for namespace head reference counter (Nilay Shroff)

 - Fix for a regression introduced in ublk in this cycle, where it would
   attempt to queue a canceled request.

 - brd RCU sleeping fix, also introduced in this cycle. Bare bones fix,
   should be improved upon for the next release.

* tag 'block-6.16-20250704' of git://git.kernel.dk/linux:
  brd: fix sleeping function called from invalid context in brd_insert_page()
  ublk: don't queue request if the associated uring_cmd is canceled
  nvme-multipath: fix suspicious RCU usage warning
  nvme-pci: refresh visible attrs after being checked
  nvmet: fix memory leak of bio integrity
  nvme: correctly account for namespace head reference counter
  nvme: Fix incorrect cdw15 value in passthru error logging
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme-multipath: fix suspicious RCU usage warning</title>
<updated>2025-07-01T06:17:02+00:00</updated>
<author>
<name>Geliang Tang</name>
<email>tanggeliang@kylinos.cn</email>
</author>
<published>2025-06-30T07:48:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d6811074203b13f715ce2480ac64c5b1c773f2a5'/>
<id>d6811074203b13f715ce2480ac64c5b1c773f2a5</id>
<content type='text'>
When I run the NVME over TCP test in virtme-ng, I get the following
"suspicious RCU usage" warning in nvme_mpath_add_sysfs_link():

'''
[    5.024557][   T44] nvmet: Created nvm controller 1 for subsystem nqn.2025-06.org.nvmexpress.mptcp for NQN nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77.
[    5.027401][  T183] nvme nvme0: creating 2 I/O queues.
[    5.029017][  T183] nvme nvme0: mapped 2/0/0 default/read/poll queues.
[    5.032587][  T183] nvme nvme0: new ctrl: NQN "nqn.2025-06.org.nvmexpress.mptcp", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77
[    5.042214][   T25]
[    5.042440][   T25] =============================
[    5.042579][   T25] WARNING: suspicious RCU usage
[    5.042705][   T25] 6.16.0-rc3+ #23 Not tainted
[    5.042812][   T25] -----------------------------
[    5.042934][   T25] drivers/nvme/host/multipath.c:1203 RCU-list traversed in non-reader section!!
[    5.043111][   T25]
[    5.043111][   T25] other info that might help us debug this:
[    5.043111][   T25]
[    5.043341][   T25]
[    5.043341][   T25] rcu_scheduler_active = 2, debug_locks = 1
[    5.043502][   T25] 3 locks held by kworker/u9:0/25:
[    5.043615][   T25]  #0: ffff888008730948 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x7ed/0x1350
[    5.043830][   T25]  #1: ffffc900001afd40 ((work_completion)(&amp;entry-&gt;work)){+.+.}-{0:0}, at: process_one_work+0xcf3/0x1350
[    5.044084][   T25]  #2: ffff888013ee0020 (&amp;head-&gt;srcu){.+.+}-{0:0}, at: nvme_mpath_add_sysfs_link.part.0+0xb4/0x3a0
[    5.044300][   T25]
[    5.044300][   T25] stack backtrace:
[    5.044439][   T25] CPU: 0 UID: 0 PID: 25 Comm: kworker/u9:0 Not tainted 6.16.0-rc3+ #23 PREEMPT(full)
[    5.044441][   T25] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    5.044442][   T25] Workqueue: async async_run_entry_fn
[    5.044445][   T25] Call Trace:
[    5.044446][   T25]  &lt;TASK&gt;
[    5.044449][   T25]  dump_stack_lvl+0x6f/0xb0
[    5.044453][   T25]  lockdep_rcu_suspicious.cold+0x4f/0xb1
[    5.044457][   T25]  nvme_mpath_add_sysfs_link.part.0+0x2fb/0x3a0
[    5.044459][   T25]  ? queue_work_on+0x90/0xf0
[    5.044461][   T25]  ? lockdep_hardirqs_on+0x78/0x110
[    5.044466][   T25]  nvme_mpath_set_live+0x1e9/0x4f0
[    5.044470][   T25]  nvme_mpath_add_disk+0x240/0x2f0
[    5.044472][   T25]  ? __pfx_nvme_mpath_add_disk+0x10/0x10
[    5.044475][   T25]  ? add_disk_fwnode+0x361/0x580
[    5.044480][   T25]  nvme_alloc_ns+0x81c/0x17c0
[    5.044483][   T25]  ? kasan_quarantine_put+0x104/0x240
[    5.044487][   T25]  ? __pfx_nvme_alloc_ns+0x10/0x10
[    5.044495][   T25]  ? __pfx_nvme_find_get_ns+0x10/0x10
[    5.044496][   T25]  ? rcu_read_lock_any_held+0x45/0xa0
[    5.044498][   T25]  ? validate_chain+0x232/0x4f0
[    5.044503][   T25]  nvme_scan_ns+0x4c8/0x810
[    5.044506][   T25]  ? __pfx_nvme_scan_ns+0x10/0x10
[    5.044508][   T25]  ? find_held_lock+0x2b/0x80
[    5.044512][   T25]  ? ktime_get+0x16d/0x220
[    5.044517][   T25]  ? kvm_clock_get_cycles+0x18/0x30
[    5.044520][   T25]  ? __pfx_nvme_scan_ns_async+0x10/0x10
[    5.044522][   T25]  async_run_entry_fn+0x97/0x560
[    5.044523][   T25]  ? rcu_is_watching+0x12/0xc0
[    5.044526][   T25]  process_one_work+0xd3c/0x1350
[    5.044532][   T25]  ? __pfx_process_one_work+0x10/0x10
[    5.044536][   T25]  ? assign_work+0x16c/0x240
[    5.044539][   T25]  worker_thread+0x4da/0xd50
[    5.044545][   T25]  ? __pfx_worker_thread+0x10/0x10
[    5.044546][   T25]  kthread+0x356/0x5c0
[    5.044548][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044549][   T25]  ? ret_from_fork+0x1b/0x2e0
[    5.044552][   T25]  ? __lock_release.isra.0+0x5d/0x180
[    5.044553][   T25]  ? ret_from_fork+0x1b/0x2e0
[    5.044555][   T25]  ? rcu_is_watching+0x12/0xc0
[    5.044557][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044559][   T25]  ret_from_fork+0x218/0x2e0
[    5.044561][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044562][   T25]  ret_from_fork_asm+0x1a/0x30
[    5.044570][   T25]  &lt;/TASK&gt;
'''

This patch uses sleepable RCU version of helper list_for_each_entry_srcu()
instead of list_for_each_entry_rcu() to fix it.

Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy")
Signed-off-by: Geliang Tang &lt;tanggeliang@kylinos.cn&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I run the NVME over TCP test in virtme-ng, I get the following
"suspicious RCU usage" warning in nvme_mpath_add_sysfs_link():

'''
[    5.024557][   T44] nvmet: Created nvm controller 1 for subsystem nqn.2025-06.org.nvmexpress.mptcp for NQN nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77.
[    5.027401][  T183] nvme nvme0: creating 2 I/O queues.
[    5.029017][  T183] nvme nvme0: mapped 2/0/0 default/read/poll queues.
[    5.032587][  T183] nvme nvme0: new ctrl: NQN "nqn.2025-06.org.nvmexpress.mptcp", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77
[    5.042214][   T25]
[    5.042440][   T25] =============================
[    5.042579][   T25] WARNING: suspicious RCU usage
[    5.042705][   T25] 6.16.0-rc3+ #23 Not tainted
[    5.042812][   T25] -----------------------------
[    5.042934][   T25] drivers/nvme/host/multipath.c:1203 RCU-list traversed in non-reader section!!
[    5.043111][   T25]
[    5.043111][   T25] other info that might help us debug this:
[    5.043111][   T25]
[    5.043341][   T25]
[    5.043341][   T25] rcu_scheduler_active = 2, debug_locks = 1
[    5.043502][   T25] 3 locks held by kworker/u9:0/25:
[    5.043615][   T25]  #0: ffff888008730948 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x7ed/0x1350
[    5.043830][   T25]  #1: ffffc900001afd40 ((work_completion)(&amp;entry-&gt;work)){+.+.}-{0:0}, at: process_one_work+0xcf3/0x1350
[    5.044084][   T25]  #2: ffff888013ee0020 (&amp;head-&gt;srcu){.+.+}-{0:0}, at: nvme_mpath_add_sysfs_link.part.0+0xb4/0x3a0
[    5.044300][   T25]
[    5.044300][   T25] stack backtrace:
[    5.044439][   T25] CPU: 0 UID: 0 PID: 25 Comm: kworker/u9:0 Not tainted 6.16.0-rc3+ #23 PREEMPT(full)
[    5.044441][   T25] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    5.044442][   T25] Workqueue: async async_run_entry_fn
[    5.044445][   T25] Call Trace:
[    5.044446][   T25]  &lt;TASK&gt;
[    5.044449][   T25]  dump_stack_lvl+0x6f/0xb0
[    5.044453][   T25]  lockdep_rcu_suspicious.cold+0x4f/0xb1
[    5.044457][   T25]  nvme_mpath_add_sysfs_link.part.0+0x2fb/0x3a0
[    5.044459][   T25]  ? queue_work_on+0x90/0xf0
[    5.044461][   T25]  ? lockdep_hardirqs_on+0x78/0x110
[    5.044466][   T25]  nvme_mpath_set_live+0x1e9/0x4f0
[    5.044470][   T25]  nvme_mpath_add_disk+0x240/0x2f0
[    5.044472][   T25]  ? __pfx_nvme_mpath_add_disk+0x10/0x10
[    5.044475][   T25]  ? add_disk_fwnode+0x361/0x580
[    5.044480][   T25]  nvme_alloc_ns+0x81c/0x17c0
[    5.044483][   T25]  ? kasan_quarantine_put+0x104/0x240
[    5.044487][   T25]  ? __pfx_nvme_alloc_ns+0x10/0x10
[    5.044495][   T25]  ? __pfx_nvme_find_get_ns+0x10/0x10
[    5.044496][   T25]  ? rcu_read_lock_any_held+0x45/0xa0
[    5.044498][   T25]  ? validate_chain+0x232/0x4f0
[    5.044503][   T25]  nvme_scan_ns+0x4c8/0x810
[    5.044506][   T25]  ? __pfx_nvme_scan_ns+0x10/0x10
[    5.044508][   T25]  ? find_held_lock+0x2b/0x80
[    5.044512][   T25]  ? ktime_get+0x16d/0x220
[    5.044517][   T25]  ? kvm_clock_get_cycles+0x18/0x30
[    5.044520][   T25]  ? __pfx_nvme_scan_ns_async+0x10/0x10
[    5.044522][   T25]  async_run_entry_fn+0x97/0x560
[    5.044523][   T25]  ? rcu_is_watching+0x12/0xc0
[    5.044526][   T25]  process_one_work+0xd3c/0x1350
[    5.044532][   T25]  ? __pfx_process_one_work+0x10/0x10
[    5.044536][   T25]  ? assign_work+0x16c/0x240
[    5.044539][   T25]  worker_thread+0x4da/0xd50
[    5.044545][   T25]  ? __pfx_worker_thread+0x10/0x10
[    5.044546][   T25]  kthread+0x356/0x5c0
[    5.044548][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044549][   T25]  ? ret_from_fork+0x1b/0x2e0
[    5.044552][   T25]  ? __lock_release.isra.0+0x5d/0x180
[    5.044553][   T25]  ? ret_from_fork+0x1b/0x2e0
[    5.044555][   T25]  ? rcu_is_watching+0x12/0xc0
[    5.044557][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044559][   T25]  ret_from_fork+0x218/0x2e0
[    5.044561][   T25]  ? __pfx_kthread+0x10/0x10
[    5.044562][   T25]  ret_from_fork_asm+0x1a/0x30
[    5.044570][   T25]  &lt;/TASK&gt;
'''

This patch uses sleepable RCU version of helper list_for_each_entry_srcu()
instead of list_for_each_entry_rcu() to fix it.

Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy")
Signed-off-by: Geliang Tang &lt;tanggeliang@kylinos.cn&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme-pci: refresh visible attrs after being checked</title>
<updated>2025-06-30T06:42:47+00:00</updated>
<author>
<name>Eugen Hristev</name>
<email>eugen.hristev@collabora.com</email>
</author>
<published>2025-06-13T19:21:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=14005c96d6649b27fa52d0cd0f492eb3b5586c07'/>
<id>14005c96d6649b27fa52d0cd0f492eb3b5586c07</id>
<content type='text'>
The sysfs attributes are registered early, but the driver does not know
whether they are needed or not at that moment.

For the CMB attributes, commit e917a849c3fc ("nvme-pci: refresh visible
attrs for cmb attributes") solved this problem by
calling nvme_update_attrs after mapping the CMB.  However the issue
persists for the HMB attributes. To solve the problem, moved the call to
nvme_update_attrs after nvme_setup_host_mem, which sets up the HMB.

Fixes: e917a849c3fc ("nvme-pci: refresh visible attrs for cmb attributes")
Fixes: 86adbf0cdb9e ("nvme: simplify transport specific device attribute handling")
Signed-off-by: Eugen Hristev &lt;eugen.hristev@collabora.com&gt;
Signed-off-by: André Almeida &lt;andrealmeid@igalia.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The sysfs attributes are registered early, but the driver does not know
whether they are needed or not at that moment.

For the CMB attributes, commit e917a849c3fc ("nvme-pci: refresh visible
attrs for cmb attributes") solved this problem by
calling nvme_update_attrs after mapping the CMB.  However the issue
persists for the HMB attributes. To solve the problem, moved the call to
nvme_update_attrs after nvme_setup_host_mem, which sets up the HMB.

Fixes: e917a849c3fc ("nvme-pci: refresh visible attrs for cmb attributes")
Fixes: 86adbf0cdb9e ("nvme: simplify transport specific device attribute handling")
Signed-off-by: Eugen Hristev &lt;eugen.hristev@collabora.com&gt;
Signed-off-by: André Almeida &lt;andrealmeid@igalia.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: correctly account for namespace head reference counter</title>
<updated>2025-06-30T06:31:49+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-06-26T05:19:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba806c900379899e5cdd6ca165b900e2081e1c99'/>
<id>ba806c900379899e5cdd6ca165b900e2081e1c99</id>
<content type='text'>
The blktests nvme/058 manifests an issue where the NVMe subsystem
kobject entry remains stale in sysfs, causing a failure during
subsequent NVMe module reloads[1]. Specifically, when attempting to
register a new NVMe subsystem, the driver encounters a kobejct name
collision because a stale kobject still exists. Though, please note
that nvme/058 doesn't report any failure and test case passes and
it's only during subsequent NVMe module reloads, the stale nvme sub-
system kobject entry in sysfs causes the observed symptom[1].

This issue stems from an imbalance in the get/put usage of the namespace
head (nshead) reference counter. The nshead holds a reference to the
associated NVMe subsystem. If the nshead reference is not properly
released, it prevents the cleanup of the subsystem's kobject, leaving
nvme subsystem stale entry behind in sysfs.

During the failure case, the last namespace path referencing a nshead
is removed, but the nshead reference was not released. This occurs
because the release logic currently only puts the nshead reference
when its state is LIVE. However, in configurations where ANA (Asymmetric
Namespace Access) is enabled, a namespace may be associated with an ANA
state that is neither optimized nor non-optimized. In this case, the
nshead may never transition to LIVE, and the corresponding nshead
reference is then never dropped. In fact nvme/058 associates some of
nvme namespaces to an inaccessible ANA state and with that nshead is
created but it's state is not transitioned to LIVE. So the current
logic would then causes nshead reference to be leaked for non-LIVE
states.

Another scenario, during namespace allocation, the driver first
allocates a nshead and then issues an Identify Namespace command. If
this command fails — which can happen in tests like nvme/058 that
rapidly enables and disables namespaces — we must release the reference
to the newly allocated nshead. However this reference release is
currently missing in the failure, causing a nshead reference leak.

To fix this, we now unconditionally release the nshead reference when
the last nvme path referencing to the nshead is removed, regardless of
the head’s state. Also during identify namespace failure case we now
properly release the nshead reference. So this ensures proper cleanup
of the nshead, and consequently, the NVMe subsystem and its associated
kobject.

This change prevents stale kobject entries from lingering in sysfs and
eliminates the module reload failures observed just after running
nvme/058.

[1] https://lore.kernel.org/all/CAHj4cs8fOBS-eSjsd5LUBzy7faKXJtgLkCN+mDy_-ezCLLLq+Q@mail.gmail.com/

Reported-by: yi.zhang@redhat.com
Closes: https://lore.kernel.org/all/CAHj4cs8fOBS-eSjsd5LUBzy7faKXJtgLkCN+mDy_-ezCLLLq+Q@mail.gmail.com/
Fixes: 62188639ec16 ("nvme-multipath: introduce delayed removal of the multipath head node")
Tested-by: yi.zhang@redhat.com
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The blktests nvme/058 manifests an issue where the NVMe subsystem
kobject entry remains stale in sysfs, causing a failure during
subsequent NVMe module reloads[1]. Specifically, when attempting to
register a new NVMe subsystem, the driver encounters a kobejct name
collision because a stale kobject still exists. Though, please note
that nvme/058 doesn't report any failure and test case passes and
it's only during subsequent NVMe module reloads, the stale nvme sub-
system kobject entry in sysfs causes the observed symptom[1].

This issue stems from an imbalance in the get/put usage of the namespace
head (nshead) reference counter. The nshead holds a reference to the
associated NVMe subsystem. If the nshead reference is not properly
released, it prevents the cleanup of the subsystem's kobject, leaving
nvme subsystem stale entry behind in sysfs.

During the failure case, the last namespace path referencing a nshead
is removed, but the nshead reference was not released. This occurs
because the release logic currently only puts the nshead reference
when its state is LIVE. However, in configurations where ANA (Asymmetric
Namespace Access) is enabled, a namespace may be associated with an ANA
state that is neither optimized nor non-optimized. In this case, the
nshead may never transition to LIVE, and the corresponding nshead
reference is then never dropped. In fact nvme/058 associates some of
nvme namespaces to an inaccessible ANA state and with that nshead is
created but it's state is not transitioned to LIVE. So the current
logic would then causes nshead reference to be leaked for non-LIVE
states.

Another scenario, during namespace allocation, the driver first
allocates a nshead and then issues an Identify Namespace command. If
this command fails — which can happen in tests like nvme/058 that
rapidly enables and disables namespaces — we must release the reference
to the newly allocated nshead. However this reference release is
currently missing in the failure, causing a nshead reference leak.

To fix this, we now unconditionally release the nshead reference when
the last nvme path referencing to the nshead is removed, regardless of
the head’s state. Also during identify namespace failure case we now
properly release the nshead reference. So this ensures proper cleanup
of the nshead, and consequently, the NVMe subsystem and its associated
kobject.

This change prevents stale kobject entries from lingering in sysfs and
eliminates the module reload failures observed just after running
nvme/058.

[1] https://lore.kernel.org/all/CAHj4cs8fOBS-eSjsd5LUBzy7faKXJtgLkCN+mDy_-ezCLLLq+Q@mail.gmail.com/

Reported-by: yi.zhang@redhat.com
Closes: https://lore.kernel.org/all/CAHj4cs8fOBS-eSjsd5LUBzy7faKXJtgLkCN+mDy_-ezCLLLq+Q@mail.gmail.com/
Fixes: 62188639ec16 ("nvme-multipath: introduce delayed removal of the multipath head node")
Tested-by: yi.zhang@redhat.com
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: Fix incorrect cdw15 value in passthru error logging</title>
<updated>2025-06-30T06:31:45+00:00</updated>
<author>
<name>Alok Tiwari</name>
<email>alok.a.tiwari@oracle.com</email>
</author>
<published>2025-06-28T18:12:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2e96d2d8c2a7a6c2cef45593c028d9c5ef180316'/>
<id>2e96d2d8c2a7a6c2cef45593c028d9c5ef180316</id>
<content type='text'>
Fix an error in nvme_log_err_passthru() where cdw14 was incorrectly
printed twice instead of cdw15. This fix ensures accurate logging of
the full passthrough command payload.

Fixes: 9f079dda1433 ("nvme: allow passthru cmd error logging")
Signed-off-by: Alok Tiwari &lt;alok.a.tiwari@oracle.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix an error in nvme_log_err_passthru() where cdw14 was incorrectly
printed twice instead of cdw15. This fix ensures accurate logging of
the full passthrough command payload.

Fixes: 9f079dda1433 ("nvme: allow passthru cmd error logging")
Signed-off-by: Alok Tiwari &lt;alok.a.tiwari@oracle.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
