linux-toradex.git/drivers/nvme/host, branch v5.17

nvme-tcp: send H2CData PDUs based on MAXH2CDATA

2022-02-23T13:43:11+00:00

As per NVMe/TCP specification (revision 1.0a, section 3.6.2.3)
Maximum Host to Controller Data length (MAXH2CDATA): Specifies the
maximum number of PDU-Data bytes per H2CData PDU in bytes. This value
is a multiple of dwords and should be no less than 4,096.

Current code sets H2CData PDU data_length to r2t_length,
it does not check MAXH2CDATA value. Fix this by setting H2CData PDU
data_length to min(req->h2cdata_left, queue->maxh2cdata).

Also validate MAXH2CDATA value returned by target in ICResp PDU,
if it is not a multiple of dword or if it is less than 4096 return
-EINVAL from nvme_tcp_init_connection().

Signed-off-by: Varun Prakash 
Reviewed-by: Sagi Grimberg 
Signed-off-by: Christoph Hellwig

nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info

2022-02-23T13:42:58+00:00

Commit e7d65803e2bb ("nvme-multipath: revalidate paths during rescan")
introduced the NVME_NS_READY flag, which nvme_path_is_disabled() uses
to check if a path can be used or not.  We also need to set this flag
for devices that fail the ZNS feature validation and which are available
through passthrough devices only to that they can be used in multipathing
setups.

Fixes: e7d65803e2bb ("nvme-multipath: revalidate paths during rescan")
Reported-by: Kanchan Joshi 
Signed-off-by: Christoph Hellwig 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Daniel Wagner 
Tested-by: Kanchan Joshi

nvme: don't return an error from nvme_configure_metadata

2022-02-23T13:42:51+00:00

When a fabrics controller claims to support an invalidate metadata
configuration we already warn and disable metadata support.  No need to
also return an error during revalidation.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Daniel Wagner 
Tested-by: Kanchan Joshi

block: fix surprise removal for drivers calling blk_set_queue_dying

2022-02-17T14:54:03+00:00

Various block drivers call blk_set_queue_dying to mark a disk as dead due
to surprise removal events, but since commit 8e141f9eb803 that doesn't
work given that the GD_DEAD flag needs to be set to stop I/O.

Replace the driver calls to blk_set_queue_dying with a new (and properly
documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the
only remaining caller.

Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk")
Reported-by: Markus Blöchl 
Signed-off-by: Christoph Hellwig 
Reviewed-by: Sagi Grimberg 
Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de
Signed-off-by: Jens Axboe

nvme-tcp: fix bogus request completion when failing to send AER

2022-02-09T13:50:42+00:00

AER is not backed by a real request, hence we should not incorrectly
assume that when failing to send a nvme command, it is a normal request
but rather check if this is an aer and if so complete the aer (similar
to the normal completion path).

Cc: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Christoph Hellwig

nvme: add nvme_complete_req tracepoint for batched completion

2022-02-09T13:50:42+00:00

Add NVMe request completion trace in nvme_complete_batch_req() because
nvme:nvme_complete_req tracepoint is missing in case of request batched
completion.

Signed-off-by: Bean Huo 
Signed-off-by: Christoph Hellwig

nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts()

2022-02-03T06:30:57+00:00

Controller deletion/reset, immediately followed by or concurrent with
a reconnect, is hard failing the connect attempt resulting in a
complete loss of connectivity to the controller.

In the connect request, fabrics looks for an existing controller with
the same address components and aborts the connect if a controller
already exists and the duplicate connect option isn't set. The match
routine filters out controllers that are dead or dying, so they don't
interfere with the new connect request.

When NVME_CTRL_DELETING_NOIO was added, it missed updating the state
filters in the nvmf_ctlr_matches_baseopts() routine. Thus, when in this
new state, it's seen as a live controller and fails the connect request.

Correct by adding the DELETING_NIO state to the match checks.

Fixes: ecca390e8056 ("nvme: fix deadlock in disconnect during scan_work and/or ana_work")
Cc:  # v5.7+
Signed-off-by: Uday Shankar 
Reviewed-by: James Smart 
Reviewed-by: Sagi Grimberg 
Signed-off-by: Christoph Hellwig

nvme-rdma: fix possible use-after-free in transport error_recovery work

2022-02-02T08:19:07+00:00

While nvme_rdma_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

Signed-off-by: Sagi Grimberg

nvme-tcp: fix possible use-after-free in transport error_recovery work

2022-02-02T08:19:07+00:00

While nvme_tcp_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

Tested-by: Chris Leech 
Signed-off-by: Sagi Grimberg

nvme: fix a possible use-after-free in controller reset during load

2022-02-02T08:19:05+00:00

Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
readiness for AER submission. This may lead to a use-after-free
condition that was observed with nvme-tcp.

The race condition may happen in the following scenario:
1. driver executes its reset_ctrl_work
2. -> nvme_stop_ctrl - flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the queue socket)
5. AEN processed, submits another AER, calling the driver to submit
6. driver attempts to send the cmd
==> use-after-free

In order to fix that, add ctrl state check to validate the ctrl
is actually able to accept the AER submission.

This addresses the above race in controller resets because the driver
during teardown should:
1. change ctrl state to RESETTING
2. flush async_event_work (as well as other async work elements)

So after 1,2, any other AER command will find the
ctrl state to be RESETTING and bail out without submitting the AER.

Signed-off-by: Sagi Grimberg