linux-toradex.git/drivers/nvme/host/fc.c, branch v5.5

nvme-fc: fix double-free scenarios on hw queues

2019-11-26T18:00:13+00:00

If an error occurs on one of the ios used for creating an
association, the creating routine has error paths that are
invoked by the command failure and the error paths will free
up the controller resources created to that point.

But... the io was ultimately determined by an asynchronous
completion routine that detected the error and which
unconditionally invokes the error_recovery path which calls
delete_association. Delete association deletes all outstanding
io then tears down the controller resources. So the
create_association thread can be running in parallel with
the error_recovery thread. What was seen was the LLDD received
a call to delete a queue, causing the LLDD to do a free of a
resource, then the transport called the delete queue again
causing the driver to repeat the free call. The second free
routine corrupted the allocator. The transport shouldn't be
making the duplicate call, and the delete queue is just one
of the resources being freed.

To fix, it is realized that the create_association path is
completely serialized with one command at a time. So the
failed io completion will always be seen by the create_association
path and as of the failure, there are no ios to terminate and there
is no reason to be manipulating queue freeze states, etc.
The serialized condition stays true until the controller is
transitioned to the LIVE state. Thus the fix is to change the
error recovery path to check the controller state and only
invoke the teardown path if not already in the CONNECTING state.

Reviewed-by: Himanshu Madhani 
Reviewed-by: Ewan D. Milne 
Signed-off-by: James Smart 
Signed-off-by: Keith Busch

nvme_fc: add module to ops template to allow module references

2019-11-26T17:48:27+00:00

In nvme-fc: it's possible to have connected active controllers
and as no references are taken on the LLDD, the LLDD can be
unloaded.  The controller would enter a reconnect state and as
long as the LLDD resumed within the reconnect timeout, the
controller would resume.  But if a namespace on the controller
is the root device, allowing the driver to unload can be problematic.
To reload the driver, it may require new io to the boot device,
and as it's no longer connected we get into a catch-22 that
eventually fails, and the system locks up.

Fix this issue by taking a module reference for every connected
controller (which is what the core layer did to the transport
module). Reference is cleared when the controller is removed.

Acked-by: Himanshu Madhani 
Reviewed-by: Christoph Hellwig 
Signed-off-by: James Smart 
Signed-off-by: Keith Busch

nvme-fc: Avoid preallocating big SGL for data

2019-11-26T17:14:01+00:00

nvme_fc_create_io_queues() preallocates a big buffer for the IO SGL based
on SG_CHUNK_SIZE.

Modern DMA engines are often capable of dealing with very big segments so
the SG_CHUNK_SIZE is often too big. SG_CHUNK_SIZE results in a static 4KB
SGL allocation per command.

If a controller has lots of deep queues, preallocation for the sg list can
consume substantial amounts of memory. For nvme-fc, nr_hw_queues can be
128 and each queue's depth 128. This means the resulting preallocation
for the data SGL is 128*128*4K = 64MB per controller.

Switch to runtime allocation for SGL for lists longer than 2 entries. This
is the approach used by NVMe PCI so it should be reasonable for NVMeOF as
well. Runtime SGL allocation has always been the case for the legacy I/O
path so this is nothing new.

Reviewed-by: Max Gurtovoy 
Reviewed-by: James Smart 
Signed-off-by: Israel Rukshin 
Signed-off-by: Keith Busch

nvme: move common call to nvme_cleanup_cmd to core layer

2019-11-04T17:56:41+00:00

nvme_cleanup_cmd should be called for each call to nvme_setup_cmd
(symmetrical functions). Move the call for nvme_cleanup_cmd to the common
core layer and call it during nvme_complete_rq for the good flow. For
error flow, each transport will call nvme_cleanup_cmd independently. Also
take care of a special case of path failure, where we call
nvme_complete_rq without doing nvme_setup_cmd.

Signed-off-by: Max Gurtovoy 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe

nvme-fc: ensure association_id is cleared regardless of a Disconnect LS

2019-11-04T17:56:40+00:00

Code today only clears the association_id if a Disconnect LS is transmit.

Remove ambiguity and unconditionally clear the association_id if the
association has been terminated.

Signed-off-by: James Smart 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe

nvme-fc: clarify error messages

2019-11-04T17:56:40+00:00

Change wording on a couple of messages to clarify what happened.

Signed-off-by: Ewan D. Milne 
Signed-off-by: James Smart 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe

nvme-fc: Set new cmd set indicator in nvme-fc cmnd iu

2019-11-04T17:56:40+00:00

Set the new category field in the FC-NVME CMND_IU based on queue number.

Signed-off-by: James Smart 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe

nvme-fc and nvmet-fc: sync with FC-NVME-2 header changes

2019-11-04T17:56:40+00:00

Sync sources with revised structure and field names to correspond with
FC-NVME-2 header sync-up.

Tested interoperability with success:
- prior initiator with new target
- prior target with new initiator
- new on new

Signed-off-by: James Smart 
Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe

nvme-fc: Fail transport errors with NVME_SC_HOST_PATH

2019-09-12T15:50:45+00:00

NVME_SC_INTERNAL should indicate an internal controller errors
and not host transport errors. These errors will propagate to
upper layers (essentially nvme core) and be interpereted as
transport errors which should not be taken into account for
namespace state or condition.

Reviewed-by: Hannes Reinecke 
Reviewed-by: James Smart 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Sagi Grimberg

nvme-fc: Use rq_dma_dir macro

2019-08-29T19:55:03+00:00

Remove code duplication.

Signed-off-by: Israel Rukshin 
Reviewed-by: Max Gurtovoy 
Reviewed-by: James Smart 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Sagi Grimberg