<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/nvme, branch v4.9.110</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>nvmet: don't overwrite identify sn/fr with 0-bytes</title>
<updated>2018-06-16T07:52:33+00:00</updated>
<author>
<name>Martin Wilck</name>
<email>mwilck@suse.com</email>
</author>
<published>2017-08-14T20:12:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1c4eb2a50e77109b3b89c57e80fce54a72f69fcf'/>
<id>1c4eb2a50e77109b3b89c57e80fce54a72f69fcf</id>
<content type='text'>
commit 42819eb7a0957cc340ad4ed8bba736bab5ebc464 upstream.

The merged version of my patch "nvmet: don't report 0-bytes in serial
number" fails to remove two lines which should have been replaced,
so that the space-padded strings are overwritten again with 0-bytes.
Fix it.

Fixes: 42de82a8b544 nvmet: don't report 0-bytes in serial number
Signed-off-by: Martin Wilck &lt;mwilck@suse.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimbeg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 42819eb7a0957cc340ad4ed8bba736bab5ebc464 upstream.

The merged version of my patch "nvmet: don't report 0-bytes in serial
number" fails to remove two lines which should have been replaced,
so that the space-padded strings are overwritten again with 0-bytes.
Fix it.

Fixes: 42de82a8b544 nvmet: don't report 0-bytes in serial number
Signed-off-by: Martin Wilck &lt;mwilck@suse.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimbeg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>nvmet: don't report 0-bytes in serial number</title>
<updated>2018-06-16T07:52:32+00:00</updated>
<author>
<name>Martin Wilck</name>
<email>mwilck@suse.com</email>
</author>
<published>2017-07-13T22:25:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f43d8e4c8619c5ac15697ca63747737000441aca'/>
<id>f43d8e4c8619c5ac15697ca63747737000441aca</id>
<content type='text'>
commit 42de82a8b544fa55670feef7d6f85085fba48fc0 upstream.

The NVME standard mandates that the SN, MN, and FR fields of the Identify
Controller Data Structure be "ASCII strings".  That means that they may
not contain 0-bytes, not even string terminators.

Signed-off-by: Martin Wilck &lt;mwilck@suse.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
[hch: fixed for the move of the serial field, updated description]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 42de82a8b544fa55670feef7d6f85085fba48fc0 upstream.

The NVME standard mandates that the SN, MN, and FR fields of the Identify
Controller Data Structure be "ASCII strings".  That means that they may
not contain 0-bytes, not even string terminators.

Signed-off-by: Martin Wilck &lt;mwilck@suse.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
[hch: fixed for the move of the serial field, updated description]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>nvmet: Move serial number from controller to subsystem</title>
<updated>2018-06-16T07:52:32+00:00</updated>
<author>
<name>Johannes Thumshirn</name>
<email>jthumshirn@suse.de</email>
</author>
<published>2017-07-14T13:36:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1e38f8e9864f2c52775743694f0517c6bd276d83'/>
<id>1e38f8e9864f2c52775743694f0517c6bd276d83</id>
<content type='text'>
commit 2e7f5d2af2155084c6f7c86328d36e698cd84954 upstream.

The NVMe specification defines the serial number as:

"Serial Number (SN): Contains the serial number for the NVM subsystem
that is assigned by the vendor as an ASCII string. Refer to section
7.10 for unique identifier requirements. Refer to section 1.5 for ASCII
string requirements"

So move it from the controller to the subsystem, where it belongs.

Signed-off-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2e7f5d2af2155084c6f7c86328d36e698cd84954 upstream.

The NVMe specification defines the serial number as:

"Serial Number (SN): Contains the serial number for the NVM subsystem
that is assigned by the vendor as an ASCII string. Refer to section
7.10 for unique identifier requirements. Refer to section 1.5 for ASCII
string requirements"

So move it from the controller to the subsystem, where it belongs.

Signed-off-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>nvme-pci: initialize queue memory before interrupts</title>
<updated>2018-06-16T07:52:32+00:00</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2017-09-14T17:54:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b53761a18e71a5ce699c1e93d0f9a7c59b186f08'/>
<id>b53761a18e71a5ce699c1e93d0f9a7c59b186f08</id>
<content type='text'>
commit 161b8be2bd6abad250d4b3f674bdd5480f15beeb upstream.

A spurious interrupt before the nvme driver has initialized the completion
queue may inadvertently cause the driver to believe it has a completion
to process. This may result in a NULL dereference since the nvmeq's tags
are not set at this point.

The patch initializes the host's CQ memory so that a spurious interrupt
isn't mistaken for a real completion.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 161b8be2bd6abad250d4b3f674bdd5480f15beeb upstream.

A spurious interrupt before the nvme driver has initialized the completion
queue may inadvertently cause the driver to believe it has a completion
to process. This may result in a NULL dereference since the nvmeq's tags
are not set at this point.

The patch initializes the host's CQ memory so that a spurious interrupt
isn't mistaken for a real completion.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: don't send keep-alives to the discovery controller</title>
<updated>2018-05-30T05:50:39+00:00</updated>
<author>
<name>Johannes Thumshirn</name>
<email>jthumshirn@suse.de</email>
</author>
<published>2018-04-12T15:16:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f2a1bf451130e7141e3cf88f7e682c0d6afa9612'/>
<id>f2a1bf451130e7141e3cf88f7e682c0d6afa9612</id>
<content type='text'>
[ Upstream commit 74c6c71530847808d4e3be7b205719270efee80c ]

NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
Command Support" Figure 31 "Discovery Controller – Admin Commands"
explicitly listst all commands but "Get Log Page" and "Identify" as
reserved, but NetApp report the Linux host is sending Keep Alive
commands to the discovery controller, which is a violation of the
Spec.

We're already checking for discovery controllers when configuring the
keep alive timeout but when creating a discovery controller we're not
hard wiring the keep alive timeout to 0 and thus remain on
NVME_DEFAULT_KATO for the discovery controller.

This can be easily remproduced when issuing a direct connect to the
discovery susbsystem using:
'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'

Signed-off-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Fixes: 07bfcd09a288 ("nvme-fabrics: add a generic NVMe over Fabrics library")
Reported-by: Martin George &lt;marting@netapp.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 74c6c71530847808d4e3be7b205719270efee80c ]

NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
Command Support" Figure 31 "Discovery Controller – Admin Commands"
explicitly listst all commands but "Get Log Page" and "Identify" as
reserved, but NetApp report the Linux host is sending Keep Alive
commands to the discovery controller, which is a violation of the
Spec.

We're already checking for discovery controllers when configuring the
keep alive timeout but when creating a discovery controller we're not
hard wiring the keep alive timeout to 0 and thus remain on
NVME_DEFAULT_KATO for the discovery controller.

This can be easily remproduced when issuing a direct connect to the
discovery susbsystem using:
'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'

Signed-off-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Fixes: 07bfcd09a288 ("nvme-fabrics: add a generic NVMe over Fabrics library")
Reported-by: Martin George &lt;marting@netapp.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvmet: fix PSDT field check in command format</title>
<updated>2018-05-30T05:50:33+00:00</updated>
<author>
<name>Max Gurtovoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2018-01-24T15:31:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c0074250ea9064ddb9c0476b6bec75a4d862cc01'/>
<id>c0074250ea9064ddb9c0476b6bec75a4d862cc01</id>
<content type='text'>
[ Upstream commit bffd2b61670feef18d2535e9b53364d270a1c991 ]

PSDT field section according to NVM_Express-1.3:
"This field specifies whether PRPs or SGLs are used for any data
transfer associated with the command. PRPs shall be used for all
Admin commands for NVMe over PCIe. SGLs shall be used for all Admin
and I/O commands for NVMe over Fabrics. This field shall be set to
01b for NVMe over Fabrics 1.0 implementations.

Suggested-by: Idan Burstein &lt;idanb@mellanox.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit bffd2b61670feef18d2535e9b53364d270a1c991 ]

PSDT field section according to NVM_Express-1.3:
"This field specifies whether PRPs or SGLs are used for any data
transfer associated with the command. PRPs shall be used for all
Admin commands for NVMe over PCIe. SGLs shall be used for all Admin
and I/O commands for NVMe over Fabrics. This field shall be set to
01b for NVMe over Fabrics 1.0 implementations.

Suggested-by: Idan Burstein &lt;idanb@mellanox.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme-pci: Fix nvme queue cleanup if IRQ setup fails</title>
<updated>2018-05-30T05:50:31+00:00</updated>
<author>
<name>Jianchao Wang</name>
<email>jianchao.w.wang@oracle.com</email>
</author>
<published>2018-02-15T11:13:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=23f9fb0f53b3add8669f093d58fe193846154a87'/>
<id>23f9fb0f53b3add8669f093d58fe193846154a87</id>
<content type='text'>
[ Upstream commit f25a2dfc20e3a3ed8fe6618c331799dd7bd01190 ]

This patch fixes nvme queue cleanup if requesting an IRQ handler for
the queue's vector fails. It does this by resetting the cq_vector to
the uninitialized value of -1 so it is ignored for a controller reset.

Signed-off-by: Jianchao Wang &lt;jianchao.w.wang@oracle.com&gt;
[changelog updates, removed misc whitespace changes]
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f25a2dfc20e3a3ed8fe6618c331799dd7bd01190 ]

This patch fixes nvme queue cleanup if requesting an IRQ handler for
the queue's vector fails. It does this by resetting the cq_vector to
the uninitialized value of -1 so it is ignored for a controller reset.

Signed-off-by: Jianchao Wang &lt;jianchao.w.wang@oracle.com&gt;
[changelog updates, removed misc whitespace changes]
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: fix hang in remove path</title>
<updated>2018-04-13T17:48:23+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2017-06-02T08:32:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4aae4388165a2611fa4206363ccb243c1622446c'/>
<id>4aae4388165a2611fa4206363ccb243c1622446c</id>
<content type='text'>
[ Upstream commit 82654b6b8ef8b93ee87a97fc562f87f081fc2f91 ]

We need to start admin queues too in nvme_kill_queues()
for avoiding hang in remove path[1].

This patch is very similar with 806f026f9b901eaf(nvme: use
blk_mq_start_hw_queues() in nvme_kill_queues()).

[1] hang stack trace
[&lt;ffffffff813c9716&gt;] blk_execute_rq+0x56/0x80
[&lt;ffffffff815cb6e9&gt;] __nvme_submit_sync_cmd+0x89/0xf0
[&lt;ffffffff815ce7be&gt;] nvme_set_features+0x5e/0x90
[&lt;ffffffff815ce9f6&gt;] nvme_configure_apst+0x166/0x200
[&lt;ffffffff815cef45&gt;] nvme_set_latency_tolerance+0x35/0x50
[&lt;ffffffff8157bd11&gt;] apply_constraint+0xb1/0xc0
[&lt;ffffffff8157cbb4&gt;] dev_pm_qos_constraints_destroy+0xf4/0x1f0
[&lt;ffffffff8157b44a&gt;] dpm_sysfs_remove+0x2a/0x60
[&lt;ffffffff8156d951&gt;] device_del+0x101/0x320
[&lt;ffffffff8156db8a&gt;] device_unregister+0x1a/0x60
[&lt;ffffffff8156dc4c&gt;] device_destroy+0x3c/0x50
[&lt;ffffffff815cd295&gt;] nvme_uninit_ctrl+0x45/0xa0
[&lt;ffffffff815d4858&gt;] nvme_remove+0x78/0x110
[&lt;ffffffff81452b69&gt;] pci_device_remove+0x39/0xb0
[&lt;ffffffff81572935&gt;] device_release_driver_internal+0x155/0x210
[&lt;ffffffff81572a02&gt;] device_release_driver+0x12/0x20
[&lt;ffffffff815d36fb&gt;] nvme_remove_dead_ctrl_work+0x6b/0x70
[&lt;ffffffff810bf3bc&gt;] process_one_work+0x18c/0x3a0
[&lt;ffffffff810bf61e&gt;] worker_thread+0x4e/0x3b0
[&lt;ffffffff810c5ac9&gt;] kthread+0x109/0x140
[&lt;ffffffff8185800c&gt;] ret_from_fork+0x2c/0x40
[&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions")
Reported-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Tested-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 82654b6b8ef8b93ee87a97fc562f87f081fc2f91 ]

We need to start admin queues too in nvme_kill_queues()
for avoiding hang in remove path[1].

This patch is very similar with 806f026f9b901eaf(nvme: use
blk_mq_start_hw_queues() in nvme_kill_queues()).

[1] hang stack trace
[&lt;ffffffff813c9716&gt;] blk_execute_rq+0x56/0x80
[&lt;ffffffff815cb6e9&gt;] __nvme_submit_sync_cmd+0x89/0xf0
[&lt;ffffffff815ce7be&gt;] nvme_set_features+0x5e/0x90
[&lt;ffffffff815ce9f6&gt;] nvme_configure_apst+0x166/0x200
[&lt;ffffffff815cef45&gt;] nvme_set_latency_tolerance+0x35/0x50
[&lt;ffffffff8157bd11&gt;] apply_constraint+0xb1/0xc0
[&lt;ffffffff8157cbb4&gt;] dev_pm_qos_constraints_destroy+0xf4/0x1f0
[&lt;ffffffff8157b44a&gt;] dpm_sysfs_remove+0x2a/0x60
[&lt;ffffffff8156d951&gt;] device_del+0x101/0x320
[&lt;ffffffff8156db8a&gt;] device_unregister+0x1a/0x60
[&lt;ffffffff8156dc4c&gt;] device_destroy+0x3c/0x50
[&lt;ffffffff815cd295&gt;] nvme_uninit_ctrl+0x45/0xa0
[&lt;ffffffff815d4858&gt;] nvme_remove+0x78/0x110
[&lt;ffffffff81452b69&gt;] pci_device_remove+0x39/0xb0
[&lt;ffffffff81572935&gt;] device_release_driver_internal+0x155/0x210
[&lt;ffffffff81572a02&gt;] device_release_driver+0x12/0x20
[&lt;ffffffff815d36fb&gt;] nvme_remove_dead_ctrl_work+0x6b/0x70
[&lt;ffffffff810bf3bc&gt;] process_one_work+0x18c/0x3a0
[&lt;ffffffff810bf61e&gt;] worker_thread+0x4e/0x3b0
[&lt;ffffffff810c5ac9&gt;] kthread+0x109/0x140
[&lt;ffffffff8185800c&gt;] ret_from_fork+0x2c/0x40
[&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions")
Reported-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Tested-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme-pci: fix multiple ctrl removal scheduling</title>
<updated>2018-04-13T17:48:22+00:00</updated>
<author>
<name>Rakesh Pandit</name>
<email>rakesh@tuxera.com</email>
</author>
<published>2017-06-05T11:43:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=383f15ee34fb21137a6486162132f79fc049cb3e'/>
<id>383f15ee34fb21137a6486162132f79fc049cb3e</id>
<content type='text'>
[ Upstream commit 82b057caefaff2a891f821a617d939f46e03e844 ]

Commit c5f6ce97c1210 tries to address multiple resets but fails as
work_busy doesn't involve any synchronization and can fail.  This is
reproducible easily as can be seen by WARNING below which is triggered
with line:

WARN_ON(dev-&gt;ctrl.state == NVME_CTRL_RESETTING)

Allowing multiple resets can result in multiple controller removal as
well if different conditions inside nvme_reset_work fail and which
might deadlock on device_release_driver.

[  480.327007] WARNING: CPU: 3 PID: 150 at drivers/nvme/host/pci.c:1900 nvme_reset_work+0x36c/0xec0
[  480.327008] Modules linked in: rfcomm fuse nf_conntrack_netbios_ns nf_conntrack_broadcast...
[  480.327044]  btusb videobuf2_core ghash_clmulni_intel snd_hwdep cfg80211 acer_wmi hci_uart..
[  480.327065] CPU: 3 PID: 150 Comm: kworker/u16:2 Not tainted 4.12.0-rc1+ #13
[  480.327065] Hardware name: Acer Predator G9-591/Mustang_SLS, BIOS V1.10 03/03/2016
[  480.327066] Workqueue: nvme nvme_reset_work
[  480.327067] task: ffff880498ad8000 task.stack: ffffc90002218000
[  480.327068] RIP: 0010:nvme_reset_work+0x36c/0xec0
[  480.327069] RSP: 0018:ffffc9000221bdb8 EFLAGS: 00010246
[  480.327070] RAX: 0000000000460000 RBX: ffff880498a98128 RCX: dead000000000200
[  480.327070] RDX: 0000000000000001 RSI: ffff8804b1028020 RDI: ffff880498a98128
[  480.327071] RBP: ffffc9000221be50 R08: 0000000000000000 R09: 0000000000000000
[  480.327071] R10: ffffc90001963ce8 R11: 000000000000020d R12: ffff880498a98000
[  480.327072] R13: ffff880498a53500 R14: ffff880498a98130 R15: ffff880498a98128
[  480.327072] FS:  0000000000000000(0000) GS:ffff8804c1cc0000(0000) knlGS:0000000000000000
[  480.327073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  480.327074] CR2: 00007ffcf3c37f78 CR3: 0000000001e09000 CR4: 00000000003406e0
[  480.327074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  480.327075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  480.327075] Call Trace:
[  480.327079]  ? __switch_to+0x227/0x400
[  480.327081]  process_one_work+0x18c/0x3a0
[  480.327082]  worker_thread+0x4e/0x3b0
[  480.327084]  kthread+0x109/0x140
[  480.327085]  ? process_one_work+0x3a0/0x3a0
[  480.327087]  ? kthread_park+0x60/0x60
[  480.327102]  ret_from_fork+0x2c/0x40
[  480.327103] Code: e8 5a dc ff ff 85 c0 41 89 c1 0f.....

This patch addresses the problem by using state of controller to
decide whether reset should be queued or not as state change is
synchronizated using controller spinlock.  Also cancel_work_sync is
used to make sure remove cancels the reset_work and waits for it to
finish.  This patch also changes return value from -ENODEV to more
appropriate -EBUSY if nvme_reset fails to change state.

Fixes: c5f6ce97c1210 ("nvme: don't schedule multiple resets")
Signed-off-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 82b057caefaff2a891f821a617d939f46e03e844 ]

Commit c5f6ce97c1210 tries to address multiple resets but fails as
work_busy doesn't involve any synchronization and can fail.  This is
reproducible easily as can be seen by WARNING below which is triggered
with line:

WARN_ON(dev-&gt;ctrl.state == NVME_CTRL_RESETTING)

Allowing multiple resets can result in multiple controller removal as
well if different conditions inside nvme_reset_work fail and which
might deadlock on device_release_driver.

[  480.327007] WARNING: CPU: 3 PID: 150 at drivers/nvme/host/pci.c:1900 nvme_reset_work+0x36c/0xec0
[  480.327008] Modules linked in: rfcomm fuse nf_conntrack_netbios_ns nf_conntrack_broadcast...
[  480.327044]  btusb videobuf2_core ghash_clmulni_intel snd_hwdep cfg80211 acer_wmi hci_uart..
[  480.327065] CPU: 3 PID: 150 Comm: kworker/u16:2 Not tainted 4.12.0-rc1+ #13
[  480.327065] Hardware name: Acer Predator G9-591/Mustang_SLS, BIOS V1.10 03/03/2016
[  480.327066] Workqueue: nvme nvme_reset_work
[  480.327067] task: ffff880498ad8000 task.stack: ffffc90002218000
[  480.327068] RIP: 0010:nvme_reset_work+0x36c/0xec0
[  480.327069] RSP: 0018:ffffc9000221bdb8 EFLAGS: 00010246
[  480.327070] RAX: 0000000000460000 RBX: ffff880498a98128 RCX: dead000000000200
[  480.327070] RDX: 0000000000000001 RSI: ffff8804b1028020 RDI: ffff880498a98128
[  480.327071] RBP: ffffc9000221be50 R08: 0000000000000000 R09: 0000000000000000
[  480.327071] R10: ffffc90001963ce8 R11: 000000000000020d R12: ffff880498a98000
[  480.327072] R13: ffff880498a53500 R14: ffff880498a98130 R15: ffff880498a98128
[  480.327072] FS:  0000000000000000(0000) GS:ffff8804c1cc0000(0000) knlGS:0000000000000000
[  480.327073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  480.327074] CR2: 00007ffcf3c37f78 CR3: 0000000001e09000 CR4: 00000000003406e0
[  480.327074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  480.327075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  480.327075] Call Trace:
[  480.327079]  ? __switch_to+0x227/0x400
[  480.327081]  process_one_work+0x18c/0x3a0
[  480.327082]  worker_thread+0x4e/0x3b0
[  480.327084]  kthread+0x109/0x140
[  480.327085]  ? process_one_work+0x3a0/0x3a0
[  480.327087]  ? kthread_park+0x60/0x60
[  480.327102]  ret_from_fork+0x2c/0x40
[  480.327103] Code: e8 5a dc ff ff 85 c0 41 89 c1 0f.....

This patch addresses the problem by using state of controller to
decide whether reset should be queued or not as state change is
synchronizated using controller spinlock.  Also cancel_work_sync is
used to make sure remove cancels the reset_work and waits for it to
finish.  This patch also changes return value from -ENODEV to more
appropriate -EBUSY if nvme_reset fails to change state.

Fixes: c5f6ce97c1210 ("nvme: don't schedule multiple resets")
Signed-off-by: Rakesh Pandit &lt;rakesh@tuxera.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nvme: check hw sectors before setting chunk sectors</title>
<updated>2018-03-03T09:23:20+00:00</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2017-12-14T18:20:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d4ea6118d3e36ea87b02607aea8f173b4c69f808'/>
<id>d4ea6118d3e36ea87b02607aea8f173b4c69f808</id>
<content type='text'>
[ Upstream commit 249159c5f15812140fa216f9997d799ac0023a1f ]

Some devices with IDs matching the "stripe" quirk don't actually have
this quirk, and don't have an MDTS value. When MDTS is not set, the
driver sets the max sectors to UINT_MAX, which is not a power of 2,
hitting a BUG_ON from blk_queue_chunk_sectors. This patch skips setting
chunk sectors for such devices.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 249159c5f15812140fa216f9997d799ac0023a1f ]

Some devices with IDs matching the "stripe" quirk don't actually have
this quirk, and don't have an MDTS value. When MDTS is not set, the
driver sets the max sectors to UINT_MAX, which is not a power of 2,
hitting a BUG_ON from blk_queue_chunk_sectors. This patch skips setting
chunk sectors for such devices.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
