<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/scsi/sd.h, branch v3.11</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>[SCSI] sd: Update WRITE SAME heuristics</title>
<updated>2013-06-27T00:56:18+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2013-06-07T02:15:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=66c28f97120e8a621afd5aa7a31c4b85c547d33d'/>
<id>66c28f97120e8a621afd5aa7a31c4b85c547d33d</id>
<content type='text'>
SATA drives located behind a SAS controller would incorrectly receive
WRITE SAME commands. Tweak the heuristics so that:

 - If REPORT SUPPORTED OPERATION CODES is provided we will use that to
   choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also
   fixes an issue with the old code which would issue WRITE SAME(10)
   despite the command not being whitelisted in REPORT SUPPORTED
   OPERATION CODES.

 - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back
   to WRITE SAME(10) unless the device has an ATA Information VPD page.
   The assumption is that a SATL which is smart enough to implement
   WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES.

To facilitate the new heuristics scsi_report_opcode() has been modified
to so we can distinguish between "operation not supported" and "RSOC not
supported".

Reported-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Tested-by: Bernd Schubert &lt;bernd.schubert@itwm.fraunhofer.de&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SATA drives located behind a SAS controller would incorrectly receive
WRITE SAME commands. Tweak the heuristics so that:

 - If REPORT SUPPORTED OPERATION CODES is provided we will use that to
   choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also
   fixes an issue with the old code which would issue WRITE SAME(10)
   despite the command not being whitelisted in REPORT SUPPORTED
   OPERATION CODES.

 - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back
   to WRITE SAME(10) unless the device has an ATA Information VPD page.
   The assumption is that a SATL which is smart enough to implement
   WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES.

To facilitate the new heuristics scsi_report_opcode() has been modified
to so we can distinguish between "operation not supported" and "RSOC not
supported".

Reported-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Tested-by: Bernd Schubert &lt;bernd.schubert@itwm.fraunhofer.de&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: fix array cache flushing bug causing performance problems</title>
<updated>2013-05-02T23:15:54+00:00</updated>
<author>
<name>James Bottomley</name>
<email>JBottomley@Parallels.com</email>
</author>
<published>2013-04-24T21:02:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=39c60a0948cc06139e2fbfe084f83cb7e7deae3b'/>
<id>39c60a0948cc06139e2fbfe084f83cb7e7deae3b</id>
<content type='text'>
Some arrays synchronize their full non volatile cache when the sd driver sends
a SYNCHRONIZE CACHE command.  Unfortunately, they can have Terrabytes of this
and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a
writeback cache.  This leads to massive slowdowns on journalled filesystems.

The fix is to allow userspace to turn off the writeback cache setting as a
temporary measure (i.e. without doing the MODE SELECT to write it back to the
device), so even though the device reported it has a writeback cache, the
user, knowing that the cache is non volatile and all they care about is
filesystem correctness, can turn that bit off in the kernel and avoid the
performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands.

The way you do this is add a 'temporary' prefix when performing the usual
cache setting operations, so

echo temporary write through &gt; /sys/class/scsi_disk/&lt;disk&gt;/cache_type

Reported-by: Ric Wheeler &lt;rwheeler@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some arrays synchronize their full non volatile cache when the sd driver sends
a SYNCHRONIZE CACHE command.  Unfortunately, they can have Terrabytes of this
and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a
writeback cache.  This leads to massive slowdowns on journalled filesystems.

The fix is to allow userspace to turn off the writeback cache setting as a
temporary measure (i.e. without doing the MODE SELECT to write it back to the
device), so even though the device reported it has a writeback cache, the
user, knowing that the cache is non volatile and all they care about is
filesystem correctness, can turn that bit off in the kernel and avoid the
performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands.

The way you do this is add a 'temporary' prefix when performing the usual
cache setting operations, so

echo temporary write through &gt; /sys/class/scsi_disk/&lt;disk&gt;/cache_type

Reported-by: Ric Wheeler &lt;rwheeler@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: Implement support for WRITE SAME</title>
<updated>2012-11-14T06:45:42+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2012-09-18T16:19:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5db44863b6ebbb400c5e61d56ebe8f21ef48b1bd'/>
<id>5db44863b6ebbb400c5e61d56ebe8f21ef48b1bd</id>
<content type='text'>
Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk
driver.

 - We set the default maximum to 0xFFFF because there are several
   devices out there that only support two-byte block counts even with
   WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the
   device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK
   LIMITS VPD.

 - max_write_same_blocks can be overriden per-device basis in sysfs.

 - The UNMAP discovery heuristics remain unchanged but the discard
   limits are tweaked to match the "real" WRITE SAME commands.

 - In the error handling logic we now distinguish between WRITE SAME
   with and without UNMAP set.

The discovery process heuristics are:

 - If the device reports a SCSI level of SPC-3 or greater we'll issue
   READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is
   supported. If that's the case we will use it.

 - If the device supports the block limits VPD and reports a MAXIMUM
   WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16).

 - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond
   0xFFFFFFFF or the block count exceeds 0xFFFF.

 - no_write_same is set for ATA, FireWire and USB.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Jeff Garzik &lt;jgarzik@redhat.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk
driver.

 - We set the default maximum to 0xFFFF because there are several
   devices out there that only support two-byte block counts even with
   WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the
   device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK
   LIMITS VPD.

 - max_write_same_blocks can be overriden per-device basis in sysfs.

 - The UNMAP discovery heuristics remain unchanged but the discard
   limits are tweaked to match the "real" WRITE SAME commands.

 - In the error handling logic we now distinguish between WRITE SAME
   with and without UNMAP set.

The discovery process heuristics are:

 - If the device reports a SCSI level of SPC-3 or greater we'll issue
   READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is
   supported. If that's the case we will use it.

 - If the device supports the block limits VPD and reports a MAXIMUM
   WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16).

 - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond
   0xFFFFFFFF or the block count exceeds 0xFFFF.

 - no_write_same is set for ATA, FireWire and USB.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Jeff Garzik &lt;jgarzik@redhat.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: Avoid remapping bad reference tags</title>
<updated>2012-09-24T08:10:59+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2012-08-28T18:29:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8c579ab69d50a416887390ba4b89598a7b2fa0b6'/>
<id>8c579ab69d50a416887390ba4b89598a7b2fa0b6</id>
<content type='text'>
It does not make sense to translate ref tags with unexpected values.
Instead we simply ignore them and let the upper layers catch the
problem. Ref tags that contain the expected value are still remapped.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It does not make sense to translate ref tags with unexpected values.
Instead we simply ignore them and let the upper layers catch the
problem. Ref tags that contain the expected value are still remapped.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] Handle disk devices which can not process medium access commands</title>
<updated>2012-02-19T16:14:52+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2012-02-09T18:48:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=18a4d0a22ed6c54b67af7718c305cd010f09ddf8'/>
<id>18a4d0a22ed6c54b67af7718c305cd010f09ddf8</id>
<content type='text'>
We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: remove arbitrary SD_MAX_DISKS namespace limit</title>
<updated>2011-10-30T08:58:11+00:00</updated>
<author>
<name>Dave Kleikamp</name>
<email>dave.kleikamp@oracle.com</email>
</author>
<published>2011-10-19T16:49:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=21208ae5a21fd5f337e987cde11374eaf2fe70b4'/>
<id>21208ae5a21fd5f337e987cde11374eaf2fe70b4</id>
<content type='text'>
There is no reason to limit the SCSI disk namespace to sdXXX.

Add new error messages to sd_probe() in the unlikely event that either
ida_get_new() or sd_format_disk_name() fail.

Signed-off-by: Dave Kleikamp &lt;dave.kleikamp@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no reason to limit the SCSI disk namespace to sdXXX.

Add new error messages to sd_probe() in the unlikely event that either
ida_get_new() or sd_format_disk_name() fail.

Signed-off-by: Dave Kleikamp &lt;dave.kleikamp@oracle.com&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: Logical Block Provisioning update</title>
<updated>2011-03-14T23:37:34+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2011-03-08T07:07:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c98a0eb0e90d1caa8a92913cd45462102cbd5eaf'/>
<id>c98a0eb0e90d1caa8a92913cd45462102cbd5eaf</id>
<content type='text'>
SBC3r26 contains many changes to the Logical Block Provisioning
interfaces (formerly known as Thin Provisioning ditto). This patch
implements support for both the old and new schemes using the same
heuristic as before (whether the LBP VPD page is present).

The new code also allows the provisioning mode (i.e. choice of command)
to be overridden on a per-device basis via sysfs. Two additional modes
are supported in this version:

 - WRITE SAME(10) with the UNMAP bit set

 - WRITE SAME(10) without the UNMAP bit set. This allows us to support
   devices that predate the TP/LBP enhancements in SBC3 and which work
   by way zero-detection

Switching between modes has been consolidated in a helper function that
also updates the block layer topology according to the limitations of
the chosen command.

I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10)
if WRITE SAME(16) fails, etc. but found several devices that got
cranky. So for now we'll disable discard if one of the commands
fail. The user still has the option of selecting a different mode in
sysfs.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SBC3r26 contains many changes to the Logical Block Provisioning
interfaces (formerly known as Thin Provisioning ditto). This patch
implements support for both the old and new schemes using the same
heuristic as before (whether the LBP VPD page is present).

The new code also allows the provisioning mode (i.e. choice of command)
to be overridden on a per-device basis via sysfs. Two additional modes
are supported in this version:

 - WRITE SAME(10) with the UNMAP bit set

 - WRITE SAME(10) without the UNMAP bit set. This allows us to support
   devices that predate the TP/LBP enhancements in SBC3 and which work
   by way zero-detection

Switching between modes has been consolidated in a helper function that
also updates the block layer topology according to the limitations of
the chosen command.

I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10)
if WRITE SAME(16) fails, etc. but found several devices that got
cranky. So for now we'll disable discard if one of the commands
fail. The user still has the option of selecting a different mode in
sysfs.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: implement sd_check_events()</title>
<updated>2011-01-14T15:17:34+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2010-12-18T17:42:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2bae0093cab4ee0a7a8728fdfc35b74569350863'/>
<id>2bae0093cab4ee0a7a8728fdfc35b74569350863</id>
<content type='text'>
Replace sd_media_change() with sd_check_events().

* Move media removed logic into set_media_not_present() and
  media_not_present() and set sdev-&gt;changed iff an existing media is
  removed or the device indicates UNIT_ATTENTION.

* Make sd_check_events() sets sdev-&gt;changed if previously missing
  media becomes present.

* Event is reported only if sdev-&gt;changed is set.

This makes media presence event reported if scsi_disk-&gt;media_present
actually changed or the device indicated UNIT_ATTENTION.  For backward
compatibility, SDEV_EVT_MEDIA_CHANGE is generated each time
sd_check_events() detects media change event.

[jejb: fix boot failure]
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace sd_media_change() with sd_check_events().

* Move media removed logic into set_media_not_present() and
  media_not_present() and set sdev-&gt;changed iff an existing media is
  removed or the device indicates UNIT_ATTENTION.

* Make sd_check_events() sets sdev-&gt;changed if previously missing
  media becomes present.

* Event is reported only if sdev-&gt;changed is set.

This makes media presence event reported if scsi_disk-&gt;media_present
actually changed or the device indicated UNIT_ATTENTION.  For backward
compatibility, SDEV_EVT_MEDIA_CHANGE is generated each time
sd_check_events() detects media change event.

[jejb: fix boot failure]
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: Fix overflow with big physical blocks</title>
<updated>2010-10-11T22:33:20+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2010-09-28T18:48:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=526f7c7950bbf1271e59177d70d74438c2ef96de'/>
<id>526f7c7950bbf1271e59177d70d74438c2ef96de</id>
<content type='text'>
The hw_sector_size variable could overflow if a device reported huge
physical blocks.  Switch to the more accurate physical_block_size
terminology and make sure we use an unsigned int to match the range
permitted by READ CAPACITY(16).

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The hw_sector_size variable could overflow if a device reported huge
physical blocks.  Switch to the more accurate physical_block_size
terminology and make sure we use an unsigned int to match the range
permitted by READ CAPACITY(16).

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[SCSI] sd: Update thin provisioning support</title>
<updated>2010-09-17T17:07:55+00:00</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2010-09-10T05:22:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=045d3fe766b01921e24e2d4178e011b3b09ad4d6'/>
<id>045d3fe766b01921e24e2d4178e011b3b09ad4d6</id>
<content type='text'>
Add support for the Thin Provisioning VPD page and use the TPU and TPWS
bits to switch between UNMAP and WRITE SAME(16) for discards.  If no TP
VPD page is present we fall back to old scheme where the max descriptor
count combined with the max lba count are used trigger UNMAP.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add support for the Thin Provisioning VPD page and use the TPU and TPWS
bits to switch between UNMAP and WRITE SAME(16) for discards.  If no TP
VPD page is present we fall back to old scheme where the max descriptor
count combined with the max lba count are used trigger UNMAP.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
