linux-toradex.git/drivers/scsi/scsi_sysfs.c, branch v3.11

[SCSI] Allow error handling timeout to be specified

2013-06-04T18:16:24+00:00

Introduce eh_timeout which can be used for error handling purposes. This
was previously hardcoded to 10 seconds in the SCSI error handling
code. However, for some fast-fail scenarios it is necessary to be able
to tune this as it can take several iterations (bus device, target, bus,
controller) before we give up.

Signed-off-by: Martin K. Petersen 
Signed-off-by: James Bottomley

[SCSI] prevent stack buffer overflow in host_reset

2012-11-30T09:08:16+00:00

store_host_reset() has tried to re-invent the wheel to compare sysfs strings.
Unfortunately it did so poorly and never bothered to check the input from
userspace before overwriting stack with it, so something simple as:

echo "WoopsieWoopsie" >
/sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset

would result in:

[  316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7
[  316.310101]
[  316.320051] Pid: 6655, comm: sh Tainted: G        W    3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129
[  316.320051] Call Trace:
[  316.340058] pps pps0: PPS event at 1352918752.620355751
[  316.340062] pps pps0: capture assert seq #303
[  316.320051]  [] panic+0xcd/0x1f4
[  316.320051]  [] ? store_host_reset+0xd7/0x100
[  316.320051]  [] __stack_chk_fail+0x16/0x20
[  316.320051]  [] store_host_reset+0xd7/0x100
[  316.320051]  [] dev_attr_store+0x13/0x30
[  316.320051]  [] sysfs_write_file+0x101/0x170
[  316.320051]  [] vfs_write+0xb8/0x180
[  316.320051]  [] sys_write+0x50/0xa0
[  316.320051]  [] tracesys+0xe1/0xe6

Fix this by uninventing whatever was going on there and just use sysfs_streq.

Bug introduced by 29443691 ("[SCSI] scsi: Added support for adapter and
firmware reset").

[jejb: added necessary const to prevent compile warnings]
Signed-off-by: Sasha Levin 
Cc:  #3.2+
Signed-off-by: James Bottomley

[SCSI] scsi_remove_target: fix softlockup regression on hot remove

2012-09-24T08:17:49+00:00

John reports:
 BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202]
 [..]
 Call Trace:
  [] scsi_remove_target+0xda/0x1f0
  [] sas_rphy_remove+0x55/0x60
  [] sas_rphy_delete+0x11/0x20
  [] sas_port_delete+0x25/0x160
  [] mptsas_del_end_device+0x183/0x270

...introduced by commit 3b661a9 "[SCSI] fix hot unplug vs async scan race".

Don't restart lookup of more stargets in the multi-target case, just
arrange to traverse the list once, on the assumption that new targets
are always added at the end.  There is no guarantee that the target will
change state in scsi_target_reap() so we can end up spinning if we
restart.

Cc: 
Acked-by: Jack Wang 
LKML-Reference: 
Reported-by: John Drescher 
Tested-by: John Drescher 
Signed-off-by: Dan Williams 
Signed-off-by: James Bottomley

[SCSI] fix hot unplug vs async scan race

2012-07-20T07:58:45+00:00

The following crash results from cases where the end_device has been
removed before scsi_sysfs_add_sdev has had a chance to run.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
 IP: [] sysfs_create_dir+0x32/0xb6
 ...
 Call Trace:
  [] kobject_add_internal+0x120/0x1e3
  [] ? trace_hardirqs_on+0xd/0xf
  [] kobject_add_varg+0x41/0x50
  [] kobject_add+0x64/0x66
  [] device_add+0x12d/0x63a
  [] ? _raw_spin_unlock_irqrestore+0x47/0x56
  [] ? module_refcount+0x89/0xa0
  [] scsi_sysfs_add_sdev+0x4e/0x28a
  [] do_scan_async+0x9c/0x145

...teach scsi_sysfs_add_devices() to check for deleted devices() before
trying to add them, and teach scsi_remove_target() how to remove targets
that have not been added via device_add().

Cc: 
Reported-by: Dariusz Majchrzak 
Signed-off-by: Dan Williams 
Signed-off-by: James Bottomley

[SCSI] Stop accepting SCSI requests before removing a device

2012-07-20T07:58:41+00:00

Avoid that the code for requeueing SCSI requests triggers a
crash by making sure that that code isn't scheduled anymore
after a device has been removed.

Also, source code inspection of __scsi_remove_device() revealed
a race condition in this function: no new SCSI requests must be
accepted for a SCSI device after device removal started.

Signed-off-by: Bart Van Assche 
Reviewed-by: Mike Christie 
Acked-by: Tejun Heo 
Signed-off-by: James Bottomley

[SCSI] Fix device removal NULL pointer dereference

2012-07-20T07:58:40+00:00

Use blk_queue_dead() to test whether the queue is dead instead
of !sdev. Since scsi_prep_fn() may be invoked concurrently with
__scsi_remove_device(), keep the queuedata (sdev) pointer in
__scsi_remove_device(). This patch fixes a kernel oops that
can be triggered by USB device removal. See also
http://www.spinics.net/lists/linux-scsi/msg56254.html.

Other changes included in this patch:
- Swap the blk_cleanup_queue() and kfree() calls in
  scsi_host_dev_release() to make that code easier to grasp.
- Remove the queue dead check from scsi_run_queue() since the
  queue state can change anyway at any point in that function
  where the queue lock is not held.
- Remove the queue dead check from the start of scsi_request_fn()
  since it is redundant with the scsi_device_online() check.

Reported-by: Jun'ichi Nomura 
Signed-off-by: Bart Van Assche 
Reviewed-by: Mike Christie 
Reviewed-by: Tejun Heo 
Cc: 
Signed-off-by: James Bottomley

[SCSI] add new SDEV_TRANSPORT_OFFLINE state

2012-07-20T07:58:21+00:00

This patch adds a new state SDEV_TRANSPORT_OFFLINE. It will
be used by transport classes to offline devices for cases like
when the fast_io_fail/recovery_tmo fires. In those cases we
want all IO to fail, and we have not yet escalated to dev_loss_tmo
behavior where we are removing the devices.

Currently to handle this state, transport classes are setting
the scsi_device's state to running, setting their internal
session/port structs state to something that indicates failed,
and then failing IO from some transport check in the queuecommand.

The reason for the new value is so that users can distinguish
between a device failure that is a result of a transport problem
vs the wide range of errors that devices get offlined for
when a scsi command times out and we offline the devices there.
It also fixes the confusion as to why the transport class is
failing IO, but has set the device state from blocked to running.

Signed-off-by: Mike Christie 
Signed-off-by: James Bottomley

[SCSI] scsi: Added support for adapter and firmware reset

2011-08-27T14:36:46+00:00

Added new sysfs attr 'host_reset' in scsi_sysfs.c to
perform adapter or firmware reset as suggested by
Mike Christie here:
http://marc.info/?l=linux-scsi&m=127359347111167&w=2

user/application can write "adapter" or "firmware" on
this attr and it will call newly added function hook
in scsi_host_template to call LDD adapter or firmware
reset implementation.

Signed-off-by: Vikas Chaudhary 
Reviewed-by: Mike Christie 
Signed-off-by: James Bottomley

[SCSI] Fix oops caused by queue refcounting failure

2011-06-02T09:34:43+00:00

In certain circumstances, we can get an oops from a torn down device.
Most notably this is from CD roms trying to call scsi_ioctl.  The root
cause of the problem is the fact that after scsi_remove_device() has
been called, the queue is fully torn down.  This is actually wrong
since the queue can be used until the sdev release function is called.
Therefore, we add an extra reference to the queue which is released in
sdev->release, so the queue always exists.

Reported-by: Parag Warudkar 
Cc: stable@kernel.org
Signed-off-by: James Bottomley

[SCSI] put stricter guards on queue dead checks

2011-04-24T16:02:17+00:00

SCSI uses request_queue->queuedata == NULL as a signal that the queue
is dying.  We set this state in the sdev release function.  However,
this allows a small window where we release the last reference but
haven't quite got to this stage yet and so something will try to take
a reference in scsi_request_fn and oops.  It's very rare, but we had a
report here, so we're pushing this as a bug fix

The actual fix is to set request_queue->queuedata to NULL in
scsi_remove_device() before we drop the reference.  This causes
correct automatic rejects from scsi_request_fn as people who hold
additional references try to submit work and prevents anything from
getting a new reference to the sdev that way.

Cc: stable@kernel.org
Signed-off-by: James Bottomley