summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2012-08-02libsas: fix sas_discover_devices return code handlingDan Williams
commit b17caa174a7e1fd2e17b26e210d4ee91c4c28b37 upstream. commit 198439e4 [SCSI] libsas: do not set res = 0 in sas_ex_discover_dev() commit 19252de6 [SCSI] libsas: fix wide port hotplug issues The above commits seem to have confused the return value of sas_ex_discover_dev which is non-zero on failure and sas_ex_join_wide_port which just indicates short circuiting discovery on already established ports. The result is random discovery failures depending on configuration. Calls to sas_ex_join_wide_port are the source of the trouble as its return value is errantly assigned to 'res'. Convert it to bool and stop returning its result up the stack. Tested-by: Dan Melnic <dan.melnic@amd.com> Reported-by: Dan Melnic <dan.melnic@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jack Wang <jack_wang@usish.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02libsas: continue revalidationDan Williams
commit 26f2f199ff150d8876b2641c41e60d1c92d2fb81 upstream. Continue running revalidation until no more broadcast devices are discovered. Fixes cases where re-discovery completes too early in a domain with multiple expanders with pending re-discovery events. Servicing BCNs can get backed up behind error recovery. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations)Dan Williams
commit 57fc2e335fd3c2f898ee73570dc81426c28dc7b4 upstream. Rapid ata hotplug on a libsas controller results in cases where libsas is waiting indefinitely on eh to perform an ata probe. A race exists between scsi_schedule_eh() and scsi_restart_operations() in the case when scsi_restart_operations() issues i/o to other devices in the sas domain. When this happens the host state transitions from SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and ->host_busy is non-zero so we put the eh thread to sleep even though ->host_eh_scheduled is active. Before putting the error handler to sleep we need to check if the host_state needs to return to SHOST_RECOVERY for another trip through eh. Since i/o that is released by scsi_restart_operations has been blocked for at least one eh cycle, this implementation allows those i/o's to run before another eh cycle starts to discourage hung task timeouts. Reported-by: Tom Jackson <thomas.p.jackson@intel.com> Tested-by: Tom Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02fix hot unplug vs async scan raceDan Williams
commit 3b661a92e869ebe2358de8f4b3230ad84f7fce51 upstream. The following crash results from cases where the end_device has been removed before scsi_sysfs_add_sdev has had a chance to run. BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a [<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56 [<ffffffff8107de15>] ? module_refcount+0x89/0xa0 [<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a [<ffffffff8132dcbb>] do_scan_async+0x9c/0x145 ...teach scsi_sysfs_add_devices() to check for deleted devices() before trying to add them, and teach scsi_remove_target() how to remove targets that have not been added via device_add(). Reported-by: Dariusz Majchrzak <dariusz.majchrzak@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02Avoid dangling pointer in scsi_requeue_command()Bart Van Assche
commit 940f5d47e2f2e1fa00443921a0abf4822335b54d upstream. When we call scsi_unprep_request() the command associated with the request gets destroyed and therefore drops its reference on the device. If this was the only reference, the device may get released and we end up with a NULL pointer deref when we call blk_requeue_request. Reported-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> [jejb: enhance commend and add commit log for stable] Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02Fix device removal NULL pointer dereferenceBart Van Assche
commit 67bd94130015c507011af37858989b199c52e1de upstream. Use blk_queue_dead() to test whether the queue is dead instead of !sdev. Since scsi_prep_fn() may be invoked concurrently with __scsi_remove_device(), keep the queuedata (sdev) pointer in __scsi_remove_device(). This patch fixes a kernel oops that can be triggered by USB device removal. See also http://www.spinics.net/lists/linux-scsi/msg56254.html. Other changes included in this patch: - Swap the blk_cleanup_queue() and kfree() calls in scsi_host_dev_release() to make that code easier to grasp. - Remove the queue dead check from scsi_run_queue() since the queue state can change anyway at any point in that function where the queue lock is not held. - Remove the queue dead check from the start of scsi_request_fn() since it is redundant with the scsi_device_online() check. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-08-02Fix NULL dereferences in scsi_cmd_to_driverMark Rustad
commit 222a806af830fda34ad1f6bc991cd226916de060 upstream. Avoid crashing if the private_data pointer happens to be NULL. This has been seen sometimes when a host reset happens, notably when there are many LUNs: host3: Assigned Port ID 0c1601 scsi host3: libfc: Host reset succeeded on port (0c1601) BUG: unable to handle kernel NULL pointer dereference at 0000000000000350 IP: [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0 <snip> Process scsi_eh_3 (pid: 4144, threadinfo ffff88030920c000, task ffff880326b160c0) Stack: 000000010372e6ba 0000000000000282 000027100920dca0 ffffffffa0038ee0 0000000000000000 0000000000030003 ffff88030920dc80 ffff88030920dc80 00000002000e0000 0000000a00004000 ffff8803242f7760 ffff88031326ed80 Call Trace: [<ffffffff8105b590>] ? lock_timer_base+0x70/0x70 [<ffffffff81352fbe>] scsi_eh_tur+0x3e/0xc0 [<ffffffff81353a36>] scsi_eh_test_devices+0x76/0x170 [<ffffffff81354125>] scsi_eh_host_reset+0x85/0x160 [<ffffffff81354291>] scsi_eh_ready_devs+0x91/0x110 [<ffffffff813543fd>] scsi_unjam_host+0xed/0x1f0 [<ffffffff813546a8>] scsi_error_handler+0x1a8/0x200 [<ffffffff81354500>] ? scsi_unjam_host+0x1f0/0x1f0 [<ffffffff8106ec3e>] kthread+0x9e/0xb0 [<ffffffff81509264>] kernel_thread_helper+0x4/0x10 [<ffffffff8106eba0>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81509260>] ? gs_change+0x13/0x13 Code: 25 28 00 00 00 48 89 45 c8 31 c0 48 8b 87 80 00 00 00 48 8d b5 60 ff ff ff 89 d1 48 89 fb 41 89 d6 4c 89 fa 48 8b 80 b8 00 00 00 <48> 8b 80 50 03 00 00 48 8b 00 48 89 85 38 ff ff ff 48 8b 07 4c RIP [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0 RSP <ffff88030920dc50> CR2: 0000000000000350 Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> [bwh: Backported to 3.2: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-07-25libsas: fix taskfile corruption in sas_ata_qc_fill_rtfDan Williams
commit 6ef1b512f4e6f936d89aa20be3d97a7ec7c290ac upstream. fill_result_tf() grabs the taskfile flags from the originating qc which sas_ata_qc_fill_rtf() promptly overwrites. The presence of an ata_taskfile in the sata_device makes it tempting to just copy the full contents in sas_ata_qc_fill_rtf(). However, libata really only wants the fis contents and expects the other portions of the taskfile to not be touched by ->qc_fill_rtf. To that end store a fis buffer in the sata_device and use ata_tf_from_fis() like every other ->qc_fill_rtf() implementation. Reported-by: Praveen Murali <pmurali@logicube.com> Tested-by: Praveen Murali <pmurali@logicube.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-06-19mpt2sas: Fix unsafe using smp_processor_id() in preemptiblenagalakshmi.nandigama@lsi.com
commit a2c658505bf5c75516ee0a79287223e86a2474af upstream. When CONFIG_DEBUG_PREEMPT is enabled, bug is observed in the smp_processor_id(). This is because smp_processor_id() is not called in preempt safe condition. To fix this issue, use raw_smp_processor_id instead of smp_processor_id. Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-06-10fix scsi_wait_scanJames Bottomley
commit 1ff2f40305772b159a91c19590ee159d3a504afc upstream. Commit c751085943362143f84346d274e0011419c84202 Author: Rafael J. Wysocki <rjw@sisk.pl> Date: Sun Apr 12 20:06:56 2009 +0200 PM/Hibernate: Wait for SCSI devices scan to complete during resume Broke the scsi_wait_scan module in 2.6.30. Apparently debian still uses it so fix it and backport to stable before removing it in 3.6. The breakage is caused because the function template in include/scsi/scsi_scan.h is defined to be a nop unless SCSI is built in. That means that in the modular case (which is every distro), the scsi_wait_scan module does a simple async_synchronize_full() instead of waiting for scans. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-06-10Fix dm-multipath starvation when scsi host is busyJun'ichi Nomura
commit b7e94a1686c5daef4f649f7f4f839cc294f07710 upstream. block congestion control doesn't have any concept of fairness across multiple queues. This means that if SCSI reports the host as busy in the queue congestion control it can result in an unfair starvation situation in dm-mp if there are multiple multipath devices on the same host. For example: http://www.redhat.com/archives/dm-devel/2012-May/msg00123.html The fix for this is to report only the sdev busy state (and ignore the host busy state) in the block congestion control call back. The host is still congested, but the SCSI subsystem will sort out the congestion in a fair way because it knows the relation between the queues and the host. [jejb: fixed up trailing whitespace] Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31hpsa: Add IRQF_SHARED back in for the non-MSI(X) interrupt handlerStephen M. Cameron
commit 45bcf018d1a4779d592764ef57517c92589d55d7 upstream. IRQF_SHARED is required for older controllers that don't support MSI(X) and which may end up sharing an interrupt. All the controllers hpsa normally supports have MSI(X) capability, but older controllers may be encountered via the hpsa_allow_any=1 module parameter. Also remove deprecated IRQF_DISABLED. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31isci: fix oem parameter validation on single controller skusDan Williams
commit fc25f79af321c01a739150ba2c09435cf977a63d upstream. OEM parameters [1] are parsed from the platform option-rom / efi driver. By default the driver was validating the parameters for the dual-controller case, but in single-controller case only the first set of parameters may be valid. Limit the validation to the number of actual controllers detected otherwise the driver may fail to parse the valid parameters leading to driver-load or runtime failures. [1] the platform specific set of phy address, configuration,and analog tuning values [stable v3.0+] Reported-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31mpt2sas: Fix for panic happening because of improper memory allocationnagalakshmi.nandigama@lsi.com
commit e42fafc25fa86c61824e8d4c5e7582316415d24f upstream. The ioc->pfacts member in the IOC structure is getting set to zero following a call to _base_get_ioc_facts due to the memset in that routine. So if the ioc->pfacts was read after a host reset, there would be a NULL pointer dereference. The routine _base_get_ioc_facts is called from context of host reset. The problem in _base_get_ioc_facts is the size of Mpi2IOCFactsReply is 64, whereas the sizeof "struct mpt2sas_facts" is 60, so there is a four byte overflow resulting from the memset. Also, there is memset in _base_get_port_facts using the incorrect structure, it should be "struct mpt2sas_port_facts" instead of Mpi2PortFactsReply. Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31hpsa: Fix problem with MSA2xxx devicesStephen M. Cameron
commit 9bc3711cbb67ac620bf09b4a147cbab45b2c36c0 upstream. Upgraded firmware on Smart Array P7xx (and some others) made them show up as SCSI revision 5 devices and this caused the driver to fail to map MSA2xxx logical drives to the correct bus/target/lun. A symptom of this would be that the target ID of the logical drives as presented by the external storage array is ignored, and all such logical drives are assigned to target zero, differentiated only by LUN. Some multipath software reportedly does not deal well with this behavior, failing to recognize different paths to the same device as such. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Scott Teel <scott.teel@hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-11libsas: fix false positive 'device attached' conditionsDan Williams
commit 7d1d865181185bdf1316d236b1b4bd02c9020729 upstream. Normalize phy->attached_sas_addr to return a zero-address in the case when device-type == NO_DEVICE or the linkrate is invalid to handle expanders that put non-zero sas addresses in the discovery response: sas: ex 5001b4da000f903f phy02:U:0 attached: 0100000000000000 (no device) sas: ex 5001b4da000f903f phy01:U:0 attached: 0100000000000000 (no device) sas: ex 5001b4da000f903f phy03:U:0 attached: 0100000000000000 (no device) sas: ex 5001b4da000f903f phy00:U:0 attached: 0100000000000000 (no device) Reported-by: Andrzej Jakowski <andrzej.jakowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-11libsas: fix sas_find_bcast_phy() in the presence of 'vacant' physThomas Jackson
commit 1699490db339e2c6b3037ea8e7dcd6b2755b688e upstream. If an expander reports 'PHY VACANT' for a phy index prior to the one that generated a BCN libsas fails rediscovery. Since a vacant phy is defined as a valid phy index that will never have an attached device just continue the search. Signed-off-by: Thomas Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-03-12osd_uld: Bump MAX_OSD_DEVICES from 64 to 1,048,576Boaz Harrosh
commit 41f8ad76362e7aefe3a03949c43e23102dae6e0b upstream. It used to be that minors where 8 bit. But now they are actually 20 bit. So the fix is simplicity itself. I've tested with 300 devices and all user-mode utils work just fine. I have also mechanically added 10,000 to the ida (so devices are /dev/osd10000, /dev/osd10001 ...) and was able to mkfs an exofs filesystem and access osds from user-mode. All the open-osd user-mode code uses the same library to access devices through their symbolic names in /dev/osdX so I'd say it's pretty safe. (Well tested) This patch is very important because some of the systems that will be deploying the 3.2 pnfs-objects code are larger than 64 OSDs and will stop to work properly when reaching that number. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-29scsi_pm: Fix bug in the SCSI power management handlerAlan Stern
commit fea6d607e154cf96ab22254ccb48addfd43d4cb5 upstream. This patch (as1520) fixes a bug in the SCSI layer's power management implementation. LUN scanning can be carried out asynchronously in do_scan_async(), and sd uses an asynchronous thread for the time-consuming parts of disk probing in sd_probe_async(). Currently nothing coordinates these async threads with system sleep transitions; they can and do attempt to continue scanning/probing SCSI devices even after the host adapter has been suspended. As one might expect, the outcome is not ideal. This is what the "prepare" stage of system suspend was created for. After the prepare callback has been called for a host, target, or device, drivers are not allowed to register any children underneath them. Currently the SCSI prepare callback is not implemented; this patch rectifies that omission. For SCSI hosts, the prepare routine calls scsi_complete_async_scans() to wait until async scanning is finished. It might be slightly more efficient to wait only until the host in question has been scanned, but there's currently no way to do that. Besides, during a sleep transition we will ultimately have to wait until all the host scanning has finished anyway. For SCSI devices, the prepare routine calls async_synchronize_full() to wait until sd probing is finished. The routine does nothing for SCSI targets, because asynchronous target scanning is done only as part of host scanning. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-29scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost'Huajun Li
commit 267a6ad4aefaafbde607804c60945bcf97f91c1b upstream. In do_scan_async(), calling scsi_autopm_put_host(shost) may reference freed shost, and cause Posison overwitten warning. Yes, this case can happen, for example, an USB is disconnected just when do_scan_async() thread starts to run, then scsi_host_put() called in scsi_finish_async_scan() will lead to shost be freed(because the refcount of shost->shost_gendev decreases to 1 after USB disconnects), at this point, if references shost again, system will show following warning msg. To make scsi_autopm_put_host(shost) always reference a valid shost, put it just before scsi_host_put() in function scsi_finish_async_scan(). [ 299.281565] ============================================================================= [ 299.281634] BUG kmalloc-4096 (Tainted: G I ): Poison overwritten [ 299.281682] ----------------------------------------------------------------------------- [ 299.281684] [ 299.281752] INFO: 0xffff880056c305d0-0xffff880056c305d0. First byte 0x6a instead of 0x6b [ 299.281816] INFO: Allocated in scsi_host_alloc+0x4a/0x490 age=1688 cpu=1 pid=2004 [ 299.281870] __slab_alloc+0x617/0x6c1 [ 299.281901] __kmalloc+0x28c/0x2e0 [ 299.281931] scsi_host_alloc+0x4a/0x490 [ 299.281966] usb_stor_probe1+0x5b/0xc40 [usb_storage] [ 299.282010] storage_probe+0xa4/0xe0 [usb_storage] [ 299.282062] usb_probe_interface+0x172/0x330 [usbcore] [ 299.282105] driver_probe_device+0x257/0x3b0 [ 299.282138] __driver_attach+0x103/0x110 [ 299.282171] bus_for_each_dev+0x8e/0xe0 [ 299.282201] driver_attach+0x26/0x30 [ 299.282230] bus_add_driver+0x1c4/0x430 [ 299.282260] driver_register+0xb6/0x230 [ 299.282298] usb_register_driver+0xe5/0x270 [usbcore] [ 299.282337] 0xffffffffa04ab03d [ 299.282364] do_one_initcall+0x47/0x230 [ 299.282396] sys_init_module+0xa0f/0x1fe0 [ 299.282429] INFO: Freed in scsi_host_dev_release+0x18a/0x1d0 age=85 cpu=0 pid=2008 [ 299.282482] __slab_free+0x3c/0x2a1 [ 299.282510] kfree+0x296/0x310 [ 299.282536] scsi_host_dev_release+0x18a/0x1d0 [ 299.282574] device_release+0x74/0x100 [ 299.282606] kobject_release+0xc7/0x2a0 [ 299.282637] kobject_put+0x54/0xa0 [ 299.282668] put_device+0x27/0x40 [ 299.282694] scsi_host_put+0x1d/0x30 [ 299.282723] do_scan_async+0x1fc/0x2b0 [ 299.282753] kthread+0xdf/0xf0 [ 299.282782] kernel_thread_helper+0x4/0x10 [ 299.282817] INFO: Slab 0xffffea00015b0c00 objects=7 used=7 fp=0x (null) flags=0x100000000004080 [ 299.282882] INFO: Object 0xffff880056c30000 @offset=0 fp=0x (null) [ 299.282884] ... Signed-off-by: Huajun Li <huajun.li.lee@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03mpt2sas: Removed redundant calling of _scsih_probe_devices() from _scsih_probenagalakshmi.nandigama@lsi.com
commit 2cb6fc8c014b9b00c4487a79b8f6ed0da4121f45 upstream. Removed redundant calling of _scsih_probe_devices() from _scsih_probe as it is getting called from _scsih_scan_finished. Also moved the function scsi_scan_host(shost) to get called after the volumes on warp drive are reported to the OS. Otherwise by the time the (ioc->hide_drives) flags is set, the volumes on warp drive are reported to the OS already. Also modified the initialization of reply queues only in case of driver load time in the function _base_make_ioc_operational(). Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-25sym53c8xx: Fix NULL pointer dereference in slave_destroyStratos Psomadakis
commit cced5041ed5a2d1352186510944b0ddfbdbe4c0b upstream. sym53c8xx_slave_destroy unconditionally assumes that sym53c8xx_slave_alloc has succesesfully allocated a sym_lcb. This can lead to a NULL pointer dereference (exposed by commit 4e6c82b). Signed-off-by: Stratos Psomadakis <psomas@gentoo.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25block: fail SCSI passthrough ioctls on partition devicesPaolo Bonzini
commit 0bfc96cb77224736dfa35c3c555d37b3646ef35e upstream. [ Changes with respect to 3.3: return -ENOTTY from scsi_verify_blk_ioctl and -ENOIOCTLCMD from sd_compat_ioctl. ] Linux allows executing the SG_IO ioctl on a partition or LVM volume, and will pass the command to the underlying block device. This is well-known, but it is also a large security problem when (via Unix permissions, ACLs, SELinux or a combination thereof) a program or user needs to be granted access only to part of the disk. This patch lets partitions forward a small set of harmless ioctls; others are logged with printk so that we can see which ioctls are actually sent. In my tests only CDROM_GET_CAPABILITY actually occurred. Of course it was being sent to a (partition on a) hard disk, so it would have failed with ENOTTY and the patch isn't changing anything in practice. Still, I'm treating it specially to avoid spamming the logs. In principle, this restriction should include programs running with CAP_SYS_RAWIO. If for example I let a program access /dev/sda2 and /dev/sdb, it still should not be able to read/write outside the boundaries of /dev/sda2 independent of the capabilities. However, for now programs with CAP_SYS_RAWIO will still be allowed to send the ioctls. Their actions will still be logged. This patch does not affect the non-libata IDE driver. That driver however already tests for bd != bd->bd_contains before issuing some ioctl; it could be restricted further to forbid these ioctls even for programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO. Cc: linux-scsi@vger.kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [ Make it also print the command name when warning - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25block: add and use scsi_blk_cmd_ioctlPaolo Bonzini
commit 577ebb374c78314ac4617242f509e2f5e7156649 upstream. Introduce a wrapper around scsi_cmd_ioctl that takes a block device. The function will then be enhanced to detect partition block devices and, in that case, subject the ioctls to whitelisting. Cc: linux-scsi@vger.kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25SCSI: mpt2sas : Fix for memory allocation error for large host creditsnagalakshmi.nandigama@lsi.com
commit aff132d95ffe14eca96cab90597cdd010b457af7 upstream. The amount of memory required for tracking chain buffers is rather large, and when the host credit count is big, memory allocation failure occurs inside __get_free_pages. The fix is to limit the number of chains to 100,000. In addition, the number of host credits is limited to 30,000 IOs. However this limitation can be overridden this using the command line option max_queue_depth. The algorithm for calculating the reply_post_queue_depth is changed so that it is equal to (reply_free_queue_depth + 16), previously it was (reply_free_queue_depth * 2). Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25SCSI: mpt2sas: Release spinlock for the raid device list before blocking itnagalakshmi.nandigama@lsi.com
commit 30c43282f3d347f47f9e05199d2b14f56f3f2837 upstream. Added code to release the spinlock that is used to protect the raid device list before calling a function that can block. The blocking was causing a reschedule, and subsequently it is tried to acquire the same lock, resulting in a panic (NMI Watchdog detecting a CPU lockup). Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-15[SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set ↵john fastabend
skb priority Use DCB notifiers to set the skb priority to allow packets to be steered and tagged correctly over DCB enabled drivers that setup traffic classes. This allows queue_mapping() routines to be removed in these drivers that were previously inspecting the ethertype of every skb to mark FCoE/FIP frames. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-14[SCSI] bnx2i: Fixed kernel panic caused by unprotected task->sc->request derefEddie Wai
During session recovery, the conn_stop call will trigger a flush to all outstanding SCSI cmds in the xmit queue. This will set all outstanding task->sc to NULL prior to the session_teardown call which frees the task memory. In the bnx2i SCSI response processing path, only the task was being checked for NULL under the session lock before the task->sc->request dereferencing. If there are outstanding SCSI cmd responses pending for process, the following kernel panic can be exposed where task->sc was found to be NULL. Call Trace: [ 69.720205] [<ffffffffa040d0d0>] bnx2i_process_new_cqes+0x290/0x3c0 [bnx2i] [ 69.804289] [<ffffffffa040d233>] bnx2i_fastpath_notification+0x33/0xa0 [bnx2 i] [ 69.891490] [<ffffffffa040d37b>] bnx2i_indicate_kcqe+0xdb/0x330 [bnx2i] [ 69.971427] [<ffffffffa03eac5e>] service_kcqes+0x16e/0x1d0 [cnic] [ 70.045132] [<ffffffffa03eacea>] cnic_service_bnx2x_kcq+0x2a/0x50 [cnic] [ 70.126105] [<ffffffffa03ead53>] cnic_service_bnx2x_bh+0x43/0x140 [cnic] [ 70.207081] [<ffffffff81060676>] tasklet_action+0x66/0x110 [ 70.273521] [<ffffffff8106025f>] __do_softirq+0xef/0x220 [ 70.337887] [<ffffffff81447ebc>] call_softirq+0x1c/0x30 This patch adds the !task->sc check and also protects the sc dereferencing under the session lock. Signed-off-by: Eddie Wai <eddie.wai@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-14[SCSI] qla4xxx: check for failed conn setupMike Christie
iscsi_conn_setup can fail so we must check for NULL being returned. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-14[SCSI] qla4xxx: a small loop fixTomas Henzl
When the qla4xxx_get_fwddb_entry returns QLA_ERROR the nex_idx is not updated, for (idx = 0; idx < max_ddbs; idx = next_idx) { ret = qla4xxx_get_fwddb_entry(ha, idx, NULL, 0, NULL, &next_idx, &state, &conn_err, NULL, NULL); if (ret == QLA_ERROR) continue; This means there is a risk that the 'idx < max_ddbs' condition will never met and the loop will loop forever. Fix this by explicitly increasing the next_idx in the error condition. Maybe a break instead of continue is more appropriate, leaving the decision on the qlogic maintainer. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-14[SCSI] qla4xxx: fix flash/ddb supportMike Christie
With open-iscsi support, target entries persisted in the FLASH were not login. Added support in the qla4xxx driver to do the login on probe time to the target entries saved in the FLASH by user. With this changes upgrade to the new kernel with open-iscsi support in qla4xxx will ensure users original target entries login on driver load Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com> Signed-off-by: Ravi Anand <ravi.anand@qlogic.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-14[SCSI] fcoe: Fix preempt count leak in fcoe_filter_frames()Thomas Gleixner
The error exit path leaks preempt count. Add the missing put_cpu(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Yi Zou <yi.zou@intel.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Update version number to 8.03.07.12-k.Chad Dupuis
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Submit all chained IOCBs for passthrough commands on request ↵Giridhar Malavali
queue 0. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Correct fc_host port_state display.Saurav Kashyap
[jejb: checkpatch fixes] Add more fine grain parsing of vha->loop_state to export a more accurate fc_host port_state. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Disable generating pause frames when firmware hang detected ↵Giridhar Malavali
for ISP82xx. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Clear mailbox busy flag during premature mailbox completion ↵Giridhar Malavali
for ISP82xx. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Encapsulate prematurely completing mailbox commands during ↵Chad Dupuis
ISP82xx firmware hang. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Display IPE error message for ISP82xx.Chad Dupuis
[jejb: fixup checkpatch error] Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Return the correct value for a mailbox command if 82xx is in ↵Andrew Vasquez
reset recovery. We need to return QLA_FUNCTION_TIMEOUT immediately otherwise we mess up the mailbox command state machine. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Enable Minidump by default with default capture mask 0x1f.Giridhar Malavali
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Stop unconditional completion of mailbox commands issued in ↵Giridhar Malavali
interrupt mode during firmware hang. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Revert back the request queue mapping to request queue 0.Giridhar Malavali
If there is an error creating multiple response queues then we need to revert the request queue mapping back to request queue 0. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Don't call alloc_fw_dump for ISP82XX.Saurav Kashyap
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Check for SCSI status on underruns.Arun Easi
Signed-off-by: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] qla2xxx: Remove qla2x00_wait_for_loop_ready function.Saurav Kashyap
This function can wait for 5min under certain scenarios. One of them is when the port is down from switch and bus reset is issued. The bus reset used to wait for 5 minutes for the loop and upper layer callers used to hang and give stack trace because of getting stuck for 120 sec. It is legacy code that was used when the driver used to do queuing of the commands. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-12[SCSI] mpt2sas: _scsih_smart_predicted_fault uses GFP_KERNEL in interrupt ↵Anton Blanchard
context _scsih_smart_predicted_fault is called in an interrupt and therefore must allocate memory using GFP_ATOMIC. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-14[SCSI] hpsa: Disable ASPMMatthew Garrett
The Windows driver .inf disables ASPM on hpsa devices. Do the same because the selection of a non default ASPM policy can cause the device to hang. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: stable@kernel.org Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-11[SCSI] aacraid: controller hangs if kernel uses non-default ASPM policyVasily Averin
Aacraid controller can hang on some nodes if kernel uses non-default (powersave) ASPM policy. Controller hangs shortly after successful load and hardware detection. Scsi error handler detects this hang and tries to restart hardware but it does not help. Initially it was noticed on RHEL6-based openVZ kernel after backporting aacraid driver from mainline (RHEL6 kernel with original driver works well) http://bugzilla.openvz.org/show_bug.cgi?id=2043 This issue happens because default ASPM policy was changed in Red Hat kernels. Therefore guys from Red Hat have noticed this problem long time ago: on Fedora 12 https://bugzilla.redhat.com/show_bug.cgi?id=540478 on Fedora 14 https://bugzilla.redhat.com/show_bug.cgi?id=679385 In RHEL6 kernel this issue was fixed, ASPM was disabled in aacraid driver. In kernel changelog I've found that seems it was done by Matthew Garrett: - [scsi] aacraid: Disable ASPM by default (Matthew Garrett) [599735] However seems this patch was not submitted to mainline. I've reproduced this issue on vanilla 3.1.0 kernel booted with "pcie_aspm.policy=powersave" option, So I believe it makes sense to do it now. Signed-off-by: Vasily Averin <vvs@sw.ru> [mjg: Checking the Windows drivers indicates that they disable ASPM under all circumstances, so:] Acked-by: Matthew Garrett <mjg@redhat.com> Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-10[SCSI] mpt2sas: add missing allocation.Dan Carpenter
There was supposed to be a kzalloc() here and the compiler complained about it. mpt2sas_scsih.c: In function ‘mpt2sas_scsih_reset_handler’: mpt2sas_scsih.c:2807:21: warning: ‘fw_event’ may be used uninitialized in this function [-Wuninitialized] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>