summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2011-12-09SCSI: Silencing 'killing requests for dead queue'Hannes Reinecke
commit 745718132c3c7cac98a622b610e239dcd5217f71 upstream. When we tear down a device we try to flush all outstanding commands in scsi_free_queue(). However the check in scsi_request_fn() is imperfect as it only signals that we _might start_ aborting commands, not that we've actually aborted some. So move the printk inside the scsi_kill_request function, this will also give us a hint about which commands are aborted. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-26aacraid: controller hangs if kernel uses non-default ASPM policyVasily Averin
commit cf16123c9c8e346ed1dd171295a678d77648d7f8 upstream. Aacraid controller can hang on some nodes if kernel uses non-default (powersave) ASPM policy. Controller hangs shortly after successful load and hardware detection. Scsi error handler detects this hang and tries to restart hardware but it does not help. Initially it was noticed on RHEL6-based openVZ kernel after backporting aacraid driver from mainline (RHEL6 kernel with original driver works well) http://bugzilla.openvz.org/show_bug.cgi?id=2043 This issue happens because default ASPM policy was changed in Red Hat kernels. Therefore guys from Red Hat have noticed this problem long time ago: on Fedora 12 https://bugzilla.redhat.com/show_bug.cgi?id=540478 on Fedora 14 https://bugzilla.redhat.com/show_bug.cgi?id=679385 In RHEL6 kernel this issue was fixed, ASPM was disabled in aacraid driver. In kernel changelog I've found that seems it was done by Matthew Garrett: - [scsi] aacraid: Disable ASPM by default (Matthew Garrett) [599735] However seems this patch was not submitted to mainline. I've reproduced this issue on vanilla 3.1.0 kernel booted with "pcie_aspm.policy=powersave" option, So I believe it makes sense to do it now. Signed-off-by: Vasily Averin <vvs@sw.ru> [mjg: Checking the Windows drivers indicates that they disable ASPM under all circumstances, so:] Acked-by: Matthew Garrett <mjg@redhat.com> Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-26hpsa: Disable ASPMMatthew Garrett
commit e5a44df85e8d78e5c2d3d2e4f59b460905691e2f upstream. The Windows driver .inf disables ASPM on hpsa devices. Do the same because the selection of a non default ASPM policy can cause the device to hang. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-26fix WARNING: at drivers/scsi/scsi_lib.c:1704James Bottomley
commit 4e6c82b3614a18740ef63109d58743a359266daf upstream. On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote: > Hi all, > > Starting some time last week I am getting the following during boot on > our PPC970 blade: > > calling .ipr_init+0x0/0x68 @ 1 > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011) > ipr 0000:01:01.0: Found IOA with IRQ: 26 > ipr 0000:01:01.0: Starting IOA initialization sequence. > ipr 0000:01:01.0: Adapter firmware version: 06160039 > ipr 0000:01:01.0: IOA initialized. > scsi0 : IBM 572E Storage Adapter > ------------[ cut here ]------------ > WARNING: at drivers/scsi/scsi_lib.c:1704 > Modules linked in: > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70 > REGS: c0000000783c2f60 TRAP: 0700 Not tainted (3.1.0-autokern1) > MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 24002024 XER: 20000002 > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0 > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0 > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000 > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000 > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800 > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000 > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001 > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938 > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0 > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90 > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > Call Trace: > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable) > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40 > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650 > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100 > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0 > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390 > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50 > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150 > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210 > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110 > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0 > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40 > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340 > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0 > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140 > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68 > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0 > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70 > Instruction dump: > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010 > f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000 > ---[ end trace 759bed76a85e8dec ]--- > scsi 0:0:1:0: Direct-Access IBM-ESXS MAY2036RC T106 PQ: 0 ANSI: 5 > ------------[ cut here ]------------ > > I get lots more of these. The obvious commit to point the finger at > is 3308511c93e6 ("[SCSI] Make scsi_free_queue() kill pending SCSI > commands") but the root cause may be something different. Caused by commit f7c9c6bb14f3104608a3a83cadea10a6943d2804 Author: Anton Blanchard <anton@samba.org> Date: Thu Nov 3 08:56:22 2011 +1100 [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev Doesn't completely do the teardown. The true fix is to do a proper teardown instead of hand rolling it Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11hpsa: add small delay when using PCI Power Management to reset for kumpMike Miller
commit c4853efec665134b2e6fc9c13447323240980351 upstream. The P600 requires a small delay when changing states. Otherwise we may think the board did not reset and we bail. This for kdump only and is particular to the P600. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11mpt2sas: Fix for system hang when discovery in progressnagalakshmi.nandigama@lsi.com
commit 0167ac67ff6f35bf2364f7672c8012b0cd40277f upstream. Fix for issue : While discovery is in progress, hot unplug and hot plug of enclosure connected to the controller card is causing system to hang. When a device is in the process of being detected at driver load time then if it is removed, the device that is no longer present will not be added to the list. So the code in _scsih_probe_sas() is rearranged as such so the devices that failed to be detected are not added to the list. Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11Fix block queue and elevator memory leak in scsi_alloc_sdevAnton Blanchard
commit f7c9c6bb14f3104608a3a83cadea10a6943d2804 upstream. When looking at memory consumption issues I noticed quite a lot of memory in the kmalloc-2048 bucket: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 6561 6471 98% 2.30K 243 27 15552K kmalloc-2048 Over 15MB. slub debug shows that cfq is responsible for almost all of it: # sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls 6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13 In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc fails we don't free it with scsi_free_queue. The patch below fixes the issue: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 135 72 53% 2.30K 5 27 320K kmalloc-2048 # cat /sys/kernel/slab/kmalloc-2048/alloc_calls 3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11Make scsi_free_queue() kill pending SCSI commandsBart Van Assche
commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 upstream. Make sure that SCSI device removal via scsi_remove_host() does finish all pending SCSI commands. Currently that's not the case and hence removal of a SCSI host during I/O can cause a deadlock. See also "blkdev_issue_discard() hangs forever if underlying storage device is removed" (http://bugzilla.kernel.org/show_bug.cgi?id=40472). See also http://lkml.org/lkml/2011/8/27/6. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11scsi_dh: check queuedata pointer before proceeding furtherMoger, Babu
commit a18a920c70d48a8e4a2b750d8a183b3c1a4be514 upstream. This patch validates sdev pointer in scsi_dh_activate before proceeding further. Without this check we might see the panic as below. I have seen this panic multiple times.. Call trace: #0 [ffff88007d647b50] machine_kexec at ffffffff81020902 #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0 #2 [ffff88007d647c70] oops_end at ffffffff8139c650 #3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15 #4 [ffff88007d647d50] page_fault at ffffffff8139b8cf [exception RIP: scsi_dh_activate+0x82] RIP: ffffffffa0041922 RSP: ffff88007d647e00 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000093c5 RDX: 00000000000093c5 RSI: ffffffffa02e6640 RDI: ffff88007cc88988 RBP: 000000000000000f R8: ffff88007d646000 R9: 0000000000000000 R10: ffff880082293790 R11: 00000000ffffffff R12: ffff88007cc88988 R13: 0000000000000000 R14: 0000000000000286 R15: ffff880037b845e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 #5 [ffff88007d647e38] run_workqueue at ffffffff81060268 #6 [ffff88007d647e78] worker_thread at ffffffff81060386 #7 [ffff88007d647ee8] kthread at ffffffff81064436 #8 [ffff88007d647f48] kernel_thread at ffffffff81003fba Signed-off-by: Babu Moger <babu.moger@netapp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11st: fix race in st_scsi_execute_endPetr Uzel
commit c68bf8eeaa57c852e74adcf597237be149eef830 upstream. The call to complete() in st_scsi_execute_end() wakes up sleeping thread in write_behind_check(), which frees the st_request, thus invalidating the pointer to the associated bio structure, which is then passed to the blk_rq_unmap_user(). Fix by storing pointer to bio structure into temporary local variable. This bug is present since at least linux-2.6.32. Signed-off-by: Petr Uzel <petr.uzel@suse.cz> Reported-by: Juergen Groß <juergen.gross@ts.fujitsu.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11isci: fix missed unlock in apc_agent_timeout()Jeff Skirvin
commit 983d3fdd332742167d0482c06fd29cf4b8a687c0 upstream. Needed to jump to scic_lock unlock. Also spotted by coccicheck. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11isci: fix support for large smp requestsDan Williams
commit 54b5e3a4bfa3452bc10cd4da672099ccc46b8c09 upstream. Kill the local smp response buffer. Besides being unnecessary, it is too small (currently truncates responses to 60 bytes). The mid-layer will have already allocated a sufficiently sized buffer, just kmap and copy into it directly. Reported-by: Derick Marks <derick.w.marks@intel.com> Tested-by: Derick Marks <derick.w.marks@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11libsas: set sas_address and device type of rphyJack Wang
commit bb041a0e9c31229071b6e56e1d0d8374af0d2038 upstream. Libsas forget to set the sas_address and device type of rphy lead to file under /sys/class/sas_x show wrong value, fix that. Signed-off-by: Jack Wang <jack_wang@usish.com> Tested-by: Crystal Yu <crystal_yu@usish.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11ipr: Always initiate hard reset in kdump kernelAnton Blanchard
commit 5d7c20b7fa5c6ca19e871b4050e321c99d32bd43 upstream. During kdump testing I noticed timeouts when initialising each IPR adapter. While the driver has logic to detect an adapter in an indeterminate state, it wasn't triggering and each adapter went through a 5 minute timeout before finally going operational. Some analysis showed the needs_hard_reset flag wasn't getting set. We can check the reset_devices kernel parameter which is set by kdump and force a full reset. This fixes the problem. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11megaraid_sas: Fix instance access in megasas_reset_timeradam radford
commit f575c5d3ebdca3b0482847d8fcba971767754a9e upstream. The following patch for megaraid_sas will fix a potential bad pointer access in megasas_reset_timer(), when a MegaRAID 9265/9285 or 9360/9380 gets a timeout. megasas_build_io_fusion() sets SCp.ptr to be a struct megasas_cmd_fusion *, but then megasas_reset_timer() was casting SCp.ptr to be a struct megasas_cmd *, then trying to access cmd->instance, which is invalid. Just loading instance from scmd->device->host->hostdata in megasas_reset_timer() fixes the issue. Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-16libsas: fix panic when single phy is disabled on a wide portMark Salyzyn
commit a73914c35b05d80f8ce78288e10056c91090b666 upstream. When a wide port is being utilized to a target, if one disables only one of the phys, we get an OS crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000238 IP: [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 PGD 4103f5067 PUD 41dba9067 PMD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/bus/pci/slots/5/address CPU 0 Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4 ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3 jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001] Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4 ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3 jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001] Pid: 5146, comm: scsi_wq_5 Not tainted 2.6.32-71.29.1.el6.lustre.7.x86_64 #1 Storage Server RIP: 0010:[<ffffffff814ca9b1>] [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 RSP: 0018:ffff8803e4e33d30 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000238 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8803e664c800 RDI: 0000000000000238 RBP: ffff8803e4e33d40 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000238 R14: ffff88041acb7200 R15: ffff88041c51ada0 FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000238 CR3: 0000000410143000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process scsi_wq_5 (pid: 5146, threadinfo ffff8803e4e32000, task ffff8803e4e294a0) Stack: ffff8803e664c800 0000000000000000 ffff8803e4e33d70 ffffffffa001f06e <0> ffff8803e4e33d60 ffff88041c51ada0 ffff88041acb7200 ffff88041bc0aa00 <0> ffff8803e4e33d90 ffffffffa0032b6c 0000000000000014 ffff88041acb7200 Call Trace: [<ffffffffa001f06e>] sas_port_delete_phy+0x2e/0xa0 [scsi_transport_sas] [<ffffffffa0032b6c>] sas_unregister_devs_sas_addr+0xac/0xe0 [libsas] [<ffffffffa0034914>] sas_ex_revalidate_domain+0x204/0x330 [libsas] [<ffffffffa00307f0>] ? sas_revalidate_domain+0x0/0x90 [libsas] [<ffffffffa0030855>] sas_revalidate_domain+0x65/0x90 [libsas] [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0 [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0 [<ffffffff81091b36>] kthread+0x96/0xa0 [<ffffffff810141ca>] child_rip+0xa/0x20 [<ffffffff81091aa0>] ? kthread+0x0/0xa0 [<ffffffff810141c0>] ? child_rip+0x0/0x20 Code: ff ff 85 c0 75 ed eb d6 66 90 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 0f 1f 44 00 00 48 89 fb e8 92 f4 ff ff 48 89 df <f0> ff 0f 79 05 e8 25 00 00 00 65 48 8b 04 25 08 cc 00 00 48 2d RIP [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 RSP <ffff8803e4e33d30> CR2: 0000000000000238 The following patch is admittedly a band-aid, and does not solve the root cause, but it still is a good candidate for hardening as a pointer check before reference. Signed-off-by: Mark Salyzyn <mark_salyzyn@us.xyratex.com> Tested-by: Jack Wang <jack_wang@usish.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-16qla2xxx: Fix crash in qla2x00_abort_all_cmds() on unloadRoland Dreier
commit 9bfacd01dc9b7519e1e6da12b01963550b9d09a2 upstream. I hit a crash in qla2x00_abort_all_cmds() if the qla2xxx module is unloaded right after it is loaded. I debugged this down to the abort handling improperly treating a command of type SRB_ADISC_CMD as if it had a bsg_job to complete when that command actually uses the iocb_cmd part of the union. (I guess to hit this one has to unload the module while the async FC initialization is still in progress) It seems we should only look for a bsg_job if type is SRB_ELS_CMD_RPT, SRB_ELS_CMD_HST or SRB_CT_CMD, so switch the test to make that explicit. Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03scsi: qla4xxx needs libiscsi.oRandy Dunlap
commit 3538a001ea7db13fa1be2966b71f69d808acff01 upstream. qla4xxx driver needs to be linked with libiscsi.o to fix build errors. This happens when no other drivers that use libiscsi.o are enabled. ERROR: "iscsi_conn_stop" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_get_addr_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_teardown" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_host_alloc" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_start" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_send_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_set_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_failure" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_complete_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_bind" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_itt_to_task" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03libsas: fix failure to revalidate domain for anything but the first expander ↵Mark Salyzyn
child. commit 24926dadc41cc566e974022b0e66231b82c6375f upstream. In an enclosure model where there are chaining expanders to a large body of storage, it was discovered that libsas, responding to a broadcast event change, would only revalidate the domain of first child expander in the list. The issue is that the pointer value to the discovered source device was used to break out of the loop, rather than the content of the pointer. This still remains non-compliant as the revalidate domain code is supposed to loop through all child expanders, and not stop at the first one it finds that reports a change count. However, the design of this routine does not allow multiple device discoveries and that would be a more complicated set of patches reserved for another day. We are fixing the glaring bug rather than refactoring the code. Signed-off-by: Mark Salyzyn <msalyzyn@us.xyratex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03aacraid: reset should disable MSI interruptVasily Averin
commit d0efab26f89506387a1bde898556660e06d7eb15 upstream. scsi reset on hardware with enabled MSI interrupts generates WARNING message [11027.798722] aacraid: Host adapter abort request (0,0,0,0) [11027.798814] aacraid: Host adapter reset request. SCSI hang ? [11087.762237] aacraid: SCSI bus appears hung [11135.082543] ------------[ cut here ]------------ [11135.082646] WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x251/0x290() Signed-off-by: Vasily Averin <vvs@sw.ru> Acked-by: Mark Salyzyn <mark_salyzyn@us.xyratex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-033w-9xxx: fix iommu_iova leakJames Bottomley
commit 96067723e46b0dd24ae7b934085ab4eff4d26a1b upstream. Following reports on the list, it looks like the 3e-9xxx driver will leak dma mappings every time we get a transient queueing error back from the card. This is because it maps the sg list in the routine that sends the command, but doesn't unmap again in the transient failure path (even though the command is sent back to the block layer). Fix by unmapping before returning the status. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereferenceNeil Horman
commit e48f129c2f200dde8899f6ea5c6e7173674fc482 upstream. This oops was reported recently: d:mon> e cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120] pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3] lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i] sp: c0000000fd4c73a0 msr: 8000000000009032 dar: 0 dsisr: 40000000 current = 0xc0000000fd640d40 paca = 0xc00000000054ff80 pid = 5085, comm = iscsid d:mon> t [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i] [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi] [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18 [scsi_transport_iscsi2] [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4 [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8 [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
2011-10-03bnx2fc: scsi_dma_unmap() not invoked on IO completionsBhanu Prakash Gollapudi
commit b5a95fe7ef464a67fab6ff870aa740739e788f90 upstream. Do not set io_req->sc_cmd to NULL until bnx2fc_unmap_sg_list() is called to enable it to unmap the DMA mappings. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-03bnx2fc: Fix kernel panic when deleting NPIV portsBhanu Prakash Gollapudi
commit d36b3279e157641c345b12eddb3db78fb42da80f upstream. Deleting NPIV port causes a kernel panic when the NPIV port is in the same zone as the physical port and shares the same LUN. This happens due to the fact that vport destroy and unsolicited ELS are scheduled to run on the same workqueue, and vport destroy destroys the lport and the unsolicited ELS tries to access the invalid lport. This patch fixes this issue by maintaining a list of valid lports and verifying if the lport is valid or not before accessing it. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03libiscsi_tcp: fix LLD data allocationMike Christie
commit 74dcd0ec735ba9c5bef254b2f6e53068cf3f9ff0 upstream. Have libiscsi_tcp have upper layers allocate the LLD data along with the iscsi_cls_conn struct, so it is refcounted. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03isci: fix event-get pointer incrementDan Williams
commit 77cd72a53f6426f81b7f56a862402849ee903bda upstream. Hardware only increments the put pointer on event types >= 4. Do not increment the get pointer for event type 3. Reported-by: Kapil Karkra <kapil.karkra@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03isci: Leave requests alone if already terminating.Jeff Skirvin
commit 39ea2c5b5ffaa344467da53e885cfa4ac0105050 upstream. Instead of immediately completing any request that has a second termination call made on it, wait for the TC done/abort HW event. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03isci: change sas phy timeouts from 54us to 59usMarcin Tomczak
commit 985af6f70dbb8a33b3af8a7c7df508d924650e37 upstream. Need the following workaround in the driver for interoperability with the older Intel SSD drives and any other SATA drive that may exhibit the same behavior. This is a corner case where SCU speed is limited to either 3G or 1.5G and the drive has a period of DC idle when it switches speed during SATA speed negotiation. Workaround :change PHYTOV[31:24] from 0x36 to 0x3B. Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03lpfc 8.3.25: PCI and SR-IOV FixesJames Smart
commit 0a96e9754d6c4a2a31e50ee6c6e36ec13f80bc25 upstream. PCI and SR-IOV Fixes - Call pci_save_state after the pci_restore_state completes. - After calling pci_enable_pcie_error_reporting() and checking the return value for logging messages from rc, reset rc to 0 to it will not later be interpreted for error. - Read PCI config space SR-IOV capability to get the number of VFs supported. - Check for the PF's supported number of VFs before invoking PCI enable sriov API call and log error message that user requested number of VFs is beyond the PF capability if such request is passed in. - Added check for Physical function with Virtual Functions attached. If so, first disable all the VFs before proceeding to device reset. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03lpfc 8.3.25: Fabric and Target Discovery FixesJames Smart
commit 5248a7498e5f6f3d6d276080466946f82f0ea56a upstream. Fabric and Target Discovery Fixes - Clear FC_VPORT_NEEDS_INIT_VPI flag during completion of REG_VFI mailbox command. - Prevent SLI3 Code from unregistering the physical VPI. - Add an else clause to the code that checks and sets sp->cmn.request_multiple_Nport to clear the bit. - Remove a redundant mbox free. - Modified lpfc_sli4_async_fip_evt to pass in physical VPI toi lpfc_find_vport_by_vpid function. - Modified lpfc_find_vport_by_vpid to translate physical VPI to logical VPI before comparing with vport VPI. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03lpfc 8.3.25: Adapter Interface fixes and changesJames Smart
commit 7851fe2c7f294d0beccf4c3d6af52e8247b89f00 upstream. Adapter Interface fixes and changes - Modify the macro field from lpfc_init_vpi_vpi to lpfc_init_vfi_vpi - Add the new CQE_CODE_RECEIVE_V1 CQE Code, add code in the driver to handle the new Code the same as the CQE_CODE_RECEIVE code except that there are two new checks for this code that will cause the driver to use the new V1 macros for rq_id and fcf_id. - Fix a bug in lpfc_prep_seq() where the size out of the first CQE was ONLY being used, even though multiple dmabufs make up the sequence, each have their own CQE with potentially different sizes. - Fix bug in lpfc_bsg_ct_unsol_event() where the ulpContext and ulpWord[3] fields of the XMIT_SEQUENCE64_CX IOCB were being calculated incorrectly. - Do physical to logical translation before indexing into the active XRI array. - Populate physical vpi in the iocb data structure. - Put the current accumulated total in each IOCB in the chain as we are walking thru then. The last IOCB in the chain should have the total length of the sequence. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03lpfc 8.3.25: Miscellaneous Bug fixes and code cleanupJames Smart
commit 88a2cfbb8bf3802ca5a90c7d1567a1e542e6ef0c upstream. Miscellaneous Bug fixes and code cleanup - Fix 16G link speed reporting by adding check for 16G check. - Change the check and enforcement of MAILBOX_EXT_SIZE (2048B) to the check and enforcement of BSG_MBOX_SIZE - sizeof(MAILBOX_t) (3840B). - Instead of waiting for a fixed amount of time after performing firmware reset, the driver shall wait for the Lancer SLIPORT_STATUS register for the readiness of the firmware for bring up. - Add logging to indicate when dynamic parameters are changed. - Add revision and date to the firmware image format. - Use revision instead of rev_name to check firmware image version. - Update temporary offset after memcopy is complete for firmware update. - Consolidated the use of the macros to get rid of duplicated register offset definitions. - Removed the unused second parameter in routine lpfc_bsg_diag_mode_enter() - Enable debugfs when debugfs is enabled. - Update function comments for lpfc_sli4_alloc_xri and lpfc_sli4_init_rpi_hdrs. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03lpfc 8.3.25: T10 DIF FixesJames Smart
commit 7c56b9fd3b6d2d933075d12abee67ceb7c90d04a upstream. T10 DIF Fixes - Fix the case where the SCSI Host supplies the CRC and driver to controller protection is on. - Only support T10 DIF type 1. LBA always goes in ref tag and app tag is not checked. - Change the format of the sense data passed up to the SCSI layer to match the Descriptor Format Sense Data found in SPC-4 sections 4.5.2.1 and 4.5.2.2. - Fix Slip PDE implementation. - Remove BUG() in else casein lpfc_sc_to_bg_opcodes. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03mpt2sas: Adding support for customer specific brandingKashyap, Desai
commit ab3e5f60d1fc8fe725d02510ff820ff207a8dbef upstream. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03mpt2sas: Added DID_NO_CONNECT return when driver remove and avoid shutdown callKashyap, Desai
commit 7821578caa8cb831868989041112ab808029ca65 upstream. Driver should not call shutdown call from _scsih_remove otherwise, The scsi midlayer can be deadlocked when devices are removed from the driver pci_driver->shutdown handler. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03fcoe: Unable to select the exchangeID from offload pool for storage targetsKiran Patil
commit 1ff9918b625457ce20d450d00f9ed0a12ba191b7 upstream. Problem: When initiator sends write command to target, target tries to assign new sequence. It allocates new exchangeID (RX_ID) always from non-offloaded pool (Non-offload EMA) Fix: Enhanced fcoe_oem_match routine to look at F_CTL flags and if it is exchange responder and command type is WRITEDATA, then function returns TRUE instead of FALSE. This function is used to determine which pool to use (offload pool of exchange is used only if this function returns TRUE). Technical Notes: N/A Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03libfc: Enhancement to RPORT state machine applicable only for VN2VN modeKiran Patil
commit 480584818a4bb3655d8d0d875ed60b427fc61cc5 upstream. Problem: Existing RPORT state machine continues witg FLOGI/PLOGI process only after it receices beacon from other end. Once claiming stage is over (either clain notify or clain repose), beacon is sent and state machine enters into operational mode where it initiates the rlogin process (FLOGI/PLOGI) to the peer but before this rlogin is initiated, exitsing implementation checks if it received beacon from other end, it beacon is not received yet, rlogin process is not initiated. Other end initiates FLOGI but peer end keeps on rejecting FLOGI, hence after 3 retries other end deletes associated rport, then sends a beacon. Once the beacon is received, peer end now initiates rlogin to the peer end but since associated rport is deleted FLOGI is neither accepted nor the reject response send out because rport is deleted. Hence unable to proceed withg FLOGI/PLOGI process and fails to establish VN2VN connection. Fix: VN2VN spec is not standard yet but based on exitsing collateral on T11, it appears that, both end shall send beacon and enter into 'operational mode' without explictly waiting for beacon from other end. Fix is to allow the RPORT login process as long as respective RPORT is created (as part of claim notification / claim response) even though state of RPORT is INIT. Means don't wait for beacon from peer end, if peer end initiates FLOGI (means peer end exist and responding). Notes: This patch is preparing the FCoE stack for target wrt offload. This is generic patch and harmless even if applied on storage initiator because 'else if' condition of function 'fcoe_oem_found' shall evaluate to TRUE only for targets. Dependencies: None Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03iscsi_tcp: fix locking around iscsi sk user dataMike Christie
commit 03adb5f91280b433c3685d0ee86b2e1424af3d88 upstream. iscsi_sw_tcp_conn_restore_callbacks could have set the sk_user_data field to NULL then iscsi_sw_tcp_data_ready could read that and try to access the NULL pointer. This adds some checks for NULL sk_user_data in the sk callback functions and it uses the sk_callback_lock to set/get that sk_user_data field. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03qla2xxx: Correct inadvertent loop state transitions during port-update handling.Andrew Vasquez
commit 58b48576966ed0afd3f63ef17480ec12748a7119 upstream. Transitioning to a LOOP_UPDATE loop-state could cause the driver to miss normal link/target processing. LOOP_UPDATE is a crufty artifact leftover from at time the driver performed it's own internal command-queuing. Safely remove this state. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03hpsa: fix physical device lun and target numbering problemStephen M. Cameron
commit 01350d05539d1c95ef3568d062d864ab76ae7670 upstream. If a physical device exposed to the OS by hpsa is replaced (e.g. one hot plug tape drive is replaced by another, or a tape drive is placed into "OBDR" mode in which it acts like a CD-ROM device) and a rescan is initiated, the replaced device will be added to the SCSI midlayer with target and lun numbers set to -1. After that, a panic is likely to ensue. When a physical device is replaced, the lun and target number should be preserved. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03hpsa: fix problem that OBDR devices are not detectedStephen M. Cameron
commit 0b0e1d6cbcc8627970e0399df8f06edd690ec7d9 upstream. The test to detect OBDR ("One Button Disaster Recovery") cd-rom devices was comparing against uninitialized data. Fixed by moving the test for the device to where the inquiry data is collected, and uninitialized variable altogether as it wasn't really being used. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03isci: fix 32-bit operation when CONFIG_HIGHMEM64G=nDan Williams
commit ee33e2b771f9e9e4aaba2bb2ace7b727fe451a8b upstream. The unsolicited frame control infrastructure requires a table of dma addresses for the hardware to lookup the frame buffer location by an index. The hardware expects the elements of this table to be 64-bit quantities, so we cannot reference these elements as dma_addr_t. All unsolicited frame protocols are affected, particularly SATA-PIO and SMP which prevented direct-attached SATA drives and expander-attached drives to not be discovered. Reported-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03isci: fix sata response handlingDan Williams
commit 1a878284473284f9577d44babf16d87152a05c33 upstream. A bug (likely copy/paste) that has been carried from the original implementation. The unsolicited frame handling structure returns the d2h fis in the isci_request.stp.rsp buffer. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15mpt2sas: Fixed Big Indian Issues on 32 bit PPCKashyap, Desai
commit c97951ec46d4b076c2236b77db34eeed6dddb8eb upstream. This patch addresses many endian issues solved by runing sparse with the option __CHECK_ENDIAN__ turned on. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04hpsa: do not attempt to read from a write-only registerStephen M. Cameron
commit fec62c368b9c8b05d5124ca6c3b8336b537f26f3 upstream. Most smartarrays tolerate it, but a few new ones don't. Without this change some newer Smart Arrays will lock up and i/o will grind to a halt. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04pmcraid: reject negative request sizeDan Rosenberg
commit b5b515445f4f5a905c5dd27e6e682868ccd6c09d upstream. There's a code path in pmcraid that can be reached via device ioctl that causes all sorts of ugliness, including heap corruption or triggering the OOM killer due to consecutive allocation of large numbers of pages. First, the user can call pmcraid_chr_ioctl(), with a type PMCRAID_PASSTHROUGH_IOCTL. This calls through to pmcraid_ioctl_passthrough(). Next, a pmcraid_passthrough_ioctl_buffer is copied in, and the request_size variable is set to buffer->ioarcb.data_transfer_length, which is an arbitrary 32-bit signed value provided by the user. If a negative value is provided here, bad things can happen. For example, pmcraid_build_passthrough_ioadls() is called with this request_size, which immediately calls pmcraid_alloc_sglist() with a negative size. The resulting math on allocating a scatter list can result in an overflow in the kzalloc() call (if num_elem is 0, the sglist will be smaller than expected), or if num_elem is unexpectedly large the subsequent loop will call alloc_pages() repeatedly, a high number of pages will be allocated and the OOM killer might be invoked. It looks like preventing this value from being negative in pmcraid_ioctl_passthrough() would be sufficient. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04fix crash in scsi_dispatch_cmd()James Bottomley
commit bfe159a51203c15d23cb3158fffdc25ec4b4dda1 upstream. USB surprise removal of sr is triggering an oops in scsi_dispatch_command(). What seems to be happening is that USB is hanging on to a queue reference until the last close of the upper device, so the crash is caused by surprise remove of a mounted CD followed by attempted unmount. The problem is that USB doesn't issue its final commands as part of the SCSI teardown path, but on last close when the block queue is long gone. The long term fix is probably to make sr do the teardown in the same way as sd (so remove all the lower bits on ejection, but keep the upper disk alive until last close of user space). However, the current oops can be simply fixed by not allowing any commands to be sent to a dead queue. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04ses: requesting a fault indicationDouglas Gilbert
commit 2a350cab9daf9a46322d83b091bb05cf54ccf6ab upstream. Noticed that when the sysfs interface of the SCSI SES driver was used to request a fault indication the LED flashed but the buzzer didn't sound. So it was doing what REQUEST IDENT (locate) should do. Changelog: - fix the setting of REQUEST FAULT for the device slot and array device slot elements in the enclosure control diagnostic page - note the potentially defective code that reads the FAULT SENSED and FAULT REQUESTED bits from the enclosure status diagnostic page The attached patch is against git/scsi-misc-2.6 Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04sr: check_events() ignore GET_EVENT when TUR says otherwiseKay Sievers
commit 79b9677d885d1a792bc103f2febb06f91f92de43 upstream. Some broken devices indicates that media has changed on every GET_EVENT_STATUS_NOTIFICATION. This translates into MEDIA_CHANGE uevent on every open() which lets udev run into a loop. Verify GET_EVENT result against TUR and if it generates spurious events for several times in a row, ignore the GET_EVENT events, and trust only the TUR status. This is the log of a USB stick with a (broken) fake CDROM drive: scsi 5:0:0:0: Direct-Access SanDisk U3 Cruzer Micro 8.02 PQ: 0 ANSI: 0 CCS sd 5:0:0:0: Attached scsi generic sg3 type 0 scsi 5:0:0:1: CD-ROM SanDisk U3 Cruzer Micro 8.02 PQ: 0 ANSI: 0 sd 5:0:0:0: [sdb] Attached SCSI removable disk sr2: scsi3-mmc drive: 48x/48x tray sr 5:0:0:1: Attached scsi CD-ROM sr2 sr 5:0:0:1: Attached scsi generic sg4 type 5 sr2: GET_EVENT and TUR disagree continuously, suppress GET_EVENT events sd 5:0:0:0: [sdb] 31777279 512-byte logical blocks: (16.2 GB/15.1 GiB) sd 5:0:0:0: [sdb] No Caching mode page present sd 5:0:0:0: [sdb] Assuming drive cache: write through sd 5:0:0:0: [sdb] No Caching mode page present sd 5:0:0:0: [sdb] Assuming drive cache: write through sdb: sdb1 -tj: Updated to consider only spurious GET_EVENT events among different types of disagreement and allow using TUR for kernel event polling after GET_EVENT is ignored. Reported-By: Markus Rathgeb maggu2810@googlemail.com Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04Blacklist Traxdata CDR4120 and IOMEGA Zip drive to avoid lock ups.Werner Fink
commit 82103978189e9731658cd32da5eb85ab7b8542b8 upstream. This patch resulted from the discussion at https://bugzilla.novell.com/show_bug.cgi?id=679277, https://bugzilla.novell.com/show_bug.cgi?id=681840 . Signed-off-by: Werner Fink <werner@novell.com> Signed-off-by: Ankit Jain <jankit@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>