linux-toradex.git - Linux kernel for Apalis and Colibri modules

Age	Commit message (Collapse)	Author
2017-01-19	xhci: fix deadlock at host remove by running watchdog correctly	Mathias Nyman
	commit d6169d04097fd9ddf811e63eae4e5cd71e6666e2 upstream. If a URB is killed while the host is removed we can end up in a situation where the hub thread takes the roothub device lock, and waits for the URB to be given back by xhci-hcd, blocking the host remove code. xhci-hcd tries to stop the endpoint and give back the urb, but can't as the host is removed from PCI bus at the same time, preventing the normal way of giving back urb. Instead we need to rely on the stop command timeout function to give back the urb. This xhci_stop_endpoint_command_watchdog() timeout function used a XHCI_STATE_DYING flag to indicate if the timeout function is already running, but later this flag has been taking into use in other places to mark that xhci is dying. Remove checks for XHCI_STATE_DYING in xhci_urb_dequeue. We are still checking that reading from pci state does not return 0xffffffff or that host is not halted before trying to stop the endpoint. This whole area of stopping endpoints, giving back URBs, and the wathdog timeout need rework, this fix focuses on solving a specific deadlock issue that we can then send to stable before any major rework. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	xhci: Fix race related to abort operation	OGAWA Hirofumi
	commit 1c111b6c3844a142e03bcfc2fa17bfbdea08e9dc upstream. Current abort operation has race. xhci_handle_command_timeout() xhci_abort_cmd_ring() xhci_write_64(CMD_RING_ABORT) xhci_handshake(5s) do { check CMD_RING_RUNNING udelay(1) ... COMP_CMD_ABORT event COMP_CMD_STOP event xhci_handle_stopped_cmd_ring() restart cmd_ring CMD_RING_RUNNING become 1 again } while () return -ETIMEDOUT xhci_write_64(CMD_RING_ABORT) /* can abort random command */ To do abort operation correctly, we have to wait both of COMP_CMD_STOP event and negation of CMD_RING_RUNNING. But like above, while timeout handler is waiting negation of CMD_RING_RUNNING, event handler can restart cmd_ring. So timeout handler never be notice negation of CMD_RING_RUNNING, and retry of CMD_RING_ABORT can abort random command (BTW, I guess retry of CMD_RING_ABORT was workaround of this race). To fix this race, this moves xhci_handle_stopped_cmd_ring() to xhci_abort_cmd_ring(). And timeout handler waits COMP_CMD_STOP event. At this point, timeout handler is owner of cmd_ring, and safely restart cmd_ring by using xhci_handle_stopped_cmd_ring(). [FWIW, as bonus, this way would be easily extend to add CMD_RING_PAUSE operation] [locks edited as patch is rebased on other locking fixes -Mathias] Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	xhci: Use delayed_work instead of timer for command timeout	OGAWA Hirofumi
	commit cb4d5ce588c5ff68e0fdd30370a0e6bc2c0a736b upstream. This is preparation to fix abort operation race (See "xhci: Fix race related to abort operation"). To make timeout sleepable, use delayed_work instead of timer. [change a newly added pending timer fix to pending work -Mathias] Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: xhci-mem: use passed in GFP flags instead of GFP_KERNEL	Dan Carpenter
	commit c95a9f83711bf53faeb4ed9bbb63a3f065613dfb upstream. We normally use the passed in gfp flags for allocations, it's just these two which were missed. Fixes: 22d45f01a836 ("usb/xhci: replace pci__consistent() with dma__coherent()") Cc: Mathias Nyman <mathias.nyman@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: xhci: hold lock over xhci_abort_cmd_ring()	Lu Baolu
	commit 4dea70778c0f48b4385c7720c363ec8d37a401b4 upstream. In command timer function, xhci_handle_command_timeout(), xhci->lock is unlocked before call into xhci_abort_cmd_ring(). This might cause race between the timer function and the event handler. The xhci_abort_cmd_ring() function sets the CMD_RING_ABORT bit in the command register and polling it until the setting takes effect. A stop command ring event might be handled between writing the abort bit and polling for it. The event handler will restart the command ring, which causes the failure of polling, and we ever believed that we failed to stop it. As a bonus, this also fixes some issues of calling functions without locking in xhci_handle_command_timeout(). Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	xhci: Handle command completion and timeout race	Mathias Nyman
	commit a5a1b9514154437aa1ed35c291191f82fd3e941a upstream. If we get a command completion event at the same time as the command timeout work starts on another cpu we might end up aborting the wrong command. If the command completion takes the xhci lock before the timeout work, it will handle the command, pick the next command, mark it as current_cmd, and re-queue the timeout work. When the timeout work finally gets the lock It will start aborting the wrong command. This case can be resolved by checking if the timeout work is pending inside the timeout function itself. A new timeout work can only be pending if the command completed and a new command was queued. If there are no more commands pending then command completion will set the current_cmd to NULL, which is already handled in the timeout work. Reported-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: host: xhci: Fix possible wild pointer when handling abort command	Baolin Wang
	commit 2a7cfdf37b7c08ac29df4c62ea5ccb01474b6597 upstream. When current command was supposed to be aborted, host will free the command in handle_cmd_completion() function. But it might be still referenced by xhci->current_cmd, which need to set NULL. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: xhci: fix return value of xhci_setup_device()	Lu Baolu
	commit 90797aee5d6902b49a453c97d83c326408aeb5a8 upstream. xhci_setup_device() should return failure with correct error number when xhci host has died, removed or halted. During usb device enumeration, if usb host is not accessible (died, removed or halted), the hc_driver->address_device() should return a corresponding error code to usb core. But current xhci driver just returns success. This misleads usb core to continue the enumeration by reading the device descriptor, which will result in failure, and users will get a misleading message like "device descriptor read/8, error -110". Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	xhci: free xhci virtual devices with leaf nodes first	Mathias Nyman
	commit ee8665e28e8d90ce69d4abe5a469c14a8707ae0e upstream. the tt_info provided by a HS hub might be in use to by a child device Make sure we free the devices in the correct order. This is needed in special cases such as when xhci controller is reset when resuming from hibernate, and all virt_devices are freed. Also free the virt_devices starting from max slot_id as children more commonly have higher slot_id than parent. Reported-by: Guenter Roeck <groeck@chromium.org> Tested-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Apollo Lake	Wan Ahmad Zainie
	commit 6c97cfc1a097b1e0786c836e92b7a72b4d031e25 upstream. Intel Apollo Lake also requires XHCI_PME_STUCK_QUIRK. Adding its PCI ID to quirk. Signed-off-by: Wan Ahmad Zainie <wan.ahmad.zainie.wan.mohamad@intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	xhci: workaround for hosts missing CAS bit	Mathias Nyman
	commit 346e99736c3ce328fd42d678343b70243aca5f36 upstream. If a device is unplugged and replugged during Sx system suspend some Intel xHC hosts will overwrite the CAS (Cold attach status) flag and no device connection is noticed in resume. A device in this state can be identified in resume if its link state is in polling or compliance mode, and the current connect status is 0. A device in this state needs to be warm reset. Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8 Observed on Cherryview and Apollolake as they go into compliance mode if LFPS times out during polling, and re-plugged devices are not discovered at resume. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12	usb: xhci: fix possible wild pointer	Lu Baolu
	commit 2b985467371a58ae44d76c7ba12b0951fee6ed98 upstream. handle_cmd_completion() frees a command structure which might be still referenced by xhci->current_cmd. This might cause problem when xhci->current_cmd is accessed after that. A real-life case could be like this. The host takes a very long time to respond to a command, and the command timer is fired at the same time when the command completion event arrives. The command completion handler frees xhci->current_cmd before the timer function can grab xhci->lock. Afterward, timer function grabs the lock and go ahead with checking and setting members of xhci->current_cmd. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06	USB: UHCI: report non-PME wakeup signalling for Intel hardware	Alan Stern
	commit ccdb6be9ec6580ef69f68949ebe26e0fb58a6fb0 upstream. The UHCI controllers in Intel chipsets rely on a platform-specific non-PME mechanism for wakeup signalling. They can generate wakeup signals even though they don't support PME. We need to let the USB core know this so that it will enable runtime suspend for UHCI controllers. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	xhci: add restart quirk for Intel Wildcatpoint PCH	Mathias Nyman
	commit 4c39135aa412d2f1381e43802523da110ca7855c upstream. xHC in Wildcatpoint-LP PCH is similar to LynxPoint-LP and need the same quirks to prevent machines from spurious restart while shutting them down. Reported-by: Hasan Mahmood <hasan.mahm@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	usb: increase ohci watchdog delay to 275 msec	Bryan Paluch
	commit ed6d6f8f42d7302f6f9b6245f34927ec20d26c12 upstream. Increase ohci watchout delay to 275 ms. Previous delay was 250 ms with 20 ms of slack, after removing slack time some ohci controllers don't respond in time. Logs from systems with controllers that have the issue would show "HcDoneHead not written back; disabled" Signed-off-by: Bryan Paluch <bryanpaluch@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	xhci: use default USB_RESUME_TIMEOUT when resuming ports.	Mathias Nyman
	commit 7d3b016a6f5a0fa610dfd02b05654c08fa4ae514 upstream. USB2 host inititated resume, and system suspend bus resume need to use the same USB_RESUME_TIMEOUT as elsewhere. This resolves a device disconnect issue at system resume seen on Intel Braswell and Apollolake, but is in no way limited to those platforms. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24	xhci: fix null pointer dereference in stop command timeout function	Mathias Nyman
	commit bcf42aa60c2832510b9be0f30c090bfd35bb172d upstream. The stop endpoint command has its own 5 second timeout timer. If the timeout function is triggered between USB3 and USB2 host removal it will try to call usb_hc_died(xhci_to_hcd(xhci)->primary_hcd) the ->primary_hcd will be set to NULL at USB3 hcd removal. Fix this by first checking if the PCI host is being removed, and also by using only xhci_to_hcd() as it will always return the primary hcd. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07	xhci: Make sure xhci handles USB_SPEED_SUPER_PLUS devices.	Mathias Nyman
	commit 0caf6b33452112e5a1186c8c964e90310e49e6bd upstream. In most cases the devices with the speed set to USB_SPEED_SUPER_PLUS are handled like regular SuperSpeed devices. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07	xhci: don't dereference a xhci member after removing xhci	Mathias Nyman
	commit f1f6d9a8b540df22b87a5bf6bc104edaade81f47 upstream. Remove the hcd after checking for the xhci last quirks, not before. This caused a hang on a Alpine Ridge xhci based maching which remove the whole xhci controller when unplugging the last usb device Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07	usb: xhci: Fix panic if disconnect	Jim Lin
	commit 88716a93766b8f095cdef37a8e8f2c93aa233b21 upstream. After a device is disconnected, xhci_stop_device() will be invoked in xhci_bus_suspend(). Also the "disconnect" IRQ will have ISR to invoke xhci_free_virt_device() in this sequence. xhci_irq -> xhci_handle_event -> handle_cmd_completion -> xhci_handle_cmd_disable_slot -> xhci_free_virt_device If xhci->devs[slot_id] has been assigned to NULL in xhci_free_virt_device(), then virt_dev->eps[i].ring in xhci_stop_device() may point to an invlid address to cause kernel panic. virt_dev = xhci->devs[slot_id]; : if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue) [] Unable to handle kernel paging request at virtual address 00001a68 [] pgd=ffffffc001430000 [] [00001a68] pgd=000000013c807003, pud=000000013c807003, pmd=000000013c808003, pte=0000000000000000 [] Internal error: Oops: 96000006 [#1] PREEMPT SMP [] CPU: 0 PID: 39 Comm: kworker/0:1 Tainted: G U [] Workqueue: pm pm_runtime_work [] task: ffffffc0bc0e0bc0 ti: ffffffc0bc0ec000 task.ti: ffffffc0bc0ec000 [] PC is at xhci_stop_device.constprop.11+0xb4/0x1a4 This issue is found when running with realtek ethernet device (0bda:8153). Signed-off-by: Jim Lin <jilin@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07	xhci: always handle "Command Ring Stopped" events	Mathias Nyman
	commit 33be126510974e2eb9679f1ca9bca4f67ee4c4c7 upstream. Fix "Command completion event does not match command" errors by always handling the command ring stopped events. The command ring stopped event is generated as a result of aborting or stopping the command ring with a register write. It is not caused by a command in the command queue, and thus won't have a matching command in the comman list. Solve it by handling the command ring stopped event before checking for a matching command. In most command time out cases we abort the command ring, and get a command ring stopped event. The events command pointer will point at the current command ring dequeue, which in most cases matches the timed out command in the command list, and no error messages are seen. If we instead get a command aborted event before the command ring stopped event, the abort event will increse the command ring dequeue pointer, and the following command ring stopped events command pointer will point at the next, not yet queued command. This case triggered the error message Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07	usb: ehci: change order of register cleanup during shutdown	Marc Ohlf
	commit bc337b51508beb2d039aff5074a76cfe1c212030 upstream. In ehci_turn_off_all_ports() all EHCI port registers are cleared to zero. On some hardware, this can lead to an system hang, when ehci_port_power() accesses the already cleared registers. This patch changes the order of cleanup. First call ehci_port_power() which respects the current bits in port status registers and afterwards cleanup the hard way by setting everything to zero. Signed-off-by: Marc Ohlf <ohlf@mkt-sys.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-10	USB: OHCI: Don't mark EDs as ED_OPER if scheduling fails	Michał Pecio
	commit c66f59ee5050447b3da92d36f5385a847990a894 upstream. Since ed_schedule begins with marking the ED as "operational", the ED may be left in such state even if scheduling actually fails. This allows future submission attempts to smuggle this ED to the hardware behind the scheduler's back and without linking it to the ohci->eds_in_use list. The former causes bandwidth saturation and data loss on isoc endpoints, the latter crashes the kernel when attempt is made to unlink such ED from this list. Fix ed_schedule to update ED state only on successful return. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-11	usb: host: ehci-tegra: Grab the correct UTMI pads reset	Thierry Reding
	commit f8a15a9650694feaa0dabf197b0c94d37cd3fb42 upstream. There are three EHCI controllers on Tegra SoCs, each with its own reset line. However, the first controller contains a set of UTMI configuration registers that are shared with its siblings. These registers will only be reset as part of the first controller's reset. For proper operation it must be ensured that the UTMI configuration registers are reset before any of the EHCI controllers are enabled, irrespective of the probe order. Commit a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to broken USB") introduced code that ensures the first controller is always reset before setting up any of the controllers, and is never again reset afterwards. This code, however, grabs the wrong reset. Each EHCI controller has two reset controls attached: 1) the USB controller reset and 2) the UTMI pads reset (really the first controller's reset). In order to reset the UTMI pads registers the code must grab the second reset, but instead it grabbing the first. Fixes: a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to broken USB") Acked-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-11	xhci: Fix handling timeouted commands on hosts in weird states.	Mathias Nyman
	commit 3425aa03f484d45dc21e0e791c2f6c74ea656421 upstream. If commands timeout we mark them for abortion, then stop the command ring, and turn the commands to no-ops and finally restart the command ring. If the host is working properly the no-op commands will finish and pending completions are called. If we notice the host is failing, driver clears the command ring and completes, deletes and frees all pending commands. There are two separate cases reported where host is believed to work properly but is not. In the first case we successfully stop the ring but no abort or stop command ring event is ever sent and host locks up. The second case is if a host is removed, command times out and driver believes the ring is stopped, and assumes it will be restarted, but actually ends up timing out on the same command forever. If one of the pending commands has the xhci->mutex held it will block xhci_stop() in the remove codepath which otherwise would cleanup pending commands. Add a check that clears all pending commands in case host is removed, or we are stuck timing out on the same command. Also restart the command timeout timer when stopping the command ring to ensure we recive an ring stop/abort event. Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-11	USB: xhci: Add broken streams quirk for Frescologic device id 1009	Hans de Goede
	commit d95815ba6a0f287213118c136e64d8c56daeaeab upstream. I got one of these cards for testing uas with, it seems that with streams it dma-s all over the place, corrupting memory. On my first tests it managed to dma over the BIOS of the motherboard somehow and completely bricked it. Tests on another motherboard show that it does work with streams disabled. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-11	usb: xhci-plat: properly handle probe deferral for devm_clk_get()	Thomas Petazzoni
	commit de95c40d5beaa47f6dc8fe9ac4159b4672b51523 upstream. On some platforms, the clocks might be registered by a platform driver. When this is the case, the clock platform driver may very well be probed after xhci-plat, in which case the first probe() invocation of xhci-plat will receive -EPROBE_DEFER as the return value of devm_clk_get(). The current code handles that as a normal error, and simply assumes that this means that the system doesn't have a clock for the XHCI controller, and continues probing without calling clk_prepare_enable(). Unfortunately, this doesn't work on systems where the XHCI controller does have a clock, but that clock is provided by another platform driver. In order to fix this situation, we handle the -EPROBE_DEFER error condition specially, and abort the XHCI controller probe(). It will be retried later automatically, the clock will be available, devm_clk_get() will succeed, and the probe() will continue with the clock prepared and enabled as expected. In practice, such issue is seen on the ARM64 Marvell 7K/8K platform, where the clocks are registered by a platform driver. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-11	xhci: Cleanup only when releasing primary hcd	Gabriel Krisman Bertazi
	commit 27a41a83ec54d0edfcaf079310244e7f013a7701 upstream. Under stress occasions some TI devices might not return early when reading the status register during the quirk invocation of xhci_irq made by usb_hcd_pci_remove. This means that instead of returning, we end up handling this interruption in the middle of a shutdown. Since xhci->event_ring has already been freed in xhci_mem_cleanup, we end up accessing freed memory, causing the Oops below. commit 8c24d6d7b09d ("usb: xhci: stop everything on the first call to xhci_stop") is the one that changed the instant in which we clean up the event queue when stopping a device. Before, we didn't call xhci_mem_cleanup at the first time xhci_stop is executed (for the shared HCD), instead, we only did it after the invocation for the primary HCD, much later at the removal path. The code flow for this oops looks like this: xhci_pci_remove() usb_remove_hcd(xhci->shared) xhci_stop(xhci->shared) xhci_halt() xhci_mem_cleanup(xhci); // Free the event_queue usb_hcd_pci_remove(primary) xhci_irq() // Access the event_queue if STS_EINT is set. Crash. xhci_stop() xhci_halt() // return early The fix modifies xhci_stop to only cleanup the xhci data when releasing the primary HCD. This way, we still have the event_queue configured when invoking xhci_irq. We still halt the device on the first call to xhci_stop, though. I could reproduce this issue several times on the mainline kernel by doing a bind-unbind stress test with a specific storage gadget attached. I also ran the same test over-night with my patch applied and didn't observe the issue anymore. [ 113.334124] Unable to handle kernel paging request for data at address 0x00000028 [ 113.335514] Faulting instruction address: 0xd00000000d4f767c [ 113.336839] Oops: Kernel access of bad area, sig: 11 [#1] [ 113.338214] SMP NR_CPUS=1024 NUMA PowerNV [c000000efe47ba90] c000000000720850 usb_hcd_irq+0x50/0x80 [c000000efe47bac0] c00000000073d328 usb_hcd_pci_remove+0x68/0x1f0 [c000000efe47bb00] d00000000daf0128 xhci_pci_remove+0x78/0xb0 [xhci_pci] [c000000efe47bb30] c00000000055cf70 pci_device_remove+0x70/0x110 [c000000efe47bb70] c00000000061c6bc __device_release_driver+0xbc/0x190 [c000000efe47bba0] c00000000061c7d0 device_release_driver+0x40/0x70 [c000000efe47bbd0] c000000000619510 unbind_store+0x120/0x150 [c000000efe47bc20] c0000000006183c4 drv_attr_store+0x64/0xa0 [c000000efe47bc60] c00000000039f1d0 sysfs_kf_write+0x80/0xb0 [c000000efe47bca0] c00000000039e14c kernfs_fop_write+0x18c/0x1f0 [c000000efe47bcf0] c0000000002e962c __vfs_write+0x6c/0x190 [c000000efe47bd90] c0000000002eab40 vfs_write+0xc0/0x200 [c000000efe47bde0] c0000000002ec85c SyS_write+0x6c/0x110 [c000000efe47be30] c000000000009260 system_call+0x38/0x108 Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Roger Quadros <rogerq@ti.com> Cc: joel@jms.id.au Reviewed-by: Roger Quadros <rogerq@ti.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04	xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers	Mathias Nyman
	commit 98d74f9ceaefc2b6c4a6440050163a83be0abede upstream. PCI hotpluggable xhci controllers such as some Alpine Ridge solutions will remove the xhci controller from the PCI bus when the last USB device is disconnected. Add a flag to indicate that the host is being removed to avoid queueing configure_endpoint commands for the dropped endpoints. For PCI hotplugged controllers this will prevent 5 second command timeouts For static xhci controllers the configure_endpoint command is not needed in the removal case as everything will be returned, freed, and the controller is reset. For now the flag is only set for PCI connected host controllers. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04	usb: xhci: fix wild pointers in xhci_mem_cleanup	Lu Baolu
	commit 71504062a7c34838c3fccd92c447f399d3cb5797 upstream. This patch fixes some wild pointers produced by xhci_mem_cleanup. These wild pointers will cause system crash if xhci_mem_cleanup() is called twice. Reported-and-tested-by: Pengcheng Li <lpc.li@hisilicon.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04	xhci: resume USB 3 roothub first	Mathias Nyman
	commit 671ffdff5b13314b1fc65d62cf7604b873fb5dc4 upstream. Give USB3 devices a better chance to enumerate at USB 3 speeds if they are connected to a suspended host. Solves an issue with NEC uPD720200 host hanging when partially enumerating a USB3 device as USB2 after host controller runtime resume. Tested-by: Mike Murdoch <main.haarp@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04	usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host	Rafal Redzimski
	commit 0d46faca6f887a849efb07c1655b5a9f7c288b45 upstream. Broxton B0 also requires XHCI_PME_STUCK_QUIRK. Adding PCI device ID for Broxton B and adding to quirk. Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com> Signed-off-by: Robert Dobrowolski <robert.dobrowolski@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25	xhci: Fix list corruption in urb dequeue at host removal	Mathias Nyman
	commit 5c82171167adb8e4ac77b91a42cd49fb211a81a0 upstream. xhci driver frees data for all devices, both usb2 and and usb3 the first time usb_remove_hcd() is called, including td_list and and xhci_ring structures. When usb_remove_hcd() is called a second time for the second xhci bus it will try to dequeue all pending urbs, and touches td_list which is already freed for that endpoint. Reported-by: Joe Lawrence <joe.lawrence@stratus.com> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25	Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"	Mathias Nyman
	commit a6835090716a85f2297668ba593bd00e1051e662 upstream. This reverts commit e210c422b6fd ("xhci: don't finish a TD if we get a short transfer event mid TD") Turns out that most host controllers do not follow the xHCI specs and never send the second event for the last TRB in the TD if there was a short event mid-TD. Returning the URB directly after the first short-transfer event is far better than never returning the URB. (class drivers usually timeout after 30sec). For the hosts that do send the second event we will go back to treating it as misplaced event and print an error message for it. The origial patch was sent to stable kernels and needs to be reverted from there as well Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-17	usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms	Lu Baolu
	commit ccc04afb72cddbdf7c0e1c17e92886405a71b754 upstream. Intel Broxton M was verifed to require XHCI_PME_STUCK_QUIRK quirk as well. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-17	usb: xhci: handle both SSIC ports in PME stuck quirk	Lu Baolu
	commit fa89537783cb442263fa5a14df6c7693eaf32f11 upstream. Commit abce329c27b3 ("xhci: Workaround to get D3 working in Intel xHCI") adds a workaround for a limitation of PME storm caused by SSIC port in some Intel SoCs. This commit only handled one SSIC port, while there are actually two SSIC ports in the chips. This patch handles both SSIC ports. Without this fix, users still see PME storm. Signed-off-by: Zhuang Jin Can <jin.can.zhuang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31	xhci: refuse loading if nousb is used	Oliver Neukum
	commit 1eaf35e4dd592c59041bc1ed3248c46326da1f5f upstream. The module should fail to load. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-11	xhci: fix usb2 resume timing and races.	Mathias Nyman
	According to USB 2 specs ports need to signal resume for at least 20ms, in practice even longer, before moving to U0 state. Both host and devices can initiate resume. On device initiated resume, a port status interrupt with the port in resume state in issued. The interrupt handler tags a resume_done[port] timestamp with current time + USB_RESUME_TIMEOUT, and kick roothub timer. Root hub timer requests for port status, finds the port in resume state, checks if resume_done[port] timestamp passed, and set port to U0 state. On host initiated resume, current code sets the port to resume state, sleep 20ms, and finally sets the port to U0 state. This should also be changed to work in a similar way as the device initiated resume, with timestamp tagging, but that is not yet tested and will be a separate fix later. There are a few issues with this approach 1. A host initiated resume will also generate a resume event. The event handler will find the port in resume state, believe it's a device initiated resume, and act accordingly. 2. A port status request might cut the resume signalling short if a get_port_status request is handled during the host resume signalling. The port will be found in resume state. The timestamp is not set leading to time_after_eq(jiffies, timestamp) returning true, as timestamp = 0. get_port_status will proceed with moving the port to U0. 3. If an error, or anything else happens to the port during device initiated resume signalling it will leave all the device resume parameters hanging uncleared, preventing further suspend, returning -EBUSY, and cause the pm thread to busyloop trying to enter suspend. Fix this by using the existing resuming_ports bitfield to indicate that resume signalling timing is taken care of. Check if the resume_done[port] is set before using it for timestamp comparison, and also clear out any resume signalling related variables if port is not in U0 or Resume state This issue was discovered when a PM thread busylooped, trying to runtime suspend the xhci USB 2 roothub on a Dell XPS Cc: stable <stable@vger.kernel.org> Reported-by: Daniel J Blueman <daniel@quora.org> Tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-04	USB: host: ohci-at91: fix a crash in ohci_hcd_at91_overcurrent_irq	Alexandre Belloni
	The interrupt handler, ohci_hcd_at91_overcurrent_irq may be called right after registration. At that time, pdev->dev.platform_data is not yet set, leading to a NULL pointer dereference. Fixes: e4df92279fd9 (USB: host: ohci-at91: merge loops in ohci_hcd_at91_drv_probe) Reported-by: Peter Rosin <peda@axentia.se> Tested-by: Peter Rosin <peda@axentia.se> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: stable@vger.kernel.org # 4.3+ Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-04	usb: xhci: fix config fail of FS hub behind a HS hub with MTT	Chunfeng Yun
	if a full speed hub connects to a high speed hub which supports MTT, the MTT field of its slot context will be set to 1 when xHCI driver setups an xHCI virtual device in xhci_setup_addressable_virt_dev(); once usb core fetch its hub descriptor, and need to update the xHC's internal data structures for the device, the HUB field of its slot context will be set to 1 too, meanwhile MTT is also set before, this will cause configure endpoint command fail, so in the case, we should clear MTT to 0 for full speed hub according to section 6.2.2 Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-04	xhci: Fix memory leak in xhci_pme_acpi_rtd3_enable()	Mika Westerberg
	There is a memory leak because acpi_evaluate_dsm() actually returns an object which the caller is supposed to release. Fix this by calling ACPI_FREE() for the returned object (this expands to kfree() so passing NULL there is fine as well). While there correct indentation in !CONFIG_ACPI case. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable <stable@vger.kernel.org> # v4.2 Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-01	USB: whci-hcd: add check for dma mapping error	Alexey Khoroshilov
	qset_fill_page_list() do not check for dma mapping errors. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-18	xhci: Fix a race in usb2 LPM resume, blocking U3 for usb2 devices	Mathias Nyman
	Clear device initiated resume variables once device is fully up and running in U0 state. Resume needs to be signaled for 20ms for usb2 devices before they can be moved to U0 state. An interrupt is triggered if a device initiates resume. As we handle the event in interrupt context we can not sleep for 20ms, so we instead set a resume flag, a timestamp, and start the roothub polling. The roothub code will later move the port to U0 when it finds a port in resume state with the resume flag set, and timestamp passed by 20ms. A host initiated resume is however not done in interrupt context, and host initiated resume code will directly signal resume, wait 20ms and then move the port to U0. These two codepaths can race, if we are in the middle of a host initated resume, while sleeping for 20ms, we may handle a port event and find the port in resume state. The port event handling code will assume the resume was device initiated and set the resume flag and timestamp. Root hub code will however not catch the port in resume state again as the host initated resume code has already moved the port to U0. The resume flag and timestamp will remain set for this port preventing port from suspending again (LPM setting port to U3) Fix this for now by always clearing the device initated resume parameters once port is in U0 Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-18	usb: xhci: fix checking ep busy for CFC	Lu Baolu
	Function ep_ring_is_processing() checks the dequeue pointer in endpoint context to know whether an endpoint is busy with processing TRBs. This is not correct since dequeue pointer field in an endpoint context is only valid when the endpoint is in Halted or Stopped states. This buggy code causes audio noise when playing sound with USB headset connected to host controllers which support CFC (one of xhci 1.1 features). This patch should exist in stable kernel since v4.3. Reported-and-tested-by: YD Tseng <yd_tseng@asmedia.com.tw> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: stable <stable@vger.kernel.org> # v4.3 Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-18	xhci: Workaround to get Intel xHCI reset working more reliably	Rajmohan Mani
	Existing Intel xHCI controllers require a delay of 1 mS, after setting the CMD_RESET bit in command register, before accessing any HC registers. This allows the HC to complete the reset operation and be ready for HC register access. Without this delay, the subsequent HC register access, may result in a system hang, very rarely. Verified CherryView / Braswell platforms go through over 5000 warm reboot cycles (which was not possible without this patch), without any xHCI reset hang. Signed-off-by: Rajmohan Mani <rajmohan.mani@intel.com> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-09	dma: remove external references to dma_supported	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-07	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge second patch-bomb from Andrew Morton: - most of the rest of MM - procfs - lib/ updates - printk updates - bitops infrastructure tweaks - checkpatch updates - nilfs2 update - signals - various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc, dma-debug, dma-mapping, ... * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (102 commits) ipc,msg: drop dst nil validation in copy_msg include/linux/zutil.h: fix usage example of zlib_adler32() panic: release stale console lock to always get the logbuf printed out dma-debug: check nents in dma_sync_sg* dma-mapping: tidy up dma_parms default handling pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode kexec: use file name as the output message prefix fs, seqfile: always allow oom killer seq_file: reuse string_escape_str() fs/seq_file: use seq_* helpers in seq_hex_dump() coredump: change zap_threads() and zap_process() to use for_each_thread() coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT) signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread() signal: turn dequeue_signal_lock() into kernel_dequeue_signal() signals: kill block_all_signals() and unblock_all_signals() nilfs2: fix gcc uninitialized-variable warnings in powerpc build nilfs2: fix gcc unused-but-set-variable warnings MAINTAINERS: nilfs2: add header file for tracing nilfs2: add tracepoints for analyzing reading and writing metadata files ...
2015-11-06	mm, page_alloc: distinguish between being unable to sleep, unwilling to ↵	Mel Gorman
	sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06	Merge tag 'asm-generic-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cleanups from Arnd Bergmann: "The asm-generic changes for 4.4 are mostly a series from Christoph Hellwig to clean up various abuses of headers in there. The patch to rename the io-64-nonatomic-.h headers caused some conflicts with new users, so I added a workaround that we can remove in the next merge window. The only other patch is a warning fix from Marek Vasut" tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: temporarily add back asm-generic/io-64-nonatomic.h asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations gpio-mxc: stop including <asm-generic/bug> n_tracesink: stop including <asm-generic/bug> n_tracerouter: stop including <asm-generic/bug> mlx5: stop including <asm-generic/kmap_types.h> hifn_795x: stop including <asm-generic/kmap_types.h> drbd: stop including <asm-generic/kmap_types.h> move count_zeroes.h out of asm-generic move io-64-nonatomic.h out of asm-generic
2015-11-05	Merge tag 'spi-v4.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "Quite a lot of activity in SPI this cycle, almost all of it in drivers with a few minor improvements and tweaks in the core. - Updates to pxa2xx to support Intel Broxton and multiple chip selects. - Support for big endian in the bcm63xx driver. - Multiple slave support for the mt8173 - New driver for the auxiliary SPI controller in bcm2835 SoCs. - Support for Layerscale SoCs in the Freescale DSPI driver" * tag 'spi-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (87 commits) spi: pxa2xx: Rework self-initiated platform data creation for non-ACPI spi: pxa2xx: Add support for Intel Broxton spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals spi: pxa2xx: Add output control for multiple Intel LPSS chip selects spi: pxa2xx: Use LPSS prefix for defines that are Intel LPSS specific spi: Add DSPI support for layerscape family spi: ti-qspi: improve ->remove() callback spi/spi-xilinx: Fix race condition on last word read spi: Drop owner assignment from spi_drivers spi: Add THIS_MODULE to spi_driver in SPI core spi: Setup the master controller driver before setting the chipselect spi: dw: replace magic constant by DW_SPI_DR spi: mediatek: mt8173 spi multiple devices support spi: mediatek: handle controller_data in mtk_spi_setup spi: mediatek: remove mtk_spi_config spi: mediatek: Update document devicetree bindings to support multiple devices spi: fix kernel-doc warnings about missing return desc in spi.c spi: fix kernel-doc warnings about missing return desc in spi.h spi: pxa2xx: Align a few defines spi: pxa2xx: Save other reg_cs_ctrl bits when configuring chip select ...