linux-toradex.git/kernel/workqueue.c, branch v2.6.32.60

usb: Fix deadlock in hid_reset when Dell iDRAC is reset

2012-10-07T21:37:12+00:00

This was fixed upstream by commit e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c
('workqueue: implement concurrency managed dynamic worker pool'), but
that is far too large a change for stable.

When Dell iDRAC is reset, the iDRAC's USB keyboard/mouse device stops
responding but is not actually disconnected.  This causes usbhid to
hid hid_io_error(), and you get a chain of calls like...

hid_reset()
 usb_reset_device()
  usb_reset_and_verify_device()
   usb_ep0_reinit()
    usb_disble_endpoint()
     usb_hcd_disable_endpoint()
      ehci_endpoint_disable()

Along the way, as a result of an error/timeout with a USB transaction,
ehci_clear_tt_buffer() calls usb_hub_clear_tt_buffer() (to clear a failed
transaction out of the transaction translator in the hub), which schedules
hub_tt_work() to be run (using keventd), and then sets qh->clearing_tt=1 so
that nobody will mess with that qh until the TT is cleared.

But run_workqueue() never happens for the keventd workqueue on that CPU, so
hub_tt_work() never gets run.  And qh->clearing_tt never gets changed back to
0.

This causes ehci_endpoint_disable() to get stuck in a loop waiting for
qh->clearing_tt to go to 0.

Part of the problem is hid_reset() is itself running on keventd.  So
when that thread gets a timeout trying to talk to the HID device, it
schedules clear_work (to run hub_tt_work) to run, and then gets stuck
in ehci_endpoint_disable waiting for it to run.

However, clear_work never gets run because the workqueue for that CPU
is still waiting for hid_reset to finish.

A much less invasive patch for earlier kernels is to just schedule
clear_work on khubd if the usb code needs to clear the TT and it sees
that it is already running on keventd.  Khubd isn't used by default
because it can get blocked by device enumeration sometimes, but I
think it should be ok for a backup for unusual cases like this just to
prevent deadlock.

Signed-off-by: Stuart Hayes 
Signed-off-by: Shyam Iyer 
[bwh: Use current_is_keventd() rather than checking current->{flags,comm}]
Signed-off-by: Ben Hutchings 
Signed-off-by: Willy Tarreau

workqueue: fix race condition in schedule_on_each_cpu()

2009-11-18T01:40:33+00:00

Commit 65a64464349883891e21e74af16c05d6e1eeb4e9 ("HWPOISON: Allow
schedule_on_each_cpu() from keventd") which allows schedule_on_each_cpu()
to be called from keventd added a race condition.  schedule_on_each_cpu()
may race with cpu hotplug and end up executing the function twice on a
cpu.

Fix it by moving direct execution into the section protected with
get/put_online_cpus().  While at it, update code such that direct
execution is done after works have been scheduled for all other cpus and
drop unnecessary cpu != orig test from flush loop.

Signed-off-by: Tejun Heo 
Cc: Andi Kleen 
Acked-by: Oleg Nesterov 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Merge branch 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6

2009-10-29T15:20:00+00:00

* 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:
  HWPOISON: fix invalid page count in printk output
  HWPOISON: Allow schedule_on_each_cpu() from keventd
  HWPOISON: fix/proc/meminfo alignment
  HWPOISON: fix oops on ksm pages
  HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page
  HWPOISON: return early on non-LRU pages
  HWPOISON: Add brief hwpoison description to Documentation
  HWPOISON: Clean up PR_MCE_KILL interface

HWPOISON: Allow schedule_on_each_cpu() from keventd

2009-10-19T05:29:22+00:00

Right now when calling schedule_on_each_cpu() from keventd there
is a deadlock because it tries to schedule a work item on the current CPU
too. This happens via lru_add_drain_all() in hwpoison.

Just call the function for the current CPU in this case. This is actually
faster too.

Debugging with Fengguang Wu & Max Asbock

Signed-off-by: Andi Kleen

workqueue: add 'flush_delayed_work()' to run and wait for delayed work

2009-10-14T22:11:35+00:00

It basically turns a delayed work into an immediate work, and then waits
for it to finish, thus allowing you to force (and wait for) an immediate
flush of a delayed work.

We'll want to use this in the tty layer to clean up tty_flush_to_ldisc().

Acked-by: Oleg Nesterov 
[ Fixed to use 'del_timer_sync()' as noted by Oleg ]
Signed-off-by: Linus Torvalds

Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

2009-09-11T20:23:18+00:00

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits)
  sched: Fix sched::sched_stat_wait tracepoint field
  sched: Disable NEW_FAIR_SLEEPERS for now
  sched: Keep kthreads at default priority
  sched: Re-tune the scheduler latency defaults to decrease worst-case latencies
  sched: Turn off child_runs_first
  sched: Ensure that a child can't gain time over it's parent after fork()
  sched: enable SD_WAKE_IDLE
  sched: Deal with low-load in wake_affine()
  sched: Remove short cut from select_task_rq_fair()
  sched: Turn on SD_BALANCE_NEWIDLE
  sched: Clean up topology.h
  sched: Fix dynamic power-balancing crash
  sched: Remove reciprocal for cpu_power
  sched: Try to deal with low capacity, fix update_sd_power_savings_stats()
  sched: Try to deal with low capacity
  sched: Scale down cpu_power due to RT tasks
  sched: Implement dynamic cpu_power
  sched: Add smt_gain
  sched: Update the cpu_power sum during load-balance
  sched: Add SD_PREFER_SIBLING
  ...

sched: Keep kthreads at default priority

2009-09-09T15:30:06+00:00

Removes kthread/workqueue priority boost, they increase worst-case
desktop latencies.

Signed-off-by: Mike Galbraith 
Acked-by: Peter Zijlstra 
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
Signed-off-by: Ingo Molnar

workqueues: Improve schedule_work() documentation

2009-08-04T13:21:16+00:00

Two important aspects of the schedule_work() function are not
yet documented:

 - that it is allowed to pass a struct work_struct * to this
   function that is already on the kernel-global workqueue;

 - the meaning of its return value.

The patch below documents both aspects.

Signed-off-by: Bart Van Assche 
Cc: "Greg Kroah-Hartman" 
Cc: Andrew Morton 
LKML-Reference: <200907301900.54202.bart.vanassche@gmail.com>
Signed-off-by: Ingo Molnar

ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro

2009-06-01T23:10:40+00:00

v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
    introduced by Steven Rostedt: consolidate trace and trace_event headers
v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
    the work addr
v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro

TRACE_EVENT is a more generic way to define tracepoints.
Doing so adds these new capabilities to the tracepoints:

  - zero-copy and per-cpu splice() tracing
  - binary tracing without printf overhead
  - structured logging records exposed under /debug/tracing/events
  - trace events embedded in function tracer output and other plugins
  - user-defined, per tracepoint filter expressions

Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
tracepoints.

[ Impact: expand workqueue tracer to events tracing ]

Signed-off-by: Zhao Lei 
Cc: Steven Rostedt 
Cc: Tom Zanussi 
Cc: Oleg Nesterov 
Cc: Andrew Morton 
Signed-off-by: KOSAKI Motohiro 
Signed-off-by: Frederic Weisbecker

work_on_cpu(): rewrite it to create a kernel thread on demand

2009-04-09T00:20:37+00:00

Impact: circular locking bugfix

The various implemetnations and proposed implemetnations of work_on_cpu()
are vulnerable to various deadlocks because they all used queues of some
form.

Unrelated pieces of kernel code thus gained dependencies wherein if one
work_on_cpu() caller holds a lock which some other work_on_cpu() callback
also takes, the kernel could rarely deadlock.

Fix this by creating a short-lived kernel thread for each work_on_cpu()
invokation.

This is not terribly fast, but the only current caller of work_on_cpu() is
pci_call_probe().

It would be nice to find some other way of doing the node-local
allocations in the PCI probe code so that we can zap work_on_cpu()
altogether.  The code there is rather nasty.  I can't think of anything
simple at this time...

Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Rusty Russell