linux-toradex.git/kernel/watchdog.c, branch v4.8-rc6

watchdog: don't run proc_watchdog_update if new value is same as old

2016-03-17T22:09:34+00:00

While working on a script to restore all sysctl params before a series of
tests I found that writing any value into the
/proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh}
causes them to call proc_watchdog_update().

  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.

There doesn't appear to be a reason for doing this work every time a write
occurs, so only do it when the values change.

Signed-off-by: Josh Hunt 
Acked-by: Don Zickus 
Reviewed-by: Aaron Tomlin 
Cc: Ulrich Obergfell 
Cc: 	[4.1.x+]
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

2016-01-12T02:53:13+00:00

Pull workqueue update from Tejun Heo:
 "Workqueue changes for v4.5.  One cleanup patch and three to improve
  the debuggability.

  Workqueue now has a stall detector which dumps workqueue state if any
  worker pool hasn't made forward progress over a certain amount of time
  (30s by default) and also triggers a warning if a workqueue which can
  be used in memory reclaim path tries to wait on something which can't
  be.

  These should make workqueue hangs a lot easier to debug."

* 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: simplify the apply_workqueue_attrs_locked()
  workqueue: implement lockup detector
  watchdog: introduce touch_softlockup_watchdog_sched()
  workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue

panic, x86: Allow CPUs to save registers even if looping in NMI context

2015-12-19T10:07:01+00:00

Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
sends an NMI IPI to CPUs which haven't called panic() to stop them,
save their register information and do some cleanups for crash dumping.
However, if such a CPU is infinitely looping in NMI context, we fail to
save its register information into the crash dump.

For example, this can happen when unknown NMIs are broadcast to all
CPUs as follows:

  CPU 0                             CPU 1
  ===========================       ==========================
  receive an unknown NMI
  unknown_nmi_error()
    panic()                         receive an unknown NMI
      spin_trylock(&panic_lock)     unknown_nmi_error()
      crash_kexec()                   panic()
                                        spin_trylock(&panic_lock)
                                        panic_smp_self_stop()
                                          infinite loop
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------> blocked until IRET
                                          infinite loop...

Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
so the NMI is not handled and the callback function to save registers is
never called.

In practice, this can happen on some servers which broadcast NMIs to all
CPUs when the NMI button is pushed.

To save registers in this case, we need to:

  a) Return from NMI handler instead of looping infinitely
  or
  b) Call the callback function directly from the infinite loop

Inherently, a) is risky because NMI is also used to prevent corrupted
data from being propagated to devices.  So, we chose b).

This patch does the following:

1. Move the infinite looping of CPUs which haven't called panic() in NMI
   context (actually done by panic_smp_self_stop()) outside of panic() to
   enable us to refer pt_regs. Please note that panic_smp_self_stop() is
   still used for normal context.

2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
   registers and do some cleanups after setting waiting_for_crash_ipi which
   is used for counting down the number of CPUs which handled the callback

Signed-off-by: Hidehiro Kawai 
Acked-by: Michal Hocko 
Cc: Aaron Tomlin 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Baoquan He 
Cc: Chris Metcalf 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Don Zickus 
Cc: Eric Biederman 
Cc: Frederic Weisbecker 
Cc: Gobinda Charan Maji 
Cc: HATAYAMA Daisuke 
Cc: Hidehiro Kawai 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Javi Merino 
Cc: Jiang Liu 
Cc: Jonathan Corbet 
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml 
Cc: Masami Hiramatsu 
Cc: Michal Nazarewicz 
Cc: Nicolas Iooss 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Prarit Bhargava 
Cc: Rasmus Villemoes 
Cc: Seth Jennings 
Cc: Stefan Lippers-Hollmann 
Cc: Steven Rostedt 
Cc: Thomas Gleixner 
Cc: Ulrich Obergfell 
Cc: Vitaly Kuznetsov 
Cc: Vivek Goyal 
Cc: Yasuaki Ishimatsu 
Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov 
Signed-off-by: Thomas Gleixner

panic, x86: Fix re-entrance problem due to panic on NMI

2015-12-19T10:07:00+00:00

If panic on NMI happens just after panic() on the same CPU, panic() is
recursively called. Kernel stalls, as a result, after failing to acquire
panic_lock.

To avoid this problem, don't call panic() in NMI context if we've
already entered panic().

For that, introduce nmi_panic() macro to reduce code duplication. In
the case of panic on NMI, don't return from NMI handlers if another CPU
already panicked.

Signed-off-by: Hidehiro Kawai 
Acked-by: Michal Hocko 
Cc: Aaron Tomlin 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Baoquan He 
Cc: Chris Metcalf 
Cc: David Hildenbrand 
Cc: Don Zickus 
Cc: "Eric W. Biederman" 
Cc: Frederic Weisbecker 
Cc: Gobinda Charan Maji 
Cc: HATAYAMA Daisuke 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Javi Merino 
Cc: Jonathan Corbet 
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml 
Cc: Masami Hiramatsu 
Cc: Michal Nazarewicz 
Cc: Nicolas Iooss 
Cc: Peter Zijlstra 
Cc: Prarit Bhargava 
Cc: Rasmus Villemoes 
Cc: Rusty Russell 
Cc: Seth Jennings 
Cc: Steven Rostedt 
Cc: Thomas Gleixner 
Cc: Ulrich Obergfell 
Cc: Vitaly Kuznetsov 
Cc: Vivek Goyal 
Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov 
Signed-off-by: Thomas Gleixner

workqueue: implement lockup detector

2015-12-08T16:29:47+00:00

Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING.  These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.

To alleviate the situation, this patch implements workqueue lockup
detector.  It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.

 BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
 Showing busy workqueues and worker pools:
 workqueue events: flags=0x0
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
     pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
 workqueue events_power_efficient: flags=0x80
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
     pending: check_lifetime, neigh_periodic_work
 workqueue cgroup_pidlist_destroy: flags=0x0
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
     pending: cgroup_pidlist_destroy_work_fn
 ...

The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.

v2: Decoupled from softlockup control knobs.

Signed-off-by: Tejun Heo 
Acked-by: Don Zickus 
Cc: Ulrich Obergfell 
Cc: Michal Hocko 
Cc: Chris Mason 
Cc: Andrew Morton

watchdog: introduce touch_softlockup_watchdog_sched()

2015-12-08T16:29:42+00:00

touch_softlockup_watchdog() is used to tell watchdog that scheduler
stall is expected.  One group of usage is from paths where the task
may not be able to yield for a long time such as performing slow PIO
to finicky device and coming out of suspend.  The other is to account
for scheduler and timer going idle.

For scheduler softlockup detection, there's no reason to distinguish
the two cases; however, workqueue lockup detector is planned and it
can use the same signals from the former group while the latter would
spuriously prevent detection.  This patch introduces a new function
touch_softlockup_watchdog_sched() and convert the latter group to call
it instead.  For now, it just calls touch_softlockup_watchdog() and
there's no functional difference.

Signed-off-by: Tejun Heo 
Cc: Ulrich Obergfell 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Andrew Morton

kernel/watchdog.c: fix race between proc_watchdog_thresh() and watchdog_timer_fn()

2015-11-06T03:34:48+00:00

Theoretically it is possible that the watchdog timer expires right at the
time when a user sets 'watchdog_thresh' to zero (note: this disables the
lockup detectors).  In this scenario, the is_softlockup() function - which
is called by the timer - could produce a false positive.

Fix this by checking the current value of 'watchdog_thresh'.

Signed-off-by: Ulrich Obergfell 
Acked-by: Don Zickus 
Reviewed-by: Aaron Tomlin 
Cc: Ulrich Obergfell 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

kernel/watchdog.c: remove {get|put}_online_cpus() from watchdog_{park|unpark}_threads()

2015-11-06T03:34:48+00:00

watchdog_{park|unpark}_threads() are now called in code paths that protect
themselves against CPU hotplug, so {get|put}_online_cpus() calls are
redundant and can be removed.

Signed-off-by: Ulrich Obergfell 
Acked-by: Don Zickus 
Reviewed-by: Aaron Tomlin 
Cc: Ulrich Obergfell 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

kernel/watchdog.c: avoid races between /proc handlers and CPU hotplug

2015-11-06T03:34:48+00:00

The handler functions for watchdog parameters in /proc/sys/kernel do not
protect themselves against races with CPU hotplug.  Hence, theoretically
it is possible that a new watchdog thread is started on a hotplugged CPU
while a parameter is being modified, and the thread could thus use a
parameter value that is 'in transition'.

For example, if 'watchdog_thresh' is being set to zero (note: this
disables the lockup detectors) the thread would erroneously use the value
zero as the sample period.

To avoid such races and to keep the /proc handler code consistent,
call
     {get|put}_online_cpus() in proc_watchdog_common()
     {get|put}_online_cpus() in proc_watchdog_thresh()
     {get|put}_online_cpus() in proc_watchdog_cpumask()

Signed-off-by: Ulrich Obergfell 
Acked-by: Don Zickus 
Reviewed-by: Aaron Tomlin 
Cc: Ulrich Obergfell 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

kernel/watchdog.c: avoid race between lockup detector suspend/resume and CPU hotplug

2015-11-06T03:34:48+00:00

The lockup detector suspend/resume interface that was introduced by
commit 8c073d27d7ad ("watchdog: introduce watchdog_suspend() and
watchdog_resume()") does not protect itself against races with CPU
hotplug.  Hence, theoretically it is possible that a new watchdog thread
is started on a hotplugged CPU while the lockup detector is suspended,
and the thread could thus interfere unexpectedly with the code that
requested to suspend the lockup detector.

Avoid the race by calling

  get_online_cpus() in lockup_detector_suspend()
  put_online_cpus() in lockup_detector_resume()

Signed-off-by: Ulrich Obergfell 
Acked-by: Don Zickus 
Reviewed-by: Aaron Tomlin 
Cc: Ulrich Obergfell 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds