linux-toradex.git/kernel, branch v3.10.73

workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE

2015-03-26T14:00:58+00:00

commit 8603e1b30027f943cc9c1eef2b291d42c3347af1 upstream.

cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.

try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
this case, __cancel_work_timer() currently invokes flush_work().  The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping

Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority.  Let's say task A just got woken
up from flush_work() by the completion of the target work item.  If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing.  This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.

task A			task B			worker

						executing work
__cancel_work_timer()
  try_to_grab_pending()
  set work CANCELING
  flush_work()
    block for work completion
						completion, wakes up A
			__cancel_work_timer()
			while (forever) {
			  try_to_grab_pending()
			    -ENOENT as work is being canceled
			  flush_work()
			    false as work is no longer executing
			}

This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.

Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com

v3: bit_waitqueue() can't be used for work items defined in vmalloc
    area.  Switched to custom wake function which matches the target
    work item and exclusive wait and wakeup.

v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
    the target bit waitqueue has wait_bit_queue's on it.  Use
    DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
    Vizoso.

Signed-off-by: Tejun Heo 
Reported-by: Rabin Vincent 
Cc: Tomeu Vizoso 
Tested-by: Jesper Nilsson 
Tested-by: Rabin Vincent 
Signed-off-by: Greg Kroah-Hartman

PM / QoS: remove duplicate call to pm_qos_update_target

2015-03-18T12:22:28+00:00

In 3.10.y backport patch 1dba303727f52ea062580b0a9b3f0c3b462769cf,
the logic to call pm_qos_update_target was moved to __pm_qos_update_request.
However, the original code was left in function pm_qos_update_request.

Currently, if pm_qos_update_request is called where new_value !=
req->node.prio then pm_qos_update_target will be called twice in a row.
Once in pm_qos_update_request and then again in the following call to
_pm_qos_update_request.

Removing the left over code from pm_qos_update_request stops this second
call to pm_qos_update_target where the work of removing / re-adding the
new_value in the constraints list would be duplicated.

Signed-off-by: Michael Scott 
Signed-off-by: Greg Kroah-Hartman

ntp: Fixup adjtimex freq validation on 32-bit systems

2015-03-06T22:40:52+00:00

commit 29183a70b0b828500816bd794b3fe192fce89f73 upstream.

Additional validation of adjtimex freq values to avoid
potential multiplication overflows were added in commit
5e5aeb4367b (time: adjtimex: Validate the ADJ_FREQUENCY values)

Unfortunately the patch used LONG_MAX/MIN instead of
LLONG_MAX/MIN, which was fine on 64-bit systems, but being
much smaller on 32-bit systems caused false positives
resulting in most direct frequency adjustments to fail w/
EINVAL.

ntpd only does direct frequency adjustments at startup, so
the issue was not as easily observed there, but other time
sync applications like ptpd and chrony were more effected by
the bug.

See bugs:

  https://bugzilla.kernel.org/show_bug.cgi?id=92481
  https://bugzilla.redhat.com/show_bug.cgi?id=1188074

This patch changes the checks to use LLONG_MAX for
clarity, and additionally the checks are disabled
on 32-bit systems since LLONG_MAX/PPM_SCALE is always
larger then the 32-bit long freq value, so multiplication
overflows aren't possible there.

Reported-by: Josh Boyer 
Reported-by: George Joseph 
Tested-by: George Joseph 
Signed-off-by: John Stultz 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Sasha Levin 
Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
[ Prettified the changelog and the comments a bit. ]
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman

kdb: fix incorrect counts in KDB summary command output

2015-03-06T22:40:52+00:00

commit 146755923262037fc4c54abc28c04b1103f3cc51 upstream.

The output of KDB 'summary' command should report MemTotal, MemFree
and Buffers output in kB. Current codes report in unit of pages.

A define of K(x) as
is defined in the code, but not used.

This patch would apply the define to convert the values to kB.
Please include me on Cc on replies. I do not subscribe to linux-kernel.

Signed-off-by: Jay Lan 
Signed-off-by: Jason Wessel 
Signed-off-by: Greg Kroah-Hartman

tracing: Fix unmapping loop in tracing_mark_write

2015-03-06T22:40:50+00:00

commit 7215853e985a4bef1a6c14e00e89dfec84f1e457 upstream.

Commit 6edb2a8a385f0cdef51dae37ff23e74d76d8a6ce introduced
an array map_pages that contains the addresses returned by
kmap_atomic. However, when unmapping those pages, map_pages[0]
is unmapped before map_pages[1], breaking the nesting requirement
as specified in the documentation for kmap_atomic/kunmap_atomic.

This was caught by the highmem debug code present in kunmap_atomic.
Fix the loop to do the unmapping properly.

Link: http://lkml.kernel.org/r/1418871056-6614-1-git-send-email-markivx@codeaurora.org

Reviewed-by: Stephen Boyd 
Reported-by: Lime Yang 
Signed-off-by: Vikram Mulukutla 
Signed-off-by: Steven Rostedt 
Signed-off-by: Greg Kroah-Hartman

smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread()

2015-02-11T06:48:17+00:00

commit 4bee96860a65c3a62d332edac331b3cf936ba3ad upstream.

The following race exists in the smpboot percpu threads management:

CPU0	      	   	     CPU1
cpu_up(2)
  get_online_cpus();
  smpboot_create_threads(2);
			     smpboot_register_percpu_thread();
			     for_each_online_cpu();
			       __smpboot_create_thread();
  __cpu_up(2);

This results in a missing per cpu thread for the newly onlined cpu2 and
in a NULL pointer dereference on a consecutive offline of that cpu.

Proctect smpboot_register_percpu_thread() with get_online_cpus() to
prevent that.

[ tglx: Massaged changelog and removed the change in
        smpboot_unregister_percpu_thread() because that's an
        optimization and therefor not stable material. ]

Signed-off-by: Lai Jiangshan 
Cc: Thomas Gleixner 
Cc: Rusty Russell 
Cc: Peter Zijlstra 
Cc: Srivatsa S. Bhat 
Cc: David Rientjes 
Link: http://lkml.kernel.org/r/1406777421-12830-1-git-send-email-laijs@cn.fujitsu.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman

workqueue: fix subtle pool management issue which can stall whole worker_pool

2015-02-06T06:35:40+00:00

commit 29187a9eeaf362d8422e62e17a22a6e115277a49 upstream.

A worker_pool's forward progress is guaranteed by the fact that the
last idle worker assumes the manager role to create more workers and
summon the rescuers if creating workers doesn't succeed in timely
manner before proceeding to execute work items.

This manager role is implemented in manage_workers(), which indicates
whether the worker may proceed to work item execution with its return
value.  This is necessary because multiple workers may contend for the
manager role, and, if there already is a manager, others should
proceed to work item execution.

Unfortunately, the function also indicates that the worker may proceed
to work item execution if need_to_create_worker() is false at the head
of the function.  need_to_create_worker() tests the following
conditions.

	pending work items && !nr_running && !nr_idle

The first and third conditions are protected by pool->lock and thus
won't change while holding pool->lock; however, nr_running can change
asynchronously as other workers block and resume and while it's likely
to be zero, as someone woke this worker up in the first place, some
other workers could have become runnable inbetween making it non-zero.

If this happens, manage_worker() could return false even with zero
nr_idle making the worker, the last idle one, proceed to execute work
items.  If then all workers of the pool end up blocking on a resource
which can only be released by a work item which is pending on that
pool, the whole pool can deadlock as there's no one to create more
workers or summon the rescuers.

This patch fixes the problem by removing the early exit condition from
maybe_create_worker() and making manage_workers() return false iff
there's already another manager, which ensures that the last worker
doesn't start executing work items.

We can leave the early exit condition alone and just ignore the return
value but the only reason it was put there is because the
manage_workers() used to perform both creations and destructions of
workers and thus the function may be invoked while the pool is trying
to reduce the number of workers.  Now that manage_workers() is called
only when more workers are needed, the only case this early exit
condition is triggered is rare race conditions rendering it pointless.

Tested with simulated workload and modified workqueue code which
trigger the pool deadlock reliably without this patch.

Signed-off-by: Tejun Heo 
Reported-by: Eric Sandeen 
Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net
Cc: Dave Chinner 
Cc: Lai Jiangshan 
Signed-off-by: Greg Kroah-Hartman

time: adjtimex: Validate the ADJ_FREQUENCY values

2015-01-30T01:40:56+00:00

commit 5e5aeb4367b450a28f447f6d5ab57d8f2ab16a5f upstream.

Verify that the frequency value from userspace is valid and makes sense.

Unverified values can cause overflows later on.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Sasha Levin 
[jstultz: Fix up bug for negative values and drop redunent cap check]
Signed-off-by: John Stultz 
Signed-off-by: Greg Kroah-Hartman

time: settimeofday: Validate the values of tv from user

2015-01-30T01:40:56+00:00

commit 6ada1fc0e1c4775de0e043e1bd3ae9d065491aa5 upstream.

An unvalidated user input is multiplied by a constant, which can result in
an undefined behaviour for large values. While this is validated later,
we should avoid triggering undefined behaviour.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Sasha Levin 
[jstultz: include trivial milisecond->microsecond correction noticed
by Andy]
Signed-off-by: John Stultz 
Signed-off-by: Greg Kroah-Hartman

perf: Fix events installation during moving group

2015-01-16T14:59:03+00:00

commit 9fc81d87420d0d3fd62d5e5529972c0ad9eab9cc upstream.

We allow PMU driver to change the cpu on which the event
should be installed to. This happened in patch:

  e2d37cd213dc ("perf: Allow the PMU driver to choose the CPU on which to install events")

This patch also forces all the group members to follow
the currently opened events cpu if the group happened
to be moved.

This and the change of event->cpu in perf_install_in_context()
function introduced in:

  0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()")

forces group members to change their event->cpu,
if the currently-opened-event's PMU changed the cpu
and there is a group move.

Above behaviour causes problem for breakpoint events,
which uses event->cpu to touch cpu specific data for
breakpoints accounting. By changing event->cpu, some
breakpoints slots were wrongly accounted for given
cpu.

Vinces's perf fuzzer hit this issue and caused following
WARN on my setup:

   WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
   Can't find any breakpoint slot
   [...]

This patch changes the group moving code to keep the event's
original cpu.

Reported-by: Vince Weaver 
Signed-off-by: Jiri Olsa 
Cc: Arnaldo Carvalho de Melo 
Cc: Frederic Weisbecker 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Vince Weaver 
Cc: Yan, Zheng 
Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman