<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/workqueue.c, branch v4.4.5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Revert "workqueue: make sure delayed work run in local cpu"</title>
<updated>2016-03-03T23:07:27+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-02-09T21:11:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6684710434d07dfd7e512e5ecc77eefb5a30151e'/>
<id>6684710434d07dfd7e512e5ecc77eefb5a30151e</id>
<content type='text'>
commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream.

This reverts commit 874bbfe600a660cba9c776b3957b1ce393151b76.

Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU.  Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").

vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger.  As a preventive measure, 874bbfe600a6 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee.  Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd1018 ("timers: Use proper base migration in
add_timer_on()").  Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had
874bbfe600a6 started crashing.

The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway.  As, with the vmstat case fixed,
874bbfe600a6 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit.  A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu")
Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Henrique de Moraes Holschuh &lt;hmh@hmh.eng.br&gt;
Cc: Daniel Bilik &lt;daniel.bilik@neosystem.cz&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Shaohua Li &lt;shli@fb.com&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Daniel Bilik &lt;daniel.bilik@neosystem.cz&gt;
Cc: Jiri Slaby &lt;jslaby@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream.

This reverts commit 874bbfe600a660cba9c776b3957b1ce393151b76.

Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU.  Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").

vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger.  As a preventive measure, 874bbfe600a6 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee.  Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd1018 ("timers: Use proper base migration in
add_timer_on()").  Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had
874bbfe600a6 started crashing.

The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway.  As, with the vmstat case fixed,
874bbfe600a6 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit.  A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu")
Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Henrique de Moraes Holschuh &lt;hmh@hmh.eng.br&gt;
Cc: Daniel Bilik &lt;daniel.bilik@neosystem.cz&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Shaohua Li &lt;shli@fb.com&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Daniel Bilik &lt;daniel.bilik@neosystem.cz&gt;
Cc: Jiri Slaby &lt;jslaby@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup</title>
<updated>2016-03-03T23:07:27+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-02-03T18:54:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=21b34b4574f8619907eb751b37c1831bfa3f2440'/>
<id>21b34b4574f8619907eb751b37c1831bfa3f2440</id>
<content type='text'>
commit d6e022f1d207a161cd88e08ef0371554680ffc46 upstream.

When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node.  However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.

This has always been broken but hasn't triggered often enough before
874bbfe600a6 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue.  This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU.  The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.

While 874bbfe600a6 has been reverted for a different reason making the
bug less visible again, it can still happen.  Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround.  The long term solution is keeping CPU
-&gt; NODE mapping stable across CPU off/online cycles which is being
worked on.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Len Brown &lt;len.brown@intel.com&gt;
Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d6e022f1d207a161cd88e08ef0371554680ffc46 upstream.

When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node.  However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.

This has always been broken but hasn't triggered often enough before
874bbfe600a6 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue.  This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU.  The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.

While 874bbfe600a6 has been reverted for a different reason making the
bug less visible again, it can still happen.  Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround.  The long term solution is keeping CPU
-&gt; NODE mapping stable across CPU off/online cycles which is being
worked on.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Len Brown &lt;len.brown@intel.com&gt;
Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2015-11-05T22:16:27+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-11-05T22:16:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e25ac7ddaae0e798f794cdaf9109bc71246110cd'/>
<id>e25ac7ddaae0e798f794cdaf9109bc71246110cd</id>
<content type='text'>
Pull workqueue update from Tejun Heo:
 "This pull request contains one patch to make an unbound worker pool
  allocated from the NUMA node containing it if such node exists.  As
  unbound worker pools are node-affine by default, this makes most pools
  allocated on the right node"

* 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Allocate the unbound pool using local node memory
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull workqueue update from Tejun Heo:
 "This pull request contains one patch to make an unbound worker pool
  allocated from the NUMA node containing it if such node exists.  As
  unbound worker pools are node-affine by default, this makes most pools
  allocated on the right node"

* 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Allocate the unbound pool using local node memory
</pre>
</div>
</content>
</entry>
<entry>
<title>workqueue: Allocate the unbound pool using local node memory</title>
<updated>2015-10-12T16:17:31+00:00</updated>
<author>
<name>Xunlei Pang</name>
<email>pang.xunlei@linaro.org</email>
</author>
<published>2015-10-09T03:53:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e2273584d3f33f7f2cfe6d7aaade0fa2f1cb3db5'/>
<id>e2273584d3f33f7f2cfe6d7aaade0fa2f1cb3db5</id>
<content type='text'>
Currently, get_unbound_pool() uses kzalloc() to allocate the
worker pool. Actually, we can use the right node to do the
allocation, achieving local memory access.

This patch selects target node first, and uses kzalloc_node()
instead.

Signed-off-by: Xunlei Pang &lt;pang.xunlei@linaro.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, get_unbound_pool() uses kzalloc() to allocate the
worker pool. Actually, we can use the right node to do the
allocation, achieving local memory access.

This patch selects target node first, and uses kzalloc_node()
instead.

Signed-off-by: Xunlei Pang &lt;pang.xunlei@linaro.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>workqueue: make sure delayed work run in local cpu</title>
<updated>2015-09-30T17:06:46+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2015-09-30T16:05:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=874bbfe600a660cba9c776b3957b1ce393151b76'/>
<id>874bbfe600a660cba9c776b3957b1ce393151b76</id>
<content type='text'>
My system keeps crashing with below message. vmstat_update() schedules a delayed
work in current cpu and expects the work runs in the cpu.
schedule_delayed_work() is expected to make delayed work run in local cpu. The
problem is timer can be migrated with NO_HZ. __queue_work() queues work in
timer handler, which could run in a different cpu other than where the delayed
work is scheduled. The end result is the delayed work runs in different cpu.
The patch makes __queue_delayed_work records local cpu earlier. Where the timer
runs doesn't change where the work runs with the change.

[   28.010131] ------------[ cut here ]------------
[   28.010609] kernel BUG at ../mm/vmstat.c:1392!
[   28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[   28.011860] Modules linked in:
[   28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G        W4.3.0-rc3+ #634
[   28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[   28.014160] Workqueue: events vmstat_update
[   28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
[   28.015445] RIP: 0010:[&lt;ffffffff8115f921&gt;]  [&lt;ffffffff8115f921&gt;]vmstat_update+0x31/0x80
[   28.016282] RSP: 0018:ffff8800ba42fd80  EFLAGS: 00010297
[   28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
[   28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
[   28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
[   28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
[   28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
[   28.020071] FS:  0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
[   28.020071] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
[   28.020071] Stack:
[   28.020071]  ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
[   28.020071]  ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
[   28.020071]  ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
[   28.020071] Call Trace:
[   28.020071]  [&lt;ffffffff8106bd88&gt;] process_one_work+0x1c8/0x540
[   28.020071]  [&lt;ffffffff8106bd0b&gt;] ? process_one_work+0x14b/0x540
[   28.020071]  [&lt;ffffffff8106c214&gt;] worker_thread+0x114/0x460
[   28.020071]  [&lt;ffffffff8106c100&gt;] ? process_one_work+0x540/0x540
[   28.020071]  [&lt;ffffffff81071bf8&gt;] kthread+0xf8/0x110
[   28.020071]  [&lt;ffffffff81071b00&gt;] ?kthread_create_on_node+0x200/0x200
[   28.020071]  [&lt;ffffffff81a6522f&gt;] ret_from_fork+0x3f/0x70
[   28.020071]  [&lt;ffffffff81071b00&gt;] ?kthread_create_on_node+0x200/0x200

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org # v2.6.31+
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
My system keeps crashing with below message. vmstat_update() schedules a delayed
work in current cpu and expects the work runs in the cpu.
schedule_delayed_work() is expected to make delayed work run in local cpu. The
problem is timer can be migrated with NO_HZ. __queue_work() queues work in
timer handler, which could run in a different cpu other than where the delayed
work is scheduled. The end result is the delayed work runs in different cpu.
The patch makes __queue_delayed_work records local cpu earlier. Where the timer
runs doesn't change where the work runs with the change.

[   28.010131] ------------[ cut here ]------------
[   28.010609] kernel BUG at ../mm/vmstat.c:1392!
[   28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[   28.011860] Modules linked in:
[   28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G        W4.3.0-rc3+ #634
[   28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[   28.014160] Workqueue: events vmstat_update
[   28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
[   28.015445] RIP: 0010:[&lt;ffffffff8115f921&gt;]  [&lt;ffffffff8115f921&gt;]vmstat_update+0x31/0x80
[   28.016282] RSP: 0018:ffff8800ba42fd80  EFLAGS: 00010297
[   28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
[   28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
[   28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
[   28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
[   28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
[   28.020071] FS:  0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
[   28.020071] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
[   28.020071] Stack:
[   28.020071]  ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
[   28.020071]  ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
[   28.020071]  ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
[   28.020071] Call Trace:
[   28.020071]  [&lt;ffffffff8106bd88&gt;] process_one_work+0x1c8/0x540
[   28.020071]  [&lt;ffffffff8106bd0b&gt;] ? process_one_work+0x14b/0x540
[   28.020071]  [&lt;ffffffff8106c214&gt;] worker_thread+0x114/0x460
[   28.020071]  [&lt;ffffffff8106c100&gt;] ? process_one_work+0x540/0x540
[   28.020071]  [&lt;ffffffff81071bf8&gt;] kthread+0xf8/0x110
[   28.020071]  [&lt;ffffffff81071b00&gt;] ?kthread_create_on_node+0x200/0x200
[   28.020071]  [&lt;ffffffff81a6522f&gt;] ret_from_fork+0x3f/0x70
[   28.020071]  [&lt;ffffffff81071b00&gt;] ?kthread_create_on_node+0x200/0x200

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org # v2.6.31+
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2015-09-02T15:02:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-02T15:02:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7d3e2eb1786fafa186eb8bc276edef75ba69e023'/>
<id>7d3e2eb1786fafa186eb8bc276edef75ba69e023</id>
<content type='text'>
Pull workqueue updates from Tejun Heo:
 "Only three trivial changes for workqueue this time - doc, MAINTAINERS
  and EXPORT_SYMBOL updates"

* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix some docbook warnings
  workqueue: Make flush_workqueue() available again to non GPL modules
  workqueue: add myself as a dedicated reviwer
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull workqueue updates from Tejun Heo:
 "Only three trivial changes for workqueue this time - doc, MAINTAINERS
  and EXPORT_SYMBOL updates"

* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix some docbook warnings
  workqueue: Make flush_workqueue() available again to non GPL modules
  workqueue: add myself as a dedicated reviwer
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-09-01T03:26:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-01T03:26:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a1d8561172f369ba56d636df49a6b4d6d77e2123'/>
<id>a1d8561172f369ba56d636df49a6b4d6d77e2123</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:
 "The biggest change in this cycle is the rewrite of the main SMP load
  balancing metric: the CPU load/utilization.  The main goal was to make
  the metric more precise and more representative - see the changelog of
  this commit for the gory details:

    9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")

  It is done in a way that significantly reduces complexity of the code:

    5 files changed, 249 insertions(+), 494 deletions(-)

  and the performance testing results are encouraging.  Nevertheless we
  need to keep an eye on potential regressions, since this potentially
  affects every SMP workload in existence.

  This work comes from Yuyang Du.

  Other changes:

   - SCHED_DL updates.  (Andrea Parri)

   - Simplify architecture callbacks by removing finish_arch_switch().
     (Peter Zijlstra et al)

   - cputime accounting: guarantee stime + utime == rtime.  (Peter
     Zijlstra)

   - optimize idle CPU wakeups some more - inspired by Facebook server
     loads.  (Mike Galbraith)

   - stop_machine fixes and updates.  (Oleg Nesterov)

   - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)

   - sched/numa tweaks.  (Srikar Dronamraju)

   - misc fixes and small cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  sched/deadline: Fix comment in enqueue_task_dl()
  sched/deadline: Fix comment in push_dl_tasks()
  sched: Change the sched_class::set_cpus_allowed() calling context
  sched: Make sched_class::set_cpus_allowed() unconditional
  sched: Fix a race between __kthread_bind() and sched_setaffinity()
  sched: Ensure a task has a non-normalized vruntime when returning back to CFS
  sched/numa: Fix NUMA_DIRECT topology identification
  tile: Reorganize _switch_to()
  sched, sparc32: Update scheduler comments in copy_thread()
  sched: Remove finish_arch_switch()
  sched, tile: Remove finish_arch_switch
  sched, sh: Fold finish_arch_switch() into switch_to()
  sched, score: Remove finish_arch_switch()
  sched, avr32: Remove finish_arch_switch()
  sched, MIPS: Get rid of finish_arch_switch()
  sched, arm: Remove finish_arch_switch()
  sched/fair: Clean up load average references
  sched/fair: Provide runnable_load_avg back to cfs_rq
  sched/fair: Remove task and group entity load when they are dead
  sched/fair: Init cfs_rq's sched_entity load average
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull scheduler updates from Ingo Molnar:
 "The biggest change in this cycle is the rewrite of the main SMP load
  balancing metric: the CPU load/utilization.  The main goal was to make
  the metric more precise and more representative - see the changelog of
  this commit for the gory details:

    9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")

  It is done in a way that significantly reduces complexity of the code:

    5 files changed, 249 insertions(+), 494 deletions(-)

  and the performance testing results are encouraging.  Nevertheless we
  need to keep an eye on potential regressions, since this potentially
  affects every SMP workload in existence.

  This work comes from Yuyang Du.

  Other changes:

   - SCHED_DL updates.  (Andrea Parri)

   - Simplify architecture callbacks by removing finish_arch_switch().
     (Peter Zijlstra et al)

   - cputime accounting: guarantee stime + utime == rtime.  (Peter
     Zijlstra)

   - optimize idle CPU wakeups some more - inspired by Facebook server
     loads.  (Mike Galbraith)

   - stop_machine fixes and updates.  (Oleg Nesterov)

   - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)

   - sched/numa tweaks.  (Srikar Dronamraju)

   - misc fixes and small cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  sched/deadline: Fix comment in enqueue_task_dl()
  sched/deadline: Fix comment in push_dl_tasks()
  sched: Change the sched_class::set_cpus_allowed() calling context
  sched: Make sched_class::set_cpus_allowed() unconditional
  sched: Fix a race between __kthread_bind() and sched_setaffinity()
  sched: Ensure a task has a non-normalized vruntime when returning back to CFS
  sched/numa: Fix NUMA_DIRECT topology identification
  tile: Reorganize _switch_to()
  sched, sparc32: Update scheduler comments in copy_thread()
  sched: Remove finish_arch_switch()
  sched, tile: Remove finish_arch_switch
  sched, sh: Fold finish_arch_switch() into switch_to()
  sched, score: Remove finish_arch_switch()
  sched, avr32: Remove finish_arch_switch()
  sched, MIPS: Get rid of finish_arch_switch()
  sched, arm: Remove finish_arch_switch()
  sched/fair: Clean up load average references
  sched/fair: Provide runnable_load_avg back to cfs_rq
  sched/fair: Remove task and group entity load when they are dead
  sched/fair: Init cfs_rq's sched_entity load average
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Fix a race between __kthread_bind() and sched_setaffinity()</title>
<updated>2015-08-12T10:06:09+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-05-15T15:43:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25834c73f93af7f0712c98ca4593691592e6b360'/>
<id>25834c73f93af7f0712c98ca4593691592e6b360</id>
<content type='text'>
Because sched_setscheduler() checks p-&gt;flags &amp; PF_NO_SETAFFINITY
without locks, a caller might observe an old value and race with the
set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo
it:

	__kthread_bind()
	  do_set_cpus_allowed()
						&lt;SYSCALL&gt;
						  sched_setaffinity()
						    if (p-&gt;flags &amp; PF_NO_SETAFFINITIY)
						    set_cpus_allowed_ptr()
	  p-&gt;flags |= PF_NO_SETAFFINITY

Fix the bug by putting everything under the regular scheduler locks.

This also closes a hole in the serialization of task_struct::{nr_,}cpus_allowed.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: dedekind1@gmail.com
Cc: juri.lelli@arm.com
Cc: mgorman@suse.de
Cc: riel@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20150515154833.545640346@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because sched_setscheduler() checks p-&gt;flags &amp; PF_NO_SETAFFINITY
without locks, a caller might observe an old value and race with the
set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo
it:

	__kthread_bind()
	  do_set_cpus_allowed()
						&lt;SYSCALL&gt;
						  sched_setaffinity()
						    if (p-&gt;flags &amp; PF_NO_SETAFFINITIY)
						    set_cpus_allowed_ptr()
	  p-&gt;flags |= PF_NO_SETAFFINITY

Fix the bug by putting everything under the regular scheduler locks.

This also closes a hole in the serialization of task_struct::{nr_,}cpus_allowed.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: dedekind1@gmail.com
Cc: juri.lelli@arm.com
Cc: mgorman@suse.de
Cc: riel@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20150515154833.545640346@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>workqueue: Make flush_workqueue() available again to non GPL modules</title>
<updated>2015-08-04T18:04:54+00:00</updated>
<author>
<name>Tim Gardner</name>
<email>tim.gardner@canonical.com</email>
</author>
<published>2015-08-04T17:26:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1dadafa86a779884f14a6e7a3ddde1a57b0a0a65'/>
<id>1dadafa86a779884f14a6e7a3ddde1a57b0a0a65</id>
<content type='text'>
Commit 37b1ef31a568fc02e53587620226e5f3c66454c8 ("workqueue: move
flush_scheduled_work() to workqueue.h") moved the exported non GPL
flush_scheduled_work() from a function to an inline wrapper.
Unfortunately, it directly calls flush_workqueue() which is a GPL function.
This has the effect of changing the licensing requirement for this function
and makes it unavailable to non GPL modules.

See commit ad7b1f841f8a54c6d61ff181451f55b68175e15a ("workqueue: Make
schedule_work() available again to non GPL modules") for precedent.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 37b1ef31a568fc02e53587620226e5f3c66454c8 ("workqueue: move
flush_scheduled_work() to workqueue.h") moved the exported non GPL
flush_scheduled_work() from a function to an inline wrapper.
Unfortunately, it directly calls flush_workqueue() which is a GPL function.
This has the effect of changing the licensing requirement for this function
and makes it unavailable to non GPL modules.

See commit ad7b1f841f8a54c6d61ff181451f55b68175e15a ("workqueue: Make
schedule_work() available again to non GPL modules") for precedent.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()</title>
<updated>2015-07-22T22:27:32+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.vnet.ibm.com</email>
</author>
<published>2015-06-18T22:50:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f78f5b90c4ffa559e400c3919a02236101f29f3f'/>
<id>f78f5b90c4ffa559e400c3919a02236101f29f3f</id>
<content type='text'>
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros.  This also requires
inverting the sense of the conditional, which this commit also does.

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros.  This also requires
inverting the sense of the conditional, which this commit also does.

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
