<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel, branch v3.2.23</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>tracing: change CPU ring buffer state from tracing_cpumask</title>
<updated>2012-07-12T03:32:09+00:00</updated>
<author>
<name>Vaibhav Nagarnaik</name>
<email>vnagarnaik@google.com</email>
</author>
<published>2012-05-04T01:59:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=907e461a4f50ca31c36575b03c14d2c07ba20df0'/>
<id>907e461a4f50ca31c36575b03c14d2c07ba20df0</id>
<content type='text'>
commit 71babb2705e2203a64c27ede13ae3508a0d2c16c upstream.

According to Documentation/trace/ftrace.txt:

tracing_cpumask:

        This is a mask that lets the user only trace
        on specified CPUS. The format is a hex string
        representing the CPUS.

The tracing_cpumask currently doesn't affect the tracing state of
per-CPU ring buffers.

This patch enables/disables CPU recording as its corresponding bit in
tracing_cpumask is set/unset.

Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com

Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Laurent Chavey &lt;chavey@google.com&gt;
Cc: Justin Teravest &lt;teravest@google.com&gt;
Cc: David Sharp &lt;dhsharp@google.com&gt;
Signed-off-by: Vaibhav Nagarnaik &lt;vnagarnaik@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 71babb2705e2203a64c27ede13ae3508a0d2c16c upstream.

According to Documentation/trace/ftrace.txt:

tracing_cpumask:

        This is a mask that lets the user only trace
        on specified CPUS. The format is a hex string
        representing the CPUS.

The tracing_cpumask currently doesn't affect the tracing state of
per-CPU ring buffers.

This patch enables/disables CPU recording as its corresponding bit in
tracing_cpumask is set/unset.

Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com

Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Laurent Chavey &lt;chavey@google.com&gt;
Cc: Justin Teravest &lt;teravest@google.com&gt;
Cc: David Sharp &lt;dhsharp@google.com&gt;
Signed-off-by: Vaibhav Nagarnaik &lt;vnagarnaik@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>splice: fix racy pipe-&gt;buffers uses</title>
<updated>2012-07-12T03:31:59+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-12T13:24:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9558b2ab1db5e94fcb7d5ab111a32e423a016c09'/>
<id>9558b2ab1db5e94fcb7d5ab111a32e423a016c09</id>
<content type='text'>
commit 047fe3605235888f3ebcda0c728cb31937eadfe6 upstream.

Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()

commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe-&gt;buffers.

Problem is some paths don't hold pipe mutex and assume pipe-&gt;buffers
doesn't change for their duration.

Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe-&gt;buffers where appropriate.

splice_shrink_spd() loses its struct pipe_inode_info argument.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Tested-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.2:
 - Adjust context in vmsplice_to_pipe()
 - Update one more call to splice_shrink_spd(), from skb_splice_bits()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 047fe3605235888f3ebcda0c728cb31937eadfe6 upstream.

Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()

commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe-&gt;buffers.

Problem is some paths don't hold pipe mutex and assume pipe-&gt;buffers
doesn't change for their duration.

Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe-&gt;buffers where appropriate.

splice_shrink_spd() loses its struct pipe_inode_info argument.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Tested-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.2:
 - Adjust context in vmsplice_to_pipe()
 - Update one more call to splice_shrink_spd(), from skb_splice_bits()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Fix the relax_domain_level boot parameter</title>
<updated>2012-06-19T22:18:06+00:00</updated>
<author>
<name>Dimitri Sivanich</name>
<email>sivanich@sgi.com</email>
</author>
<published>2012-06-05T18:44:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dcbf047fc060c9cfea0c4bb16ac322e4e6a08500'/>
<id>dcbf047fc060c9cfea0c4bb16ac322e4e6a08500</id>
<content type='text'>
commit a841f8cef4bb124f0f5563314d0beaf2e1249d72 upstream.

It does not get processed because sched_domain_level_max is 0 at the
time that setup_relax_domain_level() is run.

Simply accept the value as it is, as we don't know the value of
sched_domain_level_max until sched domain construction is completed.

Fix sched_relax_domain_level in cpuset.  The build_sched_domain() routine calls
the set_domain_attribute() routine prior to setting the sd-&gt;level, however,
the set_domain_attribute() routine relies on the sd-&gt;level to decide whether
idle load balancing will be off/on.

Signed-off-by: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20120605184436.GA15668@sgi.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust the filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a841f8cef4bb124f0f5563314d0beaf2e1249d72 upstream.

It does not get processed because sched_domain_level_max is 0 at the
time that setup_relax_domain_level() is run.

Simply accept the value as it is, as we don't know the value of
sched_domain_level_max until sched domain construction is completed.

Fix sched_relax_domain_level in cpuset.  The build_sched_domain() routine calls
the set_domain_attribute() routine prior to setting the sd-&gt;level, however,
the set_domain_attribute() routine relies on the sd-&gt;level to decide whether
idle load balancing will be off/on.

Signed-off-by: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20120605184436.GA15668@sgi.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust the filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/fork: fix overflow in vma length when copying mmap on clone</title>
<updated>2012-06-10T13:41:44+00:00</updated>
<author>
<name>Siddhesh Poyarekar</name>
<email>siddhesh.poyarekar@gmail.com</email>
</author>
<published>2012-05-29T22:06:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ff8445f52037e30411691d0d1ced9650c780c3c2'/>
<id>ff8445f52037e30411691d0d1ced9650c780c3c2</id>
<content type='text'>
commit 7edc8b0ac16cbaed7cb4ea4c6b95ce98d2997e84 upstream.

The vma length in dup_mmap is calculated and stored in a unsigned int,
which is insufficient and hence overflows for very large maps (beyond
16TB). The following program demonstrates this:

#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/mman.h&gt;

#define GIG 1024 * 1024 * 1024L
#define EXTENT 16393

int main(void)
{
        int i, r;
        void *m;
        char buf[1024];

        for (i = 0; i &lt; EXTENT; i++) {
                m = mmap(NULL, (size_t) 1 * 1024 * 1024 * 1024L,
                         PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);

                if (m == (void *)-1)
                        printf("MMAP Failed: %d\n", m);
                else
                        printf("%d : MMAP returned %p\n", i, m);

                r = fork();

                if (r == 0) {
                        printf("%d: successed\n", i);
                        return 0;
                } else if (r &lt; 0)
                        printf("FORK Failed: %d\n", r);
                else if (r &gt; 0)
                        wait(NULL);
        }
        return 0;
}

Increase the storage size of the result to unsigned long, which is
sufficient for storing the difference between addresses.

Signed-off-by: Siddhesh Poyarekar &lt;siddhesh.poyarekar@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7edc8b0ac16cbaed7cb4ea4c6b95ce98d2997e84 upstream.

The vma length in dup_mmap is calculated and stored in a unsigned int,
which is insufficient and hence overflows for very large maps (beyond
16TB). The following program demonstrates this:

#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/mman.h&gt;

#define GIG 1024 * 1024 * 1024L
#define EXTENT 16393

int main(void)
{
        int i, r;
        void *m;
        char buf[1024];

        for (i = 0; i &lt; EXTENT; i++) {
                m = mmap(NULL, (size_t) 1 * 1024 * 1024 * 1024L,
                         PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);

                if (m == (void *)-1)
                        printf("MMAP Failed: %d\n", m);
                else
                        printf("%d : MMAP returned %p\n", i, m);

                r = fork();

                if (r == 0) {
                        printf("%d: successed\n", i);
                        return 0;
                } else if (r &lt; 0)
                        printf("FORK Failed: %d\n", r);
                else if (r &gt; 0)
                        wait(NULL);
        }
        return 0;
}

Increase the storage size of the result to unsigned long, which is
sufficient for storing the difference between addresses.

Signed-off-by: Siddhesh Poyarekar &lt;siddhesh.poyarekar@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>compat: Fix RT signal mask corruption via sigprocmask</title>
<updated>2012-05-30T23:43:51+00:00</updated>
<author>
<name>Jan Kiszka</name>
<email>jan.kiszka@siemens.com</email>
</author>
<published>2012-05-10T13:04:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13918f54a0987a9bcec568dd1c6300f1e8b1d05a'/>
<id>13918f54a0987a9bcec568dd1c6300f1e8b1d05a</id>
<content type='text'>
commit b7dafa0ef3145c31d7753be0a08b3cbda51f0209 upstream.

compat_sys_sigprocmask reads a smaller signal mask from userspace than
sigprogmask accepts for setting.  So the high word of blocked.sig[0]
will be cleared, releasing any potentially blocked RT signal.

This was discovered via userspace code that relies on get/setcontext.
glibc's i386 versions of those functions use sigprogmask instead of
rt_sigprogmask to save/restore signal mask and caused RT signal
unblocking this way.

As suggested by Linus, this replaces the sys_sigprocmask based compat
version with one that open-codes the required logic, including the merge
of the existing blocked set with the new one provided on SIG_SETMASK.

Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b7dafa0ef3145c31d7753be0a08b3cbda51f0209 upstream.

compat_sys_sigprocmask reads a smaller signal mask from userspace than
sigprogmask accepts for setting.  So the high word of blocked.sig[0]
will be cleared, releasing any potentially blocked RT signal.

This was discovered via userspace code that relies on get/setcontext.
glibc's i386 versions of those functions use sigprogmask instead of
rt_sigprogmask to save/restore signal mask and caused RT signal
unblocking this way.

As suggested by Linus, this replaces the sys_sigprocmask based compat
version with one that open-codes the required logic, including the merge
of the existing blocked set with the new one provided on SIG_SETMASK.

Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active</title>
<updated>2012-05-30T23:43:42+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-05-14T22:04:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d79c6f64a904afc92a329f80abe693e3ae105fe'/>
<id>5d79c6f64a904afc92a329f80abe693e3ae105fe</id>
<content type='text'>
commit 544ecf310f0e7f51fa057ac2a295fc1b3b35a9d3 upstream.

worker_enter_idle() has WARN_ON_ONCE() which triggers if nr_running
isn't zero when every worker is idle.  This can trigger spuriously
while a cpu is going down due to the way trustee sets %WORKER_ROGUE
and zaps nr_running.

It first sets %WORKER_ROGUE on all workers without updating
nr_running, releases gcwq-&gt;lock, schedules, regrabs gcwq-&gt;lock and
then zaps nr_running.  If the last running worker enters idle
inbetween, it would see stale nr_running which hasn't been zapped yet
and trigger the WARN_ON_ONCE().

Fix it by performing the sanity check iff the trustee is idle.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 544ecf310f0e7f51fa057ac2a295fc1b3b35a9d3 upstream.

worker_enter_idle() has WARN_ON_ONCE() which triggers if nr_running
isn't zero when every worker is idle.  This can trigger spuriously
while a cpu is going down due to the way trustee sets %WORKER_ROGUE
and zaps nr_running.

It first sets %WORKER_ROGUE on all workers without updating
nr_running, releases gcwq-&gt;lock, schedules, regrabs gcwq-&gt;lock and
then zaps nr_running.  If the last running worker enters idle
inbetween, it would see stale nr_running which hasn't been zapped yet
and trigger the WARN_ON_ONCE().

Fix it by performing the sanity check iff the trustee is idle.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>namespaces, pid_ns: fix leakage on fork() failure</title>
<updated>2012-05-20T21:56:32+00:00</updated>
<author>
<name>Mike Galbraith</name>
<email>efault@gmx.de</email>
</author>
<published>2012-05-10T20:01:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=97a2373e32df28a6e59de89b725d4dbe58076a8d'/>
<id>97a2373e32df28a6e59de89b725d4dbe58076a8d</id>
<content type='text'>
commit 5e2bf0142231194d36fdc9596b36a261ed2b9fe7 upstream.

Fork() failure post namespace creation for a child cloned with
CLONE_NEWPID leaks pid_namespace/mnt_cache due to proc being mounted
during creation, but not unmounted during cleanup.  Call
pid_ns_release_proc() during cleanup.

Signed-off-by: Mike Galbraith &lt;efault@gmx.de&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Louis Rilling &lt;louis.rilling@kerlabs.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5e2bf0142231194d36fdc9596b36a261ed2b9fe7 upstream.

Fork() failure post namespace creation for a child cloned with
CLONE_NEWPID leaks pid_namespace/mnt_cache due to proc being mounted
during creation, but not unmounted during cleanup.  Call
pid_ns_release_proc() during cleanup.

Signed-off-by: Mike Galbraith &lt;efault@gmx.de&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Louis Rilling &lt;louis.rilling@kerlabs.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>exit_signal: fix the "parent has changed security domain" logic</title>
<updated>2012-05-11T12:15:04+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-03-19T16:03:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82b8227228169ca4c3b138f6ad1171913c14f784'/>
<id>82b8227228169ca4c3b138f6ad1171913c14f784</id>
<content type='text'>
commit b6e238dceed36891cc633167afe7151f1f3d83c5 upstream.

exit_notify() changes -&gt;exit_signal if the parent already did exec.
This doesn't really work, we are not going to send the signal now
if there is another live thread or the exiting task is traced. The
parent can exec before the last dies or the tracer detaches.

Move this check into do_notify_parent() which actually sends the
signal.

The user-visible change is that we do not change -&gt;exit_signal,
and thus the exiting task is still "clone children" for
do_wait()-&gt;eligible_child(__WCLONE). Hopefully this is fine, the
current logic is racy anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b6e238dceed36891cc633167afe7151f1f3d83c5 upstream.

exit_notify() changes -&gt;exit_signal if the parent already did exec.
This doesn't really work, we are not going to send the signal now
if there is another live thread or the exiting task is traced. The
parent can exec before the last dies or the tracer detaches.

Move this check into do_notify_parent() which actually sends the
signal.

The user-visible change is that we do not change -&gt;exit_signal,
and thus the exiting task is still "clone children" for
do_wait()-&gt;eligible_child(__WCLONE). Hopefully this is fine, the
current logic is racy anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>exit_signal: simplify the "we have changed execution domain" logic</title>
<updated>2012-05-11T12:15:03+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-03-19T16:03:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3dc731904fb3bad1d3dbd8870cf4461fb2b31680'/>
<id>3dc731904fb3bad1d3dbd8870cf4461fb2b31680</id>
<content type='text'>
commit e636825346b36a07ccfc8e30946d52855e21f681 upstream.

exit_notify() checks "tsk-&gt;self_exec_id != tsk-&gt;parent_exec_id"
to handle the "we have changed execution domain" case.

We can change do_thread() to always set -&gt;exit_signal = SIGCHLD
and remove this check to simplify the code.

We could change setup_new_exec() instead, this looks more logical
because it increments -&gt;self_exec_id. But note that de_thread()
already resets -&gt;exit_signal if it changes the leader, let's keep
both changes close to each other.

Note that we change -&gt;exit_signal lockless, this changes the rules.
Thereafter -&gt;exit_signal is not stable under tasklist but this is
fine, the only possible change is OLDSIG -&gt; SIGCHLD. This can race
with eligible_child() but the race is harmless. We can race with
reparent_leader() which changes our -&gt;exit_signal in parallel, but
it does the same change to SIGCHLD.

The noticeable user-visible change is that the execing task is not
"visible" to do_wait()-&gt;eligible_child(__WCLONE) right after exec.
To me this looks more logical, and this is consistent with mt case.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e636825346b36a07ccfc8e30946d52855e21f681 upstream.

exit_notify() checks "tsk-&gt;self_exec_id != tsk-&gt;parent_exec_id"
to handle the "we have changed execution domain" case.

We can change do_thread() to always set -&gt;exit_signal = SIGCHLD
and remove this check to simplify the code.

We could change setup_new_exec() instead, this looks more logical
because it increments -&gt;self_exec_id. But note that de_thread()
already resets -&gt;exit_signal if it changes the leader, let's keep
both changes close to each other.

Note that we change -&gt;exit_signal lockless, this changes the rules.
Thereafter -&gt;exit_signal is not stable under tasklist but this is
fine, the only possible change is OLDSIG -&gt; SIGCHLD. This can race
with eligible_child() but the race is harmless. We can race with
reparent_leader() which changes our -&gt;exit_signal in parallel, but
it does the same change to SIGCHLD.

The noticeable user-visible change is that the execing task is not
"visible" to do_wait()-&gt;eligible_child(__WCLONE) right after exec.
To me this looks more logical, and this is consistent with mt case.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Fix nohz load accounting -- again!</title>
<updated>2012-05-11T12:14:49+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2012-03-01T14:04:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5e2d50da11f0e6ec3ce8fe658d7c83b0b4346c68'/>
<id>5e2d50da11f0e6ec3ce8fe658d7c83b0b4346c68</id>
<content type='text'>
commit c308b56b5398779cd3da0f62ab26b0453494c3d4 upstream.

Various people reported nohz load tracking still being wrecked, but Doug
spotted the actual problem. We fold the nohz remainder in too soon,
causing us to loose samples and under-account.

So instead of playing catch-up up-front, always do a single load-fold
with whatever state we encounter and only then fold the nohz remainder
and play catch-up.

Reported-by: Doug Smythies &lt;dsmythies@telus.net&gt;
Reported-by: LesÅ=82aw Kope=C4=87 &lt;leslaw.kopec@nasza-klasa.pl&gt;
Reported-by: Aman Gupta &lt;aman@tmm1.net&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-4v31etnhgg9kwd6ocgx3rxl8@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
[bwh: Backported to 3.2: change filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c308b56b5398779cd3da0f62ab26b0453494c3d4 upstream.

Various people reported nohz load tracking still being wrecked, but Doug
spotted the actual problem. We fold the nohz remainder in too soon,
causing us to loose samples and under-account.

So instead of playing catch-up up-front, always do a single load-fold
with whatever state we encounter and only then fold the nohz remainder
and play catch-up.

Reported-by: Doug Smythies &lt;dsmythies@telus.net&gt;
Reported-by: LesÅ=82aw Kope=C4=87 &lt;leslaw.kopec@nasza-klasa.pl&gt;
Reported-by: Aman Gupta &lt;aman@tmm1.net&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-4v31etnhgg9kwd6ocgx3rxl8@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
[bwh: Backported to 3.2: change filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
