<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/sched.h, branch v2.6.29.3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sched: do not count frozen tasks toward load</title>
<updated>2009-04-27T17:37:00+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>ntl@pobox.com</email>
</author>
<published>2009-04-09T18:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3f22ebd2f4d191631a7b867addc8bd99e948873f'/>
<id>3f22ebd2f4d191631a7b867addc8bd99e948873f</id>
<content type='text'>
upstream commit: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions.  If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications.  This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs.  Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen.  This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch &lt;ntl@pobox.com&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Tested-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
LKML-Reference: &lt;20090408194512.47a99b95@manatee.lan&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
upstream commit: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions.  If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications.  This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs.  Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen.  This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch &lt;ntl@pobox.com&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Tested-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
LKML-Reference: &lt;20090408194512.47a99b95@manatee.lan&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.</title>
<updated>2009-03-12T04:05:44+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2009-03-12T20:35:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=76e6eee03353f01bfca707d4dbb1f10a4ee27dc0'/>
<id>76e6eee03353f01bfca707d4dbb1f10a4ee27dc0</id>
<content type='text'>
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
tsk_cpumask() now and won't see a difference as the changes roll into
linux-next.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
tsk_cpumask() now and won't see a difference as the changes roll into
linux-next.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: don't allow setuid to succeed if the user does not have rt bandwidth</title>
<updated>2009-02-27T10:11:53+00:00</updated>
<author>
<name>Dhaval Giani</name>
<email>dhaval@linux.vnet.ibm.com</email>
</author>
<published>2009-02-27T09:43:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=54e991242850edc8c53f71fa5aa3ba7a93ce38f5'/>
<id>54e991242850edc8c53f71fa5aa3ba7a93ce38f5</id>
<content type='text'>
Impact: fix hung task with certain (non-default) rt-limit settings

Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.
On failure, return EINVAL, which is also returned in
CONFIG_CGROUP_SCHED.

Reported-by: Corey Hickey &lt;bugfood-ml@fatooh.org&gt;
Signed-off-by: Dhaval Giani &lt;dhaval@linux.vnet.ibm.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Impact: fix hung task with certain (non-default) rt-limit settings

Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.
On failure, return EINVAL, which is also returned in
CONFIG_CGROUP_SCHED.

Reported-by: Corey Hickey &lt;bugfood-ml@fatooh.org&gt;
Signed-off-by: Dhaval Giani &lt;dhaval@linux.vnet.ibm.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timers: fix TIMER_ABSTIME for process wide cpu timers</title>
<updated>2009-02-11T13:04:21+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2009-02-11T10:30:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4da94d49b2ecb0a26e716a8811c3ecc542c2a65d'/>
<id>4da94d49b2ecb0a26e716a8811c3ecc542c2a65d</id>
<content type='text'>
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timers: split process wide cpu clocks/timers, fix</title>
<updated>2009-02-11T13:04:19+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2009-02-10T15:37:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3fccfd67df79c6351a156eb25a7a514e5f39c4d9'/>
<id>3fccfd67df79c6351a156eb25a7a514e5f39c4d9</id>
<content type='text'>
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.

Reported-by: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.

Reported-by: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timers: split process wide cpu clocks/timers, remove spurious warning</title>
<updated>2009-02-06T13:57:51+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2009-02-06T13:57:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7d8e23df69820e6be42bcc41d441f4860e8c76f7'/>
<id>7d8e23df69820e6be42bcc41d441f4860e8c76f7</id>
<content type='text'>
Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.

Oleg Nesterov observed:

  Can't fastpath_timer_check()-&gt;thread_group_cputimer() have the
  false warning too? Suppose we had the timer, then posix_cpu_timer_del()
  removes this timer, but task_cputime_zero(&amp;sig-&gt;cputime_expires) still
  not true.

Remove the spurious debug warning.

Reported-by: Mike Galbraith &lt;efault@gmx.de&gt;
Explained-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.

Oleg Nesterov observed:

  Can't fastpath_timer_check()-&gt;thread_group_cputimer() have the
  false warning too? Suppose we had the timer, then posix_cpu_timer_del()
  removes this timer, but task_cputime_zero(&amp;sig-&gt;cputime_expires) still
  not true.

Remove the spurious debug warning.

Reported-by: Mike Galbraith &lt;efault@gmx.de&gt;
Explained-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timers: split process wide cpu clocks/timers</title>
<updated>2009-02-05T12:04:33+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2009-02-05T11:24:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4cd4c1b40d40447fb5e7ba80746c6d7ba91d7a53'/>
<id>4cd4c1b40d40447fb5e7ba80746c6d7ba91d7a53</id>
<content type='text'>
Change the process wide cpu timers/clocks so that we:

 1) don't mess up the kernel with too many threads,
 2) don't have a per-cpu allocation for each process,
 3) have no impact when not used.

In order to accomplish this we're going to split it into two parts:

 - clocks; which can take all the time they want since they run
           from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)

 - timers; which need constant time sampling but since they're
           explicity used, the user can pay the overhead.

The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change the process wide cpu timers/clocks so that we:

 1) don't mess up the kernel with too many threads,
 2) don't have a per-cpu allocation for each process,
 3) have no impact when not used.

In order to accomplish this we're going to split it into two parts:

 - clocks; which can take all the time they want since they run
           from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)

 - timers; which need constant time sampling but since they're
           explicity used, the user can pay the overhead.

The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: re-add dead task accumulation stats.</title>
<updated>2009-02-05T12:04:33+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2009-02-05T11:24:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=32bd671d6cbeda60dc73be77fa2b9037d9a9bfa0'/>
<id>32bd671d6cbeda60dc73be77fa2b9037d9a9bfa0</id>
<content type='text'>
We're going to split the process wide cpu accounting into two parts:

 - clocks; which can take all the time they want since they run
           from user context.

 - timers; which need constant time tracing but can affort the overhead
           because they're default off -- and rare.

The clock readout will go back to a full sum of the thread group, for this
we need to re-add the exit stats that were removed in the initial itimer
rework (f06febc9: timers: fix itimer/many thread hang).

Furthermore, since that full sum can be rather slow for large thread groups
and we have the complete dead task stats, revert the do_notify_parent time
computation.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're going to split the process wide cpu accounting into two parts:

 - clocks; which can take all the time they want since they run
           from user context.

 - timers; which need constant time tracing but can affort the overhead
           because they're default off -- and rare.

The clock readout will go back to a full sum of the thread group, for this
we need to re-add the exit stats that were removed in the initial itimer
rework (f06febc9: timers: fix itimer/many thread hang).

Furthermore, since that full sum can be rather slow for large thread groups
and we have the complete dead task stats, revert the do_notify_parent time
computation.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: add missing kernel-doc in sched.h</title>
<updated>2009-02-03T05:32:10+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>randy.dunlap@oracle.com</email>
</author>
<published>2009-02-03T00:00:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=35626129abcd6a7547e84c817ef5b6eff7a8758b'/>
<id>35626129abcd6a7547e84c817ef5b6eff7a8758b</id>
<content type='text'>
Add kernel-doc notation for @lock:

include/linux/sched.h:457: No description found for parameter 'lock'

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add kernel-doc notation for @lock:

include/linux/sched.h:457: No description found for parameter 'lock'

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: drop max_user_instances and rely only on max_user_watches</title>
<updated>2009-01-30T02:04:45+00:00</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2009-01-29T22:25:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9df04e1f25effde823a600e755b51475d438f56b'/>
<id>9df04e1f25effde823a600e755b51475d438f56b</id>
<content type='text'>
Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@googlemail.com&gt;
Cc: Bron Gondwana &lt;brong@fastmail.fm&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@googlemail.com&gt;
Cc: Bron Gondwana &lt;brong@fastmail.fm&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
