<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/sched.c, branch v2.6.26-rc3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>cgroups: fix compile warning</title>
<updated>2008-05-15T02:11:14+00:00</updated>
<author>
<name>Mirco Tischler</name>
<email>mt-ml@gmx.de</email>
</author>
<published>2008-05-14T23:05:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0c70814c311581a6c86198db4f982aa683c68fb8'/>
<id>0c70814c311581a6c86198db4f982aa683c68fb8</id>
<content type='text'>
Return type of cpu_rt_runtime_write() should be int instead of ssize_t.

Signed-off-by: Mirco Tischler &lt;mt-ml@gmx.de&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Return type of cpu_rt_runtime_write() should be int instead of ssize_t.

Signed-off-by: Mirco Tischler &lt;mt-ml@gmx.de&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add new 'cond_resched_bkl()' helper function</title>
<updated>2008-05-11T23:04:48+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-05-11T23:04:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c3921ab71507b108d51a0f1ee960f80cd668a93d'/>
<id>c3921ab71507b108d51a0f1ee960f80cd668a93d</id>
<content type='text'>
It acts exactly like a regular 'cond_resched()', but will not get
optimized away when CONFIG_PREEMPT is set.

Normal kernel code is already preemptable in the presense of
CONFIG_PREEMPT, so cond_resched() is optimized away (see commit
02b67cc3ba36bdba351d6c3a00593f4ec550d9d3 "sched: do not do
cond_resched() when CONFIG_PREEMPT").

But when wanting to conditionally reschedule while holding a lock, you
need to use "cond_sched_lock(lock)", and the new function is the BKL
equivalent of that.

Also make fs/locks.c use it.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It acts exactly like a regular 'cond_resched()', but will not get
optimized away when CONFIG_PREEMPT is set.

Normal kernel code is already preemptable in the presense of
CONFIG_PREEMPT, so cond_resched() is optimized away (see commit
02b67cc3ba36bdba351d6c3a00593f4ec550d9d3 "sched: do not do
cond_resched() when CONFIG_PREEMPT").

But when wanting to conditionally reschedule while holding a lock, you
need to use "cond_sched_lock(lock)", and the new function is the BKL
equivalent of that.

Also make fs/locks.c use it.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>BKL: revert back to the old spinlock implementation</title>
<updated>2008-05-11T03:58:02+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-05-11T03:58:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e3e076c5a78519a9f64cd384e8f18bc21882ce0'/>
<id>8e3e076c5a78519a9f64cd384e8f18bc21882ce0</id>
<content type='text'>
The generic semaphore rewrite had a huge performance regression on AIM7
(and potentially other BKL-heavy benchmarks) because the generic
semaphores had been rewritten to be simple to understand and fair.  The
latter, in particular, turns a semaphore-based BKL implementation into a
mess of scheduling.

The attempt to fix the performance regression failed miserably (see the
previous commit 00b41ec2611dc98f87f30753ee00a53db648d662 'Revert
"semaphore: fix"'), and so for now the simple and sane approach is to
instead just go back to the old spinlock-based BKL implementation that
never had any issues like this.

This patch also has the advantage of being reported to fix the
regression completely according to Yanmin Zhang, unlike the semaphore
hack which still left a couple percentage point regression.

As a spinlock, the BKL obviously has the potential to be a latency
issue, but it's not really any different from any other spinlock in that
respect.  We do want to get rid of the BKL asap, but that has been the
plan for several years.

These days, the biggest users are in the tty layer (open/release in
particular) and Alan holds out some hope:

  "tty release is probably a few months away from getting cured - I'm
   afraid it will almost certainly be the very last user of the BKL in
   tty to get fixed as it depends on everything else being sanely locked."

so while we're not there yet, we do have a plan of action.

Tested-by: Yanmin Zhang &lt;yanmin_zhang@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Matthew Wilcox &lt;matthew@wil.cx&gt;
Cc: Alexander Viro &lt;viro@ftp.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The generic semaphore rewrite had a huge performance regression on AIM7
(and potentially other BKL-heavy benchmarks) because the generic
semaphores had been rewritten to be simple to understand and fair.  The
latter, in particular, turns a semaphore-based BKL implementation into a
mess of scheduling.

The attempt to fix the performance regression failed miserably (see the
previous commit 00b41ec2611dc98f87f30753ee00a53db648d662 'Revert
"semaphore: fix"'), and so for now the simple and sane approach is to
instead just go back to the old spinlock-based BKL implementation that
never had any issues like this.

This patch also has the advantage of being reported to fix the
regression completely according to Yanmin Zhang, unlike the semaphore
hack which still left a couple percentage point regression.

As a spinlock, the BKL obviously has the potential to be a latency
issue, but it's not really any different from any other spinlock in that
respect.  We do want to get rid of the BKL asap, but that has been the
plan for several years.

These days, the biggest users are in the tty layer (open/release in
particular) and Alan holds out some hope:

  "tty release is probably a few months away from getting cured - I'm
   afraid it will almost certainly be the very last user of the BKL in
   tty to get fixed as it depends on everything else being sanely locked."

so while we're not there yet, we do have a plan of action.

Tested-by: Yanmin Zhang &lt;yanmin_zhang@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Matthew Wilcox &lt;matthew@wil.cx&gt;
Cc: Alexander Viro &lt;viro@ftp.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2008-05-03T16:29:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3e51f33fcc7f55e6df25d15b55ed10c8b4da84cd'/>
<id>3e51f33fcc7f55e6df25d15b55ed10c8b4da84cd</id>
<content type='text'>
this replaces the rq-&gt;clock stuff (and possibly cpu_clock()).

 - architectures that have an 'imperfect' hardware clock can set
   CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

 - the 'jiffie' window might be superfulous when we update tick_gtod
   before the __update_sched_clock() call in sched_clock_tick()

 - cpu_clock() might be implemented as:

     sched_clock_cpu(smp_processor_id())

   if the accuracy proves good enough - how far can TSC drift in a
   single jiffie when considering the filtering and idle hooks?

[ mingo@elte.hu: various fixes and cleanups ]

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
this replaces the rq-&gt;clock stuff (and possibly cpu_clock()).

 - architectures that have an 'imperfect' hardware clock can set
   CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

 - the 'jiffie' window might be superfulous when we update tick_gtod
   before the __update_sched_clock() call in sched_clock_tick()

 - cpu_clock() might be implemented as:

     sched_clock_cpu(smp_processor_id())

   if the accuracy proves good enough - how far can TSC drift in a
   single jiffie when considering the filtering and idle hooks?

[ mingo@elte.hu: various fixes and cleanups ]

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: fix cpu clock</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-04-23T07:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dfbf4a1bc319f0f9a31e39b2da1fa5c55e85af89'/>
<id>dfbf4a1bc319f0f9a31e39b2da1fa5c55e85af89</id>
<content type='text'>
David Miller pointed it out that nothing in cpu_clock() sets
prev_cpu_time. This caused __sync_cpu_clock() to be called
all the time - against the intention of this code.

The result was that in practice we hit a global spinlock every
time cpu_clock() is called - which - even though cpu_clock()
is used for tracing and debugging, is suboptimal.

While at it, also:

- move the irq disabling to the outest layer,
  this should make cpu_clock() warp-free when called with irqs
  enabled.

- use long long instead of cycles_t - for platforms where cycles_t
  is 32-bit.

Reported-by: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
David Miller pointed it out that nothing in cpu_clock() sets
prev_cpu_time. This caused __sync_cpu_clock() to be called
all the time - against the intention of this code.

The result was that in practice we hit a global spinlock every
time cpu_clock() is called - which - even though cpu_clock()
is used for tracing and debugging, is suboptimal.

While at it, also:

- move the irq disabling to the outest layer,
  this should make cpu_clock() warp-free when called with irqs
  enabled.

- use long long instead of cycles_t - for platforms where cycles_t
  is 32-bit.

Reported-by: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: fair-group: fix a Div0 error of the fair group scheduler</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2008-04-28T04:54:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cb4ad1ffc7c0d8ea7dc8cd8ba303d83551716d46'/>
<id>cb4ad1ffc7c0d8ea7dc8cd8ba303d83551716d46</id>
<content type='text'>
When I echoed 0 into the "cpu.shares" file, a Div0 error occured.

We found it is caused by the following calling.

   sched_group_set_shares(tg, shares)
       set_se_shares(tg-&gt;se[i], shares/nr_cpu_ids)
           __set_se_shares(se, shares)
               div64_64((1ULL&lt;&lt;32), shares)

When the echoed value was less than the number of processores, the result of the
sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
the result, the Div0 error occured.

It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
the shares value isn't divided by nr_cpu_ids when setting shares of the sched
entity.

This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
shares.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I echoed 0 into the "cpu.shares" file, a Div0 error occured.

We found it is caused by the following calling.

   sched_group_set_shares(tg, shares)
       set_se_shares(tg-&gt;se[i], shares/nr_cpu_ids)
           __set_se_shares(se, shares)
               div64_64((1ULL&lt;&lt;32), shares)

When the echoed value was less than the number of processores, the result of the
sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
the result, the Div0 error occured.

It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
the shares value isn't divided by nr_cpu_ids when setting shares of the sched
entity.

This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
shares.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: fix missing locking in sched_domains code</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2008-04-28T09:33:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=712555ee4f873515612f89554ad1a3fda5fa887e'/>
<id>712555ee4f873515612f89554ad1a3fda5fa887e</id>
<content type='text'>
Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.

So use the already present doms_cur_mutex to protect these sections again.

Cc: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.

So use the already present doms_cur_mutex to protect these sections again.

Cc: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: make clock sync tunable by architecture code</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-04-23T07:31:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=690229a0912ca2fef8b542fe4d8b73acfcdc6e24'/>
<id>690229a0912ca2fef8b542fe4d8b73acfcdc6e24</id>
<content type='text'>
make time_sync_thresh tunable to architecture code.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
make time_sync_thresh tunable to architecture code.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: fix sched_info_switch not being called according to documentation</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>David Simner</name>
<email>djs203@srcf.ucam.org</email>
</author>
<published>2008-04-29T09:08:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=673a90a1e05c8127886f7659d1a457169378371f'/>
<id>673a90a1e05c8127886f7659d1a457169378371f</id>
<content type='text'>
http://bugzilla.kernel.org/show_bug.cgi?id=10545

sched_stats.h says that __sched_info_switch is "called when prev !=
next" in the comment.  sched.c should therefore do that.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
http://bugzilla.kernel.org/show_bug.cgi?id=10545

sched_stats.h says that __sched_info_switch is "called when prev !=
next" in the comment.  sched.c should therefore do that.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: fix hrtick_start_fair and CPU-Hotplug</title>
<updated>2008-05-05T21:56:18+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2008-04-29T08:02:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b328ca182f01c2a04b85e0ee8a410720b104fbcc'/>
<id>b328ca182f01c2a04b85e0ee8a410720b104fbcc</id>
<content type='text'>
Gautham R Shenoy reported:

 &gt; While running the usual CPU-Hotplug stress tests on linux-2.6.25,
 &gt; I noticed the following in the console logs.
 &gt;
 &gt; This is a wee bit difficult to reproduce. In the past 10 runs I hit this
 &gt; only once.
 &gt;
 &gt; ------------[ cut here ]------------
 &gt;
 &gt; WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65()
 &gt;
 &gt; Just wondering if we are doing a good job at handling the cancellation
 &gt; of any per-cpu scheduler timers during CPU-Hotplug.

This looks like its indeed not cancelled at all and migrates the it to
another cpu. Fix it via a proper hotplug notifier mechanism.

Reported-by: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Gautham R Shenoy reported:

 &gt; While running the usual CPU-Hotplug stress tests on linux-2.6.25,
 &gt; I noticed the following in the console logs.
 &gt;
 &gt; This is a wee bit difficult to reproduce. In the past 10 runs I hit this
 &gt; only once.
 &gt;
 &gt; ------------[ cut here ]------------
 &gt;
 &gt; WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65()
 &gt;
 &gt; Just wondering if we are doing a good job at handling the cancellation
 &gt; of any per-cpu scheduler timers during CPU-Hotplug.

This looks like its indeed not cancelled at all and migrates the it to
another cpu. Fix it via a proper hotplug notifier mechanism.

Reported-by: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
