<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/cpu.c, branch v2.6.27-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge branch 'linus' into cpus4096</title>
<updated>2008-07-28T21:32:00+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-07-28T21:32:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9e3ee1c39c0cc71222f9980ccbf87fe072897eef'/>
<id>9e3ee1c39c0cc71222f9980ccbf87fe072897eef</id>
<content type='text'>
Conflicts:

	kernel/stop_machine.c

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:

	kernel/stop_machine.c

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpu masks: optimize and clean up cpumask_of_cpu()</title>
<updated>2008-07-28T20:20:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-07-28T18:32:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e56b3bc7942982ac2589c942fb345e38bc7a341a'/>
<id>e56b3bc7942982ac2589c942fb345e38bc7a341a</id>
<content type='text'>
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Mike Travis &lt;travis@sgi.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Mike Travis &lt;travis@sgi.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>stop_machine(): stop_machine_run() changed to use cpu mask</title>
<updated>2008-07-28T02:16:30+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-28T17:16:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eeec4fad963490821348a331cca6102ae1c4a7a3'/>
<id>eeec4fad963490821348a331cca6102ae1c4a7a3</id>
<content type='text'>
Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all
cpus), pass a cpumask_t.  Allow NULL for the common case (where we
don't care which CPU the function is run on): temporary cpumask_t's
are usually considered bad for stack space.

This deprecates stop_machine_run, to be removed soon when all the
callers are dead.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all
cpus), pass a cpumask_t.  Allow NULL for the common case (where we
don't care which CPU the function is run on): temporary cpumask_t's
are usually considered bad for stack space.

This deprecates stop_machine_run, to be removed soon when all the
callers are dead.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Hotplug CPU: don't check cpu_online after take_cpu_down</title>
<updated>2008-07-28T02:16:29+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-28T17:16:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=04321587584272f4e8b9818f319f40caf8eeee13'/>
<id>04321587584272f4e8b9818f319f40caf8eeee13</id>
<content type='text'>
Akinobu points out that if take_cpu_down() succeeds, the cpu must be offline.
Remove the cpu_online() check, and put a BUG_ON().

Quoting Akinobu Mita:
   Actually the cpu_online() check was necessary before appling this
   stop_machine: simplify patch.

   With old __stop_machine_run(), __stop_machine_run() could succeed
   (return !IS_ERR(p) value) even if take_cpu_down() returned non-zero value.
   The return value of take_cpu_down() was obtained through kthread_stop()..

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "Akinobu Mita" &lt;akinobu.mita@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Akinobu points out that if take_cpu_down() succeeds, the cpu must be offline.
Remove the cpu_online() check, and put a BUG_ON().

Quoting Akinobu Mita:
   Actually the cpu_online() check was necessary before appling this
   stop_machine: simplify patch.

   With old __stop_machine_run(), __stop_machine_run() could succeed
   (return !IS_ERR(p) value) even if take_cpu_down() returned non-zero value.
   The return value of take_cpu_down() was obtained through kthread_stop()..

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "Akinobu Mita" &lt;akinobu.mita@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Simplify stop_machine</title>
<updated>2008-07-28T02:16:29+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-28T17:16:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ffdb5976c47609c862917d4c186ecbb5706d2dda'/>
<id>ffdb5976c47609c862917d4c186ecbb5706d2dda</id>
<content type='text'>
stop_machine creates a kthread which creates kernel threads.  We can
create those threads directly and simplify things a little.  Some care
must be taken with CPU hotunplug, which has special needs, but that code
seems more robust than it was in the past.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
stop_machine creates a kthread which creates kernel threads.  We can
create those threads directly and simplify things a little.  Some care
must be taken with CPU hotunplug, which has special needs, but that code
seems more robust than it was in the past.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpumask: export cpumask_of_cpu_map</title>
<updated>2008-07-26T14:48:59+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-07-26T14:50:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5a7a201c51c324876d00a54e7208af6af12d1ca4'/>
<id>5a7a201c51c324876d00a54e7208af6af12d1ca4</id>
<content type='text'>
fix:

 ERROR: "cpumask_of_cpu_map" [drivers/acpi/processor.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/microcode.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/speedstep-ich.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko] undefined!

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fix:

 ERROR: "cpumask_of_cpu_map" [drivers/acpi/processor.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/microcode.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/speedstep-ich.ko] undefined!
 ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko] undefined!

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpumask: put cpumask_of_cpu_map in the initdata section</title>
<updated>2008-07-26T14:40:33+00:00</updated>
<author>
<name>Mike Travis</name>
<email>travis@sgi.com</email>
</author>
<published>2008-07-25T01:21:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6524d938b3360504b43a1278b5a8403e85383d1a'/>
<id>6524d938b3360504b43a1278b5a8403e85383d1a</id>
<content type='text'>
  * Create the cpumask_of_cpu_map statically in the init data section
    using NR_CPUS but replace it during boot up with one sized by
    nr_cpu_ids (num possible cpus).

Signed-off-by: Mike Travis &lt;travis@sgi.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Jack Steiner &lt;steiner@sgi.com&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
  * Create the cpumask_of_cpu_map statically in the init data section
    using NR_CPUS but replace it during boot up with one sized by
    nr_cpu_ids (num possible cpus).

Signed-off-by: Mike Travis &lt;travis@sgi.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Jack Steiner &lt;steiner@sgi.com&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpumask: make cpumask_of_cpu_map generic</title>
<updated>2008-07-26T14:40:32+00:00</updated>
<author>
<name>Mike Travis</name>
<email>travis@sgi.com</email>
</author>
<published>2008-07-25T01:21:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b8d317d10cca76cabe6b03ebfeb23cc99118b731'/>
<id>b8d317d10cca76cabe6b03ebfeb23cc99118b731</id>
<content type='text'>
If an arch doesn't define cpumask_of_cpu_map, create a generic
statically-initialized one for them.  This allows removal of the buggy
cpumask_of_cpu() macro (&amp;cpumask_of_cpu() gives address of
out-of-scope var).

An arch with NR_CPUS of 4096 probably wants to allocate this itself
based on the actual number of CPUs, since otherwise they're using 2MB
of rodata (1024 cpus means 128k).  That's what
CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the
moment).

In future as we support more CPUs, we'll need to resort to a
get_cpu_map()/put_cpu_map() allocation scheme.

Signed-off-by: Mike Travis &lt;travis@sgi.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Jack Steiner &lt;steiner@sgi.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If an arch doesn't define cpumask_of_cpu_map, create a generic
statically-initialized one for them.  This allows removal of the buggy
cpumask_of_cpu() macro (&amp;cpumask_of_cpu() gives address of
out-of-scope var).

An arch with NR_CPUS of 4096 probably wants to allocate this itself
based on the actual number of CPUs, since otherwise they're using 2MB
of rodata (1024 cpus means 128k).  That's what
CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the
moment).

In future as we support more CPUs, we'll need to resort to a
get_cpu_map()/put_cpu_map() allocation scheme.

Signed-off-by: Mike Travis &lt;travis@sgi.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Jack Steiner &lt;steiner@sgi.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>workqueues: make get_online_cpus() useable for work-&gt;func()</title>
<updated>2008-07-25T17:53:40+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@tv-sign.ru</email>
</author>
<published>2008-07-25T08:47:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3da1c84c00c7e5fa8348336bd8c342f9128b0f14'/>
<id>3da1c84c00c7e5fa8348336bd8c342f9128b0f14</id>
<content type='text'>
workqueue_cpu_callback(CPU_DEAD) flushes cwq-&gt;thread under
cpu_maps_update_begin().  This means that the multithreaded workqueues
can't use get_online_cpus() due to the possible deadlock, very bad and
very old problem.

Introduce the new state, CPU_POST_DEAD, which is called after
cpu_hotplug_done() but before cpu_maps_update_done().

Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
This means that create/destroy functions can't rely on get_online_cpus()
any longer and should take cpu_add_remove_lock instead.

[akpm@linux-foundation.org: fix CONFIG_SMP=n]
Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Acked-by: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Vegard Nossum &lt;vegard.nossum@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
workqueue_cpu_callback(CPU_DEAD) flushes cwq-&gt;thread under
cpu_maps_update_begin().  This means that the multithreaded workqueues
can't use get_online_cpus() due to the possible deadlock, very bad and
very old problem.

Introduce the new state, CPU_POST_DEAD, which is called after
cpu_hotplug_done() but before cpu_maps_update_done().

Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
This means that create/destroy functions can't rely on get_online_cpus()
any longer and should take cpu_add_remove_lock instead.

[akpm@linux-foundation.org: fix CONFIG_SMP=n]
Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Acked-by: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Vegard Nossum &lt;vegard.nossum@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip</title>
<updated>2008-07-24T02:36:53+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-07-24T02:36:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7f9dce38378f0a4a298e885553d6bb7121376376'/>
<id>7f9dce38378f0a4a298e885553d6bb7121376376</id>
<content type='text'>
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: hrtick_enabled() should use cpu_active()
  sched, x86: clean up hrtick implementation
  sched: fix build error, provide partition_sched_domains() unconditionally
  sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
  cpu hotplug: Make cpu_active_map synchronization dependency clear
  cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
  sched: rework of "prioritize non-migratable tasks over migratable ones"
  sched: reduce stack size in isolated_cpu_setup()
  Revert parts of "ftrace: do not trace scheduler functions"

Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: hrtick_enabled() should use cpu_active()
  sched, x86: clean up hrtick implementation
  sched: fix build error, provide partition_sched_domains() unconditionally
  sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
  cpu hotplug: Make cpu_active_map synchronization dependency clear
  cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
  sched: rework of "prioritize non-migratable tasks over migratable ones"
  sched: reduce stack size in isolated_cpu_setup()
  Revert parts of "ftrace: do not trace scheduler functions"

Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
</pre>
</div>
</content>
</entry>
</feed>
