<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/cpuset.c, branch v2.6.22</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>cpuset: zero malloc - fix for old cpusets</title>
<updated>2007-06-16T20:16:15+00:00</updated>
<author>
<name>Paul Jackson</name>
<email>pj@sgi.com</email>
</author>
<published>2007-06-16T17:16:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3e903e7b1605aff88d7f89a96fab5e43081b914f'/>
<id>3e903e7b1605aff88d7f89a96fab5e43081b914f</id>
<content type='text'>
The cpuset code to present a list of tasks using a cpuset to user space could
write to an array that it had kmalloc'd, after a kmalloc request of zero size.

The problem was that the code didn't check for writes past the allocated end
of the array until -after- the first write.

This is a race condition that is likely rare -- it would only show up if a
cpuset went from being empty to having a task in it, during the brief time
between the allocation and the first write.

Prior to roughly 2.6.22 kernels, this was also a benign problem, because a
zero kmalloc returned a few usable bytes anyway, and no harm was done with the
bogus write.

With the 2.6.22 kernel changes to make issue a warning if code tries to write
to the location returned from a zero size allocation, this problem is no
longer benign.  This cpuset code would occassionally trigger that warning.

The fix is trivial -- check before storing into the array, not after, whether
the array is big enough to hold the store.

Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: Kirill Korotaev &lt;dev@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Srivatsa Vaddagiri &lt;vatsa@in.ibm.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cpuset code to present a list of tasks using a cpuset to user space could
write to an array that it had kmalloc'd, after a kmalloc request of zero size.

The problem was that the code didn't check for writes past the allocated end
of the array until -after- the first write.

This is a race condition that is likely rare -- it would only show up if a
cpuset went from being empty to having a task in it, during the brief time
between the allocation and the first write.

Prior to roughly 2.6.22 kernels, this was also a benign problem, because a
zero kmalloc returned a few usable bytes anyway, and no harm was done with the
bogus write.

With the 2.6.22 kernel changes to make issue a warning if code tries to write
to the location returned from a zero size allocation, this problem is no
longer benign.  This cpuset code would occassionally trigger that warning.

The fix is trivial -- check before storing into the array, not after, whether
the array is big enough to hold the store.

Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Herbert Poetzl &lt;herbert@13thfloor.at&gt;
Cc: Kirill Korotaev &lt;dev@openvz.org&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Srivatsa Vaddagiri &lt;vatsa@in.ibm.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>use simple_read_from_buffer in kernel/</title>
<updated>2007-05-09T19:30:49+00:00</updated>
<author>
<name>Akinobu Mita</name>
<email>akinobu.mita@gmail.com</email>
</author>
<published>2007-05-09T09:33:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=85badbdf5120d246ce2bb3f1a7689a805f9c9006'/>
<id>85badbdf5120d246ce2bb3f1a7689a805f9c9006</id>
<content type='text'>
Cleanup using simple_read_from_buffer() for /dev/cpuset/tasks and
/proc/config.gz.

Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cleanup using simple_read_from_buffer() for /dev/cpuset/tasks and
/proc/config.gz.

Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpusets: allow empty {cpus,mems}_allowed to be set for unpopulated cpuset</title>
<updated>2007-05-08T18:15:14+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2007-05-08T07:31:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6f7f02e78a75a09195d963e0392b195bc2d55c5c'/>
<id>6f7f02e78a75a09195d963e0392b195bc2d55c5c</id>
<content type='text'>
You currently cannot remove all cpus or mems from cpus_allowed or
mems_allowed of a cpuset.  We now allow both if there are no attached
tasks.

Acked-by: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Christoph Lameter &lt;clameter@engr.sgi.com&gt;
Signed-off-by: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
You currently cannot remove all cpus or mems from cpus_allowed or
mems_allowed of a cpuset.  We now allow both if there are no attached
tasks.

Acked-by: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Christoph Lameter &lt;clameter@engr.sgi.com&gt;
Signed-off-by: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>header cleaning: don't include smp_lock.h when not used</title>
<updated>2007-05-08T18:15:07+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>randy.dunlap@oracle.com</email>
</author>
<published>2007-05-08T07:28:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e63340ae6b6205fef26b40a75673d1c9c0c8bb90'/>
<id>e63340ae6b6205fef26b40a75673d1c9c0c8bb90</id>
<content type='text'>
Remove includes of &lt;linux/smp_lock.h&gt; where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove includes of &lt;linux/smp_lock.h&gt; where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix race between attach_task and cpuset_exit</title>
<updated>2007-05-08T18:15:05+00:00</updated>
<author>
<name>Srivatsa Vaddagiri</name>
<email>vatsa@in.ibm.com</email>
</author>
<published>2007-05-08T07:27:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd9037a26a1e6ebec9121b4681c414dc77189a90'/>
<id>dd9037a26a1e6ebec9121b4681c414dc77189a90</id>
<content type='text'>
Currently cpuset_exit() changes the exiting task's -&gt;cpuset pointer w/o
taking task_lock().  This can lead to ugly races between attach_task and
cpuset_exit.  Details of the races are described at
http://lkml.org/lkml/2007/3/24/132.

Patch below closes those races.

Signed-off-by: Srivatsa Vaddagiri &lt;vatsa@in.ibm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently cpuset_exit() changes the exiting task's -&gt;cpuset pointer w/o
taking task_lock().  This can lead to ugly races between attach_task and
cpuset_exit.  Details of the races are described at
http://lkml.org/lkml/2007/3/24/132.

Patch below closes those races.

Signed-off-by: Srivatsa Vaddagiri &lt;vatsa@in.ibm.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpusets: allow TIF_MEMDIE threads to allocate anywhere</title>
<updated>2007-05-07T19:12:53+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2007-05-06T21:49:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c596d9f320aaf30d28c1d793ff3a976dee1db8f5'/>
<id>c596d9f320aaf30d28c1d793ff3a976dee1db8f5</id>
<content type='text'>
OOM killed tasks have access to memory reserves as specified by the
TIF_MEMDIE flag in the hopes that it will quickly exit.  If such a task has
memory allocations constrained by cpusets, we may encounter a deadlock if a
blocking task cannot exit because it cannot allocate the necessary memory.

We allow tasks that have the TIF_MEMDIE flag to allocate memory anywhere,
including outside its cpuset restriction, so that it can quickly die
regardless of whether it is __GFP_HARDWALL.

Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Christoph Lameter &lt;clameter@engr.sgi.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
OOM killed tasks have access to memory reserves as specified by the
TIF_MEMDIE flag in the hopes that it will quickly exit.  If such a task has
memory allocations constrained by cpusets, we may encounter a deadlock if a
blocking task cannot exit because it cannot allocate the necessary memory.

We allow tasks that have the TIF_MEMDIE flag to allocate memory anywhere,
including outside its cpuset restriction, so that it can quickly die
regardless of whether it is __GFP_HARDWALL.

Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Christoph Lameter &lt;clameter@engr.sgi.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] mark struct inode_operations const 2</title>
<updated>2007-02-12T17:48:46+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2007-02-12T08:55:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=92e1d5be91a0e3ffa5c4697eeb09b2aa22792122'/>
<id>92e1d5be91a0e3ffa5c4697eeb09b2aa22792122</id>
<content type='text'>
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] mark struct file_operations const 7</title>
<updated>2007-02-12T17:48:46+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2007-02-12T08:55:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9a32144e9d7b4e21341174b1a83b82a82353be86'/>
<id>9a32144e9d7b4e21341174b1a83b82a82353be86</id>
<content type='text'>
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] cpuset procfs warning fix</title>
<updated>2006-12-30T18:56:43+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@osdl.org</email>
</author>
<published>2006-12-30T00:49:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=089e34b60033863549fbe561d31ac8c778a20e7f'/>
<id>089e34b60033863549fbe561d31ac8c778a20e7f</id>
<content type='text'>
fs/proc/base.c:1869: warning: initialization discards qualifiers from pointer target type
fs/proc/base.c:2150: warning: initialization discards qualifiers from pointer target type

Cc: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fs/proc/base.c:1869: warning: initialization discards qualifiers from pointer target type
fs/proc/base.c:2150: warning: initialization discards qualifiers from pointer target type

Cc: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] cpuset: rework cpuset_zone_allowed api</title>
<updated>2006-12-13T17:05:49+00:00</updated>
<author>
<name>Paul Jackson</name>
<email>pj@sgi.com</email>
</author>
<published>2006-12-13T08:34:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=02a0e53d8227aff5e62e0433f82c12c1c2805fd6'/>
<id>02a0e53d8227aff5e62e0433f82c12c1c2805fd6</id>
<content type='text'>
Elaborate the API for calling cpuset_zone_allowed(), so that users have to
explicitly choose between the two variants:

  cpuset_zone_allowed_hardwall()
  cpuset_zone_allowed_softwall()

Until now, whether or not you got the hardwall flavor depended solely on
whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
argument.

If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
version.

Unfortunately, this meant that users would end up with the softwall version
without thinking about it.  Since only the softwall version might sleep,
this led to bugs with possible sleeping in interrupt context on more than
one occassion.

The hardwall version requires that the current tasks mems_allowed allows
the node of the specified zone (or that you're in interrupt or that
__GFP_THISNODE is set or that you're on a one cpuset system.)

The softwall version, depending on the gfp_mask, might allow a node if it
was allowed in the nearest enclusing cpuset marked mem_exclusive (which
requires taking the cpuset lock 'callback_mutex' to evaluate.)

This patch removes the cpuset_zone_allowed() call, and forces the caller to
explicitly choose between the hardwall and the softwall case.

If the caller wants the gfp_mask to determine this choice, they should (1)
be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
cpuset_zone_allowed_softwall() routine.

This adds another 100 or 200 bytes to the kernel text space, due to the few
lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
routines.  It should save a few instructions executed for the calls that
turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
set (before the call) then check (within the call) the __GFP_HARDWALL flag.

For the most critical call, from get_page_from_freelist(), the same
instructions are executed as before -- the old cpuset_zone_allowed()
routine it used to call is the same code as the
cpuset_zone_allowed_softwall() routine that it calls now.

Not a perfect win, but seems worth it, to reduce this chance of hitting a
sleeping with irq off complaint again.

Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Elaborate the API for calling cpuset_zone_allowed(), so that users have to
explicitly choose between the two variants:

  cpuset_zone_allowed_hardwall()
  cpuset_zone_allowed_softwall()

Until now, whether or not you got the hardwall flavor depended solely on
whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
argument.

If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
version.

Unfortunately, this meant that users would end up with the softwall version
without thinking about it.  Since only the softwall version might sleep,
this led to bugs with possible sleeping in interrupt context on more than
one occassion.

The hardwall version requires that the current tasks mems_allowed allows
the node of the specified zone (or that you're in interrupt or that
__GFP_THISNODE is set or that you're on a one cpuset system.)

The softwall version, depending on the gfp_mask, might allow a node if it
was allowed in the nearest enclusing cpuset marked mem_exclusive (which
requires taking the cpuset lock 'callback_mutex' to evaluate.)

This patch removes the cpuset_zone_allowed() call, and forces the caller to
explicitly choose between the hardwall and the softwall case.

If the caller wants the gfp_mask to determine this choice, they should (1)
be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
cpuset_zone_allowed_softwall() routine.

This adds another 100 or 200 bytes to the kernel text space, due to the few
lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
routines.  It should save a few instructions executed for the calls that
turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
set (before the call) then check (within the call) the __GFP_HARDWALL flag.

For the most critical call, from get_page_from_freelist(), the same
instructions are executed as before -- the old cpuset_zone_allowed()
routine it used to call is the same code as the
cpuset_zone_allowed_softwall() routine that it calls now.

Not a perfect win, but seems worth it, to reduce this chance of hitting a
sleeping with irq off complaint again.

Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
