linux-toradex.git/kernel/cpuset.c, branch v2.6.22

cpuset: zero malloc - fix for old cpusets

2007-06-16T20:16:15+00:00

The cpuset code to present a list of tasks using a cpuset to user space could
write to an array that it had kmalloc'd, after a kmalloc request of zero size.

The problem was that the code didn't check for writes past the allocated end
of the array until -after- the first write.

This is a race condition that is likely rare -- it would only show up if a
cpuset went from being empty to having a task in it, during the brief time
between the allocation and the first write.

Prior to roughly 2.6.22 kernels, this was also a benign problem, because a
zero kmalloc returned a few usable bytes anyway, and no harm was done with the
bogus write.

With the 2.6.22 kernel changes to make issue a warning if code tries to write
to the location returned from a zero size allocation, this problem is no
longer benign.  This cpuset code would occassionally trigger that warning.

The fix is trivial -- check before storing into the array, not after, whether
the array is big enough to hold the store.

Cc: "Eric W. Biederman" 
Cc: "Serge E. Hallyn" 
Cc: Balbir Singh 
Cc: Dave Hansen 
Cc: Herbert Poetzl 
Cc: Kirill Korotaev 
Cc: Paul Menage 
Cc: Srivatsa Vaddagiri 
Cc: Christoph Lameter 
Signed-off-by: Paul Jackson 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

use simple_read_from_buffer in kernel/

2007-05-09T19:30:49+00:00

Cleanup using simple_read_from_buffer() for /dev/cpuset/tasks and
/proc/config.gz.

Cc: Paul Jackson 
Cc: Randy Dunlap 
Signed-off-by: Akinobu Mita 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

cpusets: allow empty {cpus,mems}_allowed to be set for unpopulated cpuset

2007-05-08T18:15:14+00:00

You currently cannot remove all cpus or mems from cpus_allowed or
mems_allowed of a cpuset.  We now allow both if there are no attached
tasks.

Acked-by: Paul Jackson 
Cc: Christoph Lameter 
Signed-off-by: Paul Menage 
Signed-off-by: David Rientjes 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

header cleaning: don't include smp_lock.h when not used

2007-05-08T18:15:07+00:00

Remove includes of  where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Fix race between attach_task and cpuset_exit

2007-05-08T18:15:05+00:00

Currently cpuset_exit() changes the exiting task's ->cpuset pointer w/o
taking task_lock().  This can lead to ugly races between attach_task and
cpuset_exit.  Details of the races are described at
http://lkml.org/lkml/2007/3/24/132.

Patch below closes those races.

Signed-off-by: Srivatsa Vaddagiri 
Cc: Paul Jackson 
Cc: Balbir Singh 
Cc: Paul Menage 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

cpusets: allow TIF_MEMDIE threads to allocate anywhere

2007-05-07T19:12:53+00:00

OOM killed tasks have access to memory reserves as specified by the
TIF_MEMDIE flag in the hopes that it will quickly exit.  If such a task has
memory allocations constrained by cpusets, we may encounter a deadlock if a
blocking task cannot exit because it cannot allocate the necessary memory.

We allow tasks that have the TIF_MEMDIE flag to allocate memory anywhere,
including outside its cpuset restriction, so that it can quickly die
regardless of whether it is __GFP_HARDWALL.

Cc: Andi Kleen 
Cc: Paul Jackson 
Cc: Christoph Lameter 
Signed-off-by: David Rientjes 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] mark struct inode_operations const 2

2007-02-12T17:48:46+00:00

Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] mark struct file_operations const 7

2007-02-12T17:48:46+00:00

Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] cpuset procfs warning fix

2006-12-30T18:56:43+00:00

fs/proc/base.c:1869: warning: initialization discards qualifiers from pointer target type
fs/proc/base.c:2150: warning: initialization discards qualifiers from pointer target type

Cc: Paul Jackson 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] cpuset: rework cpuset_zone_allowed api

2006-12-13T17:05:49+00:00

Elaborate the API for calling cpuset_zone_allowed(), so that users have to
explicitly choose between the two variants:

  cpuset_zone_allowed_hardwall()
  cpuset_zone_allowed_softwall()

Until now, whether or not you got the hardwall flavor depended solely on
whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
argument.

If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
version.

Unfortunately, this meant that users would end up with the softwall version
without thinking about it.  Since only the softwall version might sleep,
this led to bugs with possible sleeping in interrupt context on more than
one occassion.

The hardwall version requires that the current tasks mems_allowed allows
the node of the specified zone (or that you're in interrupt or that
__GFP_THISNODE is set or that you're on a one cpuset system.)

The softwall version, depending on the gfp_mask, might allow a node if it
was allowed in the nearest enclusing cpuset marked mem_exclusive (which
requires taking the cpuset lock 'callback_mutex' to evaluate.)

This patch removes the cpuset_zone_allowed() call, and forces the caller to
explicitly choose between the hardwall and the softwall case.

If the caller wants the gfp_mask to determine this choice, they should (1)
be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
cpuset_zone_allowed_softwall() routine.

This adds another 100 or 200 bytes to the kernel text space, due to the few
lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
routines.  It should save a few instructions executed for the calls that
turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
set (before the call) then check (within the call) the __GFP_HARDWALL flag.

For the most critical call, from get_page_from_freelist(), the same
instructions are executed as before -- the old cpuset_zone_allowed()
routine it used to call is the same code as the
cpuset_zone_allowed_softwall() routine that it calls now.

Not a perfect win, but seems worth it, to reduce this chance of hitting a
sleeping with irq off complaint again.

Signed-off-by: Paul Jackson 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds