<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/cgroup.c, branch v3.2-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>memcg: replace ss-&gt;id_lock with a rwlock</title>
<updated>2011-11-02T23:07:03+00:00</updated>
<author>
<name>Andrew Bresticker</name>
<email>abrestic@google.com</email>
</author>
<published>2011-11-02T20:40:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c1e2ee2dc436574880758b3836fc96935b774c32'/>
<id>c1e2ee2dc436574880758b3836fc96935b774c32</id>
<content type='text'>
While back-porting Johannes Weiner's patch "mm: memcg-aware global
reclaim" for an internal effort, we noticed a significant performance
regression during page-reclaim heavy workloads due to high contention of
the ss-&gt;id_lock.  This lock protects idr map, and serializes calls to
idr_get_next() in css_get_next() (which is used during the memcg hierarchy
walk).

Since idr_get_next() is just doing a look up, we need only serialize it
with respect to idr_remove()/idr_get_new().  By making the ss-&gt;id_lock a
rwlock, contention is greatly reduced and performance improves.

Tested: cat a 256m file from a ramdisk in a 128m container 50 times on
each core (one file + container per core) in parallel on a NUMA machine.
Result is the time for the test to complete in 1 of the containers.
Both kernels included Johannes' memcg-aware global reclaim patches.

Before rwlock patch: 1710.778s
After rwlock patch: 152.227s

Signed-off-by: Andrew Bresticker &lt;abrestic@google.com&gt;
Cc: Paul Menage &lt;menage@gmail.com&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While back-porting Johannes Weiner's patch "mm: memcg-aware global
reclaim" for an internal effort, we noticed a significant performance
regression during page-reclaim heavy workloads due to high contention of
the ss-&gt;id_lock.  This lock protects idr map, and serializes calls to
idr_get_next() in css_get_next() (which is used during the memcg hierarchy
walk).

Since idr_get_next() is just doing a look up, we need only serialize it
with respect to idr_remove()/idr_get_new().  By making the ss-&gt;id_lock a
rwlock, contention is greatly reduced and performance improves.

Tested: cat a 256m file from a ramdisk in a 128m container 50 times on
each core (one file + container per core) in parallel on a NUMA machine.
Result is the time for the test to complete in 1 of the containers.
Both kernels included Johannes' memcg-aware global reclaim patches.

Before rwlock patch: 1710.778s
After rwlock patch: 152.227s

Signed-off-by: Andrew Bresticker &lt;abrestic@google.com&gt;
Cc: Paul Menage &lt;menage@gmail.com&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroups: don't attach task to subsystem if migration failed</title>
<updated>2011-11-02T23:06:59+00:00</updated>
<author>
<name>Ben Blum</name>
<email>bblum@andrew.cmu.edu</email>
</author>
<published>2011-11-02T20:38:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=77ceab8ea590d7dc6c8f055ce43dfebd74428107'/>
<id>77ceab8ea590d7dc6c8f055ce43dfebd74428107</id>
<content type='text'>
If a task has exited to the point it has called cgroup_exit() already,
then we can't migrate it to another cgroup anymore.

This can happen when we are attaching a task to a new cgroup between the
call to -&gt;can_attach_task() on subsystems and the migration that is
eventually tried in cgroup_task_migrate().

In this case cgroup_task_migrate() returns -ESRCH and we don't want to
attach the task to the subsystems because the attachment to the new cgroup
itself failed.

Fix this by only calling -&gt;attach_task() on the subsystems if the cgroup
migration succeeded.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Ben Blum &lt;bblum@andrew.cmu.edu&gt;
Acked-by: Paul Menage &lt;paul@paulmenage.org&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a task has exited to the point it has called cgroup_exit() already,
then we can't migrate it to another cgroup anymore.

This can happen when we are attaching a task to a new cgroup between the
call to -&gt;can_attach_task() on subsystems and the migration that is
eventually tried in cgroup_task_migrate().

In this case cgroup_task_migrate() returns -ESRCH and we don't want to
attach the task to the subsystems because the attachment to the new cgroup
itself failed.

Fix this by only calling -&gt;attach_task() on the subsystems if the cgroup
migration succeeded.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Ben Blum &lt;bblum@andrew.cmu.edu&gt;
Acked-by: Paul Menage &lt;paul@paulmenage.org&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroups: more safe tasklist locking in cgroup_attach_proc</title>
<updated>2011-11-02T23:06:59+00:00</updated>
<author>
<name>Ben Blum</name>
<email>bblum@andrew.cmu.edu</email>
</author>
<published>2011-11-02T20:38:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=33ef6b6984403a688189317ef46bb3caab3b70e0'/>
<id>33ef6b6984403a688189317ef46bb3caab3b70e0</id>
<content type='text'>
Fix unstable tasklist locking in cgroup_attach_proc.

According to this thread - https://lkml.org/lkml/2011/7/27/243 - RCU is
not sufficient to guarantee the tasklist is stable w.r.t.  de_thread and
exit.  Taking tasklist_lock for reading, instead of rcu_read_lock, ensures
proper exclusion.

Signed-off-by: Ben Blum &lt;bblum@andrew.cmu.edu&gt;
Acked-by: Paul Menage &lt;paul@paulmenage.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix unstable tasklist locking in cgroup_attach_proc.

According to this thread - https://lkml.org/lkml/2011/7/27/243 - RCU is
not sufficient to guarantee the tasklist is stable w.r.t.  de_thread and
exit.  Taking tasklist_lock for reading, instead of rcu_read_lock, ensures
proper exclusion.

Signed-off-by: Ben Blum &lt;bblum@andrew.cmu.edu&gt;
Acked-by: Paul Menage &lt;paul@paulmenage.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>locking, sched, cgroups: Annotate release_list_lock as raw</title>
<updated>2011-09-13T09:11:49+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2009-07-25T14:47:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cdcc136ffd264849a943acb42c36ffe9b458f811'/>
<id>cdcc136ffd264849a943acb42c36ffe9b458f811</id>
<content type='text'>
The release_list_lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The release_list_lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6</title>
<updated>2011-07-28T02:26:38+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-28T02:26:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=95b6886526bb510b8370b625a49bc0ab3b8ff10f'/>
<id>95b6886526bb510b8370b625a49bc0ab3b8ff10f</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (54 commits)
  tpm_nsc: Fix bug when loading multiple TPM drivers
  tpm: Move tpm_tis_reenable_interrupts out of CONFIG_PNP block
  tpm: Fix compilation warning when CONFIG_PNP is not defined
  TOMOYO: Update kernel-doc.
  tpm: Fix a typo
  tpm_tis: Probing function for Intel iTPM bug
  tpm_tis: Fix the probing for interrupts
  tpm_tis: Delay ACPI S3 suspend while the TPM is busy
  tpm_tis: Re-enable interrupts upon (S3) resume
  tpm: Fix display of data in pubek sysfs entry
  tpm_tis: Add timeouts sysfs entry
  tpm: Adjust interface timeouts if they are too small
  tpm: Use interface timeouts returned from the TPM
  tpm_tis: Introduce durations sysfs entry
  tpm: Adjust the durations if they are too small
  tpm: Use durations returned from TPM
  TOMOYO: Enable conditional ACL.
  TOMOYO: Allow using argv[]/envp[] of execve() as conditions.
  TOMOYO: Allow using executable's realpath and symlink's target as conditions.
  TOMOYO: Allow using owner/group etc. of file objects as conditions.
  ...

Fix up trivial conflict in security/tomoyo/realpath.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (54 commits)
  tpm_nsc: Fix bug when loading multiple TPM drivers
  tpm: Move tpm_tis_reenable_interrupts out of CONFIG_PNP block
  tpm: Fix compilation warning when CONFIG_PNP is not defined
  TOMOYO: Update kernel-doc.
  tpm: Fix a typo
  tpm_tis: Probing function for Intel iTPM bug
  tpm_tis: Fix the probing for interrupts
  tpm_tis: Delay ACPI S3 suspend while the TPM is busy
  tpm_tis: Re-enable interrupts upon (S3) resume
  tpm: Fix display of data in pubek sysfs entry
  tpm_tis: Add timeouts sysfs entry
  tpm: Adjust interface timeouts if they are too small
  tpm: Use interface timeouts returned from the TPM
  tpm_tis: Introduce durations sysfs entry
  tpm: Adjust the durations if they are too small
  tpm: Use durations returned from TPM
  TOMOYO: Enable conditional ACL.
  TOMOYO: Allow using argv[]/envp[] of execve() as conditions.
  TOMOYO: Allow using executable's realpath and symlink's target as conditions.
  TOMOYO: Allow using owner/group etc. of file objects as conditions.
  ...

Fix up trivial conflict in security/tomoyo/realpath.c
</pre>
</div>
</content>
</entry>
<entry>
<title>atomic: use &lt;linux/atomic.h&gt;</title>
<updated>2011-07-26T23:49:47+00:00</updated>
<author>
<name>Arun Sharma</name>
<email>asharma@fb.com</email>
</author>
<published>2011-07-26T23:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=60063497a95e716c9a689af3be2687d261f115b4'/>
<id>60063497a95e716c9a689af3be2687d261f115b4</id>
<content type='text'>
This allows us to move duplicated code in &lt;asm/atomic.h&gt;
(atomic_inc_not_zero() for now) to &lt;linux/atomic.h&gt;

Signed-off-by: Arun Sharma &lt;asharma@fb.com&gt;
Reviewed-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Acked-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This allows us to move duplicated code in &lt;asm/atomic.h&gt;
(atomic_inc_not_zero() for now) to &lt;linux/atomic.h&gt;

Signed-off-by: Arun Sharma &lt;asharma@fb.com&gt;
Reviewed-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Acked-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2011-07-25T20:56:39+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-25T20:56:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d3ec4844d449cf7af9e749f73ba2052fb7b72fc2'/>
<id>d3ec4844d449cf7af9e749f73ba2052fb7b72fc2</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -&gt; `be'
  doc: Kconfig: Typo: square -&gt; squared
  doc: Konfig: Documentation/power/{pm =&gt; apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -&gt; forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -&gt; `be'
  doc: Kconfig: Typo: square -&gt; squared
  doc: Konfig: Documentation/power/{pm =&gt; apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -&gt; forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
</pre>
</div>
</content>
</entry>
<entry>
<title>kill file_permission() completely</title>
<updated>2011-07-20T05:43:11+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-06-19T16:55:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3bfa784a6539f91a27d7ffdd408efdb638e3bebd'/>
<id>3bfa784a6539f91a27d7ffdd408efdb638e3bebd</id>
<content type='text'>
convert the last remaining caller to inode_permission()

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
convert the last remaining caller to inode_permission()

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check</title>
<updated>2011-07-08T20:21:58+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2011-07-08T12:39:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d8bf4ca9ca9576548628344c9725edd3786e90b1'/>
<id>d8bf4ca9ca9576548628344c9725edd3786e90b1</id>
<content type='text'>
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroupfs: use init_cred when populating new cgroupfs mount</title>
<updated>2011-06-09T01:59:53+00:00</updated>
<author>
<name>eparis@redhat</name>
<email>eparis@redhat</email>
</author>
<published>2011-06-02T11:20:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2ce9738bac1b386f46e8478fd2c263460e7c2b09'/>
<id>2ce9738bac1b386f46e8478fd2c263460e7c2b09</id>
<content type='text'>
We recently found that in some configurations SELinux was blocking the ability
for cgroupfs to be mounted.  The reason for this is because cgroupfs creates
files and directories during the get_sb() call and also uses lookup_one_len()
during that same get_sb() call.  This is a problem since the security
subsystem cannot initialize the superblock and the inodes in that filesystem
until after the get_sb() call returns.  Thus we leave the inodes in
an unitialized state during get_sb().  For the vast majority of filesystems
this is not an issue, but since cgroupfs uses lookup_on_len() it does
search permission checks on the directories in the path it walks.  Since the
inode security state is not set up SELinux does these checks as if the inodes
were 'unlabeled.'

Many 'normal' userspace process do not have permission to interact with
unlabeled inodes.  The solution presented here is to do the permission checks
of path walk and inode creation as the kernel rather than as the task that
called mount.  Since the kernel has permission to read/write/create
unlabeled inodes the get_sb() call will complete successfully and the SELinux
code will be able to initialize the superblock and those inodes created during
the get_sb() call.

This appears to be the same solution used by other filesystems such as devtmpfs
to solve the same issue and should thus have no negative impact on other LSMs
which currently work.

Signed-off-by: Eric Paris &lt;eparis@redhat.com&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We recently found that in some configurations SELinux was blocking the ability
for cgroupfs to be mounted.  The reason for this is because cgroupfs creates
files and directories during the get_sb() call and also uses lookup_one_len()
during that same get_sb() call.  This is a problem since the security
subsystem cannot initialize the superblock and the inodes in that filesystem
until after the get_sb() call returns.  Thus we leave the inodes in
an unitialized state during get_sb().  For the vast majority of filesystems
this is not an issue, but since cgroupfs uses lookup_on_len() it does
search permission checks on the directories in the path it walks.  Since the
inode security state is not set up SELinux does these checks as if the inodes
were 'unlabeled.'

Many 'normal' userspace process do not have permission to interact with
unlabeled inodes.  The solution presented here is to do the permission checks
of path walk and inode creation as the kernel rather than as the task that
called mount.  Since the kernel has permission to read/write/create
unlabeled inodes the get_sb() call will complete successfully and the SELinux
code will be able to initialize the superblock and those inodes created during
the get_sb() call.

This appears to be the same solution used by other filesystems such as devtmpfs
to solve the same issue and should thus have no negative impact on other LSMs
which currently work.

Signed-off-by: Eric Paris &lt;eparis@redhat.com&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
