<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/cgroup, branch v4.13-rc2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>mm, cpuset: always use seqlock when changing task's nodemask</title>
<updated>2017-07-06T23:24:34+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2017-07-06T22:40:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5f155f27cb7f0670429e2b8bb954094fa4110df9'/>
<id>5f155f27cb7f0670429e2b8bb954094fa4110df9</id>
<content type='text'>
When updating task's mems_allowed and rebinding its mempolicy due to
cpuset's mems being changed, we currently only take the seqlock for
writing when either the task has a mempolicy, or the new mems has no
intersection with the old mems.

This should be enough to prevent a parallel allocation seeing no
available nodes, but the optimization is IMHO unnecessary (cpuset
updates should not be frequent), and we still potentially risk issues if
the intersection of new and old nodes has limited amount of
free/reclaimable memory.

Let's just use the seqlock for all tasks.

Link: http://lkml.kernel.org/r/20170517081140.30654-6-vbabka@suse.cz
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When updating task's mems_allowed and rebinding its mempolicy due to
cpuset's mems being changed, we currently only take the seqlock for
writing when either the task has a mempolicy, or the new mems has no
intersection with the old mems.

This should be enough to prevent a parallel allocation seeing no
available nodes, but the optimization is IMHO unnecessary (cpuset
updates should not be frequent), and we still potentially risk issues if
the intersection of new and old nodes has limited amount of
free/reclaimable memory.

Let's just use the seqlock for all tasks.

Link: http://lkml.kernel.org/r/20170517081140.30654-6-vbabka@suse.cz
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, mempolicy: simplify rebinding mempolicies when updating cpusets</title>
<updated>2017-07-06T23:24:34+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2017-07-06T22:40:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=213980c0f23b6c4932fd5516da7e8443b2a615ea'/>
<id>213980c0f23b6c4932fd5516da7e8443b2a615ea</id>
<content type='text'>
Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") has introduced a two-step protocol when
rebinding task's mempolicy due to cpuset update, in order to avoid a
parallel allocation seeing an empty effective nodemask and failing.

Later, commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory
barrier related damage v3") introduced a seqlock protection and removed
the synchronization point between the two update steps.  At that point
(or perhaps later), the two-step rebinding became unnecessary.

Currently it only makes sure that the update first adds new nodes in
step 1 and then removes nodes in step 2.  Without memory barriers the
effects are questionable, and even then this cannot prevent a parallel
zonelist iteration checking the nodemask at each step to observe all
nodes as unusable for allocation.  We now fully rely on the seqlock to
prevent premature OOMs and allocation failures.

We can thus remove the two-step update parts and simplify the code.

Link: http://lkml.kernel.org/r/20170517081140.30654-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") has introduced a two-step protocol when
rebinding task's mempolicy due to cpuset update, in order to avoid a
parallel allocation seeing an empty effective nodemask and failing.

Later, commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory
barrier related damage v3") introduced a seqlock protection and removed
the synchronization point between the two update steps.  At that point
(or perhaps later), the two-step rebinding became unnecessary.

Currently it only makes sure that the update first adds new nodes in
step 1 and then removes nodes in step 2.  Without memory barriers the
effects are questionable, and even then this cannot prevent a parallel
zonelist iteration checking the nodemask at each step to observe all
nodes as unusable for allocation.  We now fully rely on the seqlock to
prevent premature OOMs and allocation failures.

We can thus remove the two-step update parts and simplify the code.

Link: http://lkml.kernel.org/r/20170517081140.30654-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: implement "nsdelegate" mount option</title>
<updated>2017-06-28T18:45:21+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-06-27T18:30:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5136f6365ce3eace5a926e10f16ed2a233db5ba9'/>
<id>5136f6365ce3eace5a926e10f16ed2a233db5ba9</id>
<content type='text'>
Currently, cgroup only supports delegation to !root users and cgroup
namespaces don't get any special treatments.  This limits the
usefulness of cgroup namespaces as they by themselves can't be safe
delegation boundaries.  A process inside a cgroup can change the
resource control knobs of the parent in the namespace root and may
move processes in and out of the namespace if cgroups outside its
namespace are visible somehow.

This patch adds a new mount option "nsdelegate" which makes cgroup
namespaces delegation boundaries.  If set, cgroup behaves as if write
permission based delegation took place at namespace boundaries -
writes to the resource control knobs from the namespace root are
denied and migration crossing the namespace boundary aren't allowed
from inside the namespace.

This allows cgroup namespace to function as a delegation boundary by
itself.

v2: Silently ignore nsdelegate specified on !init mounts.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Aravind Anbudurai &lt;aru7@fb.com&gt;
Cc: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, cgroup only supports delegation to !root users and cgroup
namespaces don't get any special treatments.  This limits the
usefulness of cgroup namespaces as they by themselves can't be safe
delegation boundaries.  A process inside a cgroup can change the
resource control knobs of the parent in the namespace root and may
move processes in and out of the namespace if cgroups outside its
namespace are visible somehow.

This patch adds a new mount option "nsdelegate" which makes cgroup
namespaces delegation boundaries.  If set, cgroup behaves as if write
permission based delegation took place at namespace boundaries -
writes to the resource control knobs from the namespace root are
denied and migration crossing the namespace boundary aren't allowed
from inside the namespace.

This allows cgroup namespace to function as a delegation boundary by
itself.

v2: Silently ignore nsdelegate specified on !init mounts.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Aravind Anbudurai &lt;aru7@fb.com&gt;
Cc: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: restructure cgroup_procs_write_permission()</title>
<updated>2017-06-28T18:45:02+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-06-25T04:27:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=824ecbe01c5d833b8c8a371c209e3ac3a76cd18a'/>
<id>824ecbe01c5d833b8c8a371c209e3ac3a76cd18a</id>
<content type='text'>
Restructure cgroup_procs_write_permission() to make extending
permission logic easier.

This patch doesn't cause any functional changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Restructure cgroup_procs_write_permission() to make extending
permission logic easier.

This patch doesn't cause any functional changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: fix lockdep warning in debug controller</title>
<updated>2017-06-14T20:01:41+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-06-14T20:01:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b6053d40e3387c916d56d319ba6bfdb2c42a67c7'/>
<id>b6053d40e3387c916d56d319ba6bfdb2c42a67c7</id>
<content type='text'>
The debug controller grabs cgroup_mutex from interface file show
functions which can deadlock and triggers lockdep warnings.  Fix it by
using cgroup_kn_lock_live()/cgroup_kn_unlock() instead.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The debug controller grabs cgroup_mutex from interface file show
functions which can deadlock and triggers lockdep warnings.  Fix it by
using cgroup_kn_lock_live()/cgroup_kn_unlock() instead.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: refactor cgroup_masks_read() in the debug controller</title>
<updated>2017-06-14T20:01:36+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-06-14T20:01:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2866c0b4cf25274040f0b4cc045ad1f44b885dd8'/>
<id>2866c0b4cf25274040f0b4cc045ad1f44b885dd8</id>
<content type='text'>
Factor out cgroup_masks_read_one() out of cgroup_masks_read() for
simplicity.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Factor out cgroup_masks_read_one() out of cgroup_masks_read() for
simplicity.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: make debug an implicit controller on cgroup2</title>
<updated>2017-06-14T20:01:32+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-06-14T20:01:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8cc38fa7fa317d44710f24475576b1f9ee205da9'/>
<id>8cc38fa7fa317d44710f24475576b1f9ee205da9</id>
<content type='text'>
Make debug an implicit controller on cgroup2 which is enabled by
"cgroup_debug" boot param.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make debug an implicit controller on cgroup2 which is enabled by
"cgroup_debug" boot param.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Make debug cgroup support v2 and thread mode</title>
<updated>2017-06-14T20:01:21+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-06-13T21:18:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=575313f40ff33d0c2aff2701dfb2ccfcd6211d55'/>
<id>575313f40ff33d0c2aff2701dfb2ccfcd6211d55</id>
<content type='text'>
Besides supporting cgroup v2 and thread mode, the following changes
are also made:
 1) current_* cgroup files now resides only at the root as we don't
    need duplicated files of the same function all over the cgroup
    hierarchy.
 2) The cgroup_css_links_read() function is modified to report
    the number of tasks that are skipped because of overflow.
 3) The number of extra unaccounted references are displayed.
 4) The current_css_set_read() function now prints out the addresses of
    the css'es associated with the current css_set.
 5) A new cgroup_subsys_states file is added to display the css objects
    associated with a cgroup.
 6) A new cgroup_masks file is added to display the various controller
    bit masks in the cgroup.

tj: Dropped thread mode related information for now so that debug
    controller changes aren't blocked on the thread mode.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Besides supporting cgroup v2 and thread mode, the following changes
are also made:
 1) current_* cgroup files now resides only at the root as we don't
    need duplicated files of the same function all over the cgroup
    hierarchy.
 2) The cgroup_css_links_read() function is modified to report
    the number of tasks that are skipped because of overflow.
 3) The number of extra unaccounted references are displayed.
 4) The current_css_set_read() function now prints out the addresses of
    the css'es associated with the current css_set.
 5) A new cgroup_subsys_states file is added to display the css objects
    associated with a cgroup.
 6) A new cgroup_masks file is added to display the various controller
    bit masks in the cgroup.

tj: Dropped thread mode related information for now so that debug
    controller changes aren't blocked on the thread mode.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Make Kconfig prompt of debug cgroup more accurate</title>
<updated>2017-06-14T20:01:21+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-06-13T21:18:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=23b0be480f341db26ce0dee7d3f6e67f8e0e166f'/>
<id>23b0be480f341db26ce0dee7d3f6e67f8e0e166f</id>
<content type='text'>
The Kconfig prompt and description of the debug cgroup controller
more accurate by saying that it is for debug purpose only and its
interfaces are unstable.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The Kconfig prompt and description of the debug cgroup controller
more accurate by saying that it is for debug purpose only and its
interfaces are unstable.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: Move debug cgroup to its own file</title>
<updated>2017-06-14T20:01:21+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-06-13T21:18:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a28f8f5e995fe5964ae304444913536058f26e37'/>
<id>a28f8f5e995fe5964ae304444913536058f26e37</id>
<content type='text'>
The debug cgroup currently resides within cgroup-v1.c and is enabled
only for v1 cgroup. To enable the debug cgroup also for v2, it makes
sense to put the code into its own file as it will no longer be v1
specific. There is no change to the debug cgroup specific code.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The debug cgroup currently resides within cgroup-v1.c and is enabled
only for v1 cgroup. To enable the debug cgroup also for v2, it makes
sense to put the code into its own file as it will no longer be v1
specific. There is no change to the debug cgroup specific code.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
