diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2015-06-26 19:50:04 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2015-06-26 19:50:04 -0700 |
commit | bbe179f88d39274630823a0dc07d2714fd19a103 (patch) | |
tree | f70181a660e0f859f230233643faded7d44360e5 /Documentation/cgroups | |
parent | 4b703b1d4c46ca4a00109ca1a391943ec21991b3 (diff) | |
parent | 8a0792ef8e01f03cb43806c6a87738bde34df713 (diff) |
Merge branch 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- threadgroup_lock got reorganized so that its users can pick the
actual locking mechanism to use. Its only user - cgroups - is
updated to use a percpu_rwsem instead of per-process rwsem.
This makes things a bit lighter on hot paths and allows cgroups to
perform and fail multi-task (a process) migrations atomically.
Multi-task migrations are used in several places including the
unified hierarchy.
- Delegation rule and documentation added to unified hierarchy. This
will likely be the last interface update from the cgroup core side
for unified hierarchy before lifting the devel mask.
- Some groundwork for the pids controller which is scheduled to be
merged in the coming devel cycle.
* 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: add delegation section to unified hierarchy documentation
cgroup: require write perm on common ancestor when moving processes on the default hierarchy
cgroup: separate out cgroup_procs_write_permission() from __cgroup_procs_write()
kernfs: make kernfs_get_inode() public
MAINTAINERS: add a cgroup core co-maintainer
cgroup: fix uninitialised iterator in for_each_subsys_which
cgroup: replace explicit ss_mask checking with for_each_subsys_which
cgroup: use bitmask to filter for_each_subsys
cgroup: add seq_file forward declaration for struct cftype
cgroup: simplify threadgroup locking
sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem
sched, cgroup: reorganize threadgroup locking
cgroup: switch to unsigned long for bitmasks
cgroup: reorganize include/linux/cgroup.h
cgroup: separate out include/linux/cgroup-defs.h
cgroup: fix some comment typos
Diffstat (limited to 'Documentation/cgroups')
-rw-r--r-- | Documentation/cgroups/unified-hierarchy.txt | 102 |
1 files changed, 84 insertions, 18 deletions
diff --git a/Documentation/cgroups/unified-hierarchy.txt b/Documentation/cgroups/unified-hierarchy.txt index eb102fb72213..86847a7647ab 100644 --- a/Documentation/cgroups/unified-hierarchy.txt +++ b/Documentation/cgroups/unified-hierarchy.txt @@ -17,15 +17,18 @@ CONTENTS 3. Structural Constraints 3-1. Top-down 3-2. No internal tasks -4. Other Changes - 4-1. [Un]populated Notification - 4-2. Other Core Changes - 4-3. Per-Controller Changes - 4-3-1. blkio - 4-3-2. cpuset - 4-3-3. memory -5. Planned Changes - 5-1. CAP for resource control +4. Delegation + 4-1. Model of delegation + 4-2. Common ancestor rule +5. Other Changes + 5-1. [Un]populated Notification + 5-2. Other Core Changes + 5-3. Per-Controller Changes + 5-3-1. blkio + 5-3-2. cpuset + 5-3-3. memory +6. Planned Changes + 6-1. CAP for resource control 1. Background @@ -245,9 +248,72 @@ cgroup must create children and transfer all its tasks to the children before enabling controllers in its "cgroup.subtree_control" file. -4. Other Changes +4. Delegation -4-1. [Un]populated Notification +4-1. Model of delegation + +A cgroup can be delegated to a less privileged user by granting write +access of the directory and its "cgroup.procs" file to the user. Note +that the resource control knobs in a given directory concern the +resources of the parent and thus must not be delegated along with the +directory. + +Once delegated, the user can build sub-hierarchy under the directory, +organize processes as it sees fit and further distribute the resources +it got from the parent. The limits and other settings of all resource +controllers are hierarchical and regardless of what happens in the +delegated sub-hierarchy, nothing can escape the resource restrictions +imposed by the parent. + +Currently, cgroup doesn't impose any restrictions on the number of +cgroups in or nesting depth of a delegated sub-hierarchy; however, +this may in the future be limited explicitly. + + +4-2. Common ancestor rule + +On the unified hierarchy, to write to a "cgroup.procs" file, in +addition to the usual write permission to the file and uid match, the +writer must also have write access to the "cgroup.procs" file of the +common ancestor of the source and destination cgroups. This prevents +delegatees from smuggling processes across disjoint sub-hierarchies. + +Let's say cgroups C0 and C1 have been delegated to user U0 who created +C00, C01 under C0 and C10 under C1 as follows. + + ~~~~~~~~~~~~~ - C0 - C00 + ~ cgroup ~ \ C01 + ~ hierarchy ~ + ~~~~~~~~~~~~~ - C1 - C10 + +C0 and C1 are separate entities in terms of resource distribution +regardless of their relative positions in the hierarchy. The +resources the processes under C0 are entitled to are controlled by +C0's ancestors and may be completely different from C1. It's clear +that the intention of delegating C0 to U0 is allowing U0 to organize +the processes under C0 and further control the distribution of C0's +resources. + +On traditional hierarchies, if a task has write access to "tasks" or +"cgroup.procs" file of a cgroup and its uid agrees with the target, it +can move the target to the cgroup. In the above example, U0 will not +only be able to move processes in each sub-hierarchy but also across +the two sub-hierarchies, effectively allowing it to violate the +organizational and resource restrictions implied by the hierarchical +structure above C0 and C1. + +On the unified hierarchy, let's say U0 wants to write the pid of a +process which has a matching uid and is currently in C10 into +"C00/cgroup.procs". U0 obviously has write access to the file and +migration permission on the process; however, the common ancestor of +the source cgroup C10 and the destination cgroup C00 is above the +points of delegation and U0 would not have write access to its +"cgroup.procs" and thus be denied with -EACCES. + + +5. Other Changes + +5-1. [Un]populated Notification cgroup users often need a way to determine when a cgroup's subhierarchy becomes empty so that it can be cleaned up. cgroup @@ -289,7 +355,7 @@ supported and the interface files "release_agent" and "notify_on_release" do not exist. -4-2. Other Core Changes +5-2. Other Core Changes - None of the mount options is allowed. @@ -306,14 +372,14 @@ supported and the interface files "release_agent" and - The "cgroup.clone_children" file is removed. -4-3. Per-Controller Changes +5-3. Per-Controller Changes -4-3-1. blkio +5-3-1. blkio - blk-throttle becomes properly hierarchical. -4-3-2. cpuset +5-3-2. cpuset - Tasks are kept in empty cpusets after hotplug and take on the masks of the nearest non-empty ancestor, instead of being moved to it. @@ -322,7 +388,7 @@ supported and the interface files "release_agent" and masks of the nearest non-empty ancestor. -4-3-3. memory +5-3-3. memory - use_hierarchy is on by default and the cgroup file for the flag is not created. @@ -407,9 +473,9 @@ supported and the interface files "release_agent" and memory.low, memory.high, and memory.max will use the string "max" to indicate and set the highest possible value. -5. Planned Changes +6. Planned Changes -5-1. CAP for resource control +6-1. CAP for resource control Unified hierarchy will require one of the capabilities(7), which is yet to be decided, for all resource control related knobs. Process |