diff options
Diffstat (limited to 'Documentation/cgroup-v2.txt')
-rw-r--r-- | Documentation/cgroup-v2.txt | 103 |
1 files changed, 96 insertions, 7 deletions
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 4cc07ce3b8dd..3b8449f8ac7e 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -47,6 +47,12 @@ CONTENTS 5-3. IO 5-3-1. IO Interface Files 5-3-2. Writeback + 5-4. PID + 5-4-1. PID Interface Files + 5-5. RDMA + 5-5-1. RDMA Interface Files + 5-6. Misc + 5-6-1. perf_event 6. Namespace 6-1. Basics 6-2. The Root and Views @@ -328,14 +334,12 @@ a process with a non-root euid to migrate a target process into a cgroup by writing its PID to the "cgroup.procs" file, the following conditions must be met. -- The writer's euid must match either uid or suid of the target process. - - The writer must have write access to the "cgroup.procs" file. - The writer must have write access to the "cgroup.procs" file of the common ancestor of the source and destination cgroups. -The above three constraints ensure that while a delegatee may migrate +The above two constraints ensure that while a delegatee may migrate processes around freely in the delegated sub-hierarchy it can't pull in from or push out to outside the sub-hierarchy. @@ -350,10 +354,10 @@ all processes under C0 and C1 belong to U0. Let's also say U0 wants to write the PID of a process which is currently in C10 into "C00/cgroup.procs". U0 has write access to the -file and uid match on the process; however, the common ancestor of the -source cgroup C10 and the destination cgroup C00 is above the points -of delegation and U0 would not have write access to its "cgroup.procs" -files and thus the write will be denied with -EACCES. +file; however, the common ancestor of the source cgroup C10 and the +destination cgroup C00 is above the points of delegation and U0 would +not have write access to its "cgroup.procs" files and thus the write +will be denied with -EACCES. 2-6. Guidelines @@ -1119,6 +1123,91 @@ writeback as follows. vm.dirty[_background]_ratio. +5-4. PID + +The process number controller is used to allow a cgroup to stop any +new tasks from being fork()'d or clone()'d after a specified limit is +reached. + +The number of tasks in a cgroup can be exhausted in ways which other +controllers cannot prevent, thus warranting its own controller. For +example, a fork bomb is likely to exhaust the number of tasks before +hitting memory restrictions. + +Note that PIDs used in this controller refer to TIDs, process IDs as +used by the kernel. + + +5-4-1. PID Interface Files + + pids.max + + A read-write single value file which exists on non-root cgroups. The + default is "max". + + Hard limit of number of processes. + + pids.current + + A read-only single value file which exists on all cgroups. + + The number of processes currently in the cgroup and its descendants. + +Organisational operations are not blocked by cgroup policies, so it is +possible to have pids.current > pids.max. This can be done by either +setting the limit to be smaller than pids.current, or attaching enough +processes to the cgroup such that pids.current is larger than +pids.max. However, it is not possible to violate a cgroup PID policy +through fork() or clone(). These will return -EAGAIN if the creation +of a new process would cause a cgroup policy to be violated. + + +5-5. RDMA + +The "rdma" controller regulates the distribution and accounting of +of RDMA resources. + +5-5-1. RDMA Interface Files + + rdma.max + A readwrite nested-keyed file that exists for all the cgroups + except root that describes current configured resource limit + for a RDMA/IB device. + + Lines are keyed by device name and are not ordered. + Each line contains space separated resource name and its configured + limit that can be distributed. + + The following nested keys are defined. + + hca_handle Maximum number of HCA Handles + hca_object Maximum number of HCA Objects + + An example for mlx4 and ocrdma device follows. + + mlx4_0 hca_handle=2 hca_object=2000 + ocrdma1 hca_handle=3 hca_object=max + + rdma.current + A read-only file that describes current resource usage. + It exists for all the cgroup except root. + + An example for mlx4 and ocrdma device follows. + + mlx4_0 hca_handle=1 hca_object=20 + ocrdma1 hca_handle=1 hca_object=23 + + +5-6. Misc + +5-6-1. perf_event + +perf_event controller, if not mounted on a legacy hierarchy, is +automatically enabled on the v2 hierarchy so that perf events can +always be filtered by cgroup v2 path. The controller can still be +moved to a legacy hierarchy after v2 hierarchy is populated. + + 6. Namespace 6-1. Basics |