Control groups (cgroups) are a Linux kernel feature that controls and limits resource usage and isolates resources of a collection of processes. The resources include the CPU, memory, and disk I/O.
Overview
YARN is integrated with the cgroups feature. This feature allows you to control the CPU usage of each container or all containers managed by the NodeManager.
Enable the cgroups feature in YARN
-
Log on to the Alibaba Cloud EMR console.
- Go to the YARN page.
- Click the Cluster Management tab.
- On the Cluster Management page, find your cluster and click its ID.
- In the left-side navigation pane, choose .
- Enable the cgroups feature.
- On the YARN page, choose in the upper-right corner.
- In the Cluster Activities dialog box, configure relevant parameters as required.
- Click OK.
- In the Confirm message, click OK.
- Restart NodeManager.
- Choose in the upper-right corner.
- In the Cluster Activities dialog box, configure relevant parameters as required.
- Click OK.
- In the Confirm message, click OK.
Modify parameters to control the CPU usage
-
Log on to the Alibaba Cloud EMR console.
- Go to the YARN page.
- Click the Cluster Management tab.
- On the Cluster Management page, find your cluster and click its ID.
- In the left-side navigation pane, choose .
- Modify relevant parameters.
- On the YARN page, click the Configure tab.
- Modify the parameters described in the following table.
Parameter Description yarn.nodemanager.resource.percentage-physical-cpu-limit The percentage at which you want to limit the total CPU usage of all containers managed by NodeManager. Default value: 100. Note The total CPU usage of all containers managed by NodeManager does not exceed the upper limit specified by the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter.yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage Specifies whether to set a hard limit on the CPU usage for each container. If no hard limit is set, containers can use idle CPU resources in addition to the allocated resources. The default value is false, which indicates that containers can use idle CPU resources.
Control the total CPU usage of all containers
You can modify the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to control the total CPU usage of all containers managed by NodeManager.
%CPU
indicates the CPU usage of a process on a single core. %Cpu(s)
indicates the CPU usage of all processes across all cores.
- Set the parameter to 10.
The preceding figure shows that the value of
%Cpu(s)
is 10.2, which is the sum of the%CPU
values of all container processes of the test user. The total CPU usage is calculated by using the following formula: 7% + 5.3% + 5% + 4.7% + 4.7% + 4.3% + 4.3% + 4% + 2% = 41.3%. This indicates that all processes of the test user consume 0.413 cores, which accounts for about 10% of the four cores on a specific node. - Set the parameter to 30.
The preceding figure shows that the value of
%Cpu(s)
is 32.5, which is the sum of the%CPU
values of all container processes of the test user. The total CPU usage is calculated by using the following formula: 19% + 18.3% + 18.3% + 17% + 16.7% + 16.3% + 14.7% + 12% = 132.3%. This indicates that all processes of the test user consume 1.323 cores, which accounts for about 30% of the four cores on a specific node. - Set the parameter to 50.
The preceding figure shows that the value of
%Cpu(s)
is 48.7, which is the sum of the%CPU
values of all container processes of the test user. The total CPU usage is calculated by using the following formula: 65.1% + 60.1% + 43.5% + 20.3% + 3.7% + 2% = 194.7%. This indicates that all processes of the test user consume 1.947 cores, which accounts for about 50% of the four cores on a specific node.
Control the CPU usage of each container
The total CPU usage of all containers managed by NodeManager does not exceed the upper limit specified by the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter in the Control the total CPU usage of all containers section. NodeManager provides the share mode and the strict mode for you to manage and control the CPU usage of each container. You can set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to specify a mode.
- Use share mode
To use the share mode, set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to false. This is the default value of the parameter. In share mode, containers can use idle CPU resources in addition to the allocated resources.
For example, set the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to 50 and the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to false. The cluster has two nodes deployed with NodeManager and each node has four cores. All containers managed by NodeManager on a node can consume a maximum of 50% of the four cores, which is equal to two cores. The CPU resources allocated for each container are calculated by using the following formula: (1 vCore/8 vCores) × (4 cores × 50%) = 0.25 cores. In share mode, each container can consume idle CPU resources, which are a maximum of two cores.
The preceding figure shows that the CPU resources consumed by the container processes of the test user vary largely, such as 0.65 cores, 0.61 cores, and 0.037 cores. This indicates that some containers consume much more CPU resources than the allocated 0.25 cores.
- Use strict mode
To use strict mode, set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to true. In strict mode, containers can consume only the allocated CPU resources, even if idle CPU resources are available.
For example, set the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to 50 and the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to true.
The cluster has two nodes deployed with NodeManager and each node has four cores. The CPU resources allocated for each container are calculated by using the following formula: (1 vCore/8 vCores) × (4 cores × 50%) = 0.25 cores. The preceding figure shows that the CPU resources consumed by the container processes of the test user are around the allocated 0.25 cores, such as 0.266 cores and 0.249 cores.