Control groups (cgroups) are a Linux kernel feature that limits, accounts for, and isolates the resource usage, such as the CPU, memory, and disk I/O, of a collection of processes.

Overview

YARN integrates the cgroups feature. This feature allows you to control the CPU usage of each container or all containers managed by the NodeManager.

Enable the cgroups feature in YARN

Note By default, the cgroups feature is disabled in the YARN component of an E-MapReduce cluster. You can enable the feature as needed.
  1. Log on to the Alibaba Cloud E-MapReduce console.

  2. Go to the YARN page of an E-MapReduce cluster.
    1. On the E-MapReduce homepage, click the Cluster Management tab.

      The Cluster Management tab appears.

    2. Find the target cluster and click the cluster ID.
    3. On the page that appears, choose Cluster Service > YARN in the left-side navigation pane.
  3. Enable the cgroups feature.
    1. On the YARN page, choose Actions > EnabledCGroups in the upper-right corner.

      The Cluster Activities dialog box appears.

    2. Set relevant parameters as needed.
    3. Click OK.

      The Confirm dialog box appears.

    4. Click OK.
  4. Restart the cluster.
    1. Click the Cluster Management tab.
    2. Find the target cluster and choose More > Restart in the Actions column.

Modify parameters to control the CPU usage

After you enable the cgroups feature for an E-MapReduce cluster, you can set relevant parameters in the YARN component to control the CPU usage.

  1. Log on to the Alibaba Cloud E-MapReduce console.

  2. Go to the YARN page of an E-MapReduce cluster.
    1. On the E-MapReduce homepage, click the Cluster Management tab.

      The Cluster Management tab appears.

    2. Find the target cluster and click the cluster ID.
    3. On the page that appears, choose Cluster Service > YARN in the left-side navigation pane.
  3. Modify relevant parameters.
    1. On the YARN page, click the Configure tab.
    2. Modify the parameters as described in the following table.
      Parameter Description
      yarn.nodemanager.resource.percentage-physical-cpu-limit The percentage at which you want to limit the total CPU usage of all containers managed by the NodeManager. Default value: 100.
      Note Theoretically, the total CPU usage of all containers managed by the NodeManager does not exceed the upper limit specified by the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter.
      yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage Specifies whether to set a hard limit on the CPU usage for each container. If no hard limit is set, containers can use idle CPU resources in addition to the allocated resources. The default value is false, specifying that containers can use idle CPU resources.

Control the total CPU usage of all containers

By modifying the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter, you can control the total CPU usage of all containers managed by the NodeManager.

Prepare an E-MapReduce cluster that consists of two nodes deployed with the NodeManager and one node deployed with the ResourceManager. Each node has four cores and 16 GB of memory. Set the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to control the CPU usage of nodes deployed with the NodeManager. After you set the parameter, run a Hadoop Pi job in the YARN component to check the CPU usage. This example sets the parameter to 10, 30, and 50, respectively.
Note In the following figures, the value of %CPU indicates the percentage for which the CPU usage of a process of the test user accounts in a core. The value of %Cpu(s) indicates the percentage for which the CPU usage of all processes of the test user accounts in the total CPU resources.
  • Set the parameter to 10.cpu_10

    As shown in the preceding figure, the value of %Cpu(s) is 10.2, which is the sum of the %CPU values of all processes of the test user. The total CPU usage is calculated as follows: 7% + 5.3% + 5% + 4.7% + 4.7% + 4.3% + 4.3% + 4% + 2% = 41.3%. This indicates that all processes of the test user consume 0.413 core, accounting for about 10% of the four cores on the node.

  • Set the parameter to 30.cpu_30

    As shown in the preceding figure, the value of %Cpu(s) is 32.5, which is the sum of the %CPU values of all processes of the test user. The total CPU usage is calculated as follows: 19% + 18.3% + 18.3% + 17% + 16.7% + 16.3% + 14.7% + 12% = 132.3%. This indicates that all processes of the test user consume 1.323 cores, accounting for about 30% of the four cores on the node.

  • Set the parameter to 50.cpu_50

    As shown in the preceding figure, the value of %Cpu(s) is 48.7, which is the sum of the %CPU values of all processes of the test user. The total CPU usage is calculated as follows: 65.1% + 60.1% + 43.5% + 20.3% + 3.7% + 2% = 194.7%. This indicates that all processes of the test user consume 1.947 cores, accounting for about 50% of the four cores on the node.

Control the CPU usage of each container

Theoretically, the total CPU usage of all containers managed by the NodeManager does not exceed the upper limit specified by the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter. The NodeManager provides the share mode and the strict mode for you to manage and control the CPU usage of each container. You can set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to specify a mode.

  • Use the share mode

    To use the share mode, set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to false. This is the default value of the parameter. In the share mode, containers can use idle CPU resources in addition to the allocated resources.

    For example, set the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to 50 and the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to false. The cluster has two nodes deployed with the NodeManager and each node has four cores. All containers managed by the NodeManager on a node can consume up to 50% of the four cores, which equals 2 cores. The CPU resources allocated for each container are calculated as follows: (1 vcore/8 vcore) × (4 cores × 50%) = 0.25 core. In the share mode, each container can consume idle CPU resources, which are up to 2 cores.

    share

    As shown in the preceding figure, the CPU resources consumed by the processes of the test user vary largely, such as 0.65 core, 0.61 core, and 0.037 core. This indicates that some containers consume much more CPU resources than the allocated 0.25 core.

  • Use the strict mode

    To use the strict mode, set the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to true. In the strict mode, containers can only consume the allocated CPU resources, even if idle CPU resources are available.

    For example, set the yarn.nodemanager.resource.percentage-physical-cpu-limit parameter to 50 and the yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage parameter to true.

    strict

    The cluster has two nodes deployed with the NodeManager and each node has four cores. The CPU resources allocated for each container are calculated as follows: (1 vcore/8 vcore) × (4 cores × 50%) = 0.25 core. As shown in the preceding figure, the CPU resources consumed by the processes of the test user are all around the allocated 0.25 core, such as 0.266 core and 0.249 core.