Fair Scheduler is a built-in scheduler in Apache YARN. It is used to fairly allocate resources to applications running on YARN. The resources used in each queue are allocated based on the configured weight.

Prerequisites

An EMR Hadoop cluster is created. For more information, see Create a cluster.

Precautions

After you turn on Enable Resource Queue, you can no longer configure cluster resources on the fair-scheduler tab in the Service Configuration section of the Configure tab on the YARN service page. Existing configurations are synchronized to the Cluster Resources page. If you want to configure cluster resources on the Configure tab of the YARN service page, turn off Enable Resource Queue on the Cluster Resources page.

Configure Fair Scheduler

  1. Go to the Cluster Resources page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where you want to create a cluster. The region of a cluster cannot be changed after the cluster is created.
    3. Click the Cluster Management tab.
    4. On the Cluster Management page, find your cluster and click Details in the Actions column.
    5. In the left-side navigation pane of the page that appears, click Cluster Resources.
  2. Enable Fair Scheduler.
    1. On the Cluster Resources page, turn on Enable Resource Queue.
    2. Select Fair Scheduler for Queue Type.
      Note If you use the cluster resource management feature for the first time, the Queue Type parameter is set to Capacity Scheduler by default.
    3. Click Save.
  3. In the upper-right corner of the Cluster Resources page, click Queue Settings.
  4. Configure queue information on the Queue Settings tab.
    • Find your queue and click Edit in the Actions column to modify resource queue information.
    • Find your queue and choose More > Create Child Queue in the Actions column to create a child queue.

      You cannot create a child queue for the default queue.

    root is a level-1 queue. It is the parent queue of all other queues and manages all resources of YARN. By default, only the default queue is available within the root queue.

    Notice Parameters of a parent queue have a higher priority than those of its child queues because the queue configurations are nested. If the resource usage configured for a child queue is higher than that of the parent queue, the scheduler allocates resources to the child queue based on the parameter settings of the parent queue.
    • Queue Name is required. A queue name cannot contain periods (.).
    • Weight is required. If the weight is reached, it is considered that resources are fairly allocated to queues. Weights take effect on level-2, level-3, and lower-level queues. For example, a parent queue has a child queue with a weight of 2 and another child queue with a weight of 1. If tasks are running in the child queues and the resource allocation ratio reaches 2:1, resources are fairly allocated.
    • Maximum Resources specifies the maximum number of vCores and the maximum memory space that can be allocated to a queue. The values of this parameter must be greater than the values of Minimum Resources but less than the resource scale that the YARN service can provide. If the values of the Maximum Resources parameter are greater than the resource scale, the resource scale takes effect for the queue. For example, the number of vCores that the YARN service can provide is 16, but the value of vCores for Maximum Resources is 20. In this case, 16 vCores are allocated to the queue.
    • If you do not specify a queue when applications are running, jobs are submitted to the default queue.
    • If you do not restart ResourceManager after you modify the name of a child queue, tasks can still be submitted to the original queue. However, the queue configuration is no longer displayed in the EMR console. After you restart ResourceManager, the original queue becomes unavailable.
    • After you delete a queue that is not a level-2 queue, click Deploy to make the modification take effect. After you delete a level-2 queue, choose Actions > RestartResourceManager in the upper-right corner of the Cluster Resources page to make the modification take effect.

Switch the scheduler type

After Enable Resource Queue is turned on, you can perform the following steps to switch the scheduler type:
Notice After the switchover is complete, you must restart ResourceManager to make the configuration take effect.
  1. In the upper-right corner of the Cluster Resources page, click Select Scheduler.
  2. Select the required scheduler for Queue Type.
  3. Click Save.
  4. Restart ResourceManager.
    1. In the upper-right corner of the Cluster Resources page, choose Actions > RestartResourceManager.
    2. In the Cluster Activities dialog box, configure the parameters and click OK.
    3. In the Confirm message, click OK.
      When a success message appears, the scheduler type is switched.

Disable the cluster resource management feature

Note After you disable the cluster resource management feature, you cannot perform operations on the Cluster Resources page. If you want to use the cluster resource management feature again, turn on Enable Resource Queue on the Cluster Resources page or configure the xml-direct-to-file-content parameter on the fair-scheduler tab in the Service Configuration section of the Configure tab on the YARN service page.
  1. On the Cluster Resources page, turn off Enable Resource Queue.
  2. In the Disable Resource Queue dialog box, click OK.

Submit a job

You must use the mapreduce.job.queuename parameter to specify the queue to which you want to submit a job. Example:
`hadoop jar /usr/lib/hadoop-current/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar pi -Dmapreduce.job.queuename=test  2 2`