YARN supports service-level and queue-level authorization.

Go to the Configure tab for YARN

  1. Log on to the Alibaba Cloud EMR console.
  2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
  3. Click the Cluster Management tab.
  4. On the Cluster Management page, find your cluster and click Details in the Actions column.
  5. In the left-side navigation pane, choose Cluster Service > YARN.
  6. Click the Configure tab.

Service-level authorization

For more information, see Hadoop documentation.

  • Service-level authorization controls the permissions of specified users on cluster services, such as job submissions.
  • Authorization policies are configured in the hadoop-policy.xml file.
  • Service-level authorization must be verified before all other authorizations, such as the authorization of the permissions to perform actions on HDFS or submit a YARN job for queuing.
Note If you have enabled HDFS permission check or configured access control on YARN queue resources, service-level authorization is not required. In this case, you can also configure service-level authorization based on your business requirements.

Queue-level authorization

YARN uses queues to manage permissions on resources. The following types of queue schedulers are supported: Capacity Scheduler and Fair Scheduler.

In this section, Capacity Scheduler is used as an example.

  • Configure authorization parameters

    You can grant users the permissions of submitting jobs to a queue and the permissions of managing a queue.

    Note
    • When you configure an access control list (ACL), you can specify users and groups to which you want to grant permissions. Separate a user and a group with a space, and separate two users or groups with a comma (,). You can also set the related parameter to a single space to grant no permission to anyone.
    • ACL inheritance: If a user or group can submit applications to a queue, this user or group can also submit applications to all sub-queues under this queue. Queue management ACLs can also be inherited. Therefore, if you want to prevent a user or group from submitting jobs to a queue, you must configure access control for this queue and all of its parent queues.
    • yarn.acl.enable

      Specifies whether to enable support for ACLs. Set this parameter to true.

    • yarn.admin.acl
      • Specifies YARN administrators. If you want to run the yarn rmadmin/yarn kill command, you must specify this parameter. Otherwise, the settings of queue-related ACL administrators do not take effect.
      • You can configure users and groups, as described in the preceding note.
        user1,user2 group1,group2 # Separate a user and a group with a space.
          group1,group2 # If you only specify groups, add a space before the first group.

        Make sure that the value contains has. You can add other users and groups based on your business requirements.

    • yarn.scheduler.capacity.${queue-name}.acl_submit_applications
      • Specifies users and groups that can submit jobs to a queue.
      • ${queue-name} indicates the queue name. Multi-level queues are supported. Take ACL inheritance into consideration when you configure multi-level queues.
        #queue-name=root
          <property>
              <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
              <value> </value> # No one can submit jobs to the root queue.
          </property>
         #queue-name=root.testqueue
         <property>
           <name>yarn.scheduler.capacity.root.testqueue.acl_submit_applications</name>
              <value>test testgrp</value> # Only user test and group testgrp can submit jobs to queue testqueue.
          </property>
    • yarn.scheduler.capacity.${queue-name}.acl_administer_queue
      • Specifies users and groups that can manage a queue, such as to kill jobs in a queue.
      • ${queue-name} indicates the queue name. Multi-level queues are supported. Take ACL inheritance into consideration when you configure multi-level queues.
        #queue-name=root
          <property>
              <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
              <value> </value>
          </property>
         #queue-name=root.testqueue
         <property>
           <name>yarn.scheduler.capacity.root.testqueue.acl_administer_queue</name>
              <value>test testgrp</value>
          </property>
  • Restart YARN
    • By default, support for ACLs is enabled in EMR clusters deployed in Kerberos security mode. You can directly configure ACL permissions on queues as needed.
    • For other clusters, you must enable support for ACLs, configure permissions on queues, and then restart the YARN service.
      1. After you configure the YARN authorization parameters, choose Actions > Restart All Components in the upper-right corner of the YARN page.
      2. In the Cluster Activities dialog box, configure relevant parameters and click OK. In the Confirm message, click OK.

        Click History in the upper-right corner to view the task progress.

  • Example
    • yarn-site.xml
      Key Value
      yarn.acl.enable true
      yarn.admin.acl has
    • capacity-scheduler.xml
    • Queue default: No one is allowed to submit jobs to default or manage this queue.
    • Queue q1: Only user test is allowed to submit jobs to q1 or manage this queue, such as to kill jobs in the queue.
    • Queue q2: Only user foo is allowed to submit jobs to q2 or manage this queue.
    <configuration>
        <property>
            <name>yarn.scheduler.capacity.maximum-applications</name>
            <value>10000</value>
            <description>Maximum number of applications that can be pending and running.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
            <value>0.25</value>
            <description>Maximum percent of resources in the cluster which can be used to run application masters i.e.
                controls number of concurrent running applications.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.resource-calculator</name>
            <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.queues</name>
            <value>default,q1,q2</value>
            <! -- 3 queues -->
            <description>The queues at the this level (root is the root queue).</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.capacity</name>
            <value>0</value>
            <description>Default queue target capacity.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
            <value>1</value>
            <description>Default queue user limit a percentage from 0.0 to 1.0.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
            <value>100</value>
            <description>The maximum capacity of the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.state</name>
            <value>STOPPED</value>
            <! -- Set the status of default to STOPPED. -->
            <description>The state of the default queue. State can be one of RUNNING or STOPPED.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
            <value> </value>
            <! -- No one is allowed to submit jobs to default. -->
            <description>The ACL of who can submit jobs to the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
            <value> </value>
            <! -- No one is allowed to manage default. -->
            <description>The ACL of who can administer jobs on the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.node-locality-delay</name>
            <value>40</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.queue-mappings</name>
            <value>u:test:q1,u:foo:q2</value>
            <! -- Queue mapping. User test is automatically mapped to q1. -->
            <description>A list of mappings that will be used to assign jobs to queues. The syntax for this list is
                [u|g]:[name]:[queue_name][,next mapping]* Typically this list will be used to map users to queues,for
                example, u:%user:%user maps all users to queues with the same name as the user.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
            <value>true</value>
            <! -- Specifies whether to overwrite the queue settings on the client with the preceding queue mapping. -->
            <description>If a queue mapping is present, will it override the value specified by the user? This can be used
                by administrators to place jobs in queues that are different than the one specified by the user. The default
                is false.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
            <value> </value>
            <! -- ACL inheritance. Permissions must be configured for the parent queues. -->
            <description>
                The ACL of who can submit jobs to the root queue.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.acl_submit_applications</name>
            <value>test</value>
            <! -- Only user test is allowed to submit jobs to q1. -->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.acl_submit_applications</name>
            <value>foo</value>
            <! -- Only user foo is allowed to submit jobs to q2. -->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.maximum-capacity</name>
            <value>100</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.maximum-capacity</name>
            <value>100</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.capacity</name>
            <value>50</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.capacity</name>
            <value>50</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
            <value> </value>
            <! -- ACL inheritance. Permissions must be configured for the parent queues. -->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.acl_administer_queue</name>
            <value>test</value>
            <! -- Only user test is allowed to manage q1, such as to kill jobs in the queue. -->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.acl_administer_queue</name>
            <value>foo</value>
            <! -- Only user foo is allowed to manage q2, such as to kill jobs in the queue. -->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.state</name>
            <value>RUNNING</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.state</name>
            <value>RUNNING</value>
        </property>
    </configuration>