YARN authorization can be divided into service-level and queue-level authorization.

Service-level authorization

For more information, see Hadoop's Service Level Authorization Guide.

  • Controls users' access to cluster services, such as job submission.
  • Configures hadoop-policy.xml.
  • Service-level permission checks are performed before other permission checks (such as for HDFS permission and YARN job submission)
Note Typically, if HDFS permission checks and YARN job submission controls have been set up, you may choose not to set the service-level permission control. Perform the relevant configurations as required.

Queue-level authorization

YARN supports permission control for resources by means of queues, and provides two queue scheduling methods: Capacity Scheduler and Fair Scheduler. Capacity Scheduler is used as an example here.

  • Add a configuration

    A queue also has two levels of authorization: for job submission (submitting a job to the queue) and for queue management.

    Note
    • The ACL control object for a queue is a user or group. When you are defining the related parameters, users and groups can be set at the same time, separated by spaces. You can use a comma to separate different users or groups. Only one space indicates that no one has permission.
    • ACL inheritance for a queue: If a user or group can submit an application to a queue, they can also submit applications to all of its sub-queues. The ACL that manages queues can also be inherited. If you want to prevent a user or group from submitting jobs to a queue, you must set the ACL for this queue and all its parent queues to restrict the job submission permission for this user or group.
    • yarn.acl.enable

      Set the ACL switch to true.

    • yarn.admin.acl
      • The YARN administrator setting, which enables or disables the execution of yarn rmadmin/yarn kill and other commands. This value must be configured. If not, the subsequent queue-based ACL administrator settings do not take effect.
      • You can set the user or group when setting the parameter values:
        user1,user2 group1,group2 #users and groups are separated by a space
          group1,group2 #In case there are only groups, a leading space is required.
        In an E-MapReduce cluster, you must configure the ACL permission for has as administrator.
    • yarn.scheduler.capacity.${queue-name}.acl_submit_applications
      • Set the user or group that can submit jobs to this queue.
      • If ${queue-name} is the queue name, multi-level queues are supported. Note that ACL is inherited in multi-level queues. For example:
        #queue-name=root
          <property>
              <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
              <value> </value> # Space means no one can submit jobs to the root queue
          </property>
         #queue-name=root.testqueue
         <property>
           <name>yarn.scheduler.capacity.root.testqueue.acl_submit_applications</name>
              <value>test testgrp</value> #testqueue only allows the test user and testgrp group to submit jobs
          </property>
    • yarn.scheduler.capacity.${queue-name}.acl_administer_queue
      • Set some users or groups to manage the queue, such as killing jobs in the queue.
      • Multi-level queue names are supported. Note that ACL is inherited in multi-level queues.
        #queue-name=root
          <property>
              <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
              <value> </value>
          </property>
         #queue-name=root.testqueue
         <property>
           <name>yarn.scheduler.capacity.root.testqueue.acl_administer_queue</name>
              <value>test testgrp</value>
          </property>
  • Restart the YARN service
    • Kerberos secure clusters have ACL enabled by default. You can configure the relevant ACL permissions for queues as required.
    • For non-Kerberos secure clusters, enable ACL and configure the permission control for queues in accordance with the preceding instructions. Then restart the YARN service.
  • Configuration example
    • yarn-site.xml
      <property>
              <name>yarn.acl.enable</name>
              <value>true</value>
       </property>
      <property>
              <name>yarn.admin.acl</name>
              <value>has</value>
       </property>
    • capacity-scheduler.xml
    • Default queue: Disables the default queue and does not allow users to submit jobs or manage the queue.
    • Q1 queue: Only allows the test user to submit jobs and manage the queue (such as killing jobs).
    • Q2 queue: Only allows the foo user to submit jobs and manage the queue.
    <configuration>
        <property>
            <name>yarn.scheduler.capacity.maximum-applications</name>
            <value>10000</value>
            <description>Maximum number of applications that can be pending and running.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
            <value>0.25</value>
            <description>Maximum percent of resources in the cluster which can be used to run application masters i.e.
                controls number of concurrent running applications.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.resource-calculator</name>
            <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.queues</name>
            <value>default,q1,q2</value>
            <! --3 queues->
            <description>The queues at the this level (root is the root queue).</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.capacity</name>
            <value>0</value>
            <description>Default queue target capacity.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
            <value>1</value>
            <description>Default queue user limit a percentage from 0.0 to 1.0.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
            <value>100</value>
            <description>The maximum capacity of the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.state</name>
            <value>STOPPED</value>
            <! -- Status of the default queue is set as STOPPED-->
            <description>The state of the default queue. State can be one of RUNNING or STOPPED.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
            <value> </value>
            <! -- The default queue does not allow job submission-->
            <description>The ACL of who can submit jobs to the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
            <value> </value>
            <! -- Prevent users/groups to manage the default queue-->
            <description>The ACL of who can administer jobs on the default queue.</description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.node-locality-delay</name>
            <value>40</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.queue-mappings</name>
            <value>u:test:q1,u:foo:q2</value>
            <! -- Queue mapping, automatically maps the test user to Q1 queue-->
            <description>A list of mappings that will be used to assign jobs to queues. The syntax for this list is
                [u|g]:[name]:[queue_name][,next mapping]* Typically this list will be used to map users to queues,for
                example, u:%user:%user maps all users to queues with the same name as the user.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
            <value>true</value>
            <! -- Whether or not allow the above queue-mapping to overwrite the queue parameters set up by the client-->
            <description>If a queue mapping is present, will it override the value specified by the user? This can be used
                by administrators to place jobs in queues that are different than the one specified by the user. The default
                is false.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
            <value> </value>
            <! -- ACL inheritance, the parent queue must have the admin permissions-->
            <description>
                The ACL of who can submit jobs to the root queue.
            </description>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.acl_submit_applications</name>
            <value>test</value>
            <! -- q1 only allows the test user to submit jobs-->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.acl_submit_applications</name>
            <value>foo</value>
            <! -- q2 only allows the foo user to submit jobs-->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.maximum-capacity</name>
            <value>100</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.maximum-capacity</name>
            <value>100</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.capacity</name>
            <value>50</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.capacity</name>
            <value>50</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
            <value> </value>
            <! -- ACL inheritance, the parent queue must have the admin permissions-->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.acl_administer_queue</name>
            <value>test</value>
            <! -- q1 only allow the test user to manage the queue, such as killing the jobs-->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.acl_administer_queue</name>
            <value>foo</value>
            <! -- q2 only allow the foo user to manage the queue, such as killing the jobs-->
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q1.state</name>
            <value>RUNNING</value>
        </property>
        <property>
            <name>yarn.scheduler.capacity.root.q2.state</name>
            <value>RUNNING</value>
        </property>
    </configuration>