YARN High-Security Architecture Overview: Kerberos, ACL & LCE - E-MapReduce

Kerberos authentication

EMR automatically configures all Kerberos-related parameters for YARN in a high-security cluster. No manual configuration is required. For background on Kerberos, see Overview.

To access the remote procedure call (RPC) or HTTP services of YARN, the client must pass Kerberos authentication first. The following examples show how to authenticate and then call each service type:

# RPC service
kinit
yarn node -list
kdestroy

# HTTP service
kinit
curl --negotiate -u: http://master-1-1:8088/ws/v1/cluster/nodes
kdestroy

ACL-based authorization

The access control list (ACL) feature of YARN is automatically enabled in high-security clusters. By default, only the hadoop user group has permission to manage the YARN service and queues, and to submit YARN jobs. Set yarn.acl.enable to true to enable the ACL feature.

Management permissions

The default value of yarn.admin.acl is hadoop (with a space before hadoop), which grants the hadoop user group service administrator rights. In most cases, EMR cluster processes run as Linux users belonging to the hadoop user group. The default user-group mapping in Hadoop is based on OS group information of each node.

The yarn.admin.acl parameter uses a <user> <user group> format, with a single space between the user and user group. Separate multiple users with commas, and separate multiple user groups with commas. Example: user1,user2 group1,group2. To specify user groups only, add a space at the beginning. Setting the parameter to a single space grants no user or user group any permissions.

Queue management permissions

Queue permissions in YARN cover two operations: submitting jobs and managing queues.

The default capacity-scheduler.xml configuration for a high-security cluster sets:

yarn.scheduler.capacity.root.acl_submit_applications= (single space — no user can submit jobs)
yarn.scheduler.capacity.root.acl_administer_queue= hadoop (space before hadoop — the hadoop user group can manage queues)

Users who are not in the hadoop user group cannot submit jobs to queues. To check which user groups the current user belongs to, run id.

Disable ACL

If your cluster has only a few users and ACL is not needed, clear both parameters above and refresh queues: in the EMR console, go to the YARN service page, click the Status tab, find the ResourceManager component, click the More icon in the Actions column, and select refresh_queues. Enter an execution reason and click OK. In the Confirm message, click OK.

Use ACL with Ranger

To manage user queue permissions visually, use ACL together with Ranger. Ranger supports only Capacity Scheduler. For configuration details, see Enable YARN in Ranger and configure related permissions.

Configure ACL for queues

To authorize specific users or user groups for individual queues, add the following parameters to capacity-scheduler.xml:

yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications
yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue

For parameter details, see Queue Properties and the ACL configurations of queues section in the YARN schedulers topic.

Queue mapping

To map users or user groups to queues automatically, configure yarn.scheduler.capacity.queue-mappings in capacity-scheduler.xml. Set yarn.scheduler.capacity.queue-mappings-override.enable to true so that mapped users or user groups can only submit jobs to their assigned queues.

Job management permissions

YARN job management permissions include VIEW_APP and MODIFY_APP.

Permission	What it controls
`VIEW_APP`	View job information and YARN component logs. By default, `mapred-site.xml` sets `mapreduce.job.acl-view-job=*`, which grants VIEW_APP to all users in MapReduce. This does not include engine-side VIEW_APP restrictions.
`MODIFY_APP`	Modify or stop a job in YARN. Stopping a job uses the `ADMINISTER_QUEUE` permission to manage ACLs.

LCE

Why LCE is needed

In non-high-security clusters, containers use DefaultContainerExecutor, which runs all containers under the hadoop account regardless of which user submitted the job. This means jobs from different users share the same OS identity, making it impossible to isolate them through authentication. Users can potentially access or modify YARN-related files belonging to other users, and tenants may access each other's resources.

Container executor	How it runs containers
`DefaultContainerExecutor`	Always runs as the hadoop account, regardless of which user submitted the job
`Linux Container Executor (LCE)`	Runs as the account of the user who submitted the job, using the setuid bit

How LCE works

High-security clusters use Linux Container Executor (LCE) to run containers under the account of the user who submitted the job, based on the setuid bit. This revokes high-risk and unnecessary permissions from containers and prevents cross-user resource access.

Prerequisites for LCE

For LCE to work, a Linux account corresponding to each job-submitting user must exist on the OS of every NodeManager node.

Recommended approach: Use the user management feature in the EMR console. This adds the user to the OpenLDAP service and maps it to a Linux account on each node via the nslcd service. The OpenLDAP service must be deployed in your cluster.

Alternative approach: Manage Linux accounts manually. Add the Linux account on each existing node and add bootstrap action scripts to make sure new nodes automatically create the account on startup.