All Products
Search
Document Center

DataWorks:Map tenant members to EMR accounts

Last Updated:Mar 26, 2026

Map a DataWorks tenant member account to an E-MapReduce (EMR) cluster account so that tasks submitted from DataWorks run under the correct cluster identity.

How it works

When DataWorks submits a task to an EMR cluster, it authenticates using a cluster account. The account used depends on the access identity configured when you registered the EMR cluster:

  • Cluster Account Mapped to Account of Task Owner or Cluster Account Mapped to RAM User — tasks run as a RAM user

  • Cluster Account Mapped to Alibaba Cloud Account — tasks run as an Alibaba Cloud account

Without a mapping, DataWorks falls back to default behavior, which works only in limited cases. For RAM users, DataWorks looks for a system account in the EMR cluster with the same name. If LDAP or Kerberos authentication is not enabled for the EMR cluster, you must configure a mapping between the RAM user and the system account of the EMR cluster; otherwise, tasks fail. For Alibaba Cloud accounts, there is no default — you must always configure a mapping manually, regardless of whether LDAP or Kerberos authentication is enabled.

Prerequisites

Before you begin, make sure you have:

  • An EMR cluster registered as a computing resource in your DataWorks workspace

  • One of the following roles or permissions (see Who can configure mappings)

Who can configure mappings

Your ability to configure mappings for other members depends on your role.

Role Can configure mappings for
Alibaba Cloud account All workspace members
RAM user or RAM role with AliyunDataWorksFullAccess and AliyunEMRFullAccess policies All workspace members
RAM user or RAM role assigned the Workspace Administrator role and the AliyunEMRFullAccess policy All workspace members
Any other member Themselves only

Usage notes

Authentication constraints

Do not configure a mapping for an EMR cluster that has both LDAP authentication and Kerberos authentication enabled. Tasks will fail if you do.

Ranger authorization

If Ranger authorization is enabled for an EMR cluster, add DataWorks to the cluster whitelist before developing EMR tasks. Without this, tasks fail with the error Cannot modify spark.yarn.queue at runtime or Cannot modify SKYNET_BIZDATE at runtime. See Add DataWorks to the EMR cluster whitelist.

Kerberos user management

If you use Kerberos authentication, enable the Kerberos authentication service on the EMR cluster and add the task development account to the service. For details, see Configure Kerberos authentication.

Data permissions

Service-level permissions on an EMR cluster can isolate data operation access for DataWorks users. For example, use Ranger to control which operations the mapped cluster account can perform.

If Data Lake Formation (DLF) is configured as the metadata storage service and the DLF-Auth component is used for DLF data permission management, request data permissions from Security Center in the DataWorks console. For details, see DLF data access control.

Failure scenarios

Tasks fail in the following scenarios. Use this table to diagnose misconfiguration.

Mapping type Scenario Why tasks fail
System account mapping A RAM user runs tasks, but no EMR cluster system account has the same name DataWorks cannot find a matching account
System account mapping A RAM user is mapped, but the account name or password does not match the actual EMR cluster account Authentication fails
System account mapping An Alibaba Cloud account runs tasks, but no mapping exists No default fallback for Alibaba Cloud accounts
LDAP account mapping LDAP authentication is enabled on the EMR cluster, but the mapping is not configured or is misconfigured in DataWorks DataWorks sends the wrong credentials
Kerberos account mapping Kerberos authentication is enabled on the EMR cluster, but the mapping is not configured or is misconfigured in DataWorks DataWorks sends the wrong credentials
Kerberos account mapping Kerberos mapping is configured in DataWorks, but the Kerberos authentication service is not enabled on the EMR cluster The Kerberos service is unavailable
LDAP account mapping LDAP mapping is configured in DataWorks, but LDAP authentication is not enabled for the relevant component in the EMR cluster SQL tasks (Hive, Impala, Presto, Trino) fail at authentication

Open the account mapping editor

  1. Log on to the DataWorks console. In the top navigation bar, select the target region. In the left navigation pane, choose More > Management Center.

  2. On the Management Center page, select the target workspace from the drop-down list and click Go to Management Center.

  3. In the left navigation pane, click Computing Resources.

  4. In the computing resource list, find the target EMR cluster and click Account Mappings. On the page that appears, click Edit Account Mappings in the upper-right corner.

    image.png

Configure a mapping

On the cluster account mapping editing page, choose one of the following mapping types based on the authentication method enabled for your EMR cluster.

A mapping applies to all workspaces that have the EMR cluster registered. Modify the configuration only when your business requires it.

Option 1: System account mapping

Use this option when LDAP and Kerberos authentication are not enabled for the EMR cluster.

  1. Set Configuration Mode to either:

    • Custom — define the mapping for this cluster only

    • Reference Configurations of Another Cluster — reuse an existing cluster's mapping configuration

  2. Set Mapping Type to System Account Mapping.

  3. Click Confirm.

Option 2: LDAP account mapping

Use this option when Lightweight Directory Access Protocol (LDAP) authentication is enabled for the relevant component in the EMR cluster (such as Hive, Impala, Presto, or Trino).

Important

If LDAP authentication is not enabled for the component, SQL tasks that use the mapped account will fail at authentication.

Before you configure this mapping, enable the LDAP authentication service for the relevant component in the EMR cluster.

  1. Set Configuration Mode to either:

    • Custom — define the mapping for this cluster only

    • Reference Configurations of Another Cluster — reuse an existing cluster's mapping configuration

  2. Set Mapping Type to OPEN LDAP Account Mapping.

  3. Click Confirm.

Option 3: Kerberos account mapping

Use this option when Kerberos authentication is enabled for the EMR cluster.

Before you configure this mapping, enable Kerberos on the EMR cluster.

  1. Download the authentication credentials from the EMR cluster.

  2. Click Upload Keystore File and upload the downloaded credentials. This ensures that EMR Trino and EMR Presto tasks run correctly.

  3. Set Configuration Mode to either:

    • Custom — define the mapping for this cluster only

    • Reference Configurations of Another Cluster — reuse an existing cluster's mapping configuration

  4. Set Mapping Type to Kerberos Account Mapping.

  5. Click Confirm.

Add DataWorks to the EMR cluster whitelist

If Ranger authorization is enabled for an EMR cluster, add DataWorks to the cluster whitelist and restart the Hive service before running EMR tasks. Without this step, tasks fail with Cannot modify spark.yarn.queue at runtime or Cannot modify SKYNET_BIZDATE at runtime.

  1. Add a custom parameter to the Hive service configuration in the EMR cluster:

    ALISA.* and SKYNET.* are DataWorks-specific prefixes and are required for DataWorks tasks to run.
    hive.security.authorization.sqlstd.confwhitelist.append=tez.*|spark.*|mapred.*|mapreduce.*|ALISA.*|SKYNET.*
  2. Restart the Hive service for the configuration to take effect.

What's next