All Products
Search
Document Center

DataWorks:Register a cross-account EMR cluster

Last Updated:Mar 26, 2026

DataWorks supports associating EMR clusters that belong to a different Alibaba Cloud account. To set this up, Account B (the cluster owner) creates a RAM (Resource Access Management) role and grants Account A permission to assume it. Account A then registers Account B's EMR cluster in its DataWorks workspace.

Account Role
Account B Owns the EMR cluster. Creates the RAM role and grants access.
Account A Owns the DataWorks workspace. Registers Account B's cluster.

Prerequisites

Before you begin, make sure you have:

  • Two Alibaba Cloud accounts (Account A and Account B). For setup instructions, see Create an Alibaba Cloud account

  • An EMR Hadoop cluster created under Account B. For setup instructions, see Create a cluster

  • The UID of Account B. Account A needs this UID when registering the cluster in DataWorks. Obtain it from Account B before starting

Limitations

Cross-account EMR cluster registration supports only EMR Hadoop clusters. The following constraints apply:

  • The Metadata parameter of the EMR cluster must not be set to DLF Unified Metadata

  • Only clusters of version V3.38.3 or V3.38.2 are available for selection

  • Kerberos authentication is not supported

  • Spark supports table lineages of SQL nodes but does not support field lineages of SQL nodes

Step 1: Account B — Set up a RAM role for cross-account access

Account B creates a RAM role, configures it to trust Account A, and attaches the required policy.

1.1 Create a RAM role

  1. Log on to the RAM console using Account B.

  2. Create a RAM role and set Account A as the trusted Alibaba Cloud account. For instructions, see Create a RAM role for a trusted Alibaba Cloud account.

    image.png

    Key settings:

    Parameter Value
    RAM Role Name EMRRole
    Select Trusted Alibaba Cloud Account Other Alibaba Cloud Account
    Account ID field Enter the UID of Account A. To find it, log on to the RAM console with Account A and hover over the profile picture in the top navigation bar.

    After saving, Account A can assume the EMRRole role and access authorized resources.

1.2 Update the trust policy

Go to the details page of EMRRole and update its trust policy so that Account A can access EMR clusters owned by Account B. For instructions, see Modify the trust policy of a RAM role.

Replace <ACCOUNT_A_SERVICE_ID> with the service identifier of Account A (format: san******@emr.dataworks.aliyuncs.com):

{
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "<ACCOUNT_A_SERVICE_ID>"
        ]
      }
    }
  ],
  "Version": "1"
}

1.3 Attach the required policy

On the EMRRole details page, attach the AliyunDataWorksAccessingEMRReadOnlyPolicy policy to the role.

image.png

Step 2: Account A — Register the EMR cluster in DataWorks

2.1 Open Management Center

  1. Log on to the DataWorks console. In the top navigation bar, select the target region.

  2. In the left-side navigation pane, choose More > Management Center.

  3. Select the target workspace from the drop-down list, then click Go to Management Center.

  4. In the left-side navigation pane of the Management Center page, click Computing Resources.

2.2 Configure basic cluster information

Follow the on-screen prompts to fill in the cluster details.

Note

For a standard-mode workspace, configure computing resources separately for the development and production environments. For details, see Basic mode vs. standard mode workspaces.

image

Key parameters:

Parameter Description
Alibaba Cloud Primary Account UID The UID of the account that owns the EMR cluster. Set this to Account B's UID.
Opposite RAM Role The RAM role that Account A assumes to access Account B's EMR resources. Set this to EMRRole.
Peer EMR Cluster The cluster to register. Only EMR Hadoop clusters of version V3.38.3 or V3.38.2 with Metadata not set to DLF Unified Metadata are available.

For the full list of configuration options, see Data Studio: Bind EMR computing resources.

2.3 Initialize the resource group

If this is the first time you are associating EMR computing resources, or if the cluster's service configuration has changed (for example, modifications to core-site.xml) or component versions have been upgraded, initialize the resource group. This ensures network connectivity is properly set up and that the resource group can reach the EMR cluster.

Note

Initializing a resource group may cause running tasks to fail. Schedule the initialization during off-peak hours unless immediate reinitialization is necessary (for example, to prevent tasks from failing after cluster configuration changes). If initialization fails, use the connectivity diagnosis tool to troubleshoot.

What's next

Account A's DataWorks workspace can now access Account B's EMR cluster. From here: