If you want to use a compute engine, such as a MaxCompute or an E-MapReduce (EMR) compute engine, in DataWorks, you must authorize DataWorks to access the Alibaba Cloud service to which the compute engine belongs. After the authorization is complete, the system creates a service-linked role for the Alibaba Cloud service to which the related compute engine belongs. This topic describes the service-linked roles that are automatically created when you authorize DataWorks to access the Alibaba Cloud services to which compute engines belong, and the policies that are attached to the roles.

Background information

If you want to perform compute engine-related operations in the DataWorks console, such as associating a compute engine with a workspace or modifying an existing compute engine instance, the system prompts you to perform authorization operations for DataWorks. After the authorization is complete, the system creates a service-linked role for the Alibaba Cloud service to which the related compute engine belongs.
Note
  • Only an Alibaba Cloud account or a RAM user to which the AliyunDataWorksFullAccess policy is attached can authorize DataWorks to perform operations related to compute engines. If you want to perform operations related to compute engines as a RAM user, you must make sure that the AliyunDataWorksFullAccess policy is attached to the RAM user. For information about how to grant permissions to a RAM user, see Grant permissions to the RAM user.
  • If you want to perform the following operations, you must authorize DataWorks to access the Alibaba Cloud service to which a compute engine belongs: Associate a compute engine with a workspace and add and manage data sources.
  • You can log on to the RAM console and go to the Roles page to search for a service-linked role that is created for the Alibaba Cloud service to which a compute engine belongs and view information about the service-linked role. For more information about service-linked roles, see Service-linked roles.
The following table describes the service-linked roles that can be automatically created based on authorization.
Role nameRole permissionReferences
AliyunServiceRoleForDataworksEngineAllows DataWorks to access MaxCompute. Role 1: AliyunServiceRoleForDataworksEngine
AliyunServiceRoleForDataworksOnEmrObtains metadata information of an EMR DataLake cluster and previews related data records in Data Map. Role 2: AliyunServiceRoleForDataworksOnEmr
AliyunServiceRoleForDataWorksObtains and modifies the network configurations of virtual private clouds (VPCs) and the configurations of security groups, and establishes network connections between DataWorks exclusive resource groups and data sources. DataWorks service-linked role
AliyunServiceRoleForDataWorksDIAllows Data Integration to obtain RAM roles and assume a custom RAM role to access a data source. Description of the AliyunServiceRoleForDataWorksDI role
AliyunDIDefaultRoleAllows DataWorks to access resources of other Alibaba Cloud services activated by the current account during data source configuration, node configuration, and data synchronization. The services include ApsaraDB RDS, ApsaraDB for Redis, ApsaraDB for MongoDB, PolarDB-X, HybridDB for MySQL, AnalyticDB for PostgreSQL, PolarDB, Data Management (DMS), and Data Lake Formation (DLF). Description of the AliyunDIDefaultRole role
AliyunServiceRoleForDataWorksOpenPlatformAccesses and modifies events in EventBridge and supports message event capabilities in DataWorks Open Platform. Appendix: DataWorks service-linked role
AliyunServiceRoleForDataWorksAccessDLFAllows DataWorks to access metadata information of DLF, grants permissions on metadata to users, and revokes permissions on metadata from users. This role is used to implement application and request processing for permissions on DLF metadata in Security Center. Appendix: Service-linked role used by DataWorks to access DLF
The following sections describe the service-linked roles related to MaxCompute compute engines and EMR DataLake clusters.

Role 1: AliyunServiceRoleForDataworksEngine

  • Role name: AliyunServiceRoleForDataworksEngine
  • Role permissions: Authorizes DataWorks to access MaxCompute.
  • Policy attached to the role: AliyunServiceRolePolicyForDataworksEngine
  • Policy document:
    {
      "Version": "1",
      "Statement": [
        {
          "Action": "odps:*",
          "Effect": "Allow",
          "Resource": "*"
        },
        {
          "Action": [
            "pai:*",
            "paiplugin:*",
            "eas:*"
          ],
          "Resource": "*",
          "Effect": "Allow"
        },
        {
          "Action": "ram:DeleteServiceLinkedRole",
          "Resource": "*",
          "Effect": "Allow",
          "Condition": {
            "StringEquals": {
              "ram:ServiceName": "engine.dataworks.aliyuncs.com"
            }
          }
        }
      ]
    }

Role 2: AliyunServiceRoleForDataworksOnEmr

Important Do not modify or delete the service-linked role that is automatically created based on authorization and the policy that is attached to the role. Otherwise, you cannot use EMR features in DataWorks.
  • Role name: AliyunServiceRoleForDataworksOnEmr
  • Role permissions: Previews data records in Data Map, and obtains metadata information of an EMR DataLake cluster that uses DLF for metadata management and the configurations of the EMR DataLake cluster.
  • Policy attached to the role: AliyunServiceRolePolicyForDataworksOnEmr
  • Policy document:
    • Permissions to access EMR
      {
          "Version": "1",
          "Statement": [
              {
                "Action": [
                    "emr:GetCluster",
                    "emr:GetOnKubeCluster",
                    "emr:GetClusterClientMeta",
                    "emr:GetApplicationConfigFile",
                    "emr:ListClusters",
                    "emr:ListNodes",
                    "emr:ListNodeGroups",
                    "emr:ListApplications",
                    "emr:ListApplicationConfigs",
                    "emr:ListApplicationConfigFiles",
                    "emr:ListApplicationLinks",
                    "emr:ListComponentInstances"
                  ],
                  "Resource": "*",
                  "Effect": "Allow"
              }
          ]
      }
    • Permissions to access DLF
      If the EMR DataLake cluster that you want to access uses DLF to manage metadata, the policy attached to the service-linked role also contains the following access permissions on DLF. The permissions allow DataWorks to obtain metadata information of the EMR DataLake cluster.
      {
        "Action": [
          "dlf:SubmitQuery",
          "dlf:GetQueryResult",
          "dlf:GetTable",
          "dlf:ListDatabases",
          "dlf:GetTableProfile",
          "dlf:GetCatalogSettings",
          "dlf:BatchGrantPermissions",
          "dlf:ListPartitionsByFilter",
          "dlf:ListPartitions"
        ],
        "Resource": "*",
        "Effect": "Allow"
      }
    • Permissions to access Container Service for Kubernetes (ACK)
      If you want to access an EMR on ACK cluster, the policy attached to the role also contains the following access permissions on ACK:
      {
        "Action": [
          "cs:DescribeUserPermission",
          "cs:DescribeClusterDetail",
          "cs:DescribeClusterUserKubeconfig",
          "cs:GetClusters",
          "cs:GrantPermissions",
          "cs:RevokeK8sClusterKubeConfig"
        ],
        "Resource": "*",
        "Effect": "Allow"
      }