To use other compute engines, such as MaxCompute and E-MapReduce (EMR), in DataWorks, you must first grant DataWorks the required access permissions. After you grant the permissions, the system automatically creates a service-linked role for the compute engine. This topic describes the roles and access policies that are automatically created when you authorize DataWorks to use these compute engines.
Background information
When you perform operations related to compute engines in the DataWorks console, such as adding or editing a compute engine instance, you are prompted to grant permissions. After you grant the permissions, the system automatically creates a service-linked role.
Only an Alibaba Cloud account or a Resource Access Management (RAM) user that has the
AliyunDataWorksFullAccessrole can grant DataWorks permissions to use other compute engines. If a RAM user does not have theAliyunDataWorksFullAccessrole, you must grant the role to the RAM user. For more information, see Grant permissions to a RAM user.The authorization prompt is triggered by DataWorks operations such as Data Source Management.
You can search for and view the details of the role on the page in the Resource Access Management (RAM) console. For more information about service-linked roles, see Service-linked Role.
The following table lists the roles that are automatically created after you grant the required permissions and provides links to the details of each role.
Role | Purpose | Details |
| Grants DataWorks permissions to access MaxCompute. | |
| Obtains metadata from EMR (new data lake) to preview data records in Data Map. | |
| Obtains and modifies VPC network configurations and security group configurations to establish network connections between exclusive resource groups for DataWorks and data sources. | |
| Obtains the list of RAM roles. This lets you select a role when you configure a role to access a data source. | |
| Allows DataWorks to access resources of other cloud products under the current Alibaba Cloud account when you configure data sources, configure tasks, and synchronize data. This includes some management permissions for cloud resources such as RDS, Redis, MongoDB, PolarDB-X, HybridDB for MySQL, AnalyticDB for PostgreSQL, PolarDB, DMS, and DLF. | |
| Obtains and modifies events in EventBridge to support the product message and event features of DataWorks Open Platform. | AliyunServiceRoleForDataWorksOpenPlatform service-linked role |
| Obtains metadata from Data Lake Formation (DLF) and performs operations such as granting and revoking metadata permissions. This allows Security Center to manage requests and approvals for DLF metadata. | |
| Manages resources in EventBridge and accesses resources of other cloud products such as OSS. | AliyunServiceRoleForDataWorksScheduler service-linked role |
The following sections describe the roles related to the MaxCompute and EMR (new data lake) compute engines.
Role 1: AliyunServiceRoleForDataworksEngine
Role name: AliyunServiceRoleForDataworksEngine
Purpose: A service-linked role for DataWorks to access compute engines (dataworks-engine). The dataworks-engine service uses this role to access your resources in other cloud services.
Attached policy: AliyunServiceRolePolicyForDataworksEngine
Policy details:
{ "Version": "1", "Statement": [ { "Action": "odps:*", "Effect": "Allow", "Resource": "*" }, { "Action": [ "stream:ActOnBehalfOfAnotherUser", "stream:CreateDeployment", "stream:StartJobWithParams", "stream:ListDeployments", "stream:GetDeployment", "stream:GetJob", "stream:StopJob", "stream:DeleteDeployment" ], "Effect": "Allow", "Resource": "*" }, { "Action": "dlf-auth:ActOnBehalfOfAnotherUser", "Resource": "*", "Effect": "Allow" }, { "Action": [ "pai:*", "paiplugin:*", "eas:*", "featurestore:*" ], "Resource": "*", "Effect": "Allow" }, { "Effect": "Allow", "Action": [ "emr-serverless-spark:StartSessionCluster", "emr-serverless-spark:CreateSqlStatement", "emr-serverless-spark:GetSqlStatement", "emr-serverless-spark:TerminateSqlStatement", "emr-serverless-spark:ListSessionClusters", "emr-serverless-spark:ListWorkspaces", "emr-serverless-spark:ListWorkspaceQueues", "emr-serverless-spark:ListReleaseVersions", "emr-serverless-spark:CancelJobRun", "emr-serverless-spark:ListJobRuns", "emr-serverless-spark:GetJobRun", "emr-serverless-spark:StartJobRun", "emr-serverless-spark:AddMembers", "emr-serverless-spark:GrantRoleToUsers", "emr-serverless-spark:ListLogContents", "emr-serverless-spark:GetTemplate", "emr-serverless-spark:ListKyuubiServices", "emr-serverless-spark:GetLivyCompute", "emr-serverless-spark:CreateLivyCompute", "emr-serverless-spark:UpdateLivyCompute", "emr-serverless-spark:ListLivyCompute", "emr-serverless-spark:DeleteLivyCompute", "emr-serverless-spark:StartLivyCompute", "emr-serverless-spark:StopLivyCompute", "emr-serverless-spark:CreateLivyComputeToken", "emr-serverless-spark:GetLivyComputeToken", "emr-serverless-spark:ListLivyComputeToken", "emr-serverless-spark:DeleteLivyComputeToken", "emr-serverless-spark:RefreshLivyComputeToken" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "adb:SubmitSparkApp", "adb:GetSparkAppState", "adb:GetSparkAppLog", "adb:GetSparkAppWebUiAddress", "adb:ListSparkApps", "adb:GetSparkAppInfo", "adb:KillSparkApp", "adb:DescribeAdbMySqlTables", "adb:getDatabaseObjectsByFilter", "adb:getTable" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "lindorm:GetLindormInstanceList", "lindorm:GetLindormInstance", "lindorm:GetLindormInstanceEngineList", "lindorm:GetLindormV2InstanceEngineList", "lindorm:ListLdpsComputeGroups", "lindorm:RestartLdpsComputeGroup" ], "Resource": "*" }, { "Action": "ram:DeleteServiceLinkedRole", "Resource": "*", "Effect": "Allow", "Condition": { "StringEquals": { "ram:ServiceName": "engine.dataworks.aliyuncs.com" } } }, { "Action": [ "searchengine:GetInstance", "searchengine:ListInstances", "searchengine:GetTable", "searchengine:ListTables" ], "Resource": "*", "Effect": "Allow" } ] }
Role 2: AliyunServiceRoleForDataworksOnEmr
Do not modify or delete the automatically created role and its access policy. Otherwise, the DataWorks on EMR feature may not work as expected.
Role name: AliyunServiceRoleForDataworksOnEmr
Purpose: Allows DataWorks to preview data records in Data Map, retrieve metadata from EMR clusters of the DLF type, and retrieve configuration information from EMR clusters.
Attached policy: AliyunServiceRolePolicyForDataworksOnEmr
Policy details:
EMR access permissions
{ "Version": "1", "Statement": [ { "Action": [ "emr:GetCluster", "emr:GetOnKubeCluster", "emr:GetClusterClientMeta", "emr:GetApplicationConfigFile", "emr:ListClusters", "emr:ListNodes", "emr:ListNodeGroups", "emr:ListApplications", "emr:ListApplicationConfigs", "emr:ListApplicationConfigFiles", "emr:ListApplicationLinks", "emr:ListComponentInstances", "emr:DescribeClusterV2", "emr:DescribeCluster", "emr:DescribeClusterServiceConfig", "emr:DescribeFlowAgentToken", "emr:DescribeClusterBasicInfo", "emr:ListClusterHostComponent" ], "Resource": "*", "Effect": "Allow" } ] }Data Lake Formation (DLF) access permissions
If an EMR cluster uses DLF to centrally manage metadata, the access policy of the automatically created role also includes the following DLF access permissions. These permissions allow DataWorks to retrieve metadata from EMR.
{ "Action": [ "dlf:SubmitQuery", "dlf:GetQueryResult", "dlf:GetTable", "dlf:ListDatabases", "dlf:GetTableProfile", "dlf:GetCatalogSettings", "dlf:BatchGrantPermissions", "dlf:ListPartitionsByFilter", "dlf:ListPartitions", "dlf:GetHudiProperties", "dlf:ListCatalogs", "dlf:GetDatabase", "dlf:GetLifecycleRule", "dlf:GetCatalog", "dlf:GetIcebergNamespace", "dlf:GetIcebergTable" ], "Resource": "*", "Effect": "Allow" }Container Service for Kubernetes (ACK) access permissions
If the EMR cluster is an EMR on ACK cluster, the access policy of the automatically created role also includes the following ACK access permissions.
{ "Action": [ "cs:DescribeUserPermission", "cs:DescribeClusterDetail", "cs:DescribeClusterUserKubeconfig", "cs:GetClusters", "cs:GrantPermissions", "cs:RevokeK8sClusterKubeConfig" ], "Resource": "*", "Effect": "Allow" }Serverless Spark access permissions
If the EMR cluster is an EMR Serverless Spark cluster, the access policy of the automatically created role also includes the following Serverless Spark access permissions.
{ "Effect": "Allow", "Action": [ "emr-serverless-spark:StartSessionCluster", "emr-serverless-spark:CreateSqlStatement", "emr-serverless-spark:GetSqlStatement", "emr-serverless-spark:TerminateSqlStatement", "emr-serverless-spark:ListSessionClusters", "emr-serverless-spark:ListWorkspaces", "emr-serverless-spark:ListWorkspaceQueues", "emr-serverless-spark:ListReleaseVersions", "emr-serverless-spark:CancelJobRun", "emr-serverless-spark:ListJobRuns", "emr-serverless-spark:GetJobRun", "emr-serverless-spark:StartJobRun", "emr-serverless-spark:AddMembers", "emr-serverless-spark:GrantRoleToUsers", "emr-serverless-spark:ListLogContents", "emr-serverless-spark:GetTemplate", "emr-serverless-spark:ListKyuubiServices", "emr-serverless-spark:GetLivyCompute", "emr-serverless-spark:CreateLivyCompute", "emr-serverless-spark:UpdateLivyCompute", "emr-serverless-spark:ListLivyCompute", "emr-serverless-spark:DeleteLivyCompute", "emr-serverless-spark:StartLivyCompute", "emr-serverless-spark:StopLivyCompute", "emr-serverless-spark:CreateLivyComputeToken", "emr-serverless-spark:GetLivyComputeToken", "emr-serverless-spark:ListLivyComputeToken", "emr-serverless-spark:DeleteLivyComputeToken", "emr-serverless-spark:RefreshLivyComputeToken", "emr-serverless-spark:ListLogContents" ], "Resource": "*" }The following OSS permissions are also included. These permissions allow you to upload SQL files and JAR packages or save temporary query results.
{ "Action": [ "oss:PutObject", "oss:GetObject", "oss:DeleteObject", "oss:DeleteObjectVersion" ], "Resource": [ "acs:oss:*:*:*/.dataworks/*", "acs:oss:*:*:*/.dlsdata/*" ], "Effect": "Allow" }, { "Action": "oss:PostDataLakeStorageFileOperation", "Resource": "*", "Effect": "Allow" }