Before you run E-MapReduce (EMR) nodes in DataWorks, you must complete authentication and authorization configurations at the EMR and DataWorks sides to ensure that the nodes can be run as expected. This topic describes how to manage permissions on DataWorks and EMR.
Background information
In DataWorks, you can configure mappings between the members in a workspace and the accounts of the EMR cluster associated with the workspace to obtain the permissions on the EMR cluster. This way, Alibaba Cloud accounts, node owners, or RAM users have different permissions on data when they run EMR nodes in DataWorks, and data permissions are isolated. For more information about permission configurations that are required to run EMR nodes in DataWorks, see Permission management at the EMR side and Permission management at the DataWorks side.
Limits
DataWorks allows you to use only the system account or OpenLDAP account to configure mappings between members in a workspace and the accounts of the EMR cluster associated with the workspace. When you configure the mappings, take note of the following items:- You can configure mappings only at the cluster level. Only one authentication method can be used.
- The EMR cluster accounts and passwords in the mappings must be the same as the actual accounts and passwords of the EMR cluster associated with the workspace.
Value of the Mapping Type parameter | Description |
---|---|
System Account | If the accounts or passwords are inconsistent, EMR nodes fail to be run in DataWorks. |
OpenLDAP Account | In the following scenarios, EMR nodes fail to be run in DataWorks:
|
Permission management at the EMR side
- Enable LDAP authenticationIf you want to use a non-system account for identity authentication in an EMR cluster, you must enable LDAP authentication for the cluster and add the account that is used to develop EMR nodes in DataWorks to LDAP users. In this case, you must perform the following steps:
- Enable LDAP authentication for the cluster.
To use LDAP for identity authentication, you must enable LDAP authentication for the cluster. For more information, see Enable LDAP authentication.
- Prepare the account that is used to run EMR nodes and add the account to LDAP users
and the related DataWorks workspace.
We recommend that you add users who need to create, test, commit, and deploy EMR nodes in DataStudio to LDAP users and the related DataWorks workspace. For more information about how to add an account to a DataWork workspace, see Users, roles, and permissions.
- Enable LDAP authentication for the cluster.
- Manage data permissions
You can manage the services in an EMR cluster to isolate data permissions. For example, you can use EMR Ranger to manage the permissions of users in an EMR cluster.
Permission management at the DataWorks side
- Associate an EMR cluster with a DataWorks workspace
Before you run EMR nodes in DataWorks, you must associate an EMR cluster with a DataWorks workspace. This way, the cluster can be used as a compute engine instance in DataWorks. Only accounts to which the
AliyunEMRFullAccess
policy is attached can be used to perform this operation. For more information about how to attach theAliyunEMRFullAccess
policy to an account, see Overview of users, roles, and permissions. - Grant permissions on DataWorks service modules to an account
If you want to run EMR nodes in DataWorks, you must be granted the permissions on DataWorks service modules such as DataStudio, Data Map, Data Quality, and intelligent monitoring. After you obtain the permissions, you can develop EMR nodes, perform O&M operations on the nodes, and monitor the data quality of the nodes. For more information about the permissions on service modules, see Users, roles, and permissions.
- Configure account mappings
After you associate an EMR cluster with a workspace by using the security mode, go to the EMR Cluster Configure page of DataWorks. On this page, configure mappings between the members in a DataWorks workspace and the accounts of the EMR cluster associated with the workspace. This way, the members in the DataWorks workspace have the same permissions as the mapped accounts.Note For more information about how to associate an EMR cluster with a DataWorks workspace and configure the mappings, see Configure DataWorks.