All Products
Search
Document Center

:Configure mappings between tenant member accounts and CDH or CDP cluster accounts

Last Updated:Apr 30, 2024

After you register a Cloudera's Distribution Including Apache Hadoop (CDH) cluster or Cloudera Data Platform (CDP) cluster in DataWorks, you can configure a mapping between an Alibaba Cloud account or a RAM user of a DataWorks tenant member and the account of a specific identity in the CDH or CDP cluster. This way, the tenant member can use the mapped identity of the cluster to access the cluster. The procedure of configuring a mapping to a specific identity in a CDP cluster is similar to that of configuring a mapping to a specific identity in a CDH cluster. This topic describes how to configure a mapping to a specific identity in a CDH cluster.

Mapping types

The account that is used to access a CDH cluster and execute the code of CDH tasks in DataWorks varies based on the default access identity that you specify when you register the CDH cluster. For more information, see the Configure the default access identity for the cluster section in the "Register a CDH or CDP cluster to DataWorks" topic. The following table describes the accounts that you can specify as default access identities and the supported mapping types.

Account type and description

Mapping type and description

Cluster account

The cluster account that you specify as the default access identity is used to execute the code of CDH tasks regardless of who runs CDH tasks in DataWorks.

For example, if you specify a cluster account as the default access identity, the cluster account is used to run CDH tasks regardless of whether the CDH tasks are submitted by an Alibaba Cloud account, a RAM user that is assigned the Workspace Administrator role, or a RAM user that is assigned only the Development role.

No Authentication

By default, the Mapping Type parameter is set to No Authentication.

Important

If you specify a mapping account as the default access identity, you cannot set the Mapping Type parameter to No Authentication. Otherwise, CDH tasks will fail because no access identity is configured for the Alibaba Cloud account or RAM user. You can set the Mapping Type parameter to System account mapping, OPEN LDAP account mapping, or Kerberos account mapping based on your business requirements.

Mapping account

The CDH system account, Kerberos account, or OpenLDAP account that is mapped to an Alibaba Cloud account or a RAM user is used to execute the code of CDH tasks when the Alibaba Cloud account or RAM user is used by a workspace member to run CDH tasks in DataWorks.

If you specify a mapping account as the default access identity, you must go to the Account Mapping tab to configure a mapping between a CDH cluster account and an Alibaba Cloud account or a RAM user.

Important

You may fail to use an Alibaba Cloud account or a RAM user to run CDH tasks in the following scenarios:

  • The Alibaba Cloud account or RAM user is not mapped to a CDH cluster account.

  • The Alibaba Cloud account or RAM user is mapped to a CDH cluster account, but the Mapping Type parameter is set to No Authentication.

System account mapping

  • You can configure a mapping between an Alibaba Cloud account or a RAM user and a system account of a specific CDH cluster based on your business requirements. The system account can be the administrator account of Cloudera Manager or a Hadoop account. After you configure the mapping, the CDH tasks that are submitted by the Alibaba Cloud account or the RAM user are run by the mapped system account.

  • If you want to isolate permissions on the data that can be accessed by using different Alibaba Cloud accounts or RAM users in a specific CDH cluster, you can use this mapping type.

OPEN LDAP account mapping

  • You can configure a mapping between an Alibaba Cloud account or a RAM user and an OpenLDAP account of a specific CDH cluster based on your business requirements. After you configure the mapping, the CDH tasks that are submitted by the Alibaba Cloud account or the RAM user are run by the mapped OpenLDAP account.

  • If you use Presto and select the OPEN LDAP account mapping type, configure the Config.Properties and Presto.Jks files on the Basic Information tab of the CDH cluster.

    Note

    After you enable LDAP authentication for a CDH cluster, you must configure the username and password that are used to access the CDH cluster. This improves service security.

Kerberos account mapping

  • You can configure a mapping between an Alibaba Cloud account or a RAM user and a Kerberos account of a specific CDH cluster based on your business requirements. After you configure the mapping, the CDH tasks that are submitted by the Alibaba Cloud account or the RAM user are run by the mapped Kerberos account.

  • If Kerberos authentication is enabled for Hive MetaStore of the CDH cluster, set the Mapping Type parameter to Kerberos account mapping. Otherwise, metadata collection is affected.

  • If you use Presto and select the Kerberos account mapping type, configure the Config.Properties and Presto.Jks files on the Basic Information tab of the CDH cluster.

  • If you want to isolate permissions on the data that can be accessed by using different Alibaba Cloud accounts or RAM users in a specific CDH cluster, you can use this mapping type.

    Note

    The Kerberos account is used to access the CDH cluster based on identity authentication and authorization. This ensures the security communication between users and services that are deployed in the CDH cluster. You can use the Sentry or Ranger service to configure different permissions for different Kerberos accounts in the CDH cluster to isolate permissions on data. If you set the Mapping Type parameter to Kerberos account mapping, the mapped Alibaba Cloud account or RAM user has the same permissions as the Kerberos account and can access the data of the CDH cluster.

Prerequisites

  • A CDH cluster account is created.

  • If you want to select Kerberos account mapping, make sure that Kerberos authentication is enabled for the CDH cluster.

  • Before you use an OpenLDAP account, make sure that the OpenLDAP service is enabled for the CDH cluster.

  • A CDH cluster is registered in DataWorks. For more information, see Register a CDH or CDP cluster to DataWorks.

Step 1: Go to the Account Mappings tab

  1. Go to the Management Center page.

    Log on to the DataWorks console. In the left-side navigation pane, click Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

  2. In the left-side navigation pane of the SettingCenter page, click Open Source Clusters.

  3. On the Open Source Clusters page, find the CDH cluster that you want to manage and click the Account Mappings tab. In the upper-right corner of the Account Mappings tab, click Edit Account Mappings.

    On the page that appears, you can configure a mapping between an Alibaba Cloud account or a RAM user of a DataWorks tenant member and the CDH cluster account. After you configure the mapping, the tenant member can use the specified identity of the mapped cluster account to run CDH tasks.

Step 2: Configure a mapping between a tenant member account and the CDH cluster account

In this step, you can perform the following operations to specify the cluster account that is used to execute the code of CDH tasks in DataWorks:

  1. Configure the Mapping Type parameter.

    You can set the Mapping Type parameter to No Authentication, System account mapping, OPEN LDAP account mapping, or Kerberos account mapping based on your business requirements. For more information, see Mapping types.

  2. Configure a mapping.

    Configure a mapping based on the mapping type that you specify.image.png

    Note

    If you set the Mapping Type to No Authentication, you do not need to configure a mapping. By default, the cluster account that is configured when you register a CDH or CDP cluster is used to run CDH tasks. For more information, see the Step 2: Register a CDH or CDP cluster section in the "Register a CDH or CDP cluster to DataWorks" topic.

    System account mapping

    You can configure a mapping between an Alibaba Cloud account or a RAM user and a system account of a CDH cluster based on the on-screen instructions.

    • Use Alibaba Cloud accounts to run tasks: Select an Alibaba Cloud account and configure a mapping between the Alibaba Cloud account and a system account.

    • Use RAM users to run tasks: Select a RAM user and configure a mapping between the RAM user and a system account. The following types of mappings are supported:

      • Mapping between accounts with the same name: The CDH cluster account that has the same account name as the RAM user is mapped to the RAM user. Example:

        • RAM user: ram_user_1@xxx.onaliyun.com

        • CDH cluster account that has the same account name as the RAM user: ram_user_1

        If you use a RAM user named ram_user_1@xxx.onaliyun.com to run CDH tasks, the tasks are actually run by the CDH cluster account ram_user_1.

        Note
        • If you use a RAM user to run CDH tasks in DataWorks, a CDH cluster account that has the same account name as the RAM user is actually used to run CDH tasks in the CDH cluster. You can also use a CDH cluster account that has a different account name from the RAM user to run CDH tasks.

        • To prevent tasks from failing to run, make sure that an account that has the same account name as the RAM user exists in the CDH cluster. If no such account exists, create an account on the user management page of the CDH cluster in the Cloudera Manager Admin console.

      • Mapping between accounts with different names: A CDH cluster account that has a different account name from the RAM user is mapped to the RAM user.

    Kerberos account mapping

    You can configure a mapping between an Alibaba Cloud account or a RAM user and a Kerberos account of the CDH cluster. A Kerberos account is specified in the Instance name@Domain name format. Example: cdn_test@HADOOP.COM.

    During Kerberos authentication, the keytab and krb5.conf files are required.

    • The krb5.conf file is used to store the configurations of the Key Distribution Center (KDC) server.

    • The keytab file is used to store the authentication credentials of the resource principal. The file name must be in the Kerberos account.keytab format.

    You must add the required account and upload the required files based on the on-screen instructions.

    Note
    • If Kerberos authentication is enabled for Hive MetaStore of the CDH cluster, set the Mapping Type parameter to Kerberos account mapping. Otherwise, metadata collection is affected.

    • If you use Presto and select the Kerberos account mapping type, configure the Config.Properties and Presto.Jks files on the Basic Information tab of the CDH cluster.

    • Make sure that Kerberos authentication is enabled for the CDH cluster.

    OpenLDAP account mapping

    You can configure a mapping between an Alibaba Cloud account or a RAM user and an OpenLDAP account based on the on-screen instructions.

    Note
    • If you use Presto and select the OPEN LDAP account mapping type, configure the Config.Properties and Presto.Jks files on the Basic Information tab of the CDH cluster.

    • Make sure that the OpenLDAP service is enabled for the cluster.

  3. Click Complete. The mapping is configured. The tasks that are run by an Alibaba Cloud account or a RAM user are actually run by the cluster account to which the Alibaba Cloud account or the RAM user is mapped.