All Products
Search
Document Center

DataWorks:Scenario: Add a data source across accounts

Last Updated:Apr 01, 2024

If you want to use the current Alibaba Cloud account to perform operations on or access data of a data source, such as a MaxCompute or Hologres data source, that belongs to another Alibaba Cloud account, you must add the data source to a DataWorks workspace within the current Alibaba Cloud account. This way, you can access the data across Alibaba Cloud accounts.

Precautions

  • You cannot use data sources that are added across accounts for data development or task scheduling. If you want to use such a data source to perform data development operations, you must add the data source to a DataWorks workspace within the current Alibaba Cloud account. For more information, see Add a MaxCompute data source and Add a Hologres data source.

  • When you add a MaxCompute or Hologres data source across accounts, you can use only a RAM role to access the related MaxCompute project or Hologres instance.

  • The instructions provided in this topic are suitable only for adding a MaxCompute or Hologres data source across accounts. In this topic, a MaxCompute data source is used to describe the related operations. The operations required to add a Hologres data source across accounts are similar to the operations required to add a MaxCompute data source across accounts. For information about how to add other types of data sources across accounts, see the following topics:

Prerequisites

Note

This topic describes how to add a MaxCompute data source across accounts. In this example, a MaxCompute project named xc_project_20 within Alibaba Cloud Account B is added as a data source of Alibaba Cloud Account A to allow Alibaba Cloud Account A to access data of the MaxCompute project within Alibaba Cloud Account B.

  • Alibaba Cloud Account A and Alibaba Cloud Account B are created. For more information, see Create an account.

    • Alibaba Cloud Account A: the account that is used to add the data source across accounts. This indicates that Alibaba Cloud Account A uses user information of Alibaba Cloud Account B to add a data source across accounts.

    • Alibaba Cloud Account B: the information provider for cross-account data source addition. This indicates that Alibaba Cloud Account B provides its user information to Alibaba Cloud Account A to allow Alibaba Cloud Account A to add a data source across accounts.

  • A MaxCompute project based on which you want to add a data source is created within Alibaba Cloud Account B. For information about how to create a MaxCompute project, see Create a MaxCompute project. In this example, a MaxCompute project named xc_project_20 is used.

Alibaba Cloud Account B: Create a RAM role and allow access from Alibaba Cloud Account A

  1. Create a RAM role.

    Log on to the RAM console with Alibaba Cloud Account B. Create a RAM role and add Alibaba Cloud Account A as a trusted account for the RAM role. Then, Alibaba Cloud Account A can assume the role to access the resources whose permissions are granted to the RAM role. For information about how to create a RAM role, see Create a RAM role for a trusted Alibaba Cloud account.

    Note

    The created RAM role can be assumed by Alibaba Cloud Account A to access DataWorks activated within Alibaba Cloud Account B. If the created RAM role needs to be used by Alibaba Cloud Account A to access DataWorks activated within Alibaba Cloud Account A, the policy that is attached to the RAM role must be redefined. For more information, see Define a policy for a RAM role.

    image.png

    Sample configurations of a RAM role:

    • Set the RAM Role Name parameter to McRole.

    • Set the Select Trusted Alibaba Cloud account parameter to Other Alibaba Cloud account and enter the ID of Alibaba Cloud Account A in the field that appears. You can log on to the RAM console with Alibaba Cloud Account A, and move the pointer over the profile picture in the top navigation bar to obtain the ID of Alibaba Cloud Account A.

    After the configuration is complete, Alibaba Cloud Account A can assume the McRole role and access the resources whose permissions are granted to the role.

  2. Modify the trust policy of the RAM role.

    Go to the details page of the McRole role and modify its trust policy to authorize Alibaba Cloud Account A to access DataWorks within Alibaba Cloud Account B. For information about how to modify the trust policy of a RAM role, see Edit the trust policy of a RAM role. The following code shows the document of the trust policy:

    {
      "Statement": [
        {
          "Action": "sts:AssumeRole",
          "Effect": "Allow",
          "Principal": {
            "Service": [
              "ID of Alibaba Cloud Account A@engine.dataworks.aliyuncs.com"
            ]
          }
        }
      ],
      "Version": "1"
    }

Alibaba Cloud Account B: Add the RAM role to the MaxCompute project

  1. Use Alibaba Cloud Account B to access the MaxCompute project.

    You can use the SQL analysis feature in the MaxCompute console to quickly access a MaxCompute project.

    For information about other connection tools, see Select a connection tool.

  2. Add the RAM role to the MaxCompute project.

    1. Add the McRole role created in Step 1 in the preceding section to the MaxCompute project. Sample SQL statements:

      -- Add the RAM role to the MaxCompute project.
      add user `RAM$<accout_name>:role/<RAM role name>`;
      -- View all users in the workspace.
      list users;
      -- View the permissions granted to the RAM role.
      show grants for `RAM$<accout_name>:role/<RAM role name>`;

      Parameter descriptions:

      • <accout_name>: Replace it with the name of Alibaba Cloud Account B.

      • <RAM role name>: Replace it with McRole.

    2. Grant permissions to the RAM role based on your business requirements. For information about how to grant permissions to a RAM role, see Authorization operations.

      Note

      You can grant permissions to the RAM role in advance based on the use scenario of the data source that you want to add across accounts in Workspace A. For example, if you want to query tables of a data source in Workspace B from Workspace A, make sure that the RAM role configured for the data source has the SELECT permission on the tables.

Alibaba Cloud Account A: Use the user information of Alibaba Cloud Account B to add a data source

Note

You must log on to the DataWorks console with Alibaba Cloud Account A and use the user information of Alibaba Cloud Account B to add a MaxCompute data source. Before you perform the following operations, you must obtain the ID of Alibaba Cloud Account B.

  1. Go to the Data Source page.

    1. Log on to the DataWorks console. In the left-side navigation pane, click Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the page that appears, click Data Source. The Data Source page appears.

  2. On the Data Source page, click Add Data Source. In the Add Data Source dialog box, select MaxCompute to add a MaxCompute data source.

  3. Configure information about the data source.

    1. Configure basic information about the data source.

      Configure the information shown in the following figure as prompted. If you use a workspace in standard mode, you must separately add data sources in the development environment and production environment. For information about the workspace modes, see Differences between workspaces in basic mode and workspaces in standard mode.

      image.png

      Configuration descriptions of key parameters:

      • UID Of Alibaba Cloud Account: Set the value to the ID of another Alibaba Cloud account. In this example, set the value to the ID of Alibaba Cloud Account B. You must obtain the ID of Alibaba Cloud Account B in advance.

      • RAM Role: Set the value to the RAM role that can be assumed by Alibaba Cloud Account A to access resources of Alibaba Cloud Account B. In this example, set the value to McRole.

      • MaxCompute Project Name: Set the value to the MaxCompute project based on which Alibaba Cloud Account A adds a data source across accounts by using the user information of Alibaba Cloud Account B. In this example, the MaxCompute project xc_project_20 is used.

      For more information about how to add a MaxCompute data source, see Add a MaxCompute data source.

    2. Establish a network connection between a resource group and the data source.

      In the Connection Configuration section, find a resource group based on your business requirements and click Test Network Connectivity in the Connection Status (Production Environment) column to test the connectivity between the resource group and data source. For more information about resource groups, see Overview.

    3. Click Complete Creation.

What to do next

After the data source is added, you can perform the following operations: