DataWorks provides MaxCompute Reader and MaxCompute Writer for you to read data from and write data to MaxCompute data sources.

Background information

Workspaces in standard mode support the data source isolation feature. You can add data sources separately for the development and production environments to isolate the data sources. This helps keep your data secure. For more information, see Isolate connections between the development and production environments.

MaxCompute provides a comprehensive data import scheme that helps achieve fast computing of large amounts of data. If you associate a MaxCompute compute engine instance with a workspace for the first time, DataWorks generates the default data source odps_first for the workspace. Each time you associate a new MaxCompute compute engine instance with the workspace, DataWorks generates a compute engine data source named in the format of 0_Region ID_Compute engine instance name.

The MaxCompute project names of the default data source and the default compute engine data sources are the same as the names of the MaxCompute projects that are associated with the workspace. You can change the AccessKey pair of the default data source. To change the AccessKey pair, perform the following steps: In the DataWorks console, move the pointer over the profile picture in the upper-right corner and select AccessKey Management. On the AccessKey Management page, find the AccessKey ID that you want to enable or disable and click Enable or Disable in the Actions column. Take note of the following rules when you change the AccessKey pair:
  • You can change only the AccessKey pair of one Alibaba Cloud account to that of another Alibaba Cloud account.
  • Before you change the AccessKey pair, make sure that no Data Integration or DataStudio nodes are running in DataWorks. You can use a RAM user to access the MaxCompute data sources that you add.

Procedure

  1. Go to the Data Source page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. After you select the region where the required workspace resides, find the workspace and click Data Integration in the Actions column.
    4. In the left-side navigation pane of the Data Integration page, choose Data Source > Data Sources to go to the Data Source page.
  2. On the Data Source page, click Add data source in the upper-right corner.
  3. In the Add data source dialog box, click MaxCompute in the Big Data Storage section.
  4. In the Add MaxCompute data source dialog box, configure the parameters.
    MaxCompute data source
    Parameter Description
    Data Source Name The name of the data source. The name can contain letters, digits, and underscores (_) and must start with a letter.
    Data source description The description of the data source. The description can be a maximum of 80 characters in length.
    Environment The environment in which the data source is used. Valid values: Development and Production.
    Note This parameter is displayed only when the workspace is in standard mode.
    ODPS Endpoint The endpoint of the MaxCompute project. The endpoint is automatically generated by MaxCompute based on system configurations.
    Tunnel Endpoint The endpoint of the MaxCompute Tunnel service. For more information, see Endpoints.
    ODPS project name The name of the MaxCompute project.
    AccessKey ID The AccessKey ID of the account that you use to connect to the MaxCompute project. You can view the AccessKey ID on the Security Management page.
    AccessKey Secret The AccessKey secret that corresponds to the AccessKey ID. The AccessKey secret is equivalent to a logon password.
  5. After the data source passes the connectivity test, click Complete.

What to do next

You have learned how to add a MaxCompute data source. You can proceed to subsequent tutorials. In subsequent tutorials, you will learn how to configure MaxCompute Reader and MaxCompute Writer. For more information, see MaxCompute Reader and MaxCompute Writer.