This topic describes how to build a data warehouse for an enterprise based on AnalyticDB for MySQL, and use the data warehouse to perform O&M and manage metadata.

Before you perform operations in this topic, create a workspace. For more information, see Create a workspace.

Configure an AnalyticDB for MySQL connection

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the target workspace and click Data Integration in the Actions column.
  2. On the Data Integration page, click Connection in the left-side navigation pane.
  3. On the Data Source page that appears, click Add a Connection in the upper-right corner.
  4. In the Add Connection dialog box that appears, click AnalyticDB for MySQL2.0 or AnalyticDB for MySQL3.0 in the Relational Database section.
  5. In the dialog box that appears, set the required parameters.
    Note
    • Only nodes that run on exclusive resource groups for scheduling can access AnalyticDB for MySQL data stores. For more information about how to create and use an exclusive resource group, see Add and use exclusive resource groups for scheduling.
    • When you create an AnalyticDB for MySQL 2.0 connection, you must enter the AccessKey ID and AccessKey secret of the current Alibaba Cloud account or RAM user for identity authentication.
    • When you create an AnalyticDB for MySQL 3.0 connection, you must enter the username and password used to connect to the AnalyticDB for MySQL 3.0 database for identity authentication. After you create an AnalyticDB for MySQL 3.0 database, you must log on to Cloud Native Data Warehouse Console and create the username and password for connecting to the database first.
  6. Click Test Connection.
  7. After the connection passes the connectivity test, click Complete.

Add information about your exclusive resource group for scheduling to the whitelist of the AnalyticDB for MySQL 3.0 database

An AnalyticDB for MySQL 3.0 database uses a whitelist to control client access. It allows access requests from a client only when the client information is included in the whitelist.

  1. Obtain information about the exclusive resource group for scheduling.

    To guarantee that an exclusive resource group for scheduling can properly access an AnalyticDB for MySQL 3.0 database, you must add information, such as the Elastic Network Interface (ENI) IP address, about the exclusive resource group to the whitelist of the AnalyticDB for MySQL 3.0 database. For more information, see Add an exclusive resource group for scheduling to the whitelist of the data store to be accessed. Skip the operations in this section if you use an AnalyticDB for MySQL 2.0 database.

  2. Add information about the exclusive resource group for scheduling to the whitelist of the AnalyticDB for MySQL 3.0 database.
    1. Log on to Cloud Native Data Warehouse Console. In the left-side navigation pane, click Clusters. On the page that appears, find the target cluster on the V3.0 Clusters tab and click the cluster ID. The details page of the cluster appears. Then, click Data Security.
    2. On the page that appears, click Create Whitelist in the upper-right corner. In the dialog box that appears, add information about the exclusive resource group to the whitelist.

Create a workflow

  1. In the DataWorks console, click the DataWorks icon in the upper-left corner and choose All Products > DataStudio.
  2. On the Data Analytics tab, right-click Business Flow and select Create Workflow. In the Create Workflow dialog box that appears, set Workflow Name and Description.
  3. Click Create.

Create a batch sync node

  1. Click the created workflow, right-click Data Integration, and then choose Create > Batch Synchronization.
  2. In the Create Node dialog box that appears, set Node Name and click Commit. A batch sync node is created.
  3. On the configuration tab of the batch sync node, set the parameters under Source and Target in the Connections section.
  4. Configure the mappings between the fields in the source and destination tables in the Mappings section.

    Fields in the source table on the left have a one-to-one mapping with fields in the destination table on the right. You can click Add to add a field, or move the pointer over a field and click the Delete icon to delete the field.

  5. Configure channel control policies.
    Configure the maximum transmission rate and dirty data check rules.
    Parameter Description
    Expected Maximum Concurrency The maximum number of concurrent threads to read data from or write data to data stores within the batch sync node. You can configure the concurrency for the node on the codeless user interface (UI).
    Bandwidth Throttling Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and set a maximum transmission rate to avoid heavy read workload of the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to a proper value.
    Dirty Data Records Allowed The maximum number of dirty data records allowed.
  6. Click the Properties tab in the right-side navigation pane to configure recurrence for the batch sync node.
  7. After the configuration is completed, click the Save and Run icons in sequence.

Create a data analytics node

  1. Click the target workflow, right-click UserDefined, and then choose Create > AnalyticDB for MySQL.
  2. In the Create Node dialog box that appears, set Node Name and click Commit.
  3. On the configuration tab that appears, select a connection and enter SQL statements based on the syntax supported by AnalyticDB for MySQL. You can enter Data Manipulation Language (DML) or Data Definition Language (DDL) statements in the SQL code editor.
  4. Click the Properties tab in the right-side navigation pane to configure recurrence for the node.
  5. After the configuration is completed, click the Save icon to save the node settings. Then, click the Run icon to run the SQL statements you entered.

Perform O&M

After you commit and deploy a created node, click the DataWorks icon in the upper-left corner and choose All Products > Operation Center. You can perform O&M on the page that appears. For more information, see Operation Center.

Manage metadata

Click the DataWorks icon in the upper-left corner and choose All Products > DataMap. You can manage metadata on the page that appears. For more information, see Data Map.