Data Integration allows you to synchronize data from tables in multiple databases to the same destination table at a time. Specifically, you can create a batch sync node, specify tables in multiple databases as the source tables, and then specify the destination table. After that, you can run the batch sync node to synchronize the data from the source tables to the destination table.

Background information

When you configure a batch sync node for synchronizing data from tables in multiple databases to the same destination table, you must specify source tables. Make sure that all the source tables have the same schema.

You can specify source tables in various types of databases such as MySQL, SQL Server, Oracle, PostgreSQL, PolarDB, and AnalyticDB databases. You can use the codeless user interface (UI) to configure a batch sync node for synchronizing data from tables in multiple MySQL databases to the same destination table.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the target workspace resides. Find the target workspace and click Data Analytics in the Actions column.
  2. Create a batch sync node.
    1. On the Data Development tab, move the pointer over Create icon and choose Data integration > Offline synchronization.
      Alternatively, you can click the target workflow in the Business process section, right-click Data Integration, and then choose New > Offline synchronization.
    2. In the New node dialog box, set the Node name and Destination folder parameters.
      Note The node name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.). It is not case-sensitive.
    3. Click Submit.
  3. Configure the batch sync node.
    You can configure the batch sync node on the codeless UI or in the code editor.
    • If you configure the batch sync node on the codeless UI, specify the source data stores and tables in the Data source section. If you need to add connections to more data stores, click New data source. For more information, see Create a sync node by using the codeless UI.Configuration tab
    • If you configure the batch sync node in the code editor, write the code for synchronizing data from the source tables in multiple databases to the destination table. In this example, enter the following code. For more information, see Create a sync node by using the code editor.
      {
          "type": "job",
          "version": "1.0",
          "configuration": {
              "reader": {
                  "plugin": "mysql",
                  "parameter": {
                      "connection": [
                          {
                              "table": [ // The source tables in the first database.
                                  "tbl1",
                                  "tbl2",
                                  "tbl3"
                              ],
                              "datasource": "datasourceName1" // The name of the first database.
                          },
                          {
                              "table": [ // The source tables in the second database.
                                  "tbl4",
                                  "tbl5",
                                  "tbl6"
                              ],
                              "datasource": "datasourceName2" // The name of the second database.
                          }
                      ],
                      "singleOrMulti": "multi",
                      "splitPk": "db_id",
                      "column": [
                          "id", "name", "age"
                      ],
                      "where": "1 < id and id < 100"
                  }
              },
              "writer": {
              }
          }
      }
  4. Commit the node.
    After the node properties are configured, click the Save icon in the upper-left corner. Then, commit or commit and unlock the node to the development environment.
  5. Test the node in the production environment.