After you configure data sources, network environments, and resource groups, you can create and run data synchronization solutions. This topic describes how to create a data synchronization solution and view the status of the nodes that are generated by the data synchronization solution.

Create a data synchronization solution

  1. Go to the Data Integration page. In the left-side navigation pane of the Data Integration page, click Data Synchronization Node to go to the Tasks page.
    For more information, see Select a synchronization solution.
  2. On the Tasks page, click New task in the upper-right corner. In the Select Source and Destination section, specify the source type and destination type.
  3. In the Select Synchronization Solution section, click One-click real-time synchronization to Hologres. Then, click Next. In the Configure Network Connection for Synchronization step, specify the source, destination, and exclusive resource group for Data Integration that you want to use. Then, click Test Connectivity to test the network connectivity between the resource group and data sources. Then, click Next.
  4. In the Configure Sources and Synchronization Rules step, configure basic information such as the solution name for the data synchronization solution.
    In the Basic Configuration section, configure the parameters. Configure basic information
    Parameter Description
    Solution Name The name of the data synchronization solution. The name can be a maximum of 50 characters in length.
    Description The description of the data synchronization solution. The description can be a maximum of 50 characters in length.
    Location If you select Automatic Workflow Creation, DataWorks automatically creates a workflow. All synchronization nodes generated by the data synchronization solution are named in the format of clone_database_Source name+to+Destination name and placed in the Data Integration folder of this workflow.

    If you clear Automatic Workflow Creation, you must select a directory from the Select Location drop-down list. All synchronization nodes generated by the data synchronization solution are placed in the specified directory.

  5. Select the tables from which you want to read data and configure mapping rules.
    1. In the Data Source section, configure the Type and Data source parameters.
      Note You can set the Type parameter only to MySQL, Oracle, or PolarDB.
    2. In the Source Table section, select the tables from which you want to read data from the Source Table list. Then, click the Icon icon to add the tables to the Selected Tables list.
      Source tables
      The source table list displays all the tables in the source. You can select all or some tables in a database and synchronize the tables.
      Notice If a selected source table does not have a primary key, the table cannot be synchronized in real time.
    3. In the Mapping Rules for Table Names section, click Add Rule, select a rule type, and then configure a mapping rule of the selected type.
      Supported rule types are Conversion Rule for Table Name and Rule for Destination Table name.
      • Conversion Rule for Table Name: This type of rule is used to convert the names of the source tables to the names of the destination tables. For example, if the source table is named my_table and you set Source to "my" and Target to "your" when you configure a rule of the Conversion Rule for Table Name type, my_table is converted to your_table.
      • Rule for Destination Table Name: This type of rule is used to add a prefix or a suffix to the converted names of the destination tables. For example, if you use a rule of this type to process the table name your_table that is obtained after conversion based on the rule of the Conversion Rule for Table Name type and you use ${db_table_name_src_transed} to represent the name your_table, after you enter pre_${db_table_name_src_transed}_post, the table name that is obtained is pre_your_table_post.
    4. Click Next.
  6. Select the tables to which you want to write data and configure formats for the destination tables.
    1. In the Basic Configurations of Destination Table section of the Set Destination Table step, configure the Destination and Policy for Writing to Hologres parameters.
    2. Click Refresh source table and Hologres Table mapping to configure the mappings between the source tables and destination Hologres tables.
    3. View the mapping progress, source tables, and mapped destination tables.
      Configuration of the destination Hologres tables
      No. Description
      1 The progress of mapping the source tables to the destination tables.
      Note The mapping may require a long period of time if data is synchronized from a large number of tables.
      2 A source table that does not have a primary key cannot be synchronized. If a source table does not have a primary key, you can click the edit icon in the column that corresponds to the source table to specify a primary key for the table.
      3 The source of the destination table. Valid values: Create Table and Use Existing Table.
      • If you select Use Existing Table from the drop-down list in the Table creation method column, the names of existing Hologres tables are automatically displayed in the drop-down list of the Table name column. You must select the table name that you want to use from the drop-down list.
      • If you select Create Table from the drop-down list in the Table creation method column, the name of the destination table that is automatically created appears in the Table name column. You can click the table name to view and modify the table creation statements.
    4. Click Next.
  7. Configure the resources required to run the data synchronization solution.
    In the Configure Resources step, configure the parameters. Configure the resources required to run the data synchronization solution
    Parameter Description
    Select an exclusive resource group for real-time tasks The exclusive resource group used to run the real-time synchronization node that is generated by the data synchronization solution. Select the exclusive resource group that you created from the drop-down list. Only exclusive resource groups for Data Integration can be used to run data synchronization solutions. You can select the exclusive resource group for Data Integration that you created. For more information, see Plan and configure resources.
    Exclusive Resource Groups for Full Batch Sync Nodes
    Resource Group for Scheduling The exclusive resource group for scheduling used to schedule the batch synchronization node that is generated by the data synchronization solution.
    Maximum number of connections supported by source read The maximum number of Java Database Connectivity (JDBC) connections that are allowed for the source. Configure this parameter based on the resources of the source database.
    Offline task name rules The name of the batch synchronization node that is used to synchronize the full data of the source. After a data synchronization solution is created, DataWorks first generates a batch synchronization node to synchronize full data and then generates a real-time synchronization node to continuously synchronize incremental data.
  8. Click Complete Configuration. The data synchronization solution is configured.

Run the data synchronization solution

On the Tasks page, find the created data synchronization solution and click Submit and Run in the Operation column to run the solution.

View the status of nodes that are generated by the data synchronization solution

  • On the Tasks page, find the solution that is run and click Execution details in the Operation column. Then, you can view the execution details of all nodes generated by the solution.
  • Find a node whose execution details you want to view and click Execution details in the Status column. In the message that appears, click the provided link to go to the DataStudio page.

Manage the data synchronization solution

  • View or edit the data synchronization solution.
    On the Tasks page, find the data synchronization solution that you want to view or edit, and choose More > View Configuration in the Operation column.
    Note You can click Modify Configuration to modify the data synchronization solution only if the solution is in the Not Running state. If you click Modify Configuration in the Operation column of a data synchronization solution in another state, you can only view information about the solution.
  • Delete the data synchronization solution.
    Find the data synchronization solution that you want to delete and choose More > Delete in the Operation column. In the Delete message, click OK.
    Note After you click OK, only the configuration record of the data synchronization solution is deleted. The generated synchronization nodes and tables are not affected.