After you configure data sources, network environments, and resource groups, you can create and run a sync solution. This topic describes how to create a sync solution and view the status of the nodes that are generated by the sync solution.
Configure a sync solution
- Go to the Data Integration page and choose to go to the Tasks page. For more information, see Select a data synchronization solution.
- On the Tasks page, click New task in the upper-right corner.
- In the Create Data Synchronization Solution dialog box, click One-click real-time synchronization to Hologres.
- In the Set Synchronization Sources and Rules step, configure basic information such
as the name of the data sync solution. In the Basic Configuration section, set the parameters that are described in the following table.
Parameter Description Solution Name The name of the sync solution. The name can be a maximum of 50 characters in length. Description The description of the sync solution. The description can be a maximum of 50 characters in length. Location If you select Automatic Workflow Creation, DataWorks automatically creates a workflow named in the format of clone_database_Source name+to+Destination name. All sync nodes generated by the sync solution are placed in the Data Integration folder of this workflow.
If you clear Automatic Workflow Creation, select a directory from the Select Location drop-down list. All sync nodes generated by the data sync solution are placed in the specified directory.
- Select a source data source and configure sync rules.
- In the Data Source section, specify the Type and Data source parameters. Note You can set the Type parameter only to MySQL, Oracle, or PolarDB.
- In the Source Table section, select the tables whose data you want to synchronize from the Source Table list. Then, click the icon to add the tables to the Selected Source Table list. The Source Table section displays all the tables in the source. You can select all or specific tables.Notice If a selected table does not have a primary key, the table cannot be synchronized in real time.
- In the Mapping Rules for Table Names section, click Add rule to select a rule. Supported options are Conversion Rule for Table Name and Rule for Destination Table name.
- Conversion Rule for Table Name: the rule that is used to convert the names of source tables to those of destination tables.
- Rule for Destination Table name: the rule that is used to add a prefix or a suffix to the converted names of destination tables.
- Click Next Step.
- In the Data Source section, specify the Type and Data source parameters.
- Select the destination data source and configure the formats for the destination tables.
- In the Set Destination Table step, specify Destination and Schema, and specify whether to enable Table name case sensitive.
- Click Refresh source table and Hologres Table mapping to configure the mappings between the source tables and destination Hologres tables.
- View the mapping progress, source tables, and mapped destination tables.
Serial number Description 1 The progress of mapping the source tables to destination tables.Note The mapping may require an extended period of time if you want to synchronize data from a large number of tables. 2 If a source table does not have a primary key, an error message appears to remind you that the current source table does not have a primary key and cannot be synchronized. The synchronization can be performed if one of the selected source tables has a primary key. Source tables without primary keys are ignored during the synchronization. 3 The source of the destination table. Valid values: Create Table and Use Existing Table. 4The name of the destination table. The information that appears here varies based on the value that you selected from the drop-down list in the Table creation method column.
- If you set the Table creation method parameter to Use Existing Table, the names of existing Hologres tables are automatically displayed in the drop-down list of the Hologres Table name column. You can select the table name that you want to use from the drop-down list.
- If you set the Table creation method to Create Table, the name of the destination table that is automatically created appears. To view and modify the SQL statements that are used to create a table, click the name of the table.
- Click Next Step.
- Configure the resources required by the data sync solution. In the Set Resources for Solution Running step, set the parameters that are described in the following table.
Parameter Description Select an exclusive resource group for real-time tasks The exclusive resource group that is used to run the real-time sync node and batch sync node generated by the data sync solution. Only exclusive resource groups for Data Integration can be used to run solutions. You can set this parameter to the name of the exclusive resource group for Data Integration that you purchased. For more information, see Plan and configure resources. Resource Groups for Full Batch Sync Nodes Select scheduling Resource Group The resource group for scheduling that is used to run the nodes generated by the batch sync solution. Maximum number of connections supported by source read The maximum number of Java Database Connectivity (JDBC) connections that are allowed for the source. Specify an appropriate number based on the resources of the source. Offline task name rules The name of the batch sync node that is used to synchronize the full data of the source database. After a sync solution is configured, DataWorks first runs a batch sync node to synchronize full data, and then runs a real-time sync node to synchronize incremental data.
- Click Complete Configuration. The sync solution is configured.
Run the sync solution
On the Tasks page, find the created data sync solution and click Submit and Run in the Operation column to run the data sync solution.
- The system displays the following error message for a real-time synchronization node: "com.alibaba.otter.canal.parse.exception.PositionNotFoundException: can't find start position for XXX." What do I do?
- The system displays the following error message for a real-time data synchronization node: "com.alibaba.otter.canal.parse.exception.CanalParseException: command : 'show master status' has an error! pls check. you need (at least one of) the SUPER,REPLICATION CLIENT privilege(s) for this operation." What do I do?
- The system displays the following error message for a real-time data synchronization node: "com.alibaba.datax.plugin.reader.mysqlbinlogreader.MysqlBinlogReaderException: The mysql server does not enable the binlog write function. Please enable the mysql binlog write function first." What do I do?
- The system displays the following error message for the batch synchronization node: "com.alibaba.datax.common.exception.DataXException: Code:[HoloWriter-02], Description:[Invalid config parameter in your configuration.]. - Field _log_file_name_offset_ not allow null but not present in user configured columns." What do I do?
View the status and result of the data sync nodes
- On the Tasks page, find the solution that is run and choose More > Execution details in the Operation column. Then, you can view the execution details of all nodes.
- Find a node whose execution details you want to view and click Execution details in the Status column. In the message that appears, click the provided link to go to the DataStudio page.
Manage the data sync solution
- View or edit the data sync solution.
On the Tasks page, find the solution that you want to view or edit, and choose More > View Configuration in the Operation column.Note You can click View Configuration to modify the sync solution only if the solution is in the Not Running state. If you click View Configuration in the Operation column that corresponds to a data sync solution in another state, you can view only the information about that data sync solution.
- Delete the data sync solution.
Find the solution that you want to delete and choose More > Delete in the Operation column. In the Delete message, click OK.Note After you click OK, only the configuration record of the data sync solution is deleted. The generated sync nodes and tables are not affected.