All Products
Search
Document Center

DataWorks:Full and incremental synchronization

Last Updated:Oct 13, 2023

This topic provides answers to some frequently asked questions about full and incremental synchronization.

Overview

Why does decimal(7,4) is converted into numeric(38,18) when I run a synchronization task to synchronize data from MySQL to Hologres?

To prevent the loss of data precision, Data Integration automatically increases the data precision during data synchronization. Data Integration allows you to modify table creation statements. You can also configure default data-type conversion rules for data synchronization based on your business requirements when you configure a synchronization task. For more information, see Create a real-time synchronization solution to synchronize data to Hologres.

Can I run a one-click real-time synchronization task to synchronize data from tables in sharded databases to the same MaxCompute table?

No, a one-click real-time full and incremental synchronization task cannot be used to synchronize data from tables in sharded databases to the same MaxCompute table. If you want to synchronize data from tables in sharded databases to the same MaxCompute table, you can use a synchronization task for synchronizing data from tables in sharded databases to MaxCompute. For more information, see Synchronize data from tables in sharded MySQL databases to MaxCompute.

How do I prevent an error from being reported after fields in a source table specified in a one-click real-time full and incremental synchronization task are changed?

When you configure the synchronization task, you can configure processing rules for DDL messages generated for operations performed on the source. The processing rules include normal processing, ignoring, and error reporting. For more information, see the Configure rules to process DDL or DML messages and configure synchronization rules section in the Configure a synchronization task in Data Integration topic.

What do I do if my PolarDB data source fails the network connectivity test?

  • Problem description: I add a PolarDB data source and test the network connectivity of the data source. The data source fails the test.

  • Solution: Set the Data Source Type parameter to Connection String Mode and check the whitelist configuration of the data source and the virtual private cloud (VPC) configuration of your exclusive resource group.

What do I do if my Oracle data source fails the network connectivity test?

  • Problem description: I add an Oracle data source and test the network connectivity of the data source. The data source fails the test.

  • Solution: Set the Data Source Type parameter to Connection String Mode and check the whitelist configuration of the data source and the VPC configuration of your exclusive resource group.

What do I do if my OceanBase data source fails the network connectivity test?

  • Problem description: I add an OceanBase data source and test the network connectivity of the data source. The data source fails the test.

  • Solution: Set the Data Source Type parameter to Connection String Mode and check the whitelist configuration of the data source and the VPC configuration of your exclusive resource group.

What do I do if my MySQL data source fails the network connectivity test?

  • Problem description: I add a MySQL data source and test the network connectivity of the data source. The data source fails the test.

  • Solution: Set the Data Source Type parameter to Connection String Mode and check the whitelist configuration of the data source and the VPC configuration of your exclusive resource group.

When I configure a real-time full and incremental synchronization task used to synchronize data to MaxCompute, the MaxCompute data source that I want to use is dimmed. What do I do?

The MaxCompute data source that you want to use is dimmed because the data source is added by yourself to DataWorks, rather than generated when you associate a MaxCompute project with the workspace as a compute engine instance. For a real-time full and incremental synchronization task used to synchronize data to MaxCompute, you must select a MaxCompute data source that is generated when you associate a MaxCompute compute engine.

When the real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source is run, the following error message is returned: com.alibaba.otter.canal.parse.exception.PositionNotFoundException: can't find start position for XXX. What do I do?

  • Problem description: The real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source fails, and the following error message is returned: com.alibaba.otter.canal.parse.exception.PositionNotFoundException: can't find start position for XXX.

  • Cause: The binary logging feature is disabled for the PolarDB data source.

  • Solution: Enable the binary logging feature for the PolarDB data source. For more information, see PolarDB data source. In addition, you must change one or more data records in the source and change the start offset of the real-time synchronization subtask to the current time.

When the real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source is run, the following error message is returned: com.alibaba.otter.canal.parse.exception.CanalParseException: command : 'show master status' has an error! pls check. you need (at least one of) the SUPER,REPLICATION CLIENT privilege(s) for this operation. What do I do?

  • Problem description: The real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source fails, and the following error message is returned: com.alibaba.otter.canal.parse.exception.CanalParseException: command : 'show master status' has an error! pls check. you need (at least one of) the SUPER,REPLICATION CLIENT privilege(s) for this operation.

  • Cause: The account used to synchronize data is not authorized to access the PolarDB data source, or the connected PolarDB database is not deployed on the primary node.

  • Solution: Authorize the account to access the PolarDB data source, or check whether the connected PolarDB database is deployed on the primary node. A real-time synchronization task cannot synchronize data from read-only nodes in a PolarDB data source. For information about how to authorize an account to access a PolarDB data source, see PolarDB data source.

When the real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source is run, the following error message is returned: com.alibaba.datax.plugin.reader.mysqlbinlogreader.MysqlBinlogReaderException: The mysql server does not enable the binlog write function. Please enable the mysql binlog write function first. What do I do?

  • Problem description: The real-time synchronization subtask generated by my synchronization task used to synchronize data from a PolarDB data source fails, and the following error message is returned: com.alibaba.datax.plugin.reader.mysqlbinlogreader.MysqlBinlogReaderException: The mysql server does not enable the binlog write function. Please enable the mysql binlog write function first.

  • Cause: The loose_polar_log_bin parameter is set to off for the PolarDB data source.

  • Solution: Set the loose_polar_log_bin parameter to on. For more information, see Enable binary logging.

When the batch synchronization subtask generated by my synchronization task is run, the following error message is returned: com.alibaba.datax.common.exception.DataXException: Code:[HoloWriter-02], Description:[Invalid config parameter in your configuration.]. - Field _log_file_name_offset_ not allow null but not present in user configured columns. What do I do?

  • Problem description: The batch synchronization subtask generated by my synchronization task fails, and the following error message is returned: com.alibaba.datax.common.exception.DataXException: Code:[HoloWriter-02], Description:[Invalid config parameter in your configuration.]. - Field _log_file_name_offset_ not allow null but not present in user configured columns.

  • Cause: The DataWorks engine plug-in that is used for batch synchronization is not updated to the latest version.

  • Solution: Submit a ticket to contact the technical support to update the plug-in.

When the batch synchronization subtask generated by my synchronization task used to synchronize data to Hologres is run, the following error message is returned: errorCode:NoSuchTopic, errorMessage:The specified topic name does not exist. What do I do?

  • Problem description: The batch synchronization subtask generated by my synchronization task used to synchronize data to Hologres fails, and the following error message is returned: errorCode:NoSuchTopic, errorMessage:The specified topic name does not exist.

  • Causes:

    • A destination Hologres table used for data synchronization does not exist.

    • The batch synchronization subtask generated by the synchronization task synchronizes data from a source table to a Hologres foreign table. Hologres Writer cannot be used to synchronize data to Hologres foreign tables.

  • Solution: Use a Hologres internal table as a destination table for data synchronization. If the destination Hologres table does not exist, modify the configurations of the synchronization task and set Table Generation Method to Create Table to enable the system to automatically create a destination Hologres table. For more information, see the Step 4: Configure a destination table section in the Create a real-time synchronization solution to synchronize data to Hologres topic.

Does the system retain information of a source table, such as not-null properties and the default values of fields of the source table, in the mapped destination table that is automatically created?

When the system creates a destination table, the system retains only information such as the field names, data types, and remarks of the mapped source table in the destination table, but not the default values of fields and constraints, such as not-null constraints and constraints on indexes, of the mapped source table.