When you configure a task to synchronize data to a Kafka cluster, you can specify the policy for synchronizing data to Kafka partitions. The policy allows you to improve the synchronization performance. For example, you can synchronize data to different partitions based on hash values.
Hash algorithm
Data Transmission Service (DTS) uses the hashCode() method in Java to calculate hash values.
Configuration method
In the Select Objects to Synchronize step of a task creating wizard, you can specify the policy for synchronizing data to Kafka partitions. For more information, see Synchronize data from an ApsaraDB RDS for MySQL instance to a user-created Kafka cluster and Overview of data synchronization scenarios.
Synchronization policies
Policy | Description | Advantage and disadvantage |
---|---|---|
Synchronize All Data to Partition 0 | DTS synchronizes all data and DDL statements to Partition 0 of the destination topic. |
|
Synchronize Data to Separate Partitions Based on Hash Values of Database and Table Names | DTS uses the database and table names as the partition key to calculate the hash value.
Then, DTS synchronizes the data and DDL statements of each table to the corresponding
partition of the destination topic.
Note
|
|
Synchronize Data to Separate Partitions Based on Hash Values of Primary Keys | DTS uses a table column as the partition key to calculate the hash value. The table
column is the primary key by default. If a table does not have a primary key, the
unique key is used as the partition key. DTS synchronizes each row to the corresponding
partition of the destination topic. You can specify one or more columns as partition
keys to calculate the hash value.
Note
|
|