When you configure a task to migrate data to a Kafka cluster, you can specify the policy for migrating data to Kafka partitions. The policy allows you to improve the migration performance. For example, you can migrate data to different partitions based on hash values.

Hash algorithm

Data Transmission Service (DTS) uses the hashCode() method in Java to calculate hash values.

Configuration method

In the Configure Migration Types and Objects step of a task creating wizard, you can specify the policy for migrating data to Kafka partitions. For more information, see Migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance and Overview of data migration scenarios.

Warning After a data migration task is started, do not change the number of partitions in the destination topic. Otherwise, data migration fails.

Policies

PolicyDescriptionAdvantage and disadvantage
Ship All Data to Partition 0DTS migrates all data and DDL statements to Partition 0 of the destination topic.
  • Advantage: The order in which all objects are created and changed is the same as that in the source database.
  • Disadvantage: This policy provides ordinary migration performance.
Ship Data to Separate Partitions Based on Hash Values of Database and Table NamesDTS uses the database and table names as the partition key to calculate the hash value. Then, DTS migrates the data and DDL statements of each table to the corresponding partition of the destination topic.
Note
  • The data and DDL statements of the same table are migrated to the same partition.
  • If a DDL statement is irrelevant to a table, for example, CREATE DATABASE, the statement is migrated to Partition 0.
  • Advantage: The order in which a destination table is created and changed is the same as that of the source table. This policy provides good migration performance.
  • Disadvantage: Tables are migrated to different partitions. After data migration, the order of data changes on different tables may become inconsistent.
Ship Data to Separate Partitions Based on Hash Values of Primary KeysDTS uses a table column as the partition key to calculate the hash value. The table column is the primary key by default. If a table does not have a primary key, the unique key is used as the partition key. DTS migrates each row to the corresponding partition of the destination topic. You can specify one or more columns as partition keys to calculate the hash value.
Note
  • If you use this policy, DDL statements are migrated to Partition 0 of the destination topic by default.
  • If a table does not have a primary key or unique key, DTS migrates the data and DDL statements of the table to Partition 0 of the destination topic.
  • Advantage: This policy provides the best migration performance.
  • Disadvantage: After data migration, the order of data changes on each data record remains the same. However, the order of data changes on different tables or tables without a primary key may become inconsistent.