Kafka is a distributed message queue service that features high throughput and high scalability. Kafka is widely used for big data analytics such as log collection, data aggregation, streaming processing, and online and offline analysis. It is important for the big data ecosystem. This topic describes how to synchronize data from a PolarDB for MySQL cluster to a self-managed Kafka cluster by using Data Transmission Service (DTS). The data synchronization feature allows you to extend message processing capabilities.
Prerequisites
- A Kafka cluster is created and the Kafka version is 0.10.1.0 to 2.7.0.
- The binary logging feature is enabled for the PolarDB for MySQL cluster. For more information, see Enable binary logging.
Precautions
The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
Limits
- You can select only tables as the objects to be synchronized.
- DTS does not automatically update the objects of the data synchronization task based
on their names.
Note If a source table is renamed during data synchronization but the new table name is not included in the selected objects, DTS does not synchronize the data of the table to the destination Kafka cluster. To synchronize the data of the renamed table, you must add the table to the selected objects of the task. For more information, see Add an object to a data synchronization task.