All Products
Search
Document Center

Data Transmission Service:Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraMQ for Kafka instance

Last Updated:Mar 28, 2026

Data Transmission Service (DTS) can stream change data from ApsaraDB RDS for MySQL into ApsaraMQ for Kafka in real time. This lets downstream consumers — analytics pipelines, event-driven services, or data warehouses — react to row-level changes without querying the source database directly.

Prerequisites

Before you begin, make sure that you have:

Billing

Synchronization typeFee
Schema synchronization and full data synchronizationFree
Incremental data synchronizationCharged. See Billing overview.

Limitations

Source database requirements

  • Tables to be synchronized must have a PRIMARY KEY or UNIQUE constraint with no duplicate field values. Otherwise, the destination may contain duplicate records.

  • If you rename tables or columns during synchronization and select individual tables as the sync objects, a single task supports up to 1,000 tables. To synchronize more tables, split them across multiple tasks or synchronize the entire database instead.

  • Do not execute DDL statements that change database or table schemas during schema synchronization or full data synchronization. Doing so causes the task to fail.

  • DTS does not synchronize foreign keys. Cascade and delete operations on the source database are not reflected in the destination.

  • Data generated by physical backup restores or cascade operations is not captured or synchronized while the task is running. If this data is missing from the destination, remove and re-add the affected databases and tables in the synchronization objects. See Modify the objects to be synchronized.

Binary logging requirements:

Source typeRequirements
ApsaraDB RDS for MySQLBinary logging is enabled by default. Set binlog_row_image to full. See Modify instance parameters. Retain binary logs for at least 3 days (7 days recommended).
Self-managed MySQLEnable binary logging. Set binlog_format to row and binlog_row_image to full. For dual-primary clusters, also set log_slave_updates to ON. See Create an account for a self-managed MySQL database and configure binary logging. Retain binary logs for at least 7 days.
Important

If DTS cannot read the binary logs, the task fails and data inconsistency may occur. To set the retention period for RDS MySQL, see the Delete binary log files section.

MySQL 8.0.23 and later — invisible columns:

Invisible columns cannot be synchronized and their data is lost. To make a column visible, run:

ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE;

Tables without explicit primary keys automatically get invisible primary keys. Make these visible before synchronizing. See Invisible Columns and Generated Invisible Primary Keys.

Other limitations

  • Evaluate the performance impact before starting. Full data synchronization reads and writes both databases heavily. Run synchronization during off-peak hours to reduce load.

  • Full data synchronization with concurrent INSERT operations causes table fragmentation in the destination. After full synchronization completes, the destination tablespace is larger than the source.

  • If you synchronize individual tables (not the entire database), do not use pt-online-schema-change for online DDL operations. Use Data Management (DMS) instead.

  • Do not write data from other sources to the destination Kafka instance during synchronization. Doing so causes data inconsistency.

  • If you scale the destination Kafka instance or cluster during synchronization, restart it afterward.

  • If a DTS task fails, DTS technical support attempts to restore it within 8 hours. The task may be restarted and task parameters may be modified during restoration.

ApsaraDB RDS for MySQL — instance-specific limitations:

Instance typeLimitation
EncDB enabledFull data synchronization is not supported.
Transparent Data Encryption (TDE) enabledSchema synchronization, full data synchronization, and incremental data synchronization are all supported.
Read-only RDS MySQL 5.6 (no transaction logs)Cannot be used as the source database.

Special cases for self-managed MySQL

  • Performing a primary/secondary switchover while the task is running causes the task to fail.

  • If no DML operations are performed on the source database for a long time, synchronization latency reporting may be inaccurate. Perform a DML operation on the source database to reset the latency value. If you synchronize an entire database, create a heartbeat table that updates every second.

  • DTS executes CREATE DATABASE IF NOT EXISTS 'test' in the source database on a schedule to advance the binary log file position.

Special cases for ApsaraDB RDS for MySQL

  • DTS executes CREATE DATABASE IF NOT EXISTS 'test' in the source database on a schedule to advance the binary log file position.

Single-record size limit

The maximum size of a single record written to Kafka is 10 MB. If a source row exceeds this limit, the DTS task stops.

To work around this, exclude large-field tables from the synchronization objects, or use filter conditions to exclude the oversized fields. If the tables are already included, remove them, re-add them, and specify filter conditions that exclude the large fields.

Supported synchronization topologies

  • One-way one-to-one synchronization

  • One-way one-to-many synchronization

  • One-way many-to-one synchronization

For details, see Synchronization topologies.

SQL operations that can be synchronized

TypeOperations
DMLINSERT, UPDATE, DELETE
DDLCREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE; CREATE VIEW, ALTER VIEW, DROP VIEW; CREATE PROCEDURE, ALTER PROCEDURE, DROP PROCEDURE; CREATE FUNCTION, DROP FUNCTION, CREATE TRIGGER, DROP TRIGGER; CREATE INDEX, DROP INDEX

Create a data synchronization task

Step 1: Go to the data synchronization page

Use one of the following methods:

DTS console

  1. Log on to the DTS console.DTS console

  2. In the left-side navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance resides.

DMS console

Note

The steps below may vary based on your DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.

  1. Log on to the DMS console.DMS console

  2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.

  3. From the drop-down list to the right of Data Synchronization Tasks, select the region where the synchronization instance resides.

Step 2: Configure source and destination databases

  1. Click Create Task to go to the task configuration page.

  2. Configure the source and destination database parameters.

    Warning

    After you configure the source and destination databases, read the Limits displayed on the page. Skipping this step may cause the task to fail or data inconsistency.

    Source database parameters

    ParameterDescription
    Task NameEnter a descriptive name. DTS generates a name automatically, but a meaningful name helps identify the task. Task names do not need to be unique.
    Select Existing ConnectionSelect a registered database instance to auto-populate the connection fields. If the instance is not registered, configure the fields manually. For registration instructions, see Manage database connections.
    Database TypeSelect MySQL.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionSelect the region where the source RDS MySQL instance resides.
    Replicate Data Across Alibaba Cloud AccountsSelect No for same-account synchronization.
    RDS Instance IDSelect the source RDS MySQL instance.
    Database AccountEnter the account with read permissions on the objects to be synchronized.
    Database PasswordEnter the password for the database account.
    EncryptionSelect Non-encrypted or SSL-encrypted. To use SSL encryption, enable it on the RDS instance first. See Use a cloud certificate to enable SSL encryption.

    Destination database parameters

    ParameterDescription
    Select Existing ConnectionSelect a registered database instance to auto-populate the connection fields. If the instance is not registered, configure the fields manually.
    Database TypeSelect Kafka.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionSelect the region where the destination Kafka instance resides.
    Kafka Instance IDSelect the destination Kafka instance.
    EncryptionSelect Non-encrypted or SCRAM-SHA-256 based on your security requirements.
    TopicSelect the topic to receive synchronized data.
    Topic That Stores DDL Information(Optional) Select a topic to store DDL information separately. If left blank, DDL information is stored in the topic set by Topic.
    Use Kafka Schema RegistrySelect No or Yes. If you select Yes, enter the URL or IP address registered in Kafka Schema Registry for your Avro schemas. Kafka Schema Registry provides a RESTful API to store and retrieve Avro schemas.
  3. Click Test Connectivity and Proceed.

    DTS server CIDR blocks must be added to the security settings of the source and destination databases. DTS adds them automatically for Alibaba Cloud instances. For self-managed databases, see Add the CIDR blocks of DTS servers. If the access method is not Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box first.

Step 3: Configure synchronization objects and options

  1. In the Configure Objects step, set the following parameters.

    ParameterDescription
    Synchronization TypesIncremental Data Synchronization is selected by default. Also select Schema Synchronization and Full Data Synchronization to synchronize historical data first, which serves as the baseline for incremental synchronization.
    Note

    If the destination is an ApsaraMQ for Kafka instance, Schema Synchronization is unavailable.

    Processing Mode of Conflicting TablesPrecheck and Report Errors: fails the precheck if identical table names exist in both databases. Use object name mapping to rename the conflicting tables. Ignore Errors and Proceed: skips the check. If the source and destination databases have the same schema and a record in the destination has the same primary key value or unique key value as a record in the source: during full synchronization, the existing record in the destination is kept; during incremental synchronization, the existing record in the destination is overwritten. Schema mismatches may cause initialization failures.
    Data Format in KafkaDTS Avro: data parsed using the DTS Avro schema. See the schema definition on GitHub. Canal JSON: data in Canal JSON format. See the Canal JSON section.
    Kafka Data Compression FormatChoose based on your workload: LZ4 (default) — low compression ratio, fast speed; GZIP — high compression ratio, slow speed, high CPU usage; Snappy — balanced ratio and speed.
    Policy for Shipping Data to Kafka PartitionsSelect a partition routing policy. See Specify the policy for migrating data to Kafka partitions.
    Message acknowledgement mechanismConfigure based on your reliability requirements. See Message acknowledgement mechanism.
    Capitalization of Object Names in Destination InstanceSelect DTS default policy or choose another option to match the capitalization of the source or destination database. See Specify the capitalization of object names in the destination instance.
    Source ObjectsSelect one or more objects and click 向右 to add them to Selected Objects. Only tables can be selected as sync objects.
    Selected ObjectsUse the object name mapping feature to set the destination topic, number of partitions, and partition keys per table. See Use the object name mapping feature. To filter specific SQL operations for a table, right-click the object in Selected Objects and select the operations. Note: Renaming an object may break dependent objects.
  2. Click Next: Advanced Settings and configure the following parameters.

    ParameterDescription
    Dedicated Cluster for Task SchedulingBy default, DTS schedules the task to a shared cluster. Purchase a dedicated cluster to improve stability. See What is a DTS dedicated cluster.
    Retry Time for Failed ConnectionsThe time range DTS retries failed connections. Valid values: 10–1440 minutes. Default: 720 minutes. Set this to more than 30 minutes. If multiple tasks share a source or destination database, the shortest retry time applies. DTS charges for the instance during retries.
    Retry Time for Other IssuesThe time range DTS retries failed DDL or DML operations. Valid values: 1–1440 minutes. Default: 10 minutes. Set this to more than 10 minutes. This value must be less than Retry Time for Failed Connections.
    Enable Throttling for Full Data SynchronizationLimit read QPS (queries per second) and write throughput during full synchronization to reduce load on the destination. Configure the Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) parameters. Available only when Full Data Synchronization is selected.
    Enable Throttling for Incremental Data SynchronizationLimit write throughput for incremental synchronization by configuring the RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s) parameters.
    Whether to delete SQL operations on heartbeat tables of forward and reverse tasksYes: DTS does not write heartbeat SQL to the source database. Synchronization latency may appear in the task. No: DTS writes heartbeat SQL to the source. Physical backup and cloning operations on the source database may be affected.
    Environment Tag(Optional) Assign an environment tag to identify this DTS instance.
    Configure ETLYes: configure extract, transform, and load (ETL) processing by entering data processing statements. See Configure ETL in a data migration or data synchronization task. No: skip ETL.
    Monitoring and AlertingYes: configure alert thresholds and notification contacts. DTS sends alerts when the task fails or synchronization latency exceeds the threshold. See Configure monitoring and alerting when you create a DTS task. No: no alerting.

Step 4: Run a precheck

  1. Click Next: Save Task Settings and Precheck. To preview the API parameters for this configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters before proceeding.

    DTS runs a precheck before the synchronization task starts. The task only starts after all precheck items pass.
  2. If any precheck item fails, click View Details to see the cause, fix the issue, and click Precheck Again. If a precheck item generates an alert:

    • If the alert cannot be ignored, fix the issue and rerun the precheck.

    • If the alert can be ignored, click Confirm Alert Details, then click Ignore in the dialog box, click OK, and then click Precheck Again. Ignoring an alert may cause data inconsistency.

Step 5: Purchase and start the instance

  1. Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

  2. On the buy page, configure the following parameters.

    ParameterDescription
    Billing MethodSubscription: pay upfront. More cost-effective for long-term use. Pay-as-you-go: billed hourly. Suitable for short-term use. Release the instance when no longer needed to avoid ongoing charges.
    Resource Group SettingsSelect the resource group for this instance. Default: default resource group. See What is Resource Management?
    Instance ClassSelect an instance class based on the required synchronization throughput. See Instance classes of data synchronization instances.
    Subscription Duration(Subscription only) Set the duration: 1–9 months, 1 year, 2 years, 3 years, or 5 years.
  3. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.

  4. Click Buy and Start, then click OK in the dialog box.

The task appears in the task list. Monitor its progress from there.

Use the object name mapping feature

The object name mapping feature lets you route data from each source table to a specific Kafka topic, set the number of partitions, and define partition keys.

  1. In the Selected Objects section, hover over a table name.

  2. Right-click and select Edit.

  3. In the Edit Table dialog box, configure the following parameters.

    ParameterDescription
    Table NameEnter the name of the destination topic. By default, this is the topic set in the Destination Database section. If the destination is an ApsaraMQ for Kafka instance, the topic must already exist — DTS does not create it. If the destination is a self-managed Kafka cluster and schema synchronization is included, DTS attempts to create the topic.
    Filter ConditionsSpecify SQL conditions to filter which rows are synchronized. See Specify filter conditions.
    Number of PartitionsSet the number of partitions in the destination topic.
    Partition KeyAvailable when Policy for Shipping Data to Kafka Partitions is set to Ship Data to Separate Partitions Based on Hash Values of Primary Keys. Specify one or more columns as partition keys. DTS routes rows to partitions based on the hash values of these columns. To select columns as partition keys, first clear Synchronize All Tables.
  4. Click OK.

FAQ

Can I change the Kafka Data Compression Format or Message acknowledgement mechanism after the task is created?

Yes. Modify these settings through the object modification feature. See Modify the objects to be synchronized.