All Products
Search
Document Center

ApsaraDB RDS:Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraMQ for Kafka instance

Last Updated:Mar 28, 2026

Use Data Transmission Service (DTS) to stream change data from an ApsaraDB RDS for MySQL instance into an ApsaraMQ for Kafka instance. DTS captures row-level changes via binary logs and delivers them to Kafka topics in real time, enabling downstream analytics, event-driven architectures, and data pipeline integrations.

Prerequisites

Before you begin, make sure you have:

Limitations

Source database

  • Tables must have a primary key or UNIQUE constraint with all fields unique. Without this, the destination may contain duplicate records.

  • If you select individual tables (not the entire database) and want to rename tables or columns during synchronization, a single task supports up to 1,000 tables. For more than 1,000 tables, configure multiple tasks or synchronize the entire database instead.

  • Binary log requirements:

    • Set binlog_row_image to full. If this parameter is not set correctly, the precheck fails and the task cannot start.

    • Retain binary logs for at least 3 days on ApsaraDB RDS for MySQL (7 days recommended). For self-managed MySQL, retain logs for at least 7 days. Shorter retention periods may cause task failures or data loss, and may affect DTS service reliability under its Service Level Agreement (SLA). For details, see the Delete binary log files section.

    • For self-managed MySQL, also set binlog_format to row. In a dual-primary cluster, set log_slave_updates to ON so DTS can obtain all binary logs. See Create an account for a self-managed MySQL database and configure binary logging.

  • Do not run DDL statements that change database or table schemas during schema synchronization or full data synchronization — this causes the task to fail.

  • Data generated by binary log change operations — such as data restored from a physical backup or data from cascade operations — is not captured or synchronized. If needed, you can remove the affected databases and tables from the synchronization objects and re-add them. See Modify the objects to be synchronized.

  • For MySQL 8.0.23 and later, invisible columns cannot be synchronized and their data is lost. To make a column visible, run ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE;. Tables without explicit primary keys may auto-generate invisible primary keys — make those visible too. See Invisible Columns and Generated Invisible Primary Keys.

  • A read-only ApsaraDB RDS for MySQL 5.6 instance cannot be used as the source because it does not record transaction logs.

Other limits

  • DTS does not synchronize foreign keys. Cascade and delete operations triggered in the source are not propagated to the destination.

  • Full data synchronization uses read and write resources of both source and destination instances, increasing database load. Run synchronization during off-peak hours when possible.

  • During full data synchronization, concurrent INSERT operations cause table fragmentation in the destination. After full synchronization, the destination tablespace is typically larger than the source.

  • If you select one or more tables instead of an entire database as the objects to be synchronized, do not use tools such as pt-online-schema-change for online DDL operations on those tables during synchronization — this may cause synchronization to fail. Use Data Management (DMS) for online DDL instead. See Perform lock-free DDL operations.

  • Do not write data from other sources to the destination during synchronization. External writes cause data inconsistency and may result in data loss.

  • If the destination Kafka instance is scaled during synchronization, restart the instance to resume the task.

  • ApsaraDB RDS for MySQL instances with the EncDB feature enabled do not support full data synchronization. Instances with Transparent Data Encryption (TDE) enabled support schema synchronization, full data synchronization, and incremental data synchronization.

  • If a DTS task fails, DTS support will attempt to restore it within 8 hours. During restoration, the task may be restarted and task parameters may be modified. Database parameters are not changed.

  • If you perform a primary/secondary switchover on a self-managed MySQL source while the task is running, the task fails.

  • DTS calculates synchronization latency based on the timestamp of the latest synchronized data in the destination database and the current timestamp in the source database. If no DML operation is performed on the source database for a long time, the synchronization latency may be inaccurate. If the latency appears too high, you can perform a DML operation on the source database to update the latency. If you select an entire database as the synchronization object, you can also create a heartbeat table — the heartbeat table is updated or receives data every second.

  • DTS executes CREATE DATABASE IF NOT EXISTS 'test' in the source database periodically to advance the binary log file position. This is expected behavior.

Record size limit

The maximum size of a single record written to Kafka is 10 MB. If a source row exceeds 10 MB, the task is interrupted. To avoid this, exclude large-field columns using filter conditions when configuring the task. If a table with large fields is already included in the task objects, remove the table, re-add it, and configure filter conditions to exclude the large fields.

Billing

Synchronization typeFee
Schema synchronization and full data synchronizationFree of charge
Incremental data synchronizationCharged. See Billing overview.

Supported synchronization topologies

  • One-way one-to-one synchronization

  • One-way one-to-many synchronization

  • One-way many-to-one synchronization

For all supported topologies, see Synchronization topologies.

SQL operations that can be synchronized

Operation typeSQL statements
DMLINSERT, UPDATE, DELETE
DDLCREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE; CREATE VIEW, ALTER VIEW, DROP VIEW; CREATE PROCEDURE, ALTER PROCEDURE, DROP PROCEDURE; CREATE FUNCTION, DROP FUNCTION, CREATE TRIGGER, DROP TRIGGER; CREATE INDEX, DROP INDEX
DTS does not synchronize foreign keys from the source to the destination. Cascade and delete operations on the source are not replicated.

Create a synchronization task

Step 1: Go to the Data Synchronization page

Use either the DTS console or the DMS console.

DTS console

  1. Log on to the DTS console.

  2. In the left-side navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance will reside.

DMS console

Exact steps may vary depending on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.
  1. Log on to the DMS console.

  2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.

  3. From the drop-down list to the right of Data Synchronization Tasks, select the region where the synchronization instance will reside.

Step 2: Configure source and destination databases

  1. Click Create Task.

  2. Configure the source and destination databases using the parameters in the following table.

    Warning

    After configuring the source and destination databases, review the Limits shown on the page to avoid task failures or data inconsistency.

    SectionParameterDescription
    N/ATask NameA name for the DTS task. DTS generates a name automatically. Specify a descriptive name to make the task easy to identify. The name does not need to be unique.
    Source DatabaseSelect Existing ConnectionIf the instance is registered with DTS, select it from the drop-down list and DTS fills in the remaining parameters automatically. Otherwise, configure the database parameters manually. In the DMS console, select from the Select a DMS database instance drop-down list.
    Database TypeSelect MySQL.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionThe region where the source ApsaraDB RDS for MySQL instance resides.
    Replicate Data Across Alibaba Cloud AccountsSelect No for same-account synchronization.
    RDS Instance IDThe ID of the source ApsaraDB RDS for MySQL instance.
    Database AccountA database account with read permissions on the objects to be synchronized.
    Database PasswordThe password for the database account.
    EncryptionSelect Non-encrypted or SSL-encrypted. To use SSL encryption, enable SSL on the RDS instance before configuring the DTS task. See Use a cloud certificate to enable SSL encryption.
    Destination DatabaseSelect Existing ConnectionIf the instance is registered with DTS, select it from the drop-down list. Otherwise, configure the database parameters manually.
    Database TypeSelect Kafka.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionThe region where the destination ApsaraMQ for Kafka instance resides.
    Kafka Instance IDThe ID of the destination ApsaraMQ for Kafka instance.
    EncryptionSelect Non-encrypted or SCRAM-SHA-256 based on your security requirements.
    TopicThe topic that receives the synchronized data. Select from the drop-down list.
    Topic That Stores DDL InformationThe topic that stores DDL information. If left blank, DDL information is stored in the topic specified by Topic.
    Use Kafka Schema RegistryWhether to use Kafka Schema Registry for Avro schema storage and retrieval. Select No or Yes. If Yes, enter the URL or IP address registered in Kafka Schema Registry for your Avro schemas.
  3. Click Test Connectivity and Proceed.

    Make sure DTS server CIDR blocks are added to the security settings of both source and destination databases. See Add the CIDR blocks of DTS servers. For self-managed databases not using Alibaba Cloud Instance as the access method, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Step 3: Configure synchronization objects

  1. In the Configure Objects step, set the synchronization parameters.

    ParameterDescription
    Synchronization TypesIncremental Data Synchronization is selected by default. Also select Full Data Synchronization to synchronize historical data as the baseline for incremental synchronization.
    Note

    Schema synchronization is not available when the destination is an ApsaraMQ for Kafka instance.

    Processing Mode of Conflicting TablesPrecheck and Report Errors: Fails the precheck if the destination has tables with the same names as source tables. Use object name mapping to rename conflicting tables. See Database, table, and column name mapping. Ignore Errors and Proceed: Skips the precheck for duplicate table names. During full synchronization, conflicting records are not overwritten — existing destination records are retained. During incremental synchronization, conflicting records overwrite existing destination records. If schemas differ, synchronization may fail or only some columns are synchronized. Proceed with caution.
    Data Format in KafkaThe message format written to Kafka. DTS Avro: Data is structured per the DTS Avro schema definition. See the schema on GitHub. Canal Json: Data is stored in Canal JSON format. See Canal Json.
    Kafka Data Compression FormatCompression algorithm for Kafka messages. LZ4 (default): low compression ratio, high speed. GZIP: high compression ratio, low speed — consumes significant CPU resources. Snappy: medium compression ratio and speed.
    Policy for Shipping Data to Kafka PartitionsHow records are distributed across Kafka partitions. See Specify the policy for migrating data to Kafka partitions.
    Message acknowledgement mechanismKafka producer acknowledgement settings. See Message acknowledgement mechanism.
    Capitalization of object names in destination instanceControls the case of database, table, and column names in the destination. Default is DTS default policy. See Specify the capitalization of object names in the destination instance.
    Source ObjectsSelect one or more objects and click the icon to add them to Selected Objects. You can select tables as the objects to be synchronized.
    Selected ObjectsLists the selected objects. Use the object name mapping feature to specify the destination topic name, number of partitions, and partition keys for each source table. See Use the object name mapping feature. To filter SQL operations per object, right-click an object in the Selected Objects section and select the operations to synchronize.
  2. Click Next: Advanced Settings and configure the advanced parameters.

    ParameterDescription
    Dedicated Cluster for Task SchedulingBy default, DTS schedules the task to the shared cluster. For improved stability, purchase and select a dedicated cluster. See What is a DTS dedicated cluster.
    Retry Time for Failed ConnectionsHow long DTS retries failed connections after the task starts. Valid values: 10–1,440 minutes. Default: 720 minutes. Set to at least 30 minutes. If DTS reconnects within this period, the task resumes; otherwise the task fails. If multiple tasks share the same source or destination, the shortest configured retry time applies. Note: You are charged for the DTS instance during the retry period. We recommend that you specify the retry time based on your business requirements, and release the DTS instance promptly after the source and destination instances are released.
    Retry Time for Other IssuesHow long DTS retries failed DDL or DML operations. Valid values: 1–1,440 minutes. Default: 10 minutes. Set to at least 10 minutes. This value must be less than Retry Time for Failed Connections.
    Enable Throttling for Full Data SynchronizationLimits read/write throughput during full synchronization to reduce load on source and destination servers. Configure QPS (queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected.
    Enable Throttling for Incremental Data SynchronizationLimits throughput during incremental synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
    Whether to delete SQL operations on heartbeat tables of forward and reverse tasksControls whether DTS writes heartbeat SQL operations to the source database. Yes: Does not write heartbeat operations — a latency indicator may appear on the task. No: Writes heartbeat operations — may affect physical backup and cloning of the source database.
    Environment TagAn optional tag to identify the DTS instance.
    Configure ETLWhether to enable extract, transform, and load (ETL). Yes: Opens a code editor to enter data processing statements. See Configure ETL in a data migration or data synchronization task. No: ETL is disabled.
    Monitoring and AlertingWhether to configure alerts for the task. Yes: Set an alert threshold and notification contacts — see Configure monitoring and alerting when you create a DTS task. No: Alerts are disabled.

Step 4: Run the precheck

  1. Click Next: Save Task Settings and Precheck.

    To preview the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
  2. Review the precheck results:

    • If all items pass, proceed to the next step.

    • If an item fails, click View Details next to the failed item, resolve the issue, and click Precheck Again.

    • If an alert is triggered for an item that cannot be ignored, resolve the issue and rerun the precheck. For ignorable alerts, click View Details next to the alert item, then click Ignore > OK, and click Precheck Again.

    Warning

    Ignoring precheck alerts may cause data inconsistency.

Step 5: Purchase an instance and start the task

  1. Wait for Success Rate to reach 100%, then click Next: Purchase Instance.

  2. On the buy page, configure the billing and instance parameters.

    ParameterDescription
    Billing MethodSubscription: Pay upfront for a fixed term. More cost-effective for long-term use. Subscription duration options: 1–9 months, or 1, 2, 3, or 5 years. Pay-as-you-go: Billed hourly. Suitable for short-term use — release the instance when no longer needed to stop billing.
    Resource Group SettingsThe resource group for the synchronization instance. Default: default resource group. See What is Resource Management?.
    Instance ClassInstance classes vary in synchronization speed. See Instance classes of data synchronization instances.
  3. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.

  4. Click Buy and Start, then click OK in the dialog box.

The task appears in the task list. Track its progress from there.

Use the object name mapping feature

Use this feature to route source table data to a specific Kafka topic, control the partition count, and set partition keys.

  1. In the Selected Objects section, hover over the topic name.

  2. Right-click and select Edit.

  3. In the Edit Table dialog box, configure the parameters.

    ParameterDescription
    Table NameThe topic that receives data from this source table. Defaults to the topic set in the Destination Database section. For ApsaraMQ for Kafka destinations, the topic must already exist — DTS does not create it automatically. For self-managed Kafka with schema synchronization, DTS attempts to create the topic. Changing this value routes the source table's data to the specified topic.
    Filter ConditionsSQL-based row filter for this table. See Specify filter conditions.
    Number of PartitionsThe number of partitions in the destination topic.
    Partition KeyOne or more columns used to compute partition hash values, applicable when Policy for Shipping Data to Kafka Partitions is set to Ship Data to Separate Partitions Based on Hash Values of Primary Keys. To configure partition keys, first clear Synchronize All Tables.
  4. Click OK.

FAQ

Can I modify the Kafka Data Compression Format after the task starts?

Yes. See Modify the objects to be synchronized.

Can I modify the Message acknowledgement mechanism after the task starts?

Yes. See Modify the objects to be synchronized.