Data Transmission Service (DTS) continuously replicates data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance. The sync pipeline covers schema synchronization, an initial full data load, and ongoing incremental change-data capture (CDC), so your SelectDB analytics workload always queries up-to-date production data.
What this route supports
| Capability | Details |
|---|---|
| Sync types | Schema synchronization, full data synchronization, incremental data synchronization |
| Supported DML | INSERT, UPDATE, DELETE |
| Supported DDL | ADD COLUMN, MODIFY COLUMN, CHANGE COLUMN, DROP COLUMN, DROP TABLE, TRUNCATE TABLE, RENAME TABLE |
| Destination engines | Unique engine, Duplicate engine |
| Max tables (with name mapping) | 1,000 per task |
| Incremental latency (normal) | Less than 10 seconds |
| Billing | Schema sync and full sync: free. Incremental sync: charged |
Prerequisites
Before you begin, make sure you have:
A destination ApsaraDB for SelectDB instance whose storage space is larger than the storage used by the source RDS for MySQL instance. See Create an instance
The required permissions on both databases. See Required permissions
Binary logging enabled on the source with
binlog_row_imageset tofull. RDS for MySQL enables binary logging by default. Run the following SQL to confirm the parameter value:SHOW VARIABLES LIKE 'binlog_row_image';The result must show
full. If it does not, update the parameter in the RDS console. See Set instance parametersBinary logs retained for the required minimum period. A shorter retention period can cause the DTS task to fail. In extreme cases, data inconsistency or data loss may occur. Issues caused by insufficient binary log retention are not covered by the DTS Service-Level Agreement (SLA):
RDS for MySQL: at least 3 days (7 days recommended). See Automatically delete local logs
Self-managed MySQL: at least 7 days
Choose a destination engine
Before you configure the task, decide which SelectDB engine to use for each destination table. The engine determines how DTS handles duplicate data and UPDATE/DELETE operations.
| Dimension | Unique engine | Duplicate engine |
|---|---|---|
| Deduplication | Automatic (based on unique keys) | Manual (use _is_deleted, _version, _record_id) |
| UPDATE/DELETE handling | Native UPDATE/DELETE in destination | Converted to INSERT; use additional columns to filter |
| Duplicate data risk | Low — if source and destination unique keys match | Higher — can occur after task retry, restart, or concurrent DML |
| When to use | Source tables have a primary key or unique key | Source tables have no primary key or unique key |
| Required setup | All unique keys in the destination must also exist in the source and be included in sync objects | Select Schema Synchronization and set Engine to duplicate for the table |
Additional columns for the Duplicate engine
DTS automatically adds the following columns to destination tables that use the Duplicate engine. Use these columns to filter duplicate rows in queries.
| Column | Data type | Default | Value |
|---|---|---|---|
_is_deleted | INT | 0 | 0 for INSERT and UPDATE; 1 for DELETE |
_version | BIGINT | 0 | 0 during full sync; timestamp (seconds) from source binary log during incremental sync |
_record_id | BIGINT | 0 | 0 during full sync; unique, incrementing record ID from the incremental log during incremental sync |
Required permissions for the database accounts
| Database | Required permissions | How to grant |
|---|---|---|
| Source RDS for MySQL | Read and write permissions on the synchronization objects | Create an account and Modify account permissions |
| Destination ApsaraDB for SelectDB | Usage_priv (cluster access), Select_priv, Load_priv, Alter_priv, Create_priv, Drop_priv | Permission management and Basic permission management |
If the source account was not created in the RDS for MySQL console, the account must also have theREPLICATION CLIENT,REPLICATION SLAVE,SHOW VIEW, andSELECTpermissions.
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview |
Limitations
Source database
All tables to be synchronized must have a primary key or a UNIQUE constraint. For tables with neither, select Schema Synchronization for Synchronization Types and set Engine to duplicate for those tables during configuration. Otherwise, the task may fail or data may be lost.
DTS adds extra columns to destination tables that use the Duplicate engine. See Additional columns for the Duplicate engine.
During initial schema synchronization and initial full data synchronization, do not run DDL operations that change database or table schemas. DTS queries the source during full sync, which creates metadata locks that may block DDL operations on the source.
Data changes that are not recorded in binary logs — such as data restored from a physical backup or data generated by cascade operations — are not synchronized.
If the source database is MySQL 8.0.23 or later and the data to be synchronized contains invisible columns, those columns cannot be read and data loss may occur. To make invisible columns visible, run:
ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE;See Invisible Columns for details.
If you need to map table or column names during table-level sync, the task supports a maximum of 1,000 tables. Exceeding this limit causes a post-submission error. Split the tables across multiple tasks, or configure the task to synchronize the entire database instead.
If the always-encrypted (EncDB) feature is enabled for the source RDS for MySQL instance, initial full data synchronization is not supported. For RDS for MySQL instances with Transparent Data Encryption (TDE) enabled, schema sync, full sync, and incremental sync are all supported.
RDS for MySQL instances that do not record transaction logs, such as read-only instances of RDS for MySQL 5.6, are not supported as a source.
DTS periodically runs the
CREATE DATABASE IF NOT EXISTS \test\`` command on the source RDS for MySQL instance to advance the binary log offset.For self-managed MySQL sources:
Enable binary logging, set
binlog_formattorow, and setbinlog_row_imagetofull.For a primary/primary cluster (both nodes acting as primary and secondary to each other), enable
log_slave_updatesto make sure DTS can read all binary logs.Retain binary logs for at least 7 days.
If a primary/secondary switchover occurs during synchronization, the task fails.
If the source is an Amazon Aurora MySQL instance or another cluster-mode MySQL instance, the domain name or IP address configured for the task must always resolve to the read/write (RW) node address.
DTS periodically runs the
CREATE DATABASE IF NOT EXISTS \test\`` command on the source database to advance the binary log offset.DTS latency is calculated by comparing the timestamp of the last synchronized data record in the destination with the current timestamp. If no DML operations are performed on the source database for a long time, the displayed latency may be inaccurate. To update the latency reading, perform a DML operation on the source. If you synchronize the entire database, you can also create a heartbeat table — DTS updates the heartbeat table every second.
Destination
Only Unique or Duplicate engine tables are supported in the SelectDB destination. Synchronization of INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FK objects is not supported.
Do not create a new cluster in the destination SelectDB instance during synchronization. If the task fails, restart the synchronization instance to resume.
Do not add backend (BE) nodes to the SelectDB instance during synchronization. If the task fails, restart the synchronization instance to resume.
Database and table names must start with a letter. Use the mapping feature to rename objects that do not comply.
Object names (databases, tables, or columns) that contain Chinese characters must be mapped to compliant names. Otherwise, the task may fail.
DDL operations that modify multiple columns at once, and consecutive DDL operations on the same table, are not supported.
In a multi-table merge scenario (multiple source tables synchronizing to a single destination table), all source tables must have the same schema to avoid data inconsistency or task failure.
Online DDL changes made with pt-online-schema-change on the source are not supported. Data loss or task failure may occur.
Online DDL changes made with DMS or the gh-ost tool are synchronized as the original DDL. No temporary table data is replicated, but the destination table may be locked temporarily.
Do not write data to the destination from other sources during synchronization. This causes data inconsistency between source and destination.
Before you synchronize data, evaluate the performance of the source and destination databases. We recommend that you perform data synchronization during off-peak hours to reduce the impact on source and destination database load.
During incremental synchronization, DTS uses a batch synchronization policy. By default, DTS writes to a single sync object at most once every 5 seconds, which may result in a normal synchronization latency of less than 10 seconds. To reduce latency, adjust the
selectdb.reservoir.timeout.millisecondsparameter in the DTS console. The valid range is 1,000–10,000 milliseconds.A shorter batching interval increases write frequency, which may increase load and write response time on the destination — and in turn increase DTS synchronization latency. Adjust based on actual destination load.
MySQL
VARCHAR(M)uses character length. SelectDBVARCHAR(N)uses byte length. If you are not using schema synchronization, set the VARCHAR field length in SelectDB to four times the MySQL length. The maximum VARCHAR length in SelectDB is 65,533 bytes; fields that exceed this limit are converted to STRING.Only the
bucket_countparameter can be set in the Selected Objects box. The value must be a positive integer. The default isauto.Initial full data synchronization uses concurrent INSERT operations, which causes fragmentation in destination tables. After full sync completes, the tablespace of the destination instance will be larger than that of the source.
Create a synchronization task
Step 1: Go to the Data Synchronization page
Use one of the following consoles:
DTS console
Log on to the DTS console.
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the data synchronization instance resides.
DMS console
The actual operations may vary based on the mode and layout of the DMS console. See Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list to the right of Data Synchronization Tasks, select the region.
Step 2: Configure source and destination connections
Click Create Task and configure the following parameters.
| Section | Parameter | Value |
|---|---|---|
| — | Task Name | A descriptive name to identify the task. Names do not need to be unique |
| Source Database | Select Existing Connection | Select a registered instance, or configure manually if not registered |
| Database Type | MySQL | |
| Access Method | Alibaba Cloud Instance | |
| Instance Region | The region of the source RDS for MySQL instance | |
| Replicate Data Across Alibaba Cloud Accounts | No (for same-account sync) | |
| RDS Instance ID | The ID of the source RDS for MySQL instance | |
| Database Account | The account with the required source permissions | |
| Database Password | The account password | |
| Encryption | Non-encrypted or SSL-encrypted. To use SSL encryption, enable SSL on the RDS for MySQL instance first. See Use a cloud certificate to enable SSL encryption | |
| Destination Database | Select Existing Connection | Select a registered instance, or configure manually if not registered |
| Database Type | SelectDB | |
| Access Method | Alibaba Cloud Instance | |
| Instance Region | The region of the destination SelectDB instance | |
| Replicate Data Across Alibaba Cloud Accounts | No (for same-account sync) | |
| Instance ID | The ID of the destination SelectDB instance | |
| Database Account | The account with the required destination permissions | |
| Database Password | The account password |
Click Test Connectivity and Proceed.
Make sure the CIDR blocks of DTS servers are added to the security settings of both the source and destination databases. See Add DTS server IP addresses to a whitelist.
Step 3: Configure sync objects
In the Configure Objects step, set the following parameters.
| Parameter | Description |
|---|---|
| Synchronization Types | Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. Full sync provides the baseline data for incremental sync. If you skip schema synchronization, create destination tables with the Unique or Duplicate model manually beforehand. See Data type mapping and Data model |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): fails the precheck if a same-named table exists in the destination. Use this to prevent unintended overwrites. Ignore Errors and Proceed: skips the check. If schemas match, source records overwrite destination records with the same primary or unique key. If schemas differ, initialization may fail or only partial data may be synchronized |
| Capitalization of Object Names in Destination Instance | Controls how DTS capitalizes database, table, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names in the destination instance |
| Source Objects | Select objects at the database or table level and click the arrow icon to add them to Selected Objects |
| Selected Objects | Right-click an object to: rename it in the destination (see Map schemas, tables, and columns); set the bucket_count parameter if schema synchronization is enabled; select specific DML/DDL operations; or set a WHERE clause filter (see Set filter conditions) |
Using the object name mapping feature may cause other objects that depend on the mapped object to fail to synchronize.
Click Next: Advanced Settings and configure the following parameters.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | Leave blank to use the shared cluster. Purchase a dedicated cluster for higher stability. See What is a DTS dedicated cluster |
| Retry Time for Failed Connections | Time range (in minutes) DTS retries after a connection failure. Valid range: 10–1,440. Default: 720. Set to more than 30 minutes. If reconnection succeeds within this window, DTS resumes the task |
| Retry Time for Other Issues | Time range (in minutes) DTS retries after DDL or DML failures. Valid range: 1–1,440. Default: 10. Set to more than 10 minutes. Must be less than Retry Time for Failed Connections |
| Enable Throttling for Full Data Synchronization | Limits QPS to source and data transfer rate during full sync to reduce source and destination load. Available only when Full Data Synchronization is selected |
| Enable Throttling for Incremental Data Synchronization | Limits RPS and transfer rate during incremental sync to reduce destination load |
| Environment Tag | Optional. Tags the instance with an environment label |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Yes: DTS does not write heartbeat table operations to the source. A latency indicator may appear on the instance. No: DTS writes heartbeat operations to the source. This may affect physical backup and cloning of the source |
| Configure ETL | Yes: enables the extract, transform, and load (ETL) feature and opens a code editor for data processing statements. See Configure ETL in a data migration or data synchronization task. No: disables ETL |
| Monitoring and Alerting | Yes: sends alerts when the task fails or synchronization latency exceeds the threshold. Configure alert thresholds and contacts. No: no alerts |
Optional: Click Next: Configure Database and Table Fields to set the Primary Key Column, Distribution Key, and Engine for destination tables. This step is available only when Schema Synchronization is selected.
Set Definition Status to All to view and modify all tables.
For Primary Key Column, you can select multiple columns to form a composite primary key. Select one or more primary key columns as the Distribution Key.
For tables with neither a primary key nor a unique key, set Engine to duplicate.
Step 4: Run the precheck and purchase the instance
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
Wait for the precheck to complete. If the precheck fails:
Click View Details next to each failed item, address the issue, and click Precheck Again.
For alert items that can be ignored, click Confirm Alert Details, then Ignore, then OK, then Precheck Again. Ignoring alerts may result in data inconsistency.
After the precheck reaches a Success Rate of 100%, click Next: Purchase Instance.
On the buy page, configure the following parameters:
Parameter Description Billing Method Subscription: pay upfront; more cost-effective for long-term use. Subscription durations: 1–9 months, or 1, 2, 3, or 5 years. Pay-as-you-go: billed per hour; suitable for short-term use Resource Group Settings The resource group for the instance. Default: default resource group Instance Class The sync speed varies by instance class. See Instance classes of data synchronization instances Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the dialog box, click OK.
The task appears in the task list. DTS first runs schema synchronization and full data synchronization, then switches to incremental data synchronization.
If a DTS instance fails, DTS support will attempt to recover it within 8 hours. Recovery may involve restarting the instance or adjusting DTS instance parameters (not database parameters). For the parameters that may be modified, see Modify instance parameters.
Data type mapping
MySQL and SelectDB use different type systems. DTS converts types automatically during schema synchronization.
| Category | MySQL data type | SelectDB data type | Notes |
|---|---|---|---|
| Integer | TINYINT | TINYINT | |
| TINYINT UNSIGNED | SMALLINT | ||
| SMALLINT | SMALLINT | ||
| SMALLINT UNSIGNED | INT | ||
| MEDIUMINT | INT | ||
| MEDIUMINT UNSIGNED | BIGINT | ||
| INT | INT | ||
| INT UNSIGNED | BIGINT | ||
| BIGINT | BIGINT | ||
| BIGINT UNSIGNED | LARGEINT | ||
| BIT(M) | INT | ||
| Decimal | DECIMAL | DECIMAL | zerofill is not supported |
| NUMERIC | DECIMAL | ||
| FLOAT | FLOAT | ||
| DOUBLE | DOUBLE | ||
| BOOL / BOOLEAN | BOOLEAN | ||
| Date and time | DATE | DATEV2 | |
| DATETIME[(fsp)] | DATETIMEV2 | ||
| TIMESTAMP[(fsp)] | DATETIMEV2 | ||
| TIME[(fsp)] | VARCHAR | ||
| YEAR[(4)] | INT | ||
| String | CHAR / VARCHAR | VARCHAR | Converted to VARCHAR(4×n). If length exceeds 65,533 bytes, converted to STRING. Default length when unspecified: VARCHAR(65533) |
| BINARY / VARBINARY | STRING | ||
| TINYTEXT / TEXT / MEDIUMTEXT / LONGTEXT | STRING | ||
| TINYBLOB / BLOB / MEDIUMBLOB / LONGBLOB | STRING | ||
| ENUM | STRING | ||
| SET | STRING | ||
| JSON | STRING |
CHAR and VARCHAR(n) are converted to VARCHAR(4×n) to prevent data loss, because MySQL VARCHAR(M) measures character length while SelectDB VARCHAR(N) measures byte length.