Alibaba Cloud Data Transmission Service (DTS) lets you synchronize data from MySQL and PostgreSQL databases to ApsaraDB for SelectDB in real time. This topic walks you through synchronizing schema, full data, and incremental data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance.
Supported source databases:
MySQL: ApsaraDB RDS for MySQL, self-managed MySQL, PolarDB for MySQL
PostgreSQL: ApsaraDB RDS for PostgreSQL, self-managed PostgreSQL, PolarDB for PostgreSQL
Prerequisites
Before you begin, ensure that you have:
An ApsaraDB for SelectDB instance
An ApsaraDB RDS for MySQL source instance
The required database account permissions (see Permissions required)
Binary logging configured on the source database (see Binary log requirements)
Limitations
Source database requirements
Tables must have primary keys or UNIQUE constraints with all unique fields. If a table lacks a primary key or UNIQUE constraint, select Schema Synchronization in the Configure Objects and Advanced Settings step, then use the Unique key model and manually configure the primary key columns in the Configure Database and Table Fields step.
Binary logging must be enabled and
binlog_row_imagemust be set tofull. If either condition is not met, the precheck fails and the task cannot start.Binary logs must be retained for at least 7 days. If logs are purged earlier, DTS may fail to read them, which can cause the task to fail or result in data inconsistency or loss. If this requirement is not met, the DTS SLA does not guarantee service reliability or performance.
Binary log requirements
For ApsaraDB RDS for MySQL:
Binary logging is enabled by default. Make sure
binlog_row_imageis set tofull. For more information, see Modify the parameters of an ApsaraDB RDS for MySQL instance.Read-only instances (for example, ApsaraDB RDS for MySQL 5.6 read-only instances) that do not record transaction logs cannot be used as source databases.
To manage binary log retention, see Manage binary log files.
For self-managed MySQL:
Enable binary logging, set
binlog_formattorow, and setbinlog_row_imagetofull.If the source is deployed in a dual-primary cluster, set
log_slave_updatestoONso DTS can read all binary logs. For setup details, see Create an account for a self-managed MySQL database and configure binary logging.If the source is Amazon Aurora MySQL or another clustered MySQL instance, make sure the domain name or IP address used in the task always resolves to a read/write (RW) node.
Other limitations
DTS can only synchronize data to tables that use the Unique key model in ApsaraDB for SelectDB.
Do not use tools such as pt-online-schema-change to perform online DDL operations on objects being synchronized. Doing so may cause the task to fail.
Write data to the destination database only through DTS during synchronization. Writing through other means can cause data inconsistency.
Run the synchronization during off-peak hours to reduce the impact on source and destination database performance.
Special cases
For self-managed MySQL:
If a primary/secondary switchover occurs during synchronization, the task fails.
If no DML operations run for a long time, latency reporting may become inaccurate. Run a DML operation in the source database to refresh the latency value. If you select a full database for synchronization, create a heartbeat table and update or write to it every second.
DTS periodically runs
CREATE DATABASE IF NOT EXISTS \test\`` to advance the binary log offset.
For ApsaraDB RDS for MySQL:
DTS periodically runs
CREATE DATABASE IF NOT EXISTS \test\`` to advance the binary log offset.
Permissions required
| Database | Required permissions |
|---|---|
| Source ApsaraDB RDS for MySQL | Read and write permissions on objects to be synchronized |
| Destination ApsaraDB for SelectDB | USAGE_PRIV on the instance; SELECT_PRIV, LOAD_PRIV, ALTER_PRIV, CREATE_PRIV, and DROP_PRIV on the destination database |
If the source database account was not created in the ApsaraDB RDS console, it must also have theREPLICATION CLIENT,REPLICATION SLAVE,SHOW VIEW, andSELECTpermissions. To check or modify account permissions, see the SHOW GRANTS Statement and Modify the permissions of a standard account on an ApsaraDB RDS for MySQL instance.
Set up a data synchronization task
Log on to the ApsaraDB for SelectDB console.
In the top navigation bar, select the region where your instance resides.
On the Instances page, click the instance ID to go to the Instance Details page.
In the left-side pane, click Data Pipeline, then click the Data Synchronization tab.
A data synchronization task in DTS typically combines full data loading with ongoing incremental change capture. Data migration, by contrast, is designed for one-time historical data transfer. For a comparison, see FAQ.
Click Create Data Synchronization Task and configure the source and destination databases.
Section Parameter Description (none) Task Name DTS generates a name automatically. Use a descriptive name for easy identification. Names do not need to be unique. Source Database Database Type Select MySQL. Access Method Select Alibaba Cloud Instance. Instance Region Select the region of the source ApsaraDB RDS for MySQL instance. RDS Instance ID Select the ID of the source instance. Database Account Enter the account with the required permissions (see Permissions required). Database Password Enter the password for the account. Encryption Select Non-encrypted or SSL-encrypted. If you select SSL-encrypted, enable SSL encryption on the RDS for MySQL instance first. See Use a cloud certificate to quickly enable SSL link encryption. Destination Database Database Type Select SelectDB. Access Method Select Alibaba Cloud Instance. Instance Region Select the region of the destination ApsaraDB for SelectDB instance. Instance ID Select the ID of the destination instance. Database Account Enter the account with the required permissions (see Permissions required). Database Password Enter the password for the account. Click Test Connectivity and Proceed. DTS automatically adds the CIDR blocks of DTS servers to the IP address whitelist of Alibaba Cloud database instances.
WarningAdding DTS server CIDR blocks to a database whitelist or ECS security group exposes your environment to security risks. Before proceeding, take preventive measures such as strengthening credentials, restricting exposed ports, authenticating API calls, regularly reviewing whitelist rules, and connecting through Express Connect, VPN Gateway, or Smart Access Gateway.
Configure the objects to synchronize and advanced settings. For more information, see Specify the capitalization of object names in the destination instance. Selecting objects In the Source Objects section, select the databases, tables, or columns to synchronize, then click the arrow icon to move them to the Selected Objects section. Keep the following in mind:
If you select databases, new tables added to the source after the task starts are not automatically synchronized. To sync new tables, select tables instead of databases. To modify the sync scope, click Reselect Objects in the DTS console.
If you select tables and need to rename them, a single task supports up to 1,000 tables. For larger volumes, split the work into multiple tasks or synchronize entire databases.
To rename an object in the destination, right-click it in the Selected Objects box. To rename multiple objects at once, click Batch Edit. See Map object names.
To filter rows with a WHERE condition, right-click a table in the Selected Objects section and specify the condition. See Set filter conditions.
To control which SQL operations are synchronized for a specific object, right-click it and select the operations.
ImportantWhen synchronizing from MySQL to ApsaraDB for SelectDB, type conversion occurs. If you do not select Schema Synchronization, create tables using the Unique key model with matching schemas in the destination instance in advance. See Data type mappings and Data models.
Synchronization types
Parameter Description Synchronization Types DTS always enables Incremental Data Synchronization. Schema Synchronization and Full Data Synchronization are also selected by default. After the precheck, DTS initializes the destination with full data as the baseline for incremental synchronization. Conflict handling
Option Behavior Precheck and Report Errors (default) Checks for tables with the same name in the destination. If any are found, the precheck fails and the task does not start. To work around this, rename the object in the destination mapping. See Database Table Column Name Mapping. Ignore Errors and Proceed Skips the same-name check. During full synchronization, DTS skips source records that conflict with existing destination records. During incremental synchronization, DTS overwrites destination records with source records. If table schemas are inconsistent, initialization may fail or synchronization may be incomplete. Use with caution. Object name capitalization
Option Behavior DTS default policy (default) Database and table names are lowercase; column names match the source. Consistent with the source database Object names match the source. Consistent with the default policy of the destination database Object names follow the destination's default rules. (Optional) Click Next: Advanced Settings to configure additional options.
Parameter Description Dedicated Cluster for Task Scheduling By default, DTS uses a shared cluster. For higher task stability, purchase a dedicated cluster. See What is a DTS dedicated cluster?. Retry Time for Failed Connections Duration DTS retries after a connection failure. Default: 720 minutes. Range: 10–1,440 minutes. Recommended: 30 minutes or more. If multiple DTS instances share a source or destination, the shortest configured retry duration applies to all. DTS charges for runtime during retries. Retry Time for Other Issues Duration DTS retries after non-connection errors (for example, DDL or DML execution failures). Default: 10 minutes. Range: 1–1,440 minutes. Must be less than Retry Time for Failed Connections. Enable Throttling for Full Data Synchronization Limit full synchronization throughput by setting QPS to the source, RPS of full data migration, and data migration speed (MB/s). Available only when Full Data Synchronization is selected. You can also adjust throttling after the task starts. Enable Throttling for Incremental Data Synchronization Limit incremental synchronization throughput by setting RPS and data synchronization speed (MB/s). Environment Tag Optional tag to identify the DTS instance. Whether to delete SQL operations on heartbeat tables of forward and reverse tasks Yes: DTS does not write heartbeat SQL to the source database. The instance may show latency. No: DTS writes heartbeat SQL to the source database. This may interfere with physical backups and cloning. Configure ETL Yes: Enable the extract, transform, and load (ETL) feature and enter data processing statements. See Configure ETL in a data migration or data synchronization task. No: Disable ETL. Monitoring and Alerting Yes: Configure alerts for task failures or latency exceeding a threshold. Set the alert threshold and notification contacts. See Configure monitoring and alerting during task configuration. No: No alerts. (Optional) Click Next: Configure Database and Table Fields to set the Primary Key Column, Distribution Key, and Engine for each destination table.
This step is only available when Schema Synchronization is selected. Set Definition Status to All to view and modify all tables. You can select multiple columns for Primary Key Column. Distribution Key must be a subset of the primary key columns. Only unique is available for Engine.
Click Next: Save Task Settings and Precheck. To preview the API parameters for this configuration, hover over the button and click Preview OpenAPI parameters. The precheck runs automatically. The task cannot start until the precheck passes.
If an item fails, click View Details, fix the issue, and click Precheck Again.
If an item shows an alert that can be ignored, click Confirm Alert Details > View Details > Ignore > OK, then click Precheck Again. Ignoring alerts may cause data inconsistency.
Wait for the success rate to reach 100%, then click Next: Purchase Instance.
Configure billing for the synchronization instance.
Parameter Description Billing Method Subscription: Pay upfront for a fixed term. More cost-effective for long-running tasks. Pay-as-you-go: Billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges. Resource Group The resource group for the instance. Default: default resource group. Instance Class The synchronization specification, which determines performance. See Instance classes of data synchronization instances. Subscription Duration Available only for Subscription billing. Options: 1–9 months, 1 year, 2 years, 3 years, or 5 years. Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start. Monitor the task progress in the task list.
SQL operations supported for incremental synchronization
| Type | Supported operations |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | ADD COLUMN; DROP COLUMN, DROP TABLE; TRUNCATE TABLE |
Data type mappings
When DTS synchronizes data from MySQL to ApsaraDB for SelectDB, the following type conversions apply.
Numeric types
| MySQL type | ApsaraDB for SelectDB type | Notes |
|---|---|---|
TINYINT | TINYINT | |
TINYINT UNSIGNED | SMALLINT | |
SMALLINT | SMALLINT | |
SMALLINT UNSIGNED | INT | |
MEDIUMINT | INT | |
MEDIUMINT UNSIGNED | BIGINT | |
INT | INT | |
INT UNSIGNED | BIGINT | |
BIGINT | BIGINT | |
BIGINT UNSIGNED | LARGEINT | |
BIT(M) | INT | |
DECIMAL | DECIMAL | ZEROFILL is not supported. |
NUMERIC | DECIMAL | |
FLOAT | FLOAT | |
DOUBLE | DOUBLE | |
BOOL, BOOLEAN | BOOLEAN |
Date and time types
| MySQL type | ApsaraDB for SelectDB type | Notes |
|---|---|---|
DATE | DATEV2 | |
DATETIME[(fsp)] | DATETIMEV2 | |
TIMESTAMP[(fsp)] | DATETIMEV2 | |
TIME[(fsp)] | VARCHAR | |
YEAR[(4)] | INT |
String types
| MySQL type | ApsaraDB for SelectDB type | Notes |
|---|---|---|
CHAR, VARCHAR | VARCHAR | CHAR and VARCHAR(n) are converted to VARCHAR(3*n) to prevent data loss. If no length is specified, VARCHAR(65533) is used. Data longer than 65,533 characters is converted to STRING. |
BINARY, VARBINARY | STRING | |
TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT | STRING | |
TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB | STRING | |
ENUM | STRING | |
SET | STRING | |
JSON | STRING |
FAQ
What is the difference between data synchronization and data migration in DTS?
Data migration is for one-time transfer of historical data from on-premises, ECS-hosted, or third-party cloud databases to Alibaba Cloud. It uses only the pay-as-you-go billing method, and charges apply only during the incremental migration phase (not schema or full data migration).
Data synchronization is for ongoing real-time replication between two data sources. It supports both pay-as-you-go and subscription billing. With pay-as-you-go, charges apply whenever incremental synchronization is in progress—including when it is paused. Synchronization is suited for active geo-redundancy, disaster recovery, cross-border data synchronization, real-time data warehousing, and cloud business intelligence (BI) pipelines.
For more information, see FAQ.
Can I add or remove columns from a source table that is being synchronized to ApsaraDB for SelectDB?
Yes. DTS automatically synchronizes ADD COLUMN and DROP COLUMN operations. For all supported incremental DDL operations, see SQL operations supported for incremental synchronization.
Does DTS synchronize new tables created after the task starts?
It depends on how you scoped the objects:
If you selected databases: New tables are not automatically synchronized. To sync their data, manually create the corresponding tables in the destination ApsaraDB for SelectDB instance. DTS then automatically synchronizes the data in those tables.
If you selected tables: New tables are not included by default. To add new tables to the task, click Reselect Objects in the Actions column of the DTS console's Data Synchronization page.
What do I do if a table to be synchronized does not have a primary key or UNIQUE constraint?
DTS requires all source tables to have primary keys or UNIQUE constraints. If a table does not meet this requirement:
In the Configure Objects and Advanced Settings step, select Schema Synchronization.
In the Configure Database and Table Fields step, select the Unique key model for the table and manually configure the primary key columns.
What's next
Synchronize data from RDS for MySQL to ApsaraDB for SelectDB
Migrate data from PolarDB for MySQL to ApsaraDB for SelectDB
Synchronize data from a PolarDB for MySQL cluster to a SelectDB instance
Migrate data from ApsaraDB RDS for PostgreSQL to ApsaraDB for SelectDB
Synchronize data from an RDS for PostgreSQL instance to a SelectDB instance