When you need real-time access to MySQL data through a high-throughput NoSQL store—for query offloading, real-time analytics, or multi-model access—replicating data manually is error-prone and hard to maintain. Data Transmission Service (DTS) automates this synchronization: it performs a one-time full data load and then continuously replicates changes from an ApsaraDB RDS for MySQL instance (or a self-managed MySQL database) to a Tablestore instance, keeping both stores in sync without custom code.
Prerequisites
Before you begin, ensure that you have:
-
A Tablestore instance. For setup instructions, see Use Tablestore.
-
The AccessKey ID and AccessKey secret of the Alibaba Cloud account that owns the Tablestore instance. For instructions, see Create an AccessKey pair for an Alibaba Cloud account.
Limitations
Understanding these limitations before you configure the task helps you avoid precheck failures and mid-sync errors.
Source database limitations
| Limitation | Trigger and consequence | Workaround |
|---|---|---|
| Primary key or unique key required | Tables without PRIMARY KEY or UNIQUE constraints produce duplicate records in Tablestore. | Add a primary key or unique key to each table before starting the task. |
| Max 1,000 tables per task (when selecting tables as objects) | Selecting more than 1,000 tables causes a request error and the task cannot start. | Create multiple tasks, or synchronize the entire database instead of selecting tables individually. |
binlog_row_image must be full |
If not set to full, the precheck fails and the task cannot start. |
For ApsaraDB RDS for MySQL, see Modify instance parameters. |
| Binary log retention of at least 7 days | Shorter retention causes the task to fail mid-sync. In some cases, data loss may occur. | Extend the binary log retention period before starting the task. See Manage binary log files. |
| No DDL during schema or full data synchronization | Running DDL statements while these phases are in progress causes the task to fail. | Pause schema changes until both phases complete. |
| Invisible columns (MySQL 8.0.23 and later) | DTS cannot read data in invisible columns, resulting in data loss. Tables without explicit primary keys auto-generate invisible primary keys, which DTS also cannot read. | Run ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE; to make columns visible before starting. See Invisible Columns and Generated Invisible Primary Keys. |
| EncDB feature | ApsaraDB RDS for MySQL instances with the EncDB feature enabled do not support full data synchronization. | Disable EncDB, or use an instance without EncDB. Instances with Transparent Data Encryption (TDE) enabled support all three synchronization types: schema, full, and incremental. |
Additional requirements for self-managed MySQL:
-
Enable binary logging, and set
binlog_formattorowandbinlog_row_imagetofull. See Create an account for a self-managed MySQL database and configure binary logging. -
In a dual-primary cluster, set
log_slave_updatestoONso DTS can capture all binary logs. -
A primary/secondary switchover while the task is running causes the task to fail.
-
DTS executes the
CREATE DATABASE IF NOT EXISTS 'test'statement in the source database as scheduled to advance the binary log file position.
Additional requirements for ApsaraDB RDS for MySQL:
-
An ApsaraDB RDS for MySQL instance that does not record transaction logs, such as a read-only ApsaraDB RDS for MySQL 5.6 instance, cannot be used as the source database.
-
DTS executes the
CREATE DATABASE IF NOT EXISTS 'test'statement in the source database as scheduled to advance the binary log file position.
Destination database limitations
| Limitation | Trigger and consequence | Workaround |
|---|---|---|
| Max 64 tables per Tablestore instance | Synchronizing more than 64 tables causes a request error. | Contact Tablestore technical support to raise the limit. |
| Tablestore naming conventions | Table and column names that violate naming rules cause the task to fail. Names must use letters, digits, and underscores (_) only; must start with a letter or underscore; and must be 1–255 characters in length. |
Rename non-compliant tables and columns before starting the task. |
General limitations
| Limitation | Consequence | Recommendation |
|---|---|---|
| Data written from other sources to the destination during sync | Data inconsistency between source and destination. | Write to the destination only through DTS while the task is running. |
| Online DDL tools (such as pt-online-schema-change) on synchronized objects | The task fails. | Avoid running online DDL on objects being synchronized. |
| Concurrent INSERT operations during initial full sync | Table fragmentation occurs; the destination tablespace will be larger than the source after this phase. | Plan storage for the destination accordingly. |
| Data from binary log change operations (restored from physical backup or cascaded operations) | This data is not synchronized to the destination. | Remove the affected tables from the synchronization objects, then re-add them. See Modify the objects to be synchronized. |
Run synchronization during off-peak hours to reduce the performance impact on your source and destination databases.
Billing
| Synchronization type | Cost |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
SQL operations supported for incremental synchronization
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
Required permissions
The database account for the source ApsaraDB RDS for MySQL instance needs read and write permissions on the objects to be synchronized.
-
To create an account, see Create an account.
-
To adjust permissions for an existing account, see Modify the permissions of an account.
If the account was not created through the ApsaraDB RDS for MySQL console, grant it these permissions manually:
-
REPLICATION CLIENT -
REPLICATION SLAVE -
SHOW VIEW -
SELECT
Create a data synchronization task
The task configuration involves nine steps: navigating to the Data Synchronization page, creating the task, configuring source and destination databases, testing connectivity, selecting synchronization objects, configuring advanced settings, mapping primary key columns, running a precheck, and purchasing the instance.
Step 1: Go to the Data Synchronization page
Open the DTS console or DMS console and navigate to the Data Synchronization page.
DTS console
-
Log on to the DTS console.
-
In the left-side navigation pane, click Data Synchronization.
-
In the upper-left corner of the page, select the region in which the data synchronization instance resides.
DMS console
The actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.
-
Log on to the DMS console.
-
In the top navigation bar, move the pointer over Data + AI and choose .
-
From the drop-down list to the right of Data Synchronization Tasks, select the region in which the data synchronization instance resides.
Step 2: Create the task
-
Click Create Task to open the task configuration page.
-
(Optional) Click New Configuration Page in the upper-right corner if prompted.
Skip this step if the Back to Previous Version button is displayed. Specific parameters may differ between the new and previous versions of the configuration page. We recommend using the new version.
Step 3: Configure source and destination databases
Configure the connection details for both the source ApsaraDB RDS for MySQL instance and the destination Tablestore instance.
| Section | Parameter | Description |
|---|---|---|
| N/A | Task Name | DTS generates a name automatically. Specify a descriptive name to make the task easy to identify. Task names do not need to be unique. |
| Source Database | Select a DMS database instance | Select an existing registered database, or configure connection details manually. If you select an existing database, DTS populates the parameters automatically. To register an Alibaba Cloud database instance, see Register an Alibaba Cloud database instance. To register a self-managed or third-party cloud database, see Register a database hosted on a third-party cloud service or a self-managed database. |
| Database Type | Select MySQL. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | The region where the source ApsaraDB RDS for MySQL instance resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No to use a database in the current Alibaba Cloud account. | |
| RDS Instance ID | The ID of the source ApsaraDB RDS for MySQL instance. | |
| Database Account | The database account. See Required permissions for what permissions this account needs. | |
| Database Password | The password for the database account. | |
| Encryption | Select Non-encrypted or SSL-encrypted. To use SSL encryption, enable it on the ApsaraDB RDS for MySQL instance first. See Use a cloud certificate to enable SSL encryption. | |
| Destination Database | Select a DMS database instance | Select an existing registered database, or configure connection details manually. To register a database, see the links above under Source Database. |
| Database Type | Select Tablestore. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | The region where the destination Tablestore instance resides. | |
| Instance ID | The ID of the destination Tablestore instance. | |
| AccessKey ID of Alibaba Cloud Account | The AccessKey ID of the Alibaba Cloud account that owns the Tablestore instance. Using a Resource Access Management (RAM) user's AccessKey ID may cause the task to fail if the RAM user lacks the required permissions. | |
| AccessKey Secret of Alibaba Cloud Account | The AccessKey secret of the Alibaba Cloud account that owns the Tablestore instance. |
Step 4: Test connectivity
Click Test Connectivity and Proceed.
DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL) and to the security group rules of Elastic Compute Service (ECS) instances that host self-managed databases. For databases hosted in on-premises data centers or on third-party cloud providers, add the CIDR blocks manually. See CIDR blocks of DTS servers.
Adding DTS server CIDR blocks to your database whitelist or security group rules carries security risks. Take preventive measures before proceeding, including: using strong credentials, limiting exposed ports, authenticating API calls, auditing whitelist rules regularly, and restricting unauthorized CIDR blocks. Alternatively, connect through Express Connect, VPN Gateway, or Smart Access Gateway.
Step 5: Configure synchronization objects
This step determines which tables DTS replicates, how conflicts are handled, and how data is written to Tablestore.
Synchronization options:
| Parameter | Description |
|---|---|
| Synchronization Types | Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. Full data synchronization loads historical data first; incremental synchronization then keeps the destination in sync with ongoing changes. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): fails the precheck if the destination already has tables with the same names. Use the object name mapping feature to rename conflicting tables if needed. See Map object names. Ignore Errors and Proceed: skips this check. During full sync, existing records with matching primary or unique keys are kept as-is. During incremental sync, they are overwritten. Schema mismatches between source and destination may cause partial sync or task failure. Proceed with caution. |
| Operation Types | The DML operations to synchronize. All operation types are selected by default. Adjust based on your requirements. |
| Processing Policy of Dirty Data | How to handle write errors: Skip or Block. |
| Data Write Mode | How data is written to Tablestore: Overwrite Row uses the UpdateRowChange operation; Update Row uses the PutRowChange operation. |
| Batch Write Mode | How multiple rows are written at once: BulkImportRequest (offline write) or BatchWriteRowRequest (multi-row write). BulkImportRequest is recommended for higher throughput and lower cost. |
| Capitalization of Object Names in Destination Instance | The capitalization policy for database, table, and column names in Tablestore. Default: DTS default policy. See Specify the capitalization of object names in the destination instance. |
| Source Objects | Select databases or tables to synchronize, then click |
| Selected Objects | To rename an object in the destination, right-click it. To rename multiple objects, click Batch Edit. See Map object names. To filter rows by condition, right-click a table and specify a WHERE clause. See Set filter conditions. |
Database name mapping is not supported. Object name mapping applies only when tables are selected as synchronization objects. Renaming an object may break other objects that depend on it. Column type changes can be made using the column name mapping feature.
Performance tuning (optional):
Click More to configure these Tablestore write parameters:
| Parameter | Description |
|---|---|
| Queue Size | Length of the write queue. |
| Thread Quantity | Number of callback threads for writing. |
| Concurrency | Maximum concurrent threads for Tablestore writes. |
| Buckets | Number of concurrent buckets for sequential incremental writes. A higher value improves concurrent write throughput. Must be less than or equal to Concurrency. |
Step 6: Configure advanced settings
Click Next: Advanced Settings to configure the following options.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules the task on the shared cluster. Purchase a dedicated cluster for higher stability. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | How long DTS retries when the source or destination is unreachable. Valid values: 10–1440 minutes. Default: 720. We recommend setting this parameter to a value greater than 30. If DTS reconnects within this window, the task resumes; otherwise, the task fails. When multiple tasks share a source or destination, the shortest retry window applies. Note that you are charged for the DTS instance during retry. |
| Retry Time for Other Issues | How long DTS retries failed DDL or DML operations. Valid values: 1–1440 minutes. Default: 10. We recommend setting this parameter to a value greater than 10. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limit the queries per second (QPS) to the source database and the migration speed (RPS and MB/s) during full data synchronization to reduce the load on the destination database server. Visible only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limit the RPS and data synchronization speed (MB/s) for incremental synchronization. |
| Environment Tag | An optional tag to identify the DTS instance in your environment. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Whether DTS writes heartbeat table SQL operations to the source database. Yesalert notification settings: skips writing (a synchronization latency may appear in the console). No: writes heartbeat operations (may affect physical backup and cloning of the source database). |
| Configure ETL | Whether to enable extract, transform, and load (ETL). Select Yes to enter data processing statements. See Configure ETL in a data migration or data synchronization task. |
| Monitoring and Alerting | Configure alerts for task failure or high synchronization latency. Select Yes to set an alert threshold and notification contacts. See Configure monitoring and alerting when you create a DTS task. |
Step 7: Configure primary key columns for Tablestore
Click Next: Configure Database and Table Fields, then click OK in the confirmation dialog.
DTS maps each source table's primary key column to Tablestore by default. To use a different column, set Definition Status to All and select the column from the Primary Key Column drop-down list.
Select multiple columns to form a composite primary key.
Step 8: Run the precheck
Click Next: Save Task Settings and Precheck to save the configuration and start a precheck.
To preview the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before starting the task. If any items fail:
-
Click View Details next to the failed item to see the cause.
-
Fix the issue, then click Precheck Again.
If an alert is triggered for an item:
-
If the alert can be ignored: click Confirm Alert Details, click Ignore, click OK, then click Precheck Again. Ignoring alerts may cause data inconsistency.
-
If the alert cannot be ignored: fix the underlying issue before proceeding.
Step 9: Purchase the data synchronization instance
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the purchase page, configure the following:
| Parameter | Description |
|---|---|
| Billing Method | Subscription: pay upfront; more cost-effective for long-term use. Duration options: 1–9 months, 1 year, 2 years, 3 years, or 5 years. Pay-as-you-go: billed hourly; suitable for short-term use. Release the instance when you no longer need it to avoid ongoing charges. |
| Resource Group Settings | The resource group for the instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The synchronization throughput tier. See Instance classes of data synchronization instances. |
Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start.
Click OK in the confirmation dialog. The task appears in the task list, where you can monitor its progress.
What to do next
-
Monitor synchronization latency in the DTS console. If no DML operations are performed on the source for a long time, the displayed latency may be inaccurate. Run any DML operation on the source to refresh the latency value. If you synchronize entire databases, consider creating a heartbeat table that updates every second.
-
To change which tables are synchronized, see Modify the objects to be synchronized.
-
To tune write performance after the task starts, adjust the Queue Size, Thread Quantity, Concurrency, and Buckets parameters on the task's settings page.