A DuckDB-based analytical primary instance supports features such as columnar storage, just-in-time (JIT) compilation, vectorized execution, efficient memory management, and parallel processing. Compared with the InnoDB engine, it improves the performance of complex queries by a hundredfold. You can use Data Transmission Service (DTS) to synchronize the database schema and data from an RDS for MySQL instance to a DuckDB-based analytical primary instance. This lets you quickly build enterprise business intelligence (BI), interactive query, and real-time reporting systems.
Applicability
A destination DuckDB-based analytical primary instance has been created. For more information, see Create a DuckDB-based analytical primary instance.
Ensure that the storage space of the destination DuckDB-based analytical primary instance is larger than the storage space used by the source database.
Precautions
Tables to be synchronized in the RDS for MySQL instance must have primary keys.
When DTS performs initial full data synchronization, it consumes read and write resources on the source and destination databases, which may increase the database load. We recommend that you synchronize data during off-peak hours, for example, when the CPU load of both databases is below 30%.
Do not perform online DDL operations on synchronization objects in the source database using tools such as gh-ost or pt-online-schema-change during data synchronization. Otherwise, the synchronization fails.
Because of the limits of a DuckDB-based analytical primary instance, the cluster is locked if the disk space usage of a node exceeds 80%. Estimate the space required for the objects to be synchronized in advance and ensure that the destination cluster has sufficient storage space.
Prefix indexes are not supported. If the source database has prefix indexes, the data synchronization may fail.
If the destination DuckDB-based analytical primary instance is being backed up while the DTS task is running, the task fails.
A DuckDB-based analytical primary instance cannot be used as a data source for data synchronization.
If a
DMLstatement containsLIMIT, a non-deterministic function such asUUID(), or anXAtransaction, it is considered anUnsafeoperation for replication. This may cause data inconsistency. Avoid using these statements when you synchronize data to a DuckDB-based analytical primary instance.
Billing
| Synchronization type | Task configuration fee |
| Schema synchronization and full data synchronization | Free of charge. |
| Incremental data synchronization | Charged. For more information, see Billing overview. |
Supported source instance types
The following instance types are supported for the source MySQL database in a data synchronization task:
RDS for MySQL
A self-managed database on an ECS instance
An on-premises self-managed database
This topic uses RDS for MySQL as an example to describe the configuration procedure. The procedure is similar if the source database is a self-managed MySQL database.
If your source database is a self-managed MySQL database, you must make additional preparations. For more information, see Overview of preparations.
Supported SQL operations
DDL operations: CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, DROP COLUMN, and MODIFY COLUMN
DML operations: INSERT, UPDATE, and DELETE
You cannot specify
ORDER BYorLIMITinUPDATEorDELETEstatements.If you change the field type of a source table during data synchronization, the synchronization job reports an error and stops. You can manually fix the issue. For more information, see Fix synchronization failures caused by changing field types.
Database account permissions
Database | Required permissions |
RDS MySQL | We recommend that you use a privileged account. If you use a standard account, it must have the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions on all synchronization objects. |
A DuckDB-based analytical primary instance | We recommend that you use a privileged account. If you use a standard account, it must have the READ and WRITE permissions. |
Procedure
Log on to the Data Transmission Service console. In the navigation pane on the left, choose Data Synchronization. In the top navigation bar, select a region.
On the Data Synchronization page, click Create Task to create a synchronization task.
On the page, configure the source and destination databases for the synchronization task.
Source instance information
Configuration
Description
Database Type
Connection Type
Cloud Instance
Instance Region
The region of the source instance that you selected when you purchased the data synchronization instance. This parameter cannot be changed.
Cross-account
No cross-account access
Instance ID
Select the ID of the source RDS instance.
Database Account
Enter the database account of the source RDS instance. For information about the required permissions, see Database account permissions.
Database Password
Enter the password that corresponds to the database account.
Connection Method
Select Unencrypted Connection or SSL Secure Connection as needed. If you select SSL Secure Connection, you must enable Secure Sockets Layer (SSL) encryption for the RDS instance in advance. For more information, see Use a cloud certificate to quickly enable SSL encryption.
ImportantThe Encryption parameter is available only within regions in the Chinese mainland and the China (Hong Kong) region.
Destination instance information
Configuration
Description
Database Type
Connection Type
Cloud Instance
Instance Region
The region of the destination instance that you selected when you purchased the data synchronization instance. This parameter cannot be changed.
Cross-account
Single account
NoteDTS synchronization tasks support cross-region data synchronization and migration. This topic uses data synchronization between instances in the same region and VPC as an example to describe how to ensure data consistency. For more information about data synchronization, see Overview of data synchronization scenarios.
RDS Instance ID
The instance ID of the DuckDB instance.
Database Account
Enter the database account of the DuckDB-based analytical primary instance. For information about the required permissions, see Database account permissions.
Database Password
Enter the password that corresponds to the database account.
Connection Method
Select Unencrypted Connection or SSL Secure Connection as needed. If you select SSL Secure Connection, you must enable SSL encryption for the RDS instance in advance. For more information, see Use a cloud certificate to quickly enable SSL encryption.
ImportantThe Encryption parameter is available only within regions in the Chinese mainland and the China (Hong Kong) region.
Click Test Connection To Proceed.
In the Authorize DTS Servers To Access DB dialog box, click Test Connection.
Configure the task objects.
On the Object Configuration page, configure the objects to be synchronized.
Configuration
Description
Synchronization Types
If you only need to perform full synchronization, we recommend that you select both Schema Synchronization and Full Synchronization.
To perform synchronization without downtime, we recommend that you select Schema Synchronization, Full Synchronization, and Incremental Synchronization.
NoteIf you do not select Schema Synchronization, make sure that the destination database contains the databases and tables that will receive the data. You can then use the table and column name mapping feature in the Selected Objects box as needed.
If you do not select Incremental Synchronization, do not write new data to the source instance during data synchronization to ensure data consistency.
Trigger Migration Method For Source Database
Select a method to migrate triggers as needed. If the objects to be migrated do not involve triggers, you do not need to configure this parameter. For more information, see Configure a method to synchronize or migrate triggers.
NoteYou can configure this parameter only when you select Schema Synchronization for Synchronization Types.
Enable Migration Evaluation
This feature evaluates whether the schemas of the source and destination databases, such as index length, stored procedures, and dependent tables, meet the requirements. You can select Yes or No as needed.
NoteYou can configure this parameter only when you select Schema Synchronization for Synchronization Types.
If you select Yes, the precheck may take longer. You can view the Evaluation Result during the precheck phase. The evaluation result does not affect the precheck result.
Synchronization Topology
Select One-way Synchronization.
Processing Mode For Tables With The Same Name In The Destination Database
Precheck And Report An Error: Checks whether a table with the same name exists in the destination database. If no table with the same name exists, the check item passes. If a table with the same name exists, an error is reported during the precheck phase and the data synchronization task does not start.
NoteIf you cannot delete or rename the table with the same name in the destination database, you can change the name of the table in the destination database. For more information, see Map table and column names.
Ignore And Proceed: Skips the check for tables with the same name in the destination database.
WarningIf you select Ignore And Proceed, data inconsistency may occur, which may pose risks to your business. For example:
If the table schemas are consistent and a record in the destination database has the same primary key value as a record in the source database:
During full synchronization, DTS retains the record in the destination cluster. The record from the source database is not migrated to the destination database.
During incremental synchronization, DTS does not retain the record in the destination cluster. The record from the source database overwrites the record in the destination database.
If the table schemas are inconsistent, only data from some columns may be synchronized, or the synchronization may fail. Proceed with caution.
Migrate Events
Select whether to migrate events from the source database as needed. If you select Yes, you must also follow the relevant requirements and perform subsequent operations. For more information, see Synchronize or migrate events.
Case-sensitivity For Destination Object Names
You can configure the case-sensitivity policy for the names of synchronized databases, tables, and columns in the destination instance. By default, DTS Default Policy is selected. You can also choose to keep the policy consistent with the source or destination database. For more information, see Case-sensitivity for destination object names.
Source Objects
In the Source Objects box, select the objects to be synchronized and click
to move them to the Selected Objects box.NoteYou can select objects at the database, table, and column levels. If you select tables or columns, other objects such as views, triggers, and stored procedures are not synchronized to the destination database.
Selected Objects
To change the name of a single synchronized object in the destination instance, right-click the object in the Selected Objects box. For information about how to set the name, see Map table and column names.
To change the names of multiple synchronized objects in the destination instance in a batch, click Batch Edit in the upper-right corner of the Selected Objects box. For information about how to set the names, see Map table and column names.
NoteTo select SQL operations to be synchronized at the database or table level, right-click the object in the Selected Objects box and select the desired SQL operations in the dialog box that appears.
To set a WHERE clause to filter data, right-click the table in the Selected Objects box and set the filter condition in the dialog box that appears. For information about how to set the filter condition, see Set a filter condition.
Click Next: Advanced Configuration to configure advanced parameters.
In this example, the default configurations are used. You can also configure the parameters as needed.
Click Next: Data Validation to configure a data validation task.
To use the data validation feature, select the data validation methods as needed. You can keep the default configurations for other parameters or configure them as needed. For information about how to configure the parameters, see Configure data validation.
NoteIf you select schema validation, the validation process may report a "data inconsistency" error even if the schema has been synchronized. This is because of historical compatibility issues. This alert does not indicate a critical issue and does not affect the actual synchronization result. To avoid false positives, we recommend that you do not enable schema validation.
Click Next: Save Task And Precheck to perform a precheck on the synchronization task.
After the Precheck Passes With A Score Of 100%, click Next: Purchase.
NoteA precheck is performed before the synchronization job starts. The job can start only after it passes the precheck.
If the precheck fails, click the
icon next to the failed check item to view the details.You can fix the issue based on the cause and run the precheck again.
If you do not need to fix the alert items, you can click Confirm To Shield and then Ignore Alert Items And Rerun Precheck to skip the alert items and run the precheck again.
On the Purchase page, select the billing method and link specification for the data synchronization instance. For more information, see the following table.
Category
Parameter
Description
Configuration
Billing Method
Subscription: You pay when you create the instance. This is suitable for long-term needs and is more cost-effective than pay-as-you-go. The longer the subscription duration, the higher the discount.
Pay-as-you-go: You are charged on an hourly basis. This is suitable for short-term needs. You can release the instance immediately after use to save costs.
Resource Group
The resource group to which the instance belongs. The default value is default resource group. For more information, see What is Resource Management?.
Link Specification
DTS provides synchronization specifications with different performance levels. The link specification affects the synchronization speed. You can select a specification based on your business scenario. For more information, see Specifications of data synchronization links.
Subscription Duration
In subscription mode, select the duration and quantity for the subscription instance. You can select a monthly subscription from 1 to 9 months, or a yearly subscription of 1, 2, 3, or 5 years.
NoteThis option is available only when the billing method is Subscription.
After the configuration is complete, read and select Data Transmission Service (Pay-As-You-Go) Terms Of Service. Click Purchase And Start, and in the Confirm dialog box, click OK.
After the purchase is successful, you are returned to the Data Synchronization page. Click the ID of the task that you created to go to the task details page. In the navigation pane on the left, click Task Management to view the progress. When the task status changes to Synchronizing, the historical data has been migrated and incremental data is being synchronized in real time.
Fix synchronization failures caused by changing field types
If you change the field type of a source table during data synchronization, the synchronization job reports an error and stops. You can follow these steps to manually fix the issue.
In the destination instance, create a new table B (for example, customer_new) based on the failed table A (for example, customer). Adjust the field types of the new table B as needed.
Run the INSERT INTO SELECT command to copy the data from table A to the new table B. Ensure that the data in both tables is consistent.
Rename or delete the failed table A, and then rename table B to customer.
In the DTS console, restart the data synchronization job.