Data Transmission Service (DTS) synchronizes data from an ApsaraDB for MongoDB replica set to a PolarDB for MySQL cluster. Use this topic to create and run a synchronization task from start to finish.
Before you begin
Before creating a synchronization task, complete the following preparation steps.
Set up the destination cluster
Create a PolarDB for MySQL cluster with available storage larger than the total size of source data. (Recommended: at least 10% larger.) See Custom purchase and Purchase a subscription cluster.
Create a database and a table with a primary key column in the destination cluster. See Manage databases.
When designing the destination table schema:
Use varchar for any column that maps to a MongoDB ObjectId
_idfield.Do not name any column _id or _value.
Configure accounts with the required permissions
| Database | Required permissions | Reference |
|---|---|---|
| Source ApsaraDB for MongoDB | Read on the source database, the admin database, and the local database | Account management |
| Destination PolarDB for MySQL | Read and write on the destination database | Create and manage a database account |
(Sharded cluster only) Apply for shard endpoints
If the source is a sharded cluster, apply for endpoints for all shard nodes. All shard nodes must share the same account password and endpoint. See Apply for an endpoint for a shard.
Billing
| Synchronization type | Fee |
|---|---|
| Full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
Synchronization types
| Type | Description |
|---|---|
| Full data synchronization | Synchronizes historical data from the source ApsaraDB for MongoDB instance to the destination PolarDB for MySQL cluster. |
| Incremental data synchronization | After full data synchronization completes, continuously synchronizes insert, update, and delete operations. Only documents updated using the $set command are included. |
Limitations
Source database limitations
| Limitation | Details |
|---|---|
| Outbound bandwidth | The source server must have sufficient outbound bandwidth. Insufficient bandwidth reduces synchronization speed. |
| Collection limit | A single task supports up to 1,000 collections when object renaming is required. For more than 1,000 collections, configure multiple tasks. |
| Unsupported databases | DTS cannot synchronize data from the admin, config, or local databases. |
| Unsupported source types | Standalone ApsaraDB for MongoDB instances, Azure Cosmos DB for MongoDB clusters, and Amazon DocumentDB elastic clusters are not supported. |
| Oplog or change stream | The oplog feature must be enabled with operation logs retained for at least 7 days, OR change streams must be enabled and DTS must be able to subscribe to changes within the last 7 days. If neither condition is met, DTS may fail to obtain logs, causing task failure or data inconsistency. |
| Change stream version | Change streams require MongoDB V4.0 or later. |
| Amazon DocumentDB inelastic clusters | Set Migration Method to ChangeStream and Architecture to Sharded Cluster. |
| Operations during full synchronization | Do not change the schemas of databases or collections, or modify data of the ARRAY type. If running full synchronization only (no incremental), do not write to the source database. |
Sharded cluster additional limitations:
The
_idfield in each collection must be unique. Duplicate_idvalues cause data inconsistency.The number of mongos nodes cannot exceed 10.
The instance must not contain orphaned documents. See the MongoDB documentation and the FAQ topic.
If the ApsaraDB for MongoDB balancer is enabled, the instance may experience delays.
Destination database and task limitations
| Limitation | Details |
|---|---|
| Sync object type | Only collections can be selected as synchronization objects. |
| Primary key requirement | The destination table must have a unique single-column primary key (composite primary keys are not supported). Assign bson_value("_id") to the primary key column. |
| Reserved column names | The destination table cannot have columns named _id or _value. |
| Transactions | Transactions are not retained. Synchronized transactions are converted to single records. |
| Character set | If the data includes rare characters or emojis (4-byte characters), the destination database and tables must use the UTF8mb4 character set. If you use DTS schema synchronization, set the character_set_server parameter to UTF8mb4. |
| FLOAT/DOUBLE precision | DTS uses ROUND(COLUMN,PRECISION) to handle FLOAT and DOUBLE values. If no precision is specified, DTS defaults to 38 digits for FLOAT and 308 digits for DOUBLE. Verify these defaults before starting synchronization. |
| Off-peak hours | Run synchronization during off-peak hours. Full data synchronization uses read and write resources of both databases, which may increase server load. |
| Post-synchronization storage | After full synchronization, concurrent INSERT operations may cause fragmentation in destination collections, resulting in higher storage usage than in the source. |
| Failed task resume | DTS attempts to resume failed tasks for up to 7 days. Before switching workloads to the destination, stop or release any failed tasks, or revoke DTS write permissions using REVOKE. Otherwise, source data may overwrite destination data when the task resumes. |
| Latency calculation | DTS calculates incremental synchronization latency based on the timestamp of the latest synced data in the destination and the current timestamp in the source. Extended periods without source updates may cause inaccurate latency readings. Perform an update on the source to refresh the latency. |
| Task failure recovery | If a DTS task fails, DTS technical support attempts to restore it within 8 hours. The task may be restarted and task parameters (not database parameters) may be modified. |
Create a data synchronization task
The task configuration consists of five steps:
Go to the data synchronization page.
Configure source and destination databases.
Configure objects to synchronize.
Run a precheck.
Purchase an instance.
Step 1: Go to the data synchronization page
Use one of the following methods.
DTS console
Log on to the DTS console.DTS console
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the synchronization task resides.
DMS console
The actual steps may vary based on the mode and layout of the DMS console. See Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.DMS console
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list to the right of Data Synchronization Tasks, select the region where the task resides.
Step 2: Configure source and destination databases
Click Create Task.
(Optional) Click New Configuration Page in the upper-right corner.
- Skip this step if Back to Previous Version is displayed — the page is already on the new version. - Use the new version of the configuration page when possible.
Configure the source and destination databases using the following parameters.
General
| Parameter | Description |
|---|---|
| Task Name | The name of the DTS task. DTS generates a name automatically. Specify a descriptive name to identify the task. The name does not need to be unique. |
Source database
| Parameter | Description |
|---|---|
| Select Existing Connection | If the source instance is already registered with DTS, select it from the drop-down list. DTS auto-populates the remaining parameters. Otherwise, configure the parameters below manually. In the DMS console, select the instance from the Select a DMS database instance drop-down list. |
| Database Type | Select MongoDB. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region where the source ApsaraDB for MongoDB instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No if the source database belongs to the current Alibaba Cloud account. |
| Architecture | The architecture of the source instance. Select Replica Set for this example. If the source is a Sharded Cluster, also specify Shard account and Shard password. |
| Migration Method | The method used to synchronize incremental data. Options: Oplog (recommended) or ChangeStream. <br>- Oplog: Requires the oplog feature to be enabled. Oplog is enabled by default for ApsaraDB for MongoDB instances and delivers low synchronization latency due to fast log-pulling speed.<br>- ChangeStream: Requires change streams to be enabled. Available for MongoDB V4.0 or later. For inelastic Amazon DocumentDB clusters, use ChangeStream only. If Architecture is Sharded Cluster, the Shard account and Shard password parameters are not required. See Change Streams. |
| Instance ID | The ID of the source ApsaraDB for MongoDB instance. |
| Authentication Database | The database that stores the account credentials. Default: admin. |
| Database Account | The account with the required permissions. |
| Database Password | The password for the database account. |
| Encryption | The connection encryption method: Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. Available options depend on the Access Method and Architecture settings — the options displayed in the console apply. Note SSL-encrypted is unavailable when Architecture is Sharded Cluster and Migration Method is Oplog. For self-managed MongoDB using Replica Set architecture with a non-Alibaba Cloud access method and SSL encryption, upload a CA certificate to verify the connection. |
Destination database
| Parameter | Description |
|---|---|
| Select Existing Connection | If the destination instance is already registered with DTS, select it from the drop-down list. DTS auto-populates the remaining parameters. Otherwise, configure the parameters below manually. |
| Database Type | Select PolarDB for MySQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region where the destination PolarDB for MySQL cluster resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No if the destination database belongs to the current Alibaba Cloud account. |
| PolarDB Cluster ID | The ID of the destination PolarDB for MySQL cluster. |
| Database Account | The account with the required permissions. |
| Database Password | The password for the database account. |
| Encryption | The connection encryption method. See Configure SSL encryption. |
Click Test Connectivity and Proceed.
- DTS server CIDR blocks must be added to the security settings of both databases. DTS adds these automatically, or add them manually. See Add the CIDR blocks of DTS servers. - For self-managed databases where Access Method is not Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.
Step 3: Configure objects to synchronize
In the Configure Objects step, set the following parameters.
| Parameter | Description |
|---|---|
| Synchronization Types | Incremental Data Synchronization is selected by default. Select Full Data Synchronization if needed. Schema Synchronization cannot be selected. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): Checks for table name conflicts before starting. If identical table names exist, the precheck fails and the task cannot start. Use the object name mapping feature to rename conflicting tables if they cannot be deleted or renamed. <br>Ignore Errors and Proceed: Skips the conflict check. Warning This option risks data inconsistency. During full synchronization, existing destination records with matching primary or unique keys are retained. During incremental synchronization, they are overwritten. If schemas differ, initialization may fail or only partial columns are synchronized. |
| Capitalization of Object Names in Destination Instance | The capitalization policy for database names, table names, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names. |
| Source Objects | Select one or more collections from the Source Objects section, then click the |
In the Selected Objects section, configure object mapping. (Optional) To remove fields that do not need to be synchronized, click the
icon after the row. Rename the database: Right-click the database in Selected Objects.
Change Schema Name to the target database name in PolarDB for MySQL.
Click OK.Rename collections: Right-click the collection in Selected Objects.
Change Table Name to the target table name in PolarDB for MySQL. 
(Optional) Specify filter conditions. See Specify filter conditions.

(Optional) In the Select DDL and DML Operations to Be Synchronized section, select which incremental operations to synchronize.

Map fields: DTS automatically maps collection data and generates
bson_value()expressions in the Assign Value column. Verify that the expressions meet your requirements, then configure Column Name, Type, Length, and Precision for each field.ImportantAssign
bson_value("_id")to the primary key column of the destination table. Specify both the field and any subfields in eachbson_value()expression following the document hierarchy. Specifying only a parent field (for example,bson_value("person")) does not synchronize its subfields to the destination.Fields with correct expressions
Set Column Name to the corresponding column name in the destination PolarDB for MySQL table.
Select a Type compatible with the source data. For data type mappings, see the Data type mapping section.
(Optional) Set Length and Precision.
Repeat for each field.
Fields with incorrect expressions
Click the
icon in the Actions column for the row.Click + Add Column.

Set Column Name, Type, Length, and Precision.
Enter the correct
bson_value()expression in Assign Value. For examples, see the Field mapping examples section.Repeat for each field.
Important- Assign
bson_value("_id")to the primary key column of the destination table. - Specify both the field and any subfields in eachbson_value()expression following the document hierarchy. Specifying only a parent field (for example,bson_value("person")) does not synchronize its subfields to the destination.Click OK.
Click Next: Advanced Settings and configure the following parameters.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules tasks to a shared cluster. Purchase a dedicated cluster to improve synchronization stability. See What is a DTS dedicated cluster. |
| Select the engine type of the destination database | The storage engine of the destination database. Options: InnoDB (default) or X-Engine (for OLTP workloads). |
| Retry Time for Failed Connections | The time range in which DTS retries failed connections. Valid values: 10–1440 minutes. Default: 720. Set to a value greater than 30. If DTS reconnects within this range, the task resumes. Otherwise, the task fails. If multiple tasks share the same source or destination database with different retry ranges, the shortest range takes precedence. DTS continues to charge during retries. |
| Retry Time for Other Issues | The time range in which DTS retries failed DDL or DML operations. Valid values: 1–1440 minutes. Default: 10. Set to a value greater than 10 and less than the Retry Time for Failed Connections value. |
| Enable Throttling for Full Data Synchronization | Limits read/write resource usage during full synchronization to reduce database server load. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected. |
| Only one data type for primary key _id in a table of the data to be synchronized | Controls whether DTS scans the _id data type during full synchronization. Yesalert notification settings: Skip the scan. No: Scan the type. Displayed only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limits resource usage during incremental synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s). |
| Environment Tag | An optional tag for categorizing the task. |
| Configure ETL | Specifies whether to enable the extract, transform, and load (ETL) feature. Yes: Enter data processing statements in the code editor. See Configure ETL. No: Disable ETL. |
| Monitoring and Alerting | Specifies whether to configure alerting. Yes: Set alert thresholds and notification contacts. No: No alerts. See Configure monitoring and alerting. |
Step 4: Save settings and run a precheck
To preview the API parameters for this task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
To proceed, click Next: Save Task Settings and Precheck.
DTS runs a precheck before starting the task. The task starts only after passing the precheck.
If the precheck fails, click View Details next to each failed item, fix the issue, and click Precheck Again.
If an alert is triggered: for items that cannot be ignored, fix the issue and rerun the precheck. For ignorable items, click Confirm Alert Details > Ignore > OK > Precheck Again. Ignoring alerts may cause data inconsistency.
Step 5: Purchase an instance
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the buy page, configure the following parameters.
| Parameter | Description |
|---|---|
| Billing Method | Subscription: Pay upfront. More cost-effective for long-term use. Pay-as-you-go: Billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges. |
| Resource Group Settings | The resource group for the synchronization instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The instance class determines synchronization speed. See Instance classes. |
| Subscription Duration | Available only for the Subscription billing method. Options: 1–9 months, or 1, 2, 3, or 5 years. |
Read and select Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the dialog box.
After the task starts, monitor its progress in the task list.
Data type mapping
The following table shows how MongoDB data types map to PolarDB for MySQL data types.
| MongoDB data type | PolarDB for MySQL data type | Notes |
|---|---|---|
| ObjectId | VARCHAR | Stored as a string representation. |
| String | VARCHAR | |
| Document | VARCHAR | |
| DbPointer | VARCHAR | |
| Array | VARCHAR | |
| Date | DATETIME | |
| TimeStamp | DATETIME | |
| Double | DOUBLE | Precision defaults to 308 digits via ROUND(COLUMN,PRECISION) if not specified. |
| 32-bit integer (BsonInt32) | INTEGER | |
| 64-bit integer (BsonInt64) | BIGINT | |
| Decimal128 | DECIMAL | |
| Boolean | BOOLEAN | |
| Null | VARCHAR |
Field mapping examples
The examples below use the following source document structure and destination table schema.
Data structure of the source ApsaraDB for MongoDB instance
{
"_id": "62cd344c85c1ea6a2a9f****",
"person": {
"name": "neo",
"age": 26,
"sex": "male"
}
}Table schema of the destination PolarDB for MySQL cluster
| Column name | Type | Notes |
|---|---|---|
| mongo_id | varchar | Primary key |
| person_name | varchar | |
| person_age | decimal |
Configuration of new columns
All three destination columns require nested field expressions because person is a parent field containing subfields.
| Column name | Type | Assign value |
|---|---|---|
| mongo_id | STRING | bson_value("_id") |
| person_name | STRING | bson_value("person","name") |
| person_age | DECIMAL | bson_value("person","age") |
Specifying only bson_value("person") does not synchronize the subfields name, age, or sex to individual columns. Always use the full hierarchical path in the bson_value() expression.
After this configuration, the destination table receives data in the following structure:
| mongo_id | person_name | person_age |
|---|---|---|
| 62cd344c85c1ea6a2a9f**** | neo | 26 |