Data Transmission Service (DTS) continuously synchronizes data from an ApsaraDB for MongoDB replica set or sharded cluster instance to an AnalyticDB for PostgreSQL instance. This topic covers full data synchronization and incremental data synchronization.
Prerequisites
Before you begin, make sure that you have:
-
An AnalyticDB for PostgreSQL instance with available storage space larger than the total size of data in the source ApsaraDB for MongoDB instance. Provision destination storage at 110% of the source data size to maintain 10% headroom. To create an instance, see Create an instance.
-
A database, a schema, and a table with a primary key column created in the destination instance to receive data. For the SQL syntax, see SQL syntax.
-
The data type of each destination column must be compatible with the corresponding MongoDB field. For example, if the
_idfield is of the ObjectId type, the destination column type must be varchar. -
Do not name any destination column _id or _value.
-
(For sharded cluster sources) The endpoints, usernames, and passwords of all shard nodes. All shard accounts must use the same credentials. To get shard endpoints, see Apply for an endpoint for a shard.
Billing
| Synchronization type | Fee |
|---|---|
| Full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
Synchronization types
| Synchronization type | Description |
|---|---|
| Full data synchronization | DTS copies all existing data from the selected collections in the source instance to the destination instance. |
| Incremental data synchronization | After full data synchronization completes, DTS continuously applies insert, update, and delete operations from the source to the destination. For file incremental data, only the $set command is supported. |
Required permissions
The incremental synchronization method available to DTS depends on the permissions granted to the database account.
| Database | Required permissions | Incremental sync method | References |
|---|---|---|---|
| Source ApsaraDB for MongoDB instance | Read permissions on the source, admin, and local databases |
Oplog (recommended for low-latency sync) or ChangeStream | Manage the permissions of MongoDB database users |
| Destination AnalyticDB for PostgreSQL instance | Read and write permissions on the destination database. Use the initial account or an account with the RDS_SUPERUSER permission. | N/A | Create and manage a database account and Manage users and permissions |
Choosing between Oplog and ChangeStream:
| Method | Requirements | Notes |
|---|---|---|
| Oplog (recommended) | Oplog enabled and retaining at least 7 days of data | Default on both self-managed MongoDB and ApsaraDB for MongoDB. Lower latency due to fast log pulling. |
| ChangeStream | MongoDB 4.0 or later; change streams enabled | Does not support two-way synchronization. Required for non-elastic Amazon DocumentDB clusters (set Migration Method to ChangeStream and Architecture to Sharded Cluster). |
Limitations
Source database limits
| Limit | Details |
|---|---|
| Bandwidth | The source database server must have sufficient outbound bandwidth; otherwise, synchronization speed is affected. |
| Collection name mapping | When you configure name mapping for collections, a single task supports up to 1,000 collections. For more collections, create multiple tasks or synchronize the entire database. |
Sharded cluster: _id uniqueness |
The _id field in each collection must be unique; otherwise, data inconsistency may occur. |
| Sharded cluster: Mongos node limit | The number of Mongos nodes cannot exceed 10. |
| Sharded cluster: orphaned documents | The source instance must not contain orphaned documents. Orphaned documents cause data inconsistency and task failure. See the MongoDB documentation and How do I delete orphaned documents? |
| Unsupported source types | Standalone ApsaraDB for MongoDB instances, Azure Cosmos DB for MongoDB clusters, and Amazon DocumentDB elastic clusters are not supported as sources. DTS cannot connect to a MongoDB database over an SRV endpoint. |
| Oplog or change stream retention | The oplog must be enabled and retain at least seven days of log data, or change streams must be enabled. If DTS cannot access the last seven days of changes, synchronization fails and data inconsistency or loss may occur. This is not covered by the DTS SLA. |
Restrictions during synchronization:
-
During full data synchronization, do not modify database or collection schemas, or change data of the ARRAY type. Doing so causes task failure or data inconsistency.
-
If the source is a sharded cluster, do not run commands that change data distribution—
shardCollection,reshardCollection,unshardCollection,moveCollection, ormovePrimary—while the task is running. Doing so causes data inconsistency. -
If the balancer is active on a sharded cluster source, latency may increase during synchronization.
Other limits
| Limit | Details |
|---|---|
| Object type | Only collections can be selected as synchronization objects. |
| Primary key requirement | The destination table must have a single-column unique primary key (not a composite primary key). The bson_value("_id") expression must be assigned to the primary key column. |
| Reserved column names | The destination table cannot contain columns named _id or _value. |
| Append-optimized tables | The destination table cannot be an append-optimized (AO) table. |
| Excluded databases | DTS cannot synchronize data from the admin, config, or local databases. |
| Transactions | Transaction information is not retained. Transactions are written to the destination as individual records. |
| FLOAT/DOUBLE precision | DTS uses ROUND(COLUMN,PRECISION) to retrieve values from FLOAT and DOUBLE columns. The default precision is 38 digits for FLOAT and 308 digits for DOUBLE. Verify that this meets your requirements. |
| Task auto-resume | DTS attempts to resume failed tasks for up to seven days. Before switching workloads to the destination, stop or release any failed tasks, or run REVOKE to remove DTS write permissions on the destination. Otherwise, source data may overwrite destination data after auto-resume. |
| Latency calculation | DTS calculates incremental synchronization latency based on the latest synchronized data timestamp. If no writes occur on the source for an extended period, the reported latency may be inaccurate. Perform an update on the source to refresh the latency reading. |
| DTS task failure | If a DTS task fails, DTS support will try to restore it within 8 hours. During restoration, the task may be restarted and task parameters may be modified. Database parameters are not modified. |
Create a synchronization task
Step 1: Open the Data Synchronization page
Use one of the following methods:
DTS console
-
Log on to the DTS console.
-
In the left-side navigation pane, click Data Synchronization.
-
In the upper-left corner, select the region where the synchronization task will run.
DMS console
The steps may vary depending on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.
-
Log on to the DMS console.
-
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
-
From the drop-down list next to Data Synchronization Tasks, select the region where the synchronization instance resides.
Step 2: Configure source and destination databases
Click Create Task, then configure the parameters below.
| Section | Parameter | Description |
|---|---|---|
| N/A | Task Name | Enter a descriptive name to identify the task. The name does not need to be unique. DTS generates a name automatically if you skip this field. |
| Source Database | Select Existing Connection | If the source instance is already registered with DTS, select it from the drop-down list. DTS auto-fills the remaining parameters. Otherwise, configure the following parameters manually. In the DMS console, select the instance from Select a DMS database instance. |
| Database Type | Select MongoDB. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the source ApsaraDB for MongoDB instance resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No to use an instance under the current Alibaba Cloud account. | |
| Architecture | Select the architecture of the source instance: Replica Set or Sharded Cluster. If you select Sharded Cluster, also configure Shard account and Shard password. | |
| Migration Method | Select how DTS reads incremental data from the source: Oplog (recommended when oplog is enabled) or ChangeStream (for MongoDB 4.0+ or Amazon DocumentDB non-elastic clusters). If Sharded Cluster is selected for Architecture and ChangeStream for Migration Method, shard credentials are not required. | |
| Instance ID | Select the ID of the source ApsaraDB for MongoDB instance. | |
| Authentication Database | Enter the name of the authentication database. The default is admin. | |
| Database Account | Enter the database account. See Required permissions. | |
| Database Password | Enter the password for the database account. | |
| Encryption | Select Non-encrypted, SSL-encrypted, or Mongo Atlas SSL based on your requirements. Available options depend on Access Method and Architecture. The options displayed in the DTS console take precedence. Note
SSL-encrypted is unavailable if Architecture is Sharded Cluster and Migration Method is Oplog. For self-managed MongoDB replica sets with SSL-encrypted selected, upload a CA certificate to verify the connection. |
|
| Destination Database | Select Existing Connection | If the destination instance is already registered with DTS, select it from the drop-down list. DTS auto-fills the remaining parameters. Otherwise, configure the following parameters manually. In the DMS console, select the instance from Select a DMS database instance. |
| Database Type | Select AnalyticDB for PostgreSQL. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the destination AnalyticDB for PostgreSQL instance resides. | |
| Instance ID | Select the ID of the destination AnalyticDB for PostgreSQL instance. | |
| Database Name | Enter the name of the database in the destination instance that will receive the synchronized data. | |
| Database Account | Enter the database account. See Required permissions. | |
| Database Password | Enter the password for the database account. |
Step 3: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
DTS automatically adds its CIDR blocks to the security settings of the source and destination databases. For more information, see Add the CIDR blocks of DTS servers.
If the source or destination database is self-managed and Access Method is not Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.
Step 4: Configure objects to synchronize
In the Configure Objects step, set the following parameters.
| Parameter | Description |
|---|---|
| Synchronization Types | By default, Incremental Data Synchronization is selected. Optionally select Full Data Synchronization. Schema Synchronization is not supported. |
| DDL and DML Operations to Be Synchronized | Select the DDL and DML operations to synchronize during incremental data synchronization at the instance level. To configure at the collection level, right-click a collection in the Selected Objects section. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): The precheck verifies that no destination tables share names with source collections. If duplicate names exist, the task cannot start. Use object name mapping to rename objects. See Map object names. Ignore Errors and Proceed: Skips the duplicate-name precheck. Warning
This may cause data inconsistency. During full data synchronization, if a record with the same primary key exists in the destination, DTS keeps the existing destination record. During incremental data synchronization, DTS overwrites the existing destination record. |
| Source Objects | Select collections from Source Objects and click the |
| Selected Objects | Configure database name mapping, table name mapping, and field mapping as described in the steps below. |
Configure database name mapping:
-
In Selected Objects, right-click the database that contains the collections to synchronize.

-
Change Database Name to the name of the schema in the destination AnalyticDB for PostgreSQL instance.

-
(Optional) In Select DDL and DML Operations to Be Synchronized, select the operations to synchronize during incremental data synchronization.

-
Click OK.
Configure table name mapping:
-
In Selected Objects, right-click a collection.

-
Change Table Name to the name of the table in the destination instance.

-
(Optional) Specify filter conditions to synchronize a subset of the data. See Specify filter conditions.

-
(Optional) In Select DDL and DML Operations to Be Synchronized, select the operations to synchronize during incremental data synchronization.

Configure field mapping:
DTS generates a default bson_value() expression for each field in the collection. Review and update the mapping as follows.
-
In the Assign Value column, the
bson_value()expression shows the source field name in quotes. For example,bson_value("age")maps theagefield from MongoDB. -
(Optional) Click the
icon next to a row to remove fields that do not need to be synchronized. -
Configure the mapping based on whether the default expression fits:
Fields with compliant expressions
-
Set Column Name to the name of the destination column.
-
Select the Type for each column. Make sure the type is compatible with the source field.
-
(Optional) Set Length and Precision.
-
Repeat for all columns.
Fields with non-compliant expressions
NoteFor example, for nested fields with parent-child relationships.
-
Click the
icon next to the row. -
Click + Add Column.

-
Set Column Name, Type, Length, and Precision.
-
Enter the
bson_value()expression in Assign Value. For nested fields, specify each level of the hierarchy (see Field mapping example).Important-
Assign
bson_value("_id")to the primary key column of the destination table. -
Specify the full field path in the
bson_value()expression for each column. Specifying only the top-level field (for example,bson_value("person")) prevents DTS from writing incremental data to the subfields.
-
-
Repeat for all columns.
-
-
Click OK.
Step 5: Configure advanced settings
Click Next: Advanced Settings to configure the following parameters.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS uses the shared cluster. To improve stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | The time range during which DTS retries failed connections. Valid values: 10–1,440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If the same source or destination is shared across multiple tasks, the shortest configured retry time takes precedence. DTS charges for the instance during the retry period. |
| Retry Time for Other Issues | The time range during which DTS retries failed DDL or DML operations. Valid values: 1–1,440 minutes. Default: 10 minutes. Set this to at least 10 minutes. This value must be smaller than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Synchronization | Limit resource usage during full data synchronization by setting Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected. |
| Only one data type for primary key _id in a table of the data to be synchronized | Specify whether the _id field has a single data type across all documents in a collection. Yes: DTS skips the data type scan and synchronizes only documents of one _id type per collection. No: DTS scans all _id data types and synchronizes all documents. Available only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limit resource usage during incremental data synchronization by setting RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s). |
| Environment Tag | Tag the DTS instance with an environment label. Optional. |
| Configure ETL | Enable the extract, transform, and load (ETL) feature to apply data transformations during synchronization. See What is ETL? and Configure ETL in a data migration or data synchronization task. |
| Monitoring and Alerting | Configure alerts to notify contacts when a task fails or synchronization latency exceeds a threshold. See Configure monitoring and alerting when you create a DTS task. |
Step 6: Run the precheck
Click Next: Save Task Settings and Precheck.
To view the API parameters for this task configuration, move the pointer over the button and click Preview OpenAPI parameters.
DTS runs a precheck before starting the task. The task starts only after the precheck passes.
If the precheck fails, click View Details next to each failed item, resolve the issue, and run the precheck again.
If the precheck triggers an alert: if the alert cannot be ignored, resolve it and rerun. If it can be ignored, click Confirm Alert Details > Ignore > OK, then click Precheck Again.
Step 7: Purchase and start the instance
-
Wait until the Success Rate reaches 100%, then click Next: Purchase Instance.
-
On the buy page, configure the following parameters.
| Section | Parameter | Description |
|---|---|---|
| New Instance Class | Billing Method | Subscription: Pay upfront for a fixed period. More cost-effective for long-term use. Pay-as-you-go: Billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges. |
| Resource Group Settings | Select the resource group for the instance. Default: default resource group. See What is Resource Management? | |
| Instance Class | Select an instance class based on the required synchronization speed. See Instance classes of data synchronization instances. | |
| Subscription Duration | Specify the subscription duration and number of instances to create. Available for 1–9 months, or 1, 2, 3, or 5 years. Available only for the Subscription billing method. |
-
Read and select Data Transmission Service (Pay-as-you-go) Service Terms.
-
Click Buy and Start. In the dialog box, click OK.
The task appears in the task list. You can monitor synchronization progress there.
Field mapping example
This example shows how to map fields from a nested MongoDB document to flat columns in AnalyticDB for PostgreSQL.
Source document structure:
{
"_id": "62cd344c85c1ea6a2a9f****",
"person": {
"name": "neo",
"age": 26,
"sex": "male"
}
}
Destination table schema:
| Column name | Type | Notes |
|---|---|---|
| mongo_id | varchar | Primary key column |
| person_name | varchar | |
| person_age | decimal |
Field mapping configuration in DTS:
| Column name | Type | Assign Value |
|---|---|---|
| mongo_id | STRING | bson_value("_id") |
| person_name | STRING | bson_value("person","name") |
| person_age | DECIMAL | bson_value("person","age") |
After synchronization, the destination table contains:
| mongo_id | person_name | person_age |
|---|---|---|
| 62cd344c85c1ea6a2a9f**** | neo | 26 |
Always specify the full field path in bson_value() for nested fields. Using bson_value("person") instead of bson_value("person","name") maps the entire person object as a single value and prevents DTS from writing incremental updates to individual subfields.