Data Transmission Service (DTS) supports two-way data synchronization between two ApsaraDB for MongoDB sharded cluster instances. This is typically used for active geo-redundancy (unit-based) and geo-disaster recovery architectures, where both instances serve live traffic and must stay in sync.
A two-way synchronization instance consists of a forward task and a reverse task. If an object is to be synchronized in both the forward and reverse synchronization tasks when you configure or reset the instance, the following rules apply: only one of the tasks can synchronize both the full data and incremental data of objects; the other synchronizes only incremental data. The source data of the current task can be synchronized only to the destination database in the task, preventing sync loops.
DTS supports two-way synchronization only between two ApsaraDB for MongoDB sharded cluster instances. It does not support synchronization among three or more instances, or between instances with different architectures.
Disable the MongoDB balancer on the source instance before full data synchronization begins. Do not re-enable it until full data synchronization is complete and incremental synchronization has started. An active balancer during full sync can cause data inconsistency. See Manage the ApsaraDB for MongoDB balancer.
Prerequisites
Before you begin, ensure that you have:
Instance setup:
Created both the source and destination ApsaraDB for MongoDB sharded cluster instances
Assigned endpoints to all shards in both instances (including the instance that acts as source in the reverse direction), with the same account and password across shards. See Apply for an endpoint for a shard
Confirmed that the destination instance has at least 10% more storage than the total data size of the source instance (recommended)
Parameter configuration (required for both instances):
Set
replication.oplogGlobalIdEnabledtotrueon all shard and config server nodes of both source and destination instances. See Configure database parameters
If replication.oplogGlobalIdEnabled is not set to true, the precheck fails with the error two-way mongo must have gid.
Sharding and balancer setup (plan before starting the forward task):
How you prepare the destination instance depends on whether you use DTS schema synchronization:
Without schema synchronization (you deselect Schema Synchronization in the Configure Objects step): Manually create the target databases and collections, configure data sharding, enable the balancer, and perform pre-sharding in the destination instance before starting the task. See Configure sharding to maximize the performance of shards.
With schema synchronization (you select Schema Synchronization in the Configure Objects step): Enable the balancer and perform pre-sharding after schema synchronization completes.
Pre-sharding the destination instance ensures that synchronized data is distributed evenly across shards, maximizing sharded cluster performance and preventing data skew.
For information about supported database versions, see Overview of data synchronization scenarios.
Before configuring the reverse task, plan the following:
The following constraints apply specifically to the reverse synchronization task. Review them before you start the forward task to avoid having to reconfigure later:
The source instance in the reverse direction is the destination in the forward direction, and vice versa. Confirm you have the correct instance IDs, accounts, and passwords for both directions.
Do not use the object name mapping feature for the reverse task. Using it can cause data inconsistency.
The objects selected for the reverse task cannot overlap with objects already in the Selected Objects list of the forward task.
The reverse task ignores DDL operations.
The Instance Region values for source and destination cannot be changed in the reverse task.
Limitations
Source and destination database constraints
| Constraint | Details |
|---|---|
| Bandwidth | The source server must have enough outbound bandwidth. Low bandwidth reduces synchronization speed. |
| Collection uniqueness | Collections must have PRIMARY KEY or UNIQUE constraints with all fields unique. Without this, the destination may contain duplicate records. |
_id field | The _id field must be unique across all documents in a collection. Duplicate _id values cause data inconsistency. |
| Collection count limit | When synchronizing collections (not entire databases) with object renaming enabled, a single task supports up to 1,000 collections. For more collections, configure multiple tasks or synchronize at the database level. |
| Document size | A single document cannot exceed 16 MB. Larger documents cause the task to fail. |
| Orphaned documents | Both source and destination instances must be free of orphaned documents before synchronization. Orphaned documents can cause inconsistency or task failure. See Glossary of MongoDB and How do I delete orphaned documents? |
| Architecture | Both instances must be ApsaraDB for MongoDB instances with identical architecture. Two-way synchronization is not supported for self-managed MongoDB or instances with mismatched architectures. |
| Scaling | Do not scale MongoDB sharded cluster instances involved in a running task. Scaling causes the task to fail. |
| Mongos node count | The source instance cannot have more than 10 mongos nodes. |
| TTL indexes | If the source has TTL indexes, data inconsistency may occur between source and destination after synchronization. |
| SRV endpoints | DTS cannot connect to a MongoDB instance over an SRV endpoint. |
| Balancer impact | If the source balancer is enabled during synchronization, the DTS task may experience delays. |
Unsupported source databases:
Azure Cosmos DB for MongoDB clusters
Amazon DocumentDB elastic clusters
Oplog and change stream requirements:
The source database must have the oplog enabled, retaining at least 7 days of log data. Alternatively, change streams must be enabled so that DTS can subscribe to data changes from the last 7 days. If neither condition is met, DTS may fail to obtain data changes, which can result in task failure, data inconsistency, or data loss—issues not covered by the DTS service level agreement (SLA).
Use the oplog (not change streams) to record data changes when possible.
Change streams require MongoDB 4.0 or later and do not support two-way synchronization.
For non-elastic Amazon DocumentDB clusters, enable change streams and set Migration Method to ChangeStream and Architecture to Sharded Cluster.
Restricted operations on the source database during synchronization:
During schema synchronization or full data synchronization: do not alter database or collection schemas, including updating array types. Schema changes cause task failure or data inconsistency.
During full data synchronization only: do not write to the source database. Writes during full sync cause data inconsistency.
While the task is running: do not run
shardCollection,reshardCollection,unshardCollection,moveCollection, ormovePrimaryon synchronized objects. These commands change data distribution and can cause inconsistency.
Other constraints
| Constraint | Details |
|---|---|
| Shard keys | Add shard keys to source data before starting the task. INSERT operations on synchronized data must include shard keys; UPDATE operations cannot modify shard keys. |
| Database version | The destination MongoDB version must be the same as or later than the source version. Older destination versions may cause compatibility issues. |
| Capped collections and unique indexes | Collections with unique indexes or the capped attribute support only single-thread writes and do not support concurrent replay during incremental sync. This may increase synchronization latency. |
| Excluded databases | DTS does not synchronize data from the admin, config, or local databases. |
| Transactions | Transaction information is not retained. Transactions are converted to individual records in the destination. |
| Conflict handling | If a primary key or unique key conflict occurs, DTS skips the conflicting write and retains the existing data in the destination collection. |
| Destination writes from other sources | Writing to the destination from sources outside DTS during synchronization causes inconsistency. For example, running online DDL statements via DMS while other sources write to the destination can cause data loss. |
| Destination storage overhead | Because DTS writes data concurrently to the destination, destination storage usage may be 5%–10% larger than the source data size. Full data synchronization also causes collection fragmentation in the destination, further increasing storage use. |
| Destination primary key | Make sure the destination does not already contain documents with the same _id values as the source. If such documents exist, delete them from the destination before starting the task—without interrupting the DTS service. |
| Count queries | Use db.$table_name.aggregate([{ $count:"myCount"}]) to query document counts on the destination MongoDB database. |
| Schema synchronization conflict with manual sharding | If data sharding is already configured on the destination and you do not need DTS to synchronize schemas, do not select Schema Synchronization. Doing so may cause inconsistency or task failure due to shard key conflicts. |
| Off-peak synchronization | Full data synchronization uses read and write resources from both source and destination. Run synchronization during off-peak hours to minimize performance impact. |
| Task failure recovery | If a task fails, DTS technical support attempts to restore it within 8 hours. During recovery, the task may restart and task parameters may be modified. Database parameters are not modified. |
Billing
| Synchronization type | Cost |
|---|---|
| Schema synchronization | Free |
| Full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
How conflict detection works
To maintain data consistency, update records with the same primary key, business primary key, or unique key on only one instance at a time.
DTS automatically detects and resolves the following conflict types to maximize task stability:
| Operation | Conflict scenario | DTS behavior |
|---|---|---|
| INSERT | The record to insert conflicts with an existing record in the destination | Ignores the INSERT operation |
| UPDATE | The record to update does not exist in the destination, or conflicts with another record | Ignores the UPDATE operation |
| DELETE | The record to delete does not exist in the destination | Ignores the DELETE operation |
The conflict resolution policy is set to Ignore by default and cannot be changed.
System clock differences and synchronization latency between instances mean the conflict detection mechanism cannot prevent all conflicts. Always ensure that records with the same primary key, business primary key, or unique key are updated on only one instance.
Synchronization types
| Type | Description |
|---|---|
| Schema synchronization | Copies collection schemas from the source instance to the destination instance. |
| Full data synchronization | Copies all existing data from the source to the destination. Supported objects: databases and collections. |
| Incremental data synchronization | Continuously copies changes from the source to the destination after full sync completes. DTS does not synchronize incremental data from databases created after the task starts. |
Incremental data synchronization covers the following operations:
CREATE COLLECTIONandCREATE INDEXDROP COLLECTIONandDROP INDEXRENAME COLLECTIONDocument inserts, updates, and deletes within a collection
File synchronization: only
$setcommands are supported
Configure two-way data synchronization
This procedure configures the DTS task before purchasing the DTS instance. In this flow, you do not need to specify the number of shards upfront. If you purchase the instance first, you must specify the shard count at purchase time.
Step 1: Open the Data Synchronization page
Use one of the following consoles to start:
DTS console:
Log in to the DTS console.
In the left navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the synchronization instance will reside.
DMS console:
The exact navigation may vary based on your DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.
Log in to the DMS console.
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list next to Data Synchronization Tasks, select the target region.
Step 2: Create the task
Click Create Task to open the task configuration page.
Step 3: Configure source and destination databases
After configuring the source and destination databases, review the Limits displayed on the page before proceeding. Skipping this step can cause task failure or data inconsistency.
Configure the following parameters:
| Section | Parameter | Description |
|---|---|---|
| N/A | Task Name | A descriptive name for the task. DTS generates a default name; you do not need it to be unique. |
| Source Database | Select Existing Connection | If the source instance is registered with DTS, select it from the list—DTS populates the parameters automatically. Otherwise, configure the parameters below. In the DMS console, select from Select a DMS database instance. |
| Database Type | Select MongoDB. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | The region where the source ApsaraDB for MongoDB instance resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No (this example uses a single account). | |
| Architecture Type | Select Sharded Cluster. | |
| Migration Method | Select Oplog. | |
| Instance ID | The ID of the source ApsaraDB for MongoDB instance. | |
| Authentication Database | The authentication database for the source instance. The default is admin. | |
| Database Account | The source database account. Must have read permissions on the source, config, admin, and local databases. | |
| Database Password | The password for the database account. | |
| Shard account | The account for accessing shards in the source instance. | |
| Shard password | The password for the shard account. | |
| Encryption | Connection encryption for the source database: Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. Available options depend on the Access Method and Architecture Type settings. | |
| Destination Database | Select Existing Connection | If the destination instance is registered with DTS, select it from the list. Otherwise, configure the parameters below. |
| Database Type | Select MongoDB. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | The region where the destination ApsaraDB for MongoDB instance resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No. | |
| Architecture Type | Select Sharded Cluster. | |
| Instance ID | The ID of the destination ApsaraDB for MongoDB instance. | |
| Authentication Database | The authentication database for the destination instance. The default is admin. | |
| Database Account | The destination database account. Must have dbAdminAnyDatabase permission, read/write permissions on the destination database, and read permissions on the local database. | |
| Database Password | The password for the database account. | |
| Encryption | Connection encryption for the destination database: Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. |
Note on encryption: - SSL-encrypted is not available when Architecture Type is Sharded Cluster and Migration Method is Oplog for an ApsaraDB for MongoDB source. - SSL-encrypted is not available when Architecture Type is Sharded Cluster for an ApsaraDB for MongoDB destination. - For self-managed MongoDB with Replica Set architecture and SSL-encrypted selected, you can upload a CA certificate to verify the connection.
Step 4: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
DTS server CIDR blocks must be added to the security settings of both source and destination databases—either automatically or manually—before the test. See Add the CIDR blocks of DTS servers.
Step 5: Configure objects to synchronize
In the Configure Objects step, set the following parameters:
| Parameter | Description |
|---|---|
| Synchronization Types | Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. After the precheck, DTS synchronizes existing data first, then continuously synchronizes incremental changes. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors: checks whether the destination contains collections with the same names as the source. If name conflicts exist, the precheck fails and the task cannot start. To resolve name conflicts without deleting destination collections, use object name mapping. Ignore Errors and Proceed: skips the name conflict check. If records in the destination share the same primary key or unique key as source records, DTS retains the destination records and does not overwrite them. Use this option with caution—it can cause data inconsistency. Data may fail to be initialized, only specific columns are synchronized, or the data synchronization instance fails. |
| Synchronization Topology | Select Two-way Synchronization. |
| Exclude DDL Operations | Yes: excludes DDL operations from synchronization. No: includes DDL operations. To maintain two-way stability, DTS synchronizes DDL operations only in the forward direction. |
| Conflict Resolution Policy | In this scenario, only Ignore is supported. DTS automatically ignores conflicting statements and retains existing destination data. |
| Source Objects | Select databases or collections to synchronize and click |
| Selected Objects | To rename an object in the destination or configure a filter, right-click the object. To remove an object, click it and then click |
If you use object name mapping to rename a database or collection in the destination, other objects that depend on it may fail to synchronize.
To filter data in a collection, right-click it in Selected Objects and configure filter conditions. Filters apply only during full data synchronization, not incremental synchronization. See Specify filter conditions.
Step 6: Configure advanced settings
Click Next: Advanced Settings and configure the following:
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS uses a shared cluster. For higher stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | How long DTS retries failed connections after the task starts. Range: 10–1,440 minutes. Default: 720 minutes. We recommend that you set this parameter to a value greater than 30 minutes. If multiple tasks share the same source or destination, the shortest retry window takes precedence. DTS charges apply during retries. |
| Retry Time for Other Issues | How long DTS retries failed DDL or DML operations. Range: 1–1,440 minutes. Default: 10 minutes. We recommend that you set this parameter to a value greater than 10 minutes. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Synchronization | Limits read/write load during full sync by configuring QPS (queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected. |
| Only one data type for primary key `_id` in a table of the data to be synchronized | Yes: DTS assumes _id has a single data type per collection and skips scanning. Synchronizes only data of that type within each collection. No: DTS scans all _id data types and synchronizes everything. Set based on your actual data. Incorrect settings may cause data loss. Available only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limits load during incremental sync by configuring RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s). |
| Environment Tag | An optional tag to identify the DTS instance. |
| Configure ETL | Yes: enables the extract, transform, and load (ETL) feature. Enter data processing statements in the code editor. See Configure ETL. No: skips ETL. |
| Monitoring and Alerting | No: no alerts. Yes: configures alerts for task failure or synchronization latency exceeding a threshold. See Configure monitoring and alerting. |
Step 7: Configure data verification (optional)
Click Next Step: Data Verification to configure data verification for the task. See Configure a data verification task.
Step 8: Run the precheck and purchase an instance
Click Next: Save Task Settings and Precheck. To preview the API parameters for this configuration, move the pointer over the button and click Preview OpenAPI parameters before proceeding.
Wait for the precheck to complete.
- The task cannot start until it passes the precheck. - If the precheck fails, click View Details next to each failed item, fix the issues, and rerun the precheck. - If an alert item appears and can be safely ignored, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring alerts may increase the risk of data inconsistency.
When Success Rate reaches 100%, click Next: Purchase Instance.
On the purchase page, configure the following:
Section Parameter Description New Instance Class Billing Method Subscription: billed upfront for a fixed term; more cost-effective for long-term use. Pay-as-you-go: billed hourly; suitable for short-term use. Release the instance when no longer needed to stop charges. Resource Group Settings The resource group for the instance. Default: default resource group. See What is Resource Management? Instance Class The synchronization speed tier. See Instance classes of data synchronization instances. Subscription Duration Available for Subscription billing only. Choose 1–9 months, 1 year, 2 years, 3 years, or 5 years. Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the confirmation dialog.
The forward synchronization task appears in the task list. Wait until its status changes to Running before proceeding.
Step 9: Configure the reverse synchronization task
Before configuring the reverse task, confirm that the forward task status is Running. Also review the planning notes in the Prerequisites section above—some constraints require decisions made before you start the forward task.
In the task list, find the reverse synchronization task and click Configure Task.
Follow steps 3 through 8 to configure source/destination databases, objects, advanced settings, and the precheck for the reverse task.
When DTS checks for conflicting tables in the reverse direction, it ignores tables already synchronized by the forward task.
When Success Rate reaches 100%, click Back.
Verify the configuration
After both tasks are configured, wait until both the forward and reverse tasks enter the Running state. Two-way data synchronization is now active.
What's next
Monitor synchronization latency and task health in the DTS console.
To adjust task parameters after the task starts, see Modify the parameters of a DTS instance.
To manage the MongoDB balancer during ongoing synchronization, see Manage the ApsaraDB for MongoDB balancer.