All Products
Search
Document Center

Data Transmission Service:Synchronize data from a MongoDB instance (without a sharding key) to a MongoDB instance (sharded cluster architecture)

Last Updated:Mar 30, 2026

Data Transmission Service (DTS) supports synchronizing data from a MongoDB instance without shard keys to a sharded cluster instance. During task configuration, you can assign default shard key values to collections that lack shard key fields.

This topic uses ApsaraDB for MongoDB (replica set) as the source and ApsaraDB for MongoDB (sharded cluster) as the destination.

Prerequisites

Before you begin, make sure that you have:

Pre-configuring data sharding prevents all data from landing on a single Shard, which would degrade cluster performance. Enabling the Balancer and performing pre-sharding prevents data skew.

If the source is a sharded cluster ApsaraDB for MongoDB instance, also make sure that:

For supported source and destination versions, see Synchronization solution overview.

Limitations

Important

DTS does not automatically check whether your configuration complies with the following limitations. Review all applicable constraints before you start the task. Running a task in violation of these limitations may cause data inconsistency, task failure, or data loss — none of which are covered by the DTS Service-Level Agreement (SLA).

Source database

Constraint Details
Bandwidth The server hosting the source database must have sufficient outbound bandwidth, or synchronization performance will be affected.
Primary keys Collections to be synchronized must have primary keys or UNIQUE constraints, with all fields unique. Without this, duplicate data may appear in the destination.
Collection limit When the synchronization object is a collection and requires name mapping, a single task supports up to 1,000 collections. If your collection count exceeds this limit, split collections across multiple tasks or synchronize at the database level instead.
Single document size A single document being synchronized cannot exceed 16 MB. Exceeding this limit causes the task to fail.
Mongos nodes If the source is a sharded cluster, the number of Mongos nodes cannot exceed 10.
Unsupported sources Azure Cosmos DB for MongoDB clusters and Amazon DocumentDB elastic clusters are not supported as sources.
TTL indexes Collections with time-to-live (TTL) indexes cannot be synchronized. Synchronizing such collections may cause data inconsistency in the destination.
Orphaned documents Make sure no orphaned documents exist in a source sharded cluster. These can cause data inconsistency or task failure. See Orphaned documents and How to clean orphaned documents in MongoDB sharded cluster architecture.
Balancer activity If the source is a sharded cluster with an active Balancer performing data migration, instance latency may increase.
oplog / change streams The source database must have oplog enabled with at least 7 days of log retention, or have change streams enabled. This lets DTS subscribe to data changes. If this requirement is not met, DTS may fail to obtain changes, leading to synchronization failure, data inconsistency, or data loss.
Important

Use oplog to record data changes in the source database. oplog provides faster log pulling and lower latency than change streams. Change streams are available only on MongoDB 4.0 and later, and two-way synchronization is not supported when using change streams. If the source is a non-elastic Amazon DocumentDB cluster, you must use change streams and set Migration Method to ChangeStream and Architecture to Sharded Cluster.

Operations during synchronization

Do not perform the following operations during synchronization, as they may cause task failure or data inconsistency:

  • Changing database or collection schemas (including array type updates) during schema synchronization or full data synchronization.

  • Writing new data to the source when running full data synchronization only. To maintain real-time data consistency, select schema synchronization, full data synchronization, and incremental data synchronization together.

  • Writing data to the destination from any source other than DTS.

Other limitations

  • Newly added collections on the source cannot have shard key default values configured during synchronization.

  • When the source is a non-sharded cluster, the task automatically enters the Configure Database and Table Fields stage to set shard key default values.

  • ShardKey default values set in the Configure Database and Table Fields stage take effect only when the destination version is lower than 4.4. For destination instances running version 4.4 or later, DTS writes the original data as-is.

  • Synchronize from a lower version to an equal or higher version to avoid compatibility issues.

  • Synchronization of data in the admin and local databases is not supported.

  • Collections with a unique index or with capped: true support only single-thread writes and cannot use concurrent replay during incremental synchronization. This may increase synchronization latency.

  • Transaction information is not retained. Transactions in the source are converted to individual records in the destination.

  • Before performing data synchronization, evaluate the performance of the source and destination databases. It is also recommended to perform data synchronization during off-peak business hours to reduce the load on the source and destination databases.

  • Because full data synchronization performs INSERT operations concurrently, it causes fragmentation in destination collections. After full synchronization completes, the destination storage space will be 5–10% larger than the source.

  • DTS automatically attempts to recover failed tasks for up to 7 days. Before switching traffic to the destination, end or release the DTS task, or revoke DTS write access to the destination — otherwise the auto-recovery may overwrite destination data.

  • Query the document count in the destination MongoDB using: db.$table_name.aggregate([{ $count:"myCount"}]).

  • Make sure the destination has no documents sharing the same primary key (_id) with the source. If duplicates exist, delete the conflicting documents from the destination before starting the task.

  • If a DTS task fails, DTS support will attempt to restore it within 8 hours and may restart the task or modify task parameters (not database parameters). For parameters that may be modified, see Modify instance parameters.

Self-managed MongoDB

If the source is a self-managed MongoDB instance:

  • A primary-secondary switch during synchronization will cause the task to fail.

  • If the source database has not had any write operations for an extended period, the displayed synchronization latency may be inaccurate. Perform an update operation on the source to refresh the latency. Alternatively, if the synchronization object is the entire database, configure a heartbeat that writes data to the source every second.

Billing

Synchronization type Fee
Schema synchronization and full data synchronization Free
Incremental data synchronization Charged. See Billing overview.

Synchronization types

Type What DTS synchronizes
Schema synchronization Schemas of selected objects from source to destination
Full data synchronization Historical data of selected objects (databases and collections)
Incremental data synchronization Ongoing data changes from source to destination

Choose between oplog and change streams

DTS supports two methods for capturing incremental changes. Use oplog unless your source requires change streams.

Method Supported operations When to use
oplog (recommended) CREATE COLLECTION, CREATE INDEX, DROP DATABASE, DROP COLLECTION, DROP INDEX, RENAME COLLECTION, INSERT, UPDATE, DELETE Default for self-managed MongoDB and ApsaraDB for MongoDB. Lower latency due to faster log pulling.
change streams DROP DATABASE, DROP COLLECTION, RENAME COLLECTION, INSERT, UPDATE, DELETE Required for non-elastic Amazon DocumentDB clusters. Available on MongoDB 4.0+. Does not support two-way synchronization.
When DTS synchronizes incremental data of a file, only the $set command can be run synchronously. DTS does not synchronize incremental data from databases created after the task starts.

Required permissions

Database Required permissions How to grant
Source ApsaraDB for MongoDB Read permissions for the databases to be synchronized, and for the config, admin, and local databases Manage MongoDB database users using DMS
Destination ApsaraDB for MongoDB dbAdminAnyDatabase permission, readWrite permission for the destination database, read permission for the local database, read permission for the config database Manage MongoDB database users using DMS

Configure and start the synchronization task

Step 1: Go to the Data Synchronization page

Use one of the following methods:

DTS console

  1. Log on to the DTS consoleDTS console.

  2. In the left-side navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance will reside.

DMS console

The actual operations may vary based on the mode and layout of the DMS console. For details, see Simple mode and Customize the layout and style of the DMS console.
  1. Log on to the DMS consoleDMS console.

  2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.

  3. From the drop-down list next to Data Synchronization Tasks, select the region where the synchronization instance will reside.

Step 2: Create the task and configure connections

  1. Click Create Task.

  2. If prompted, click New Configuration Page in the upper-right corner.

    Skip this step if Back to Previous Version is displayed instead. The new configuration page is recommended.
  3. Configure the source and destination databases using the following parameters.

Task settings

Parameter Description
Task Name A descriptive name for the DTS task. DTS generates a name automatically, but a descriptive name helps with identification. The name does not need to be unique.

Source database

Parameter Value for this example
Select Existing Connection Optional. If you select an existing connection, DTS fills in the parameters automatically. To register a new database, see Manage database connections (DTS console) or Register an Alibaba Cloud database instance (DMS console).
Database Type MongoDB
Access Method Alibaba Cloud Instance
Instance Region Region of the source ApsaraDB for MongoDB instance
Replicate Data Across Alibaba Cloud Accounts No (same account)
Architecture Replica Set. Select Sharded Cluster if the source uses a sharded cluster architecture.
Migration Method Method for capturing incremental data. Select Oplog (recommended) or ChangeStream. See Choose between oplog and change streams. If Architecture is set to Sharded Cluster and Migration Method is set to Oplog, SSL encryption is unavailable for the source.
Instance ID Instance ID of the source ApsaraDB for MongoDB
Authentication Database The database the account belongs to. Default: admin
Database Account Account with the required permissions
Database Password Password for the account
Encryption Select Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. Available options depend on Access Method and Architecture. If the source uses a Replica Set architecture and is a self-managed database with SSL-encrypted selected, you can upload a certificate authority (CA) certificate to verify the connection.

Destination database

Parameter Value for this example
Select Existing Connection Optional. Same behavior as the source connection.
Database Type MongoDB
Access Method Alibaba Cloud Instance
Instance Region Region of the destination ApsaraDB for MongoDB instance
Replicate Data Across Alibaba Cloud Accounts No (same account)
Architecture Sharded Cluster
Instance ID Instance ID of the destination ApsaraDB for MongoDB
Authentication Database The database the account belongs to. Default: admin
Database Account Account with the required permissions
Database Password Password for the account
Encryption Select Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. If the destination is an ApsaraDB for MongoDB sharded cluster, SSL-encrypted is unavailable.
  1. Click Test Connectivity and Proceed.

    DTS server CIDR blocks must be added to the security settings of the source and destination databases. For details, see Add the CIDR blocks of DTS servers. If the source or destination is a self-managed database with an access method other than Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Step 3: Configure synchronization objects

  1. In the Configure Objects step, set the following parameters.

Parameter Description
Synchronization Types Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. Full data synchronization provides the historical baseline for incremental synchronization.
Processing Mode of Conflicting Tables Precheck and Report Errors (recommended): fails the precheck if the destination has collections with names identical to source collections. Use the object name mapping feature to rename conflicting collections if they cannot be deleted or renamed. Ignore Errors and Proceed: skips the conflict check. If a destination record shares the same primary key or unique key as a source record, DTS retains the destination record and skips the source record. This may result in data inconsistency.
Synchronization Topology One-way Synchronization
Capitalization of Object Names in Destination Instance Default: DTS default policy. Change this if you need to match a specific capitalization convention. See Specify the capitalization of object names in the destination instance.
Source Objects Select databases or collections to synchronize, then click the arrow icon to move them to Selected Objects. The synchronization granularity is database or collection.
Selected Objects To rename a database in the destination: right-click a database under Selected Objects and update the Schema Name in the Edit Schema dialog box. To rename a collection: right-click the collection and update Table Name in the Edit Table dialog box. Object name mapping is available only at the collection level. If you use object name mapping, dependent objects may fail to synchronize. To add filter conditions (full synchronization phase only): right-click the collection in Selected Objects and define filter conditions. See Set filter conditions.
  1. Click Next: Advanced Settings.

Step 4: Configure advanced settings

Parameter Description
Dedicated Cluster for Task Scheduling By default, DTS schedules to the shared cluster. Purchase a dedicated cluster for higher task stability. See What is a DTS dedicated cluster.
Retry Time for Failed Connections How long DTS retries failed connections before marking the task as failed. Range: 10–1440 minutes. Default: 720 minutes. Set to at least 30 minutes. If multiple tasks share the same source or destination, the shortest retry time applies. DTS charges for the instance during the retry period.
Retry Time for Other Issues How long DTS retries failed DDL or DML operations. Range: 1–1440 minutes. Default: 10 minutes. Set to at least 10 minutes. Must be smaller than Retry Time for Failed Connections.
Enable Throttling for Full Data Migration Limits the queries per second (QPS) to the source database, records per second (RPS), and data migration speed (MB/s) during full data synchronization to reduce database load. Displayed only when Full Data Synchronization is selected.
Only one data type for primary key _id in a single table Yesalert notification settings: DTS does not scan the _id data type during full synchronization (faster). No: DTS scans the _id data type for mixed-type collections. Displayed only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization Limits RPS and data synchronization speed (MB/s) for incremental synchronization to reduce destination database load.
Environment Tag Optional. Tag the instance by environment (for example, production or staging).
Configure ETL Enable to apply extract, transform, and load (ETL) transformations. See What is ETL? and Configure ETL in a data migration or data synchronization task.
Monitoring and Alerting Enable to receive notifications when the task fails or synchronization latency exceeds a threshold. See Configure monitoring and alerting when you create a DTS task.

Step 5: Configure data verification

Click Next Step: Data Verification to configure data verification. For more information about how to use the data verification feature, see Configure a data verification task.

Step 6: Set shard key default values

This step applies only when the source is a non-sharded cluster and the destination is a sharded cluster. DTS automatically enters the Configure Database and Table Fields stage. Shard key default values take effect only when the destination version is lower than 4.4. For destination instances running version 4.4 or later, DTS writes the original data as-is.

For each collection in the synchronization scope that requires a shard key:

  1. Click Set Default Value in the corresponding row.

    If Number of Shard Keys shows 0 for a collection, that collection has no shard key in the destination and does not need a default value.
  2. Select a Shard key default value type. Supported types: string and int.

  3. Enter the Default Value for the shard key.

    Important

    Assign default values to all shard keys for all objects in the synchronization scope. Missing values trigger an alert during the precheck and may cause the task to fail.

Step 7: Save the task and run a precheck

Click Next: Save Task Settings and Precheck.

To preview the API parameters for this task configuration, hover over the button and click Preview OpenAPI parameters before proceeding.

DTS runs a precheck before starting the task. If the precheck fails:

  • For failed items: click View Details, fix the underlying issue, then rerun the precheck.

  • For alert items that can be ignored: click Confirm Alert Details, then click Ignore in the View Details dialog box, confirm with OK, and click Precheck Again. Ignoring alerts may result in data inconsistency.

Step 8: Purchase the instance and start the task

  1. Wait until the Success Rate reaches 100%, then click Next: Purchase Instance.

  2. On the buy page, configure the following parameters.

Parameter Description
Billing Method Subscription: Pay upfront for the full duration. More cost-effective for long-term use. Duration options: 1–9 months, 1 year, 2 years, 3 years, or 5 years. Pay-as-you-go: Billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges.
Resource Group Settings The resource group for this instance. Default: default resource group. See What is Resource Management?
Instance Class Determines synchronization speed. See Instance classes of data synchronization instances.
  1. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.

  2. Click Buy and Start, then click OK in the confirmation dialog box.

The task appears in the task list. Monitor its progress from there.

What's next

After the task is complete, switch your application traffic to the destination instance. To verify data consistency before switching, see Configure a data verification task.