Data Transmission Service (DTS) migrates data from a source MongoDB database without shard keys to a destination MongoDB sharded cluster. During migration, you can assign a default shard key value to collections that lack one. This guide uses an ApsaraDB for MongoDB replica set instance as the source and an ApsaraDB for MongoDB sharded cluster instance as the destination, but the procedure also applies to other source architectures.
Prerequisites
Before you begin, make sure that you have:
A destination ApsaraDB for MongoDB sharded cluster instance. See Create a sharded cluster instance
Enough storage on the destination instance -- at least 10% more than the source instance uses
(Conditional) If the source is a sharded cluster instance, an endpoint for each shard node with identical database accounts and passwords across all shards. See Request an endpoint for a shard node
Supported MongoDB versions for both source and destination. See Migration solutions
Plan the migration
Before you start configuring the migration task, review the following planning considerations. These decisions affect migration behavior and cannot easily be changed later.
Shard key behavior depends on the destination MongoDB version
How DTS handles default shard key values depends on the destination MongoDB version:
Destination MongoDB earlier than 4.4: The default shard key value you set in the Configure Database and Table Fields step takes effect. DTS fills the original data with this default value and writes it to the destination.
Destination MongoDB 4.4 or later: The default shard key value does not take effect. DTS writes the original data to the destination as-is.
Evaluate which behavior applies to your destination instance before proceeding.
Data sharding and pre-sharding
If your destination requires sharding, create the databases and collections that need sharding in the destination instance before migration. Configure data sharding, enable the balancer, and perform pre-sharding. This prevents all data from landing on a single shard and avoids data skew after migration.
For details, see Configure data sharding to maximize shard performance and How do I handle uneven data distribution in a MongoDB sharded cluster?.
Version compatibility
Keep the source and destination MongoDB versions the same, or migrate from an earlier version to a later version. Migrating from a later version to an earlier version may cause compatibility issues.
Migration types
DTS supports three migration types. Select the combination that fits your scenario.
| Migration type | What it migrates | Scope |
|---|---|---|
| Schema migration | Database schemas, collections, and indexes | Databases, collections, indexes |
| Full data migration | All existing data at the time the task starts | Databases, collections |
| Incremental data migration | Ongoing changes after full migration completes | See supported operations below |
To perform a one-time migration, select Schema Migration and Full Data Migration.
To keep the destination in sync during migration (minimizing downtime), select all three: Schema Migration, Full Data Migration, and Incremental Data Migration.
Incremental migration with oplog
DTS captures incremental data through the following operations:
CREATE COLLECTION, CREATE INDEX
DROP DATABASE, DROP COLLECTION, DROP INDEX
RENAME COLLECTION
Insert, update, and delete operations on documents
For updated documents, DTS migrates only updates from $set. Incremental migration does not capture data from databases created after the task starts. Use oplog for incremental migration whenever possible -- it provides faster log pulling and lower latency than change streams.Incremental migration with change streams
DTS captures incremental data through the following operations:
DROP DATABASE, DROP COLLECTION
RENAME COLLECTION
Insert, update, and delete operations on documents
For updated documents, DTS migrates only updates from $set.Billing
| Migration type | Instance fee | Internet traffic fee |
|---|---|---|
| Schema migration + full data migration | Free | Charged when the destination Access Method is Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | -- |
Required permissions
| Database | Schema migration and full migration | Incremental migration |
|---|---|---|
| Source ApsaraDB for MongoDB | Read permission on the databases to migrate and the config database | Read permission on the databases to migrate, the admin database, and the local database |
| Destination ApsaraDB for MongoDB | dbAdminAnyDatabase, readWrite on the destination database, read on the local database, and read on the config database | Same as schema/full migration |
To create and grant permissions, see Use DMS to manage MongoDB database users.
Limitations
Source database
Bandwidth: The source server must have enough outbound bandwidth. Insufficient bandwidth slows migration.
Primary key or unique constraint required: Collections to migrate must have a primary key or UNIQUE constraint with unique fields. Otherwise, duplicate data may appear in the destination.
Collection limit per task: When migrating at the collection level with object editing (such as name mapping), a single task supports up to 1,000 collections. If you exceed this limit, split the collections into multiple tasks or migrate the entire database.
Document size limit: Each document in the source must be 16 MB or smaller. Larger documents cause the task to fail.
Mongos node limit: If the source is a sharded cluster, it must have 10 or fewer mongos nodes.
Self-managed sharded cluster access methods: Only Public IP Address, Express Connect, VPN Gateway, or Smart Access Gateway, and Cloud Enterprise Network (CEN) are supported.
MongoDB 8.0+ with oplog: If the source is a self-managed sharded cluster running MongoDB 8.0 or later and Migration Method is Oplog, grant the
directShardOperationspermission to the shard account: Replaceusernamewith the shard account used for the migration task.db.adminCommand({ grantRolesToUser: "username", roles: [{ role: "directShardOperations", db: "admin"}]})Azure Cosmos DB / Amazon DocumentDB elastic clusters: Only full data migration is supported.
Oplog retention for incremental migration: The oplog feature must be enabled with at least seven days of operation log retention. Alternatively, enable change streams with a seven-day subscription window. If DTS cannot access the logs, the task may fail, and data inconsistency or loss may occur. The DTS SLA does not cover these issues.
Change streams version requirement: Change streams require MongoDB V4.0 or later.
Inelastic Amazon DocumentDB clusters: Enable change streams, set Migration Method to ChangeStream, and set Architecture to Sharded Cluster.
TTL indexes: TTL indexes in the source may cause data inconsistency between source and destination after migration.
Orphaned documents: Make sure no orphaned documents exist in a source sharded cluster. Otherwise, data inconsistency or task failure may occur. See Orphaned Documents and How do I clean up orphaned documents in a MongoDB sharded cluster?.
Balancer activity: If the source sharded cluster balancer is actively rebalancing data, instance latency may increase.
Restrictions during migration
Schema migration and full data migration: Do not perform schema changes on databases or collections, including updates to array types. Schema changes may cause the task to fail or result in data inconsistency.
Sharded cluster sources: Do not run
shardCollection,reshardCollection,unshardCollection,moveCollection, ormovePrimaryto change data distribution while the migration instance is running. Otherwise, data inconsistency may occur.Full migration only (no incremental): Do not write new data to the source. To maintain real-time consistency, select all three migration types: Schema Migration, Full Data Migration, and Incremental Data Migration.
General limitations
New collections during migration: Default shard key values cannot be set for collections added to the source after the task starts.
SRV addresses: DTS does not support connecting to MongoDB through an SRV address.
Non-sharded source to sharded destination: When the source is a non-sharded cluster and the destination is an Alibaba Cloud sharded cluster, the task proceeds to the Configure Database and Table Fields step.
Excluded databases: DTS cannot migrate data from the admin, config, or local databases.
Transactions: Transaction information is not retained. Transactions in the source are converted to individual records in the destination.
Primary key / unique key conflicts: When a conflict occurs, DTS skips the conflicting write and retains the existing data in the destination.
Field order (source < 3.6, destination >= 3.6): If the source runs MongoDB earlier than 3.6 and the destination runs 3.6 or later, field order may be inconsistent after migration due to differences in database engine execution plans. Evaluate the impact if your business logic involves match queries on nested structures.
Storage overhead from concurrent writes: Full migration runs concurrent INSERT operations, which cause collection fragmentation. The destination occupies 5% to 10% more storage than the source.
Automatic task retry: DTS retries failed tasks within the last seven days. Before switching workloads to the destination, stop or release the task, or revoke write permissions on the destination account. This prevents automatic resumption from overwriting destination data.
Unique index or capped collections: Collections with a unique index or the
cappedattribute set totruesupport only single-thread writes during incremental migration (no concurrent replay), which may increase latency.Document count query: To query the number of documents in the destination, use:
db.$table_name.aggregate([{ $count:"myCount"}]).Duplicate primary keys (_id): Make sure the destination does not have documents with the same
_idvalues as the source. Duplicate_idvalues cause data loss. Clear conflicting documents in the destination before migration if doing so does not affect your business.DTS task failure restoration: If a task fails, DTS technical support attempts restoration within 8 hours. During restoration, the task may be restarted and DTS task parameters may be modified. Database parameters are not modified. Modified parameters include but are not limited to those described in Modify instance parameters.
Sharded collection compliance: After switching your business to the destination, all business operations must comply with the requirements for sharded collections in that MongoDB database.
Self-managed MongoDB sources
Failover during migration: If a primary/secondary failover occurs during migration, the task fails.
Latency calculation: DTS calculates latency by comparing the timestamp of the last migrated entry with the current time. If the source has not been updated for a long time, the displayed latency may be inaccurate. Run an update on the source to refresh latency.
Heartbeat for whole-database migration: Create a heartbeat collection that periodically updates or writes data (for example, every second) to keep latency accurate.
Procedure
Step 1: Open the data migration page
Use one of the following methods to navigate to the Data Migration page.
DTS console:
Log on to the DTS console.
In the left-side navigation pane, click Data Migration.
In the upper-left corner, select the region where the data migration instance resides.
DMS console:
The actual operation may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.
In the top navigation bar, hover over Data + AI > DTS (DTS) > Data Migration.
From the drop-down list next to Data Migration Tasks, select the region where the data migration instance resides.
Step 2: Create a migration task
Click Create Task to open the task configuration page.
Step 3: Configure the source and destination databases
Configure the following parameters for the source and destination databases.
Task settings
| Parameter | Description |
|---|---|
| Task Name | DTS generates a name automatically. Specify a descriptive name to identify the task. The name does not need to be unique. |
Source database
| Parameter | Description |
|---|---|
| Select Existing Connection | If the database instance is registered with DTS, select it from the drop-down list. DTS populates the connection parameters automatically. See Manage database connections. In the DMS console, select the instance from the Select a DMS database instance drop-down list. If the instance is not registered, configure the parameters below manually. |
| Database Type | Select MongoDB. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | Select the region where the source instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No (this example uses the same account). |
| Architecture | Select Replica Set for this example. Options: Replica Set -- deploys multiple nodes for high availability and read/write splitting. See Replica set architecture. Sharded Cluster -- provides mongos, shard, and config server components. See Sharded cluster architecture. If you select Sharded Cluster, also specify Shard account and Shard password. |
| Migration Method | Select how to capture incremental data: Oplog (recommended) -- available when oplog is enabled on the source. The default for both self-managed and ApsaraDB for MongoDB instances. Provides low-latency incremental migration. ChangeStream -- available when change streams are enabled. See Change Streams. Required for inelastic Amazon DocumentDB clusters. If you select Sharded Cluster for Architecture with ChangeStream, you do not need to configure Shard account and Shard password. |
| Instance ID | Select the source ApsaraDB for MongoDB instance. |
| Authentication Database | The database that the source account belongs to. Default: admin. |
| Database Account | The source database account. See Required permissions. |
| Database Password | The password for the source database account. |
| Encryption | Select the connection encryption mode: Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. Available options depend on the Access Method and Architecture settings. Restrictions: If Architecture is Sharded Cluster and Migration Method is Oplog, SSL-encrypted is unavailable. If the source is a self-managed MongoDB database with Replica Set architecture, Access Method is not Alibaba Cloud Instance, and Encryption is SSL-encrypted, you can upload a CA certificate. |
Destination database
| Parameter | Description |
|---|---|
| Select Existing Connection | Same as source -- select a registered instance or configure manually. |
| Database Type | Select MongoDB. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | Select the region where the destination instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No (this example uses the same account). |
| Architecture | Select Sharded Cluster. |
| Instance ID | Select the destination ApsaraDB for MongoDB sharded cluster instance. |
| Authentication Database | The database that the destination account belongs to. Default: admin. |
| Database Account | The destination database account. See Required permissions. |
| Database Password | The password for the destination database account. |
| Encryption | Select the connection encryption mode: Non-encrypted, SSL-encrypted, or Mongo Atlas SSL. Restrictions: If the destination is an ApsaraDB for MongoDB instance with Sharded Cluster architecture, SSL-encrypted is unavailable. If the destination is a self-managed MongoDB with Replica Set architecture, Access Method is not Alibaba Cloud Instance, and Encryption is SSL-encrypted, you can upload a CA certificate. |
Step 4: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
Make sure the CIDR blocks of DTS servers are added to the security settings of both the source and destination databases. For details, see Add the CIDR blocks of DTS servers. If the source or destination is a self-managed database and its Access Method is not Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.
Step 5: Configure migration objects
On the Configure Objects page, configure the following settings.
| Parameter | Description |
|---|---|
| Migration Types | Select the migration types for this task. Schema Migration and Full Data Migration for one-time migration. Add Incremental Data Migration for continuous sync. If you do not select Schema Migration, create the target database and collections in the destination first and enable object name mapping in Selected Objects. If you do not select Incremental Data Migration, do not write data to the source during migration. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors -- checks for duplicate collection names. If duplicates exist, the precheck fails and you must rename objects. See Map object names. Ignore Errors and Proceed -- skips the duplicate name check. DTS skips records with duplicate primary keys and retains existing destination data. Data consistency is not guaranteed. |
| Capitalization of Object Names in Destination Instance | Controls the capitalization of database, table, and column names. Default: DTS default policy. See Specify the capitalization of object names. |
| Source Objects / Selected Objects | Select objects from the Source Objects panel and move them to Selected Objects. You can select at the database or collection level. To rename a destination database: right-click the database under Selected Objects, then change Schema Name in the Edit Schema dialog box. See Map a single database or collection. To rename a destination collection: right-click the collection under Selected Objects, then change Table Name in the Edit Table dialog box (collection-level migration only). To filter data: right-click the table in Selected Objects and set filter conditions. Data filtering is supported during full migration but not incremental migration. See Set filter conditions. |
If you use object name mapping, migration of dependent objects may fail.
Step 6: Configure advanced settings
Click Next: Advanced Settings and configure the following parameters.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, tasks run on a shared cluster. To improve stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | How long DTS retries when the source or destination connection fails. Range: 10--1,440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If DTS reconnects within this window, the task resumes. Otherwise, the task fails. If multiple tasks share a database, the most recently configured value applies. DTS charges for the instance during retries. |
| Retry Time for Other Issues | How long DTS retries when DDL or DML operations fail. Range: 1--1,440 minutes. Default: 10 minutes. Set this to at least 10 minutes. This value must be smaller than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits the read/write load during full migration. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. |
| Only one data type for primary key _id in a table of the data to be synchronized | Whether the _id field has a single data type in each collection. Yes: DTS skips scanning the _id data type in the source and migrates only one type. No: DTS scans all _id data types and migrates all of them. Enable based on your data. Incorrect settings may cause data loss. Available only when Full Data Migration is selected. |
| Enable Throttling for Incremental Data Migration | Limits the write load during incremental migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. |
| Environment Tag | (Optional) Select a tag to identify the instance. |
| Configure ETL | Whether to enable the extract, transform, and load (ETL) feature. Yes: enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: skip ETL configuration. See What is ETL?. |
| Monitoring and Alerting | Whether to configure alerts for the task. Yes: set alert thresholds and notification contacts. See Configure monitoring and alerting. No: skip alert configuration. |
Step 7: Configure data verification
Click Next Step: Data Verification and configure the data verification task. For details, see Configure a data verification task.
Step 8: Set default shard key values
Click Next: Configure Database and Table Fields to set the default shard key value for collections without shard keys.
In the row of the destination collection, click Set Default Value.
If the Number of Shard Keys for a collection is 0, the collection has no shard key and does not need a default value.
Select a Shard key default value type.
Only string and int types are supported.
Enter the Default Value for the shard key.
The default shard key value takes effect only when the destination MongoDB version is earlier than 4.4.
Set default values for all shard keys in the objects to migrate. Missing values trigger a precheck warning and may cause task failure.
Step 9: Run the precheck
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this task, hover over the button and click Preview OpenAPI parameters.
DTS runs a precheck before starting the migration. The task starts only after the precheck passes.
If the precheck fails, click View Details next to each failed item, troubleshoot the issue, and run the precheck again.
If a precheck item triggers a warning:
If the warning cannot be ignored, click View Details, fix the issue, and rerun the precheck.
If the warning can be ignored, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring warnings may cause data inconsistency.
Step 10: Purchase the instance and start migration
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the instance:
Parameter Description Resource Group The resource group for the migration instance. Default: default resource group. See What is Resource Management?. Instance Class Select a class based on your required migration speed. See Instance classes of data migration instances. Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the confirmation dialog.
The task appears on the Data Migration page where you can monitor progress.
If the task does not include incremental migration, it stops automatically and the Status shows Completed.
If the task includes incremental migration, it runs continuously and the Status shows Running. The task does not stop automatically.
Recommendations for production workloads
Migrate during off-peak hours. DTS consumes read and write resources on both the source and destination during full migration, which increases server load.
Run a precheck before each migration. The precheck validates connectivity, permissions, and schema compatibility before DTS starts moving data.
Stop or release the task before switching workloads. DTS retries failed tasks within seven days. If the task is still active when you switch to the destination, automatic retries may overwrite destination data.
Select all three migration types for production workloads. Selecting Schema Migration, Full Data Migration, and Incremental Data Migration maintains real-time data consistency and minimizes switchover downtime.