All Products
Search
Document Center

Data Transmission Service:Precautions and limits for migrating data from a MongoDB database

Last Updated:Jan 29, 2024

This topic describes the precautions and limits when you use Data Transmission Service (DTS) to migrate data from a MongoDB database, such as a self-managed MongoDB database or an ApsaraDB for MongoDB instance. To ensure that your data migration task runs as expected, read the precautions and limits before you configure the task.

Scenarios of migrating data from a MongoDB database

The following list provides the scenarios of migrating data from a MongoDB database. The precautions and limits in the scenarios may vary. You can go to the related section to view the precautions and limits in a specific scenario.

Migrate data from a MongoDB database (standalone architecture) to another MongoDB database (standalone, replica set, or sharded cluster architecture)

Category

Description

Limits on the source database

  • Bandwidth requirements: The server on which the source database is deployed must have sufficient outbound bandwidth. Otherwise, the data migration speed decreases.

  • The collections to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

  • If you select collections as the objects to be migrated and you need to edit collections in the destination database, such as renaming collections, up to 1,000 collections can be migrated in a single data migration task. If you run a task to migrate more than 1,000 collections, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the collections in batches or configure a task to migrate the entire database.

  • Limits on operations to be performed on the source database:

    • During schema migration and full data migration, do not perform schema change on databases or collections. Otherwise, the data migration task fails.

    • Incremental data migration is not supported in this scenario. To ensure data consistency, we recommend that you do not write data to the source MongoDB database during full data migration.

  • You cannot migrate collections that contain time to live (TTL) indexes. If the database to be migrated contains TTL indexes, data inconsistency may occur between the source and destination databases due to inconsistent time zones and clocks of the source and destination databases.

Other limits

  • If the destination database is a sharded cluster database, take note of the following limits:

    • Orphaned documents must be deleted. Otherwise, the migration performance is compromised. During data migration, if a _id conflict exists in the documents of the source and destination databases, data inconsistency may occur, or the data migration task may fail.

    • Before you start the data migration task, you must add shard keys to the data to be migrated in the source database. If you cannot add shard keys to the data in the source database, you can migrate data from a MongoDB database without shard keys. For more information, see Migrate data from a MongoDB instance without a sharding key to a MongoDB sharded cluster instance.

    • During the data migration, if you execute the INSERT statement to insert data into the data to be migrated, the data to be migrated must contain shard keys. If you execute the UPDATE statement to modify the data to be migrated, you cannot modify shard keys.

  • Only schema migration and full data migration are supported in this scenario. You cannot use DTS to migrate incremental data from a standalone MongoDB database because the oplog feature is disabled for the database.

  • If a collection of the destination database has a unique index or the capped attribute of a collection of the destination database is true, the collection supports only single-thread data writing and does not support concurrent replay during incremental data migration. This may increase migration latency.

  • DTS cannot migrate data from the admin or local database.

  • Transaction information is not retained. When transactions are migrated to the destination database, the transactions are converted into a single record.

  • To ensure compatibility, the version of the destination MongoDB database must be the same as or later than the version of the source MongoDB database. If the version of the destination database is earlier than the version of the source database, database compatibility issues may occur.

  • Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads on the database servers.

  • During full data migration, concurrent INSERT operations cause fragmentation in the collections of the destination database. After full data migration is complete, the storage space for collections of the destination database is larger than that of the source database.

  • Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the ROUND(COLUMN,PRECISION) function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for columns of the FLOAT data type to 38 digits and the precision for columns of the DOUBLE data type to 308 digits.

  • DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the REVOKE statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database overwrites the data in the destination database after a failed task is resumed.

  • The data is concurrently written to the destination database. Therefore, the storage space occupied in the destination database is 5% to 10% larger than the size of the data in the source database.

  • You must use the db.$table_name.aggregate([{ $count:"myCount"}]) syntax to query the return value of a count operation on the destination MongoDB database.

  • Make sure that the destination MongoDB database does not have the same primary key as the source database. The default primary key is _id. Otherwise, data may be lost. If the data in the destination database has the same primary key as that in the source database, clear the related data in the destination database without interrupting the services of DTS. For example, if the same primary key is _id, you can delete the data in the destination database that has the same _id as the source database.

Special cases

If the source database is a self-managed MongoDB database, we recommend that you do not perform a primary/secondary switchover on the source database when the data migration task is running. Otherwise, the task fails.

Migrate data from a MongoDB database (replica set architecture) to another MongoDB database (replica set or sharded cluster architecture)

Category

Description

Limits on the source database

  • Bandwidth requirements: The server on which the source database is deployed must have sufficient outbound bandwidth. Otherwise, the data migration speed decreases.

  • The collections to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

  • If you select collections as the objects to be migrated and you need to edit collections in the destination database, such as renaming collections, up to 1,000 collections can be migrated in a single data migration task. If you run a task to migrate more than 1,000 collections, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the collections in batches or configure a task to migrate the entire database.

  • If you want to migrate incremental data, make sure that the following requirements are met:

    • The oplog feature is enabled. Otherwise, error messages are returned during the precheck, and the data migration task cannot be started.

    • The oplogs of the source database must be retained for at least seven days. Otherwise, DTS may fail to obtain the oplogs, which causes the task to fail, or even data inconsistency and data loss. Make sure that you set the retention period of oplogs based on the preceding requirements. Otherwise, the service level agreement (SLA) of DTS does not guarantee service reliability or performance.

  • Limits on operations to be performed on the source database:

    • During schema migration and full data migration, do not perform schema change on databases or collections. Otherwise, the data migration task fails.

    • If you perform only full data migration, do not write data to the source database during data migration. Otherwise, data inconsistency between the source and destination databases occurs. To ensure data consistency, we recommend that you select Schema Migration, Full Data Migration, and Incremental Data Migration as the migration types.

  • You cannot migrate collections that contain time to live (TTL) indexes. If the database to be migrated contains TTL indexes, data inconsistency may occur between the source and destination databases due to inconsistent time zones and clocks of the source and destination databases.

Other limits

  • If the destination database is a sharded cluster database, take note of the following limits:

    • Orphaned documents must be deleted. Otherwise, the migration performance is compromised. During data migration, if a _id conflict exists in the documents of the source and destination databases, data inconsistency may occur, or the data migration task may fail.

    • Before you start the data migration task, you must add shard keys to the data to be migrated in the source database. If you cannot add shard keys to the data in the source database, you can migrate data from a MongoDB database without shard keys. For more information, see Migrate data from a MongoDB instance without a sharding key to a MongoDB sharded cluster instance.

    • During the data migration, if you execute the INSERT statement to insert data into the data to be migrated, the data to be migrated must contain shard keys. If you execute the UPDATE statement to modify the data to be migrated, you cannot modify shard keys.

  • To ensure compatibility, the version of the destination MongoDB database must be the same as or later than the version of the source MongoDB database. If the version of the destination database is earlier than the version of the source database, database compatibility issues may occur.

  • DTS cannot migrate data from the admin or local database.

  • If a collection of the destination database has a unique index or the capped attribute of a collection of the destination database is true, the collection supports only single-thread data writing and does not support concurrent replay during incremental data migration. This may increase migration latency.

  • Transaction information is not retained. When transactions are migrated to the destination database, the transactions are converted into a single record.

  • Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads on the database servers.

  • During full data migration, concurrent INSERT operations cause fragmentation in the collections of the destination database. After full data migration is complete, the storage space for collections of the destination database is larger than that of the source database.

  • Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the ROUND(COLUMN,PRECISION) function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for columns of the FLOAT data type to 38 digits and the precision for columns of the DOUBLE data type to 308 digits.

  • DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the REVOKE statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database overwrites the data in the destination database after a failed task is resumed.

  • The data is concurrently written to the destination database. Therefore, the storage space occupied in the destination database is 5% to 10% larger than the size of the data in the source database.

  • You must use the db.$table_name.aggregate([{ $count:"myCount"}]) syntax to query the return value of a count operation on the destination MongoDB database.

  • Make sure that the destination MongoDB database does not have the same primary key as the source database. The default primary key is _id. Otherwise, data may be lost. If the data in the destination database has the same primary key as that in the source database, clear the related data in the destination database without interrupting the services of DTS. For example, if the same primary key is _id, you can delete the data in the destination database that has the same _id as the source database.

Special cases

If the source database is a self-managed MongoDB database, take note of the following limits:

  • If you perform a primary/secondary switchover on the source database when the data migration task is running, the task fails.

  • DTS calculates migration latency based on the timestamp of the latest migrated data in the destination database and the current timestamp in the source database. If no update operation is performed on the source database for an extended period of time, the migration latency may be inaccurate. If the latency of the migration task is excessively high, you can perform an update operation on the source database to update the latency.

Note

If you select an entire database as the object to be migrated, you can create a heartbeat table. The heartbeat table is updated or receives data every second.

Migrate data between MongoDB databases (sharded cluster architecture)

Category

Description

Limits on the source database

  • Bandwidth requirements: The server on which the source database is deployed must have sufficient outbound bandwidth. Otherwise, the data migration speed decreases.

  • The collections to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

  • DTS uses the resources of the source and destination databases during full data migration. This may increase the loads on the database servers. If you migrate a large volume of data or if the server specifications do not meet your requirements, database services may become unavailable. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours.

  • If the source and destination MongoDB databases run different MongoDB versions or use different storage engines, make sure that the MongoDB versions or storage engines are compatible. For more information, see MongoDB versions and storage engines.

  • If you want to migrate incremental data, you must enable operational logging. Otherwise, error messages are returned during the precheck and the data migration task cannot be started.

    If you perform only incremental data migration, the oplogs of the source database must be stored for more than 24 hours. If you perform both full data migration and incremental data migration, the oplogs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the oplogs and the task may fail. In extreme circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of oplogs in accordance with the preceding requirements. Otherwise, the Service Level Agreement (SLA) of DTS does not guarantee service reliability or performance.

  • If you select collections as the objects to be migrated and you want to edit the collections in the destination database, such as renaming the collections, up to 1,000 collections can be migrated in a single data migration task. If you run a task to migrate more than 1,000 collections, a request error occurs. In this case, we recommend that you split the collections to be migrated, configure multiple tasks to migrate the collections, or configure a task to migrate the entire database.

  • The admin or local database cannot be used as the source or destination database.

  • The number of Mongos nodes in the source self-managed MongoDB database cannot exceed 10.

Other limits

  • If you want to configure the DTS task after you purchase it, make sure to specify the correct number of shards during the purchase.

  • Before starting the task, you need to add the shard key corresponding to the destination shard to the data to be migrated on the source database. After the task starts, the shard key must be included in the INSERT command when adding new data, and it cannot be changed by using the UPDATE command.

  • Make sure that the destination ApsaraDB for MongoDB instance does not have the same primary key as that in the source database. The default primary key is _id. Otherwise, data may be lost. If the destination instance has the same primary key as that in the source database, delete the same document that corresponds to the _id primary key in the destination instance as that in the source database without affecting your business.

  • Transaction information is not retained. When transactions are migrated to the destination database, they are converted into a single record.

  • During a data migration task, ApsaraDB for MongoDB sharded cluster instances involved in the task cannot be scaled. Otherwise, the task fails.

  • You must use the db.$table_name.aggregate([{ $count:"myCount"}]) syntax to query the return value of a count operation on the destination ApsaraDB for MongoDB database.

  • The data is concurrently written to the destination database. Therefore, the storage space occupied in the destination database is 5% to 10% larger than the size of the data in the source database.

  • Before you migrate data, evaluate the impact of data migration on the performance of the source instance and destination cluster. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses read and write resources of the source and destination databases. This may increase the loads on the database servers.

  • During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database.

  • DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination instance, stop or release the failed tasks. You can also execute the REVOKE statement to revoke the write permissions from the accounts that are used by DTS to access the destination instance. Otherwise, the data in the source database overwrites the data in the destination database after the data migration task is resumed.