Type | Description |
Source and destination database limits | Bandwidth requirements: The server that hosts the source database must have sufficient outbound bandwidth. Otherwise, the data synchronization speed is affected. The collections to be synchronized must have a primary key or a UNIQUE constraint, and the fields must be unique. Otherwise, duplicate data may appear in the destination database. The _id field in the collections to be synchronized must be unique. Otherwise, data inconsistency may occur. If you synchronize data at the collection level and need to edit the collections, such as mapping collection names, a single data synchronization task supports a maximum of 1,000 collections. If you exceed this limit, an error is reported after you submit the task. In this case, split the collections into multiple batches and configure a separate task for each batch, or configure a task to synchronize the entire database. A single piece of data to be synchronized from the source database cannot exceed 16 MB. Otherwise, the task fails. Azure Cosmos DB for MongoDB and Amazon DocumentDB elastic clusters are not supported as the source database. The source database must have Oplog enabled, and the Oplog logs must be retained for at least seven days. Alternatively, enable Change Streams and ensure that Data Transmission Service (DTS) can subscribe to data changes in the source database from the last seven days using Change Streams. Otherwise, the task may fail because it cannot obtain data changes from the source database. In extreme cases, data inconsistency or data loss may occur. Issues that arise from this are not covered by the DTS Service-Level Agreement (SLA).
Important We recommend that you use Oplog to obtain data changes from the source database. Only MongoDB 4.0 and later support obtaining data changes through Change Streams. Two-way synchronization is not supported when you use Change Streams to obtain data changes from the source database. If the source database is an Amazon DocumentDB (non-elastic) cluster, you must manually enable Change Streams. When you configure the task, set Migration Method to ChangeStream and Architecture to Sharded Cluster.
During DTS synchronization, scaling in or out the number of shards for a MongoDB sharded cluster is not supported. Otherwise, the DTS task fails. If the source instance is a self-managed MongoDB database with a sharded cluster architecture: Access Method only supports Express Connect, VPN Gateway, or Smart Access Gateway and Cloud Enterprise Network (CEN). If MongoDB is version 8.0 or later and the Migration Method is Oplog, you must ensure that the Shard account used by the sync task has the directShardOperations permission. You can add the permission using the db.adminCommand({ grantRolesToUser: "username", roles: [{ role: "directShardOperations", db: "admin"}]}) command.
Note In the command, replace username with the shard account used by the sync task.
The number of Mongos nodes in the source MongoDB sharded cluster instance cannot exceed 10. If a collection to be synchronized contains a Time To Live (TTL) index, data inconsistency or instance latency may occur. Ensure that there are no orphaned documents in the source and destination instances. Otherwise, data inconsistency or even task failure may occur. For more information, see Orphaned Document and How to purge orphaned documents from a MongoDB sharded cluster instance. Source database operation limits: During schema synchronization and full data synchronization, do not perform schema changes on databases or collections, including updating data of the array type. Otherwise, the data synchronization task may fail, or data inconsistency may occur between the source and destination databases. If you perform only full data synchronization, do not write new data to the source instance. Otherwise, data inconsistency occurs between the source and destination databases. While the synchronization instance is running, do not run commands that change the data distribution of the objects to be synchronized in the source database, such as shardCollection, reshardCollection, unshardCollection, moveCollection, and movePrimary. Otherwise, data inconsistency may occur.
If the Balancer of the source database is balancing data, instance latency may occur. Connecting to a MongoDB database using an SRV record is not supported. If the source database is MongoDB 5.0 or later and the destination database is a version earlier than 5.0, you cannot synchronize a capped collection. Synchronization may cause the task to fail or lead to data inconsistency between the source and destination databases. This is because the behavior of capped collections changed starting from MongoDB 5.0. Operations such as explicit deletion and increasing document size during updates are allowed. Earlier versions of the database kernel are not compatible with these new features.
|
Other limits | Before the task starts, add the corresponding sharding key of the destination to the data to be synchronized in the source. After the task starts, the data to be synchronized must include the sharding key when you use the INSERT command. You cannot change the sharding key when you use the UPDATE command. We recommend that you keep the MongoDB versions of the source and destination databases the same, or synchronize data from an earlier version to a later version to ensure compatibility. If you synchronize data from a later version to an earlier version, database compatibility issues may occur. Data synchronization for the admin, config, and local databases is not supported. Transaction information is not retained. When transactions from the source database are synchronized to the destination database, they are converted into individual records. When DTS writes data to the destination collection, if a primary key or unique key conflict occurs, DTS skips the corresponding data write statement and retains the existing data in the destination collection. If the source is a MongoDB database of a version earlier than 3.6 and the destination is a MongoDB database of version 3.6 or later, the order of fields in the data may be inconsistent after synchronization. The field-value mappings remain consistent. This is due to differences in the execution plans of the database engines. If your business logic involves text match queries on nested structures, assess the potential impact of this inconsistency on your business. Before you synchronize data, evaluate the performance of the source and destination databases. We also recommend that you synchronize data during off-peak hours. Otherwise, initial full data synchronization consumes read and write resources on both the source and destination databases, which may increase the database load. Initial full data synchronization runs concurrent INSERT operations, which causes fragmentation in the destination database collections. As a result, the collection space in the destination instance is larger than that in the source instance after initial full data synchronization is complete. If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported for the collection during incremental synchronization. Only single-threaded writes are supported. This may increase task latency. Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than that of the source. To query the count of documents in the destination MongoDB database, use the db.$table_name.aggregate([{ $count:"myCount"}]) syntax. Ensure that the destination MongoDB database does not have the same primary keys (which is _id by default) as the source. Otherwise, data loss occurs. If the destination has the same primary keys as the source, delete the relevant data from the destination (delete documents with the same _id as the source) without affecting your business. During full data synchronization, you must disable the Balancer of the source MongoDB database until each subtask enters the incremental synchronization phase. Otherwise, data inconsistency may occur. For more information about how to manage the Balancer, see Manage the MongoDB Balancer. If you do not need to use the schema synchronization feature provided by DTS (for example, if data sharding is already configured on the destination side), do not select Schema Synchronization under Synchronization Types on the Configure Objects page. Otherwise, data inconsistency or task failure may occur due to sharding conflicts. After you switch your business to the destination MongoDB database, you must ensure that your business operations comply with the requirements for sharded collections in that MongoDB database. Synchronization of time series collections introduced in MongoDB 5.0 and later is not supported. If an instance fails, DTS helpdesk will try to recover the instance within 8 hours. During the recovery process, operations such as restarting the instance and adjusting parameters may be performed.
Note When parameters are adjusted, only the parameters of the DTS instance are modified. The parameters of the database are not modified. The parameters that may be modified include but are not limited to those described in Modify instance parameters.
|