Type | Description |
Source and destination database limits | Bandwidth requirements: The server that hosts the source database must have sufficient outbound bandwidth. Otherwise, the data synchronization speed is affected. The collections to be synchronized must have a primary key or a UNIQUE constraint, and the fields must be unique. Otherwise, duplicate data may appear in the destination database. The _id field in the collections to be synchronized must be unique. Otherwise, data inconsistency may occur. If you synchronize data at the collection level and need to edit objects, such as mapping collection names, a single synchronization task supports a maximum of 1,000 collections. If you exceed this limit, an error is reported after you submit the task. In this case, split the collections into multiple synchronization tasks or configure a task to synchronize the entire database. A single piece of data to be synchronized from the source database cannot exceed 16 MB. Otherwise, the task fails. Azure Cosmos DB for MongoDB and Amazon DocumentDB elastic clusters are not supported as the source database. The source database must have Oplog enabled, and the Oplog must be retained for at least seven days. Alternatively, enable Change Streams and ensure that Data Transmission Service (DTS) can subscribe to data changes in the source database from the last seven days using Change Streams. Otherwise, the task may fail because it cannot obtain data changes from the source database. In extreme cases, data inconsistency or data loss may occur. Issues caused by this are not covered by the DTS Service-Level Agreement (SLA).
Important Obtain data changes from the source database using Oplog. Only MongoDB 4.0 and later support obtaining data changes using Change Streams. Two-way synchronization is not supported when you use Change Streams to obtain data changes from the source database. If the source database is a non-elastic cluster of Amazon DocumentDB, you must manually enable Change Streams. When you configure the task, set Migration Method to ChangeStream and set Architecture to Sharded Cluster.
During DTS synchronization, if you use the Oplog method for incremental synchronization, scaling the number of shards in or out for the source MongoDB sharded cluster is not supported. Otherwise, the DTS task fails and data becomes inconsistent. If the source instance is a self-managed MongoDB sharded cluster: Access Method supports only Express Connect, VPN Gateway, or Smart Access Gateway and Cloud Enterprise Network (CEN). If the MongoDB version is 8.0 or later and the Migration Method is Oplog, ensure that the shard account used by the synchronization task has the directShardOperations permission. You can grant the permission by running the db.adminCommand({ grantRolesToUser: "username", roles: [{ role: "directShardOperations", db: "admin"}]}) command.
Note In the command, replace username with the shard account used by the synchronization task.
The number of Mongos nodes in the source MongoDB sharded cluster instance cannot exceed 10. If a collection to be synchronized contains a Time To Live (TTL) index, data inconsistency or instance latency may occur. Ensure that there are no orphaned documents in the source and destination instances. Otherwise, data inconsistency or even task failure may occur. For more information, see Orphaned Document and How to clear orphaned documents from a MongoDB sharded cluster. Source database operation limits: During schema synchronization and initial full data synchronization, do not perform schema changes on databases or collections, including updating data of the array type. Otherwise, the data synchronization task fails, or data becomes inconsistent between the source and destination databases. If you perform only initial full data synchronization, do not write new data to the source instance. Otherwise, data becomes inconsistent between the source and destination databases. During the runtime of the synchronization instance, do not run commands that change data distribution on the objects to be synchronized in the source database. Examples include shardCollection, reshardCollection, unshardCollection, moveCollection, and movePrimary. Otherwise, data inconsistency may occur.
If the balancer of the source database is balancing data, instance latency may occur. Connecting to a MongoDB database using an SRV record is not supported. If the source database is MongoDB 5.0 or later and the destination database is earlier than 5.0, synchronization of capped collection is not supported. This can cause task failure or data inconsistency between the source and destination databases. This is because the behavior of capped collection changed starting from MongoDB 5.0. The new behavior allows explicit deletion and increases in document size during updates. Earlier database kernels are not compatible with these new features.
|
Other limits | Before the task starts, add a sharding key to the source data that corresponds to the sharding key in the destination. After the task starts, the data to be synchronized must include the sharding key when you use the INSERT command. You cannot change the sharding key when you use the UPDATE command. Keep the database versions of the source and destination MongoDB instances the same, or synchronize data from an earlier version to a later version to ensure compatibility. If you synchronize data from a later version to an earlier version, compatibility issues may occur. Synchronization of data in the admin, config, and local databases is not supported. Transaction information is not retained. Transactions in the source database are converted into individual records when synchronized to the destination database. If a primary key or unique key conflict occurs when DTS writes data to the destination collection, DTS skips the corresponding write statement and retains the existing data in the destination collection. If the source is a MongoDB instance earlier than version 3.6 and the destination is a MongoDB instance of version 3.6 or later, the order of fields in the data may be inconsistent after synchronization. This is due to differences in the execution plans of the database engines. The field-value mappings remain consistent. If your business logic involves text matching queries on nested structures, evaluate the potential impact of the inconsistent field order. Before you synchronize data, evaluate the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. Otherwise, initial full data synchronization consumes read and write resources on both the source and destination databases, which may increase the database load. Initial full data synchronization runs INSERT operations concurrently. This causes fragmentation in the destination collections. As a result, the collection space in the destination instance is larger than that in the source instance after initialization. If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported for the collection during incremental synchronization. Only single-threaded writes are supported. This may increase task latency. Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than that of the source. To query the count of documents in the destination MongoDB instance, use the db.$table_name.aggregate([{ $count:"myCount"}]) syntax. Ensure that the destination MongoDB instance does not have the same primary keys (_id by default) as the source instance. Otherwise, data loss occurs. If the destination has the same primary keys, clear the relevant data from the destination instance without affecting your business. This means deleting the documents in the destination that have the same _id as the source. During initial full data synchronization, you must disable the balancer of the source MongoDB database until each subtask enters the incremental synchronization phase. Otherwise, data inconsistency may occur. For information about balancer operations, see Manage the MongoDB balancer. If you do not need to use the schema synchronization feature provided by DTS, for example, if data sharding is already configured on the destination, do not select Schema Synchronization for Synchronization Types on the Configure Objects page. Otherwise, sharding conflicts may cause data inconsistency or task failure. After you switch your business to the destination MongoDB database, you must ensure that your business operations comply with the requirements for sharded collections of that MongoDB database. Synchronization of time series collections introduced in MongoDB 5.0 and later is not supported. If an instance fails, DTS helpdesk will try to recover the instance within 8 hours. During the recovery process, operations such as restarting the instance or adjusting its parameters may be performed.
Note When parameters are adjusted, only the parameters of the DTS instance are modified. The parameters in the database are not modified. The parameters that may be modified include but are not limited to those described in Modify instance parameters.
|