MongoDB sharding migration (2)
Created#More Posted time:Oct 24, 2016 9:47 AM
To learn about MongoDB sharded cluster, read these articles:
• Principles of MongoDB sharded cluster architecture
• What You Should Know about MongoDB Sharding
In this series of articles, we introduce sharding migration in three parts. This is the second part.
1. Load balancing and migration policies
2. Chunk migration process
3. Balancer O&M management
In "MongoDB sharding migration (1)", we learned that chunk migration can be triggered by the balancer. Users can also trigger it by sending the moveChunk command manually to mongos.
First, let's look at mongos' moveChunk command. If migration is triggered by the balancer, the logic is similar to that of moveChunk.
The moveChunk command has the following fields:
Ultimately, we need to specify [the chunk to migrate and its collection] and [the shard to which the chunk is migrated]. You can use either the find or bound field (but not both) to specify the chunk. Mongos can calculate the chunk to migrate and identify the chunk's source shard.
Mongos will next construct a moveChunk command to send to the source shard (mongos and the shard both support the moveChunk command, but have different internal implementation logic). Then, the source shard will undertake all the migration tasks, wait for migration to be complete, and send the execution result back to mongos.
Step 1: Mongos sends the moveChunk command to the source shard.
Mongos receives a move chunk command from the user or must move the chunk due to the load balancer policy. It constructs a moveChunk command and sends it to the source shard.
Step 2: The source shard notifies that target shard to start syncing chunk data.
After the source shard receives the moveChunk command from mongos, it sends the _recvChunkStart command to the target shard, notifying the target shard to start data migration (in fact, data migration is initiated by the target shard). Next, the source shard records all incremental modification operations for the chunk during the migration process.
Step 3: The target shard syncs chunk data to the local device.
After the target shard receives the _recvChunkStart command, it will start a separate thread to read the chunk data and write it to the local device. This process includes the following steps:
1. The target shard creates a collection and index (if necessary)
o If the migration collection does not have any chunks on the target shard, the collection must first be created on the target shard and an index identical with that of the collection on the source shard must also be created.
2. The target shard cleans dirty data (if necessary)
o If data in the chunk range exists on the target shard, this indicates a failed migration has produced dirty data. This data must be erased before continuing.
3. The target shard sends the _migrateClone command to the source shard, to read all documents in the chunk range and write them to the local device. This migrates all chunk data. After migration is completed, it updates its status to STEADY, indicating all data have been migrated.
4. The source shard will continuously call and query the migration status on the target shard to see if the status is STEADY. If it is STEADY, the source shard ceases write operation (by adding a mutually exclusive write lock to the collection). Then, it sends the _recvChunkCommit command to tell the target shard that it will not write new data.
5. The target shard's migration thread continuously sends the _transferMods command to the source shard to read the incremental modifications during the migration process and apply them locally. After the completion of incremental migration, the target shard confirms the _recvChunkCommit result with the source shard.
6. Finally, the source shard receives the _recvChunkCommit result, completing the migration process.
Step 4: The source shard updates config server metadata.
Upon the completion of data migration, the source shard updates the shard corresponding to the chunk to the config server and the chunk version. In this way, mongos discovers that the local version is lower and proactively reloads the metadata.
Step 5: The source shard deletes the chunk data.
After the chunk is migrated to the target shard, the source chunk no longer needs to be retained, so the source shard deletes the chunk data. Normally, the source shard adds the delete operation to the queue for asynchronous deletion. However, during the moveChunk process, if the _waitForDelete parameter is set to true, it performs synchronous deletion and then returns the result.
2016-09-28 11:35:47 update
Once the source shard queries the target shard and sees it is in the STEADY status, the source shard will go to the critical area and write operations are queuing on the source shard. After the migration completes, the queued write operations will continue to be executed. However, since the chunk version has been updated, this tells the client that its version is too low. Then, the client will read the configuration from the config server again. At this time, the chunk in the obtained route information will be on the target shard, so the write operations will be sent to the target shard.
This article gives a brief introduction to the chunk migration process. Chunk migration is affected by the balancer policy, chunkSize, and other factors. The next part of this series will introduce O&M management related to migration for better management of MongoDB sharded clusters.