Assistant Engineer
Assistant Engineer
  • UID626
  • Fans1
  • Follows1
  • Posts52

MongoDB sharding migration (3)

More Posted time:Oct 25, 2016 9:27 AM

If you have no idea about the MongoDB Sharded Cluster principle, please read
Principle of MongoDB sharded cluster architecture
What You Should Know about MongoDB Sharding
The sharding migration will be introduced in three parts. This article is the third part.
1. Load balancing and migration policies
2. Chunk migration process
3. Balancer O&M
In the previous two articles, I have introduced the MongoDB sharding migration policies and steps for chunk migration. In this article, I will focus on how to manage balancers to serve the business better.
Disable balancers
The scenarios requiring disabling balancers include:
• When sharded cluster is being backed up, the balancer should be disabled first to avoid data inconsistency between the shard and the config server after the backup.
• The balancer should be disabled to avoid the impact to online services from chunk migration.
To view the current state of the balancer

To disable the balancer

To enable the balancer

Instructions: The commands mentioned in this article are all executed on the Mongos of the sharding cluster.
Disable the balancer for a specific set
By default, the balancer will perform load balancing for sets of all the shards. If you don't want the balancer to automatically migrate data for some specific sets, you can disable the balancer for the designated set.
To disable the balancer for the students.grades set

To enable the balancer for the students.grades set

Set the balancer time window
In order to avoid the impact of chunk migration on the business, you can set the balancer to run only in the specified time window to avoid peak business hours. The following commands set the balancer to work only during 2 a.m. - 6 a.m.
use config
   { _id: "balancer" },
   { $set: { activeWindow : { start : "02", stop : "06" } } },
   { upsert: true }

Set migration options
When moveChunk is used to allow users to customize data migration, the security level of the data written to the target shard (a free choice between reliability and migration efficiency) can be specified through the writeConcern option.
You can modify _secondaryThrottle and writeConcern parameters. The two parameters should be used in combination. That is, if _secondaryThrottle is set to true, you can use writeConcern option to specify the data writing policy at migration; if _secondaryThrottle is set to false, you can use {w:  1}. The following commands set the writeConcern option to {w:  majority}.
use config
   { "_id" : "balancer" },
   { $set : { "_secondaryThrottle" : true ,
              "writeConcern": { "w": "majority" } } },
   { upsert : true }

If no settings are made, the default settings will apply, that is, {w:  2}. It is required that the data should be written to at least two target nodes (if the target shard is a single node, it regresses to {w:  1}).
After data migration, the source shard will remove the migrated chunks. By default, the source shard will add the task for deleting a chunk to a background queue and delete the chunk in the background in an asynchronous manner. Then the balancer can start the next chunk migration. You can set the _waitForDelete to true (false by default) so that source shard will delete the chunk data after the chunk migration.
use config
   { "_id" : "balancer" },
   { $set : { "_waitForDelete" : true } },
   { upsert : true }

Set chunkSize
The default chunkSize of MongoDB sharding is 64MB. The default settings are okay in most scenarios. But you may need to modify the chunkSize configuration in some cases. For details, refer to the Jumbo Chunk and Chunk Size section in

What You Should Know about MongoDB Sharding

. I will not repeat it here.
The following commands modify the chunkSize to 100MB.
use config { _id:"chunksize", value: 100 } )

1. In order to reduce the chunkSize, the background needs some time to split the original chunks and reduce the chunk size to within the newly configured chunkSize. (If it is a jumbo chunk, it cannot be split.)
2. If you increase the chunkSize, the original small chunks won’t be merged automatically. Only new insert or update operations will increase the chunk size gradually.
3. The range of the chunkSize values is between [1MB, 1024MB].