Introduction to the Usage and Principles of MongoDB Sharded Cluster

By Zhucha

Basic Architecture of the Sharded Cluster

Why Use a Sharded Cluster?

Problems encountered by ReplicaSet

ReplicaSet helps address issues such as read request scaling and high availability. However, as business scenarios further increase, the following issues may arise.

The storage capacity exceeds the disk capacity of a single machine.
Active data sets exceed the single-instance memory capacity that many read requests need to be read from the disk.
The number of write operations exceeds the maximum input/output operations per second (IOPS).

Scale Up Compared to Scale Out

Scale up improves the number of CPU cores, memory, bandwidth and other functions through better servers.
Scale out assigns tasks to multiple computers.

What Is MongoDB Sharded Cluster?

MongoDB sharded cluster is a method of scale out.
MongoDB uses sharded cluster to support large datasets and business scenarios with high throughput.

Basic Architecture of the Sharded Cluster

Mongos
- Access to sharded cluster instances
- Routing, distributing and merging requests
- Deploy multiple mongos to ensure high availability
ConfigServer
- Storage metadata and cluster configurations
- Deployment for ReplicaSet to ensure high availability
Shard
- User data is stored, with different shards storing different user data.
- Deployment for ReplicaSet to ensure high availability

How to Link Sharded Clusters

With a sharded cluster, Drivers need to connect to, mongos instances to interact with the entire cluster. Then mongos sends requests to different shards at the backend based on the requests from the client. For example, to read and write collection one, mongos performs request interaction with shard A and shard B. If there is a collection of two to read and write, mongos performs interaction with shard A only.

The following figure shows the sharded cluster applied from Alibaba Cloud Apsaradb for MongoDB instance. The link address of each mongos instance is listed and the ConnectionStringURI is spliced. If linking with a single mongos instance, there may be a single point of failure, therefore, ConnectionStringURI for access is recommended.

The ConnectionStringURI consists of the following parts:

mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:por tN]]][/[database][?options]]

Mongodb://: The prefix, indicating a connection string URI.
Username:password@: The username and password used to connect to the MongoDB instance, separated by a colon.
HostX:portX: The link address and port number of the instance.
/Database: Authenticated the database to which the database account belongs.
?Options: Additional connection options.

The following example describes this issue.

Example : mongodb://user:password@mongos1:3717,mongos2:3717/ad min

In the above ConnectionStringURI, the username is User, the password is Password, and then the connection to mongo1 and mongos2 is made. Their ports are both 3717, and the entire database is admin.

Primary Shard Definition

By default, the collection of each database is not sharded, storing on a fixed shard, which is called the primary shard.

Primary Shard Options

When creating a new database, the system will select a shard with the least amount of data as the primary shard of the new database based on the amount of data currently stored in each shard.

How to Split a Collection

Apsaradb for MongoDB supports set-based data sharding. A set that has been sharded is split into multiple sets and stored in shards.

sh.enableSharding("")

// eg: "record" Example : sh.enableSharding("records") sh.shardCollection(".", { : , ... } )
: The name of the sharding key field.
: {1 | -1 |"hashed"} :1 | -1: Ranged sharding key. "hashed": Hash

Sharding Key

The following example describes this issue.

Example : sh.shardCollection("records.people", { zipcode: 1 } ) to shard the recor ds.people collection, which is a record-based ranged sharding.

Shard Key

Ranged Sharding Compared to Hashed Sharding

Ranged sharding: The data is divided according to the value of the shard key.
Advantage: It can easily meet the requirements of range queries.
Disadvantage: The shard keys are monotonically written, and the writing capability cannot be expanded.
Ranged sharding supports for multiple fields: {x : 1} {x : 1 , y : 1}

The preceding figure shows an x-based ranged sharding. The data is divided into four parts, and the cut points are x:-75, x:25, and x:175. Data with similar values are adjacent to each other, which can well meet the requirements of range query. However, if data is written monotonically based on shard keys, the writing capability cannot be well expanded because all writes are carried by the last chunk.

Hashed sharding: Calculate the hash value based on the ShardKey and shard the data based on the hash value.
Advantage: The shard keys are monotonically written, and the writing capability can be expanded fully.
Disadvantages: The range query cannot be efficiently performed.

As shown in the above figure, x:25 x:26 x:27 have been scattered in different chunks after hash calculation. Based on the monotonicity of the hashed sharding, for the scenario where the shard key is written monotonically, the write capability can be fully expanded, but it is not efficient for range queries.

Hash sharding only supports for single field:

{ x : "hashed" } {x : 1 , y : "hashed"} // 4.4 new

After version 4.4, Apsaradb for MongoDB can combine hash sharding of a single field with a range of one to multiple shard key fields, for example by specifying x:1,y is the way to hash.

How to Choose a Reasonable Shard Key

Cardinality: The larger the better
- Take gender as the shard key, the data can be split into up to 2 parts.
- Take month as the shard key, the data can be split into up to 12 parts.

Frequency, i.e., the frequency of a value in the document: The lower the better. Take the current city as the shard key when recording the set of the national population. Most of the data is concentrated in the chunk where the first-tier cities are located.

Monotonically changing: Hash sharding is used.

Take log generation time as the shard key when recording a collection of logs: If ranged sharding is used, the data is written only on the last shard.

Shard Key Constraints

The shard key must be an index. For non-empty collections, an index must be created before ShardCollection. For empty collections, ShardCollection is automatically indexed.

Before version 4.4:

The size of a shard key cannot exceed 512 bytes;
Only support for single field hash sharding keys.;
The document must contain shard key;
Fields in shard key cannot be modified.

After version 4.4:

The size of a shard key has no limit;
Support for composite hash sharding keys;
The document may not contain shard key and be treated as Null during insertion;
Add the suffix refineCollectionShardKey command to the shard key, which modifies the field contained in the shard key.

Before version 4.2, the value of shard key cannot be modified. While after version 4.2, if the value of shard key is not a variable ID, the corresponding value of shard key can be modified.

FefineCollectionShardKey

New command of version 4.4, which allows to modify the shard key by adding a suffix field to the shard key:

db.adminCommand( { 
refineCollectionShardKey: "<database>.<collection>", 
key: { <existing key specification>, 
<suffix1>: <1|"hashed">, ... } 
} )

The example above describes this issue.

: The current shard key, that is, the new shard key must be prefixed with the current shard key.
: The new shard key field;
<1|"hashed"> : <1> - the ranged sharding key; <"hashed"> - the hash sharding key.

Instructions for FefineCollectionShardKey

The index corresponding to the new shard key must have been created before execution.
FefineCollectionShardKey only modifies the metadata on the configuration node and does not migrate any data. Data dispersion will be then completed by normal splitting and migration.
Version 4.4 supports the missing shard key, which is handled as a null instance. However, not all documents have all fields of the new shard key.
Version 4.4 supports a composite hash sharding key, while the preceding versions support a single-field hash sharding key only.

Targeted Operations Compared to Broadcast Operations

There are two ways for request forwarding that mongos instances perform based on the shard key information in the request. One is targeted operations and the other is broadcast operations.

Targeted operations: Calculate the targeted shard(s) based on the shard key, initiate the request and return the result.
Include operations on shard keys such as query, update, delete, and insert.

As shown in the figure above, take a as the shard key. If the request contains the field a, the mongos instance can identify its targeted shard. If the shard is a shard b, the a can directly interact with shard b, then get the result and return it to the client.

Broadcast operations: This way sends requests to all shards, merges the query results, and returns the results to the client.
Operations on shard keys such as query, update or delete the _ID field are not included.

The following figure shows the broadcast operation procedure.

Chunk and Balancer

What Is the Chunk?

Apsaradb for MongoDB splits the collection into multiple data subsets based on the shard key. Each subset is called a chunk;
ShardedCollection data is divided into keys between MinKey and MaxKey based on shard key;
Each chunk has its own interval, which is front-closed and rear-open;
The shard that stores ShardedCollection contains one or more chunks of the collection;

The preceding figure shows an x-based ranged sharding. The data is divided into four chunks with left-closed right-open interval, Chunk 1 : [minKey, -75) ; Chunk 2 : [-75, 25); chunk3: [25, 175) ; Chunk4 : [175, maxKey). ShardA holds Chunk1 and Chunk2, while ShardB and ShardC hold Chunk3 and Chunk4 respectively.

Chunk Splits

Chunk split definition

With data writing, when the chunk size increases to the specified size, 64MB by default, Apsaradb for MongoDB will split the chunk.

Chunk split methods

Manual trigger

sh.splitAt(namespace, query)
sh.splitFind(namespace, query)

Automatic trigger: Automatic chunk split Can be only triggered by operations such as insert and update. Chunk split does not occur immediately when the chunk size is adjusted to a smaller size.

JumboChunk: The smallest chunk can contain only one unique shard key. Such chunks cannot be further split.

The following figure shows as below.

Chunk Split Management

Chunk split management includes manual chunk splits and chunk size adjustment.

Manual chunk splits

Scenario: The business needs to insert massive data into the collection that is distributed only in a small number of the chunk.

Multiple shards for concurrent writes cannot be used by direct insertion and chunks will be split after insertion. By doing so, chunk migration will be triggered then, creating many inefficient IOs.

sh.splitAt(namespace, query) : Specifies a chunk split point,

Example: x was [0, 100) , sh.splitAt(ns, {x: 70}). After splitting, x becomes [0, 70) , [70, 100)

sh.splitFind(namespace, query): Splits the data in the middle targeted chunk,

Example: x was [0, 100) , sh.splitFind(ns, {x: 70}). After splitting x becomes [0, 50) , [50, 100)

Adjust the ChunkSize

Example: use config; db.settings.save( { _id:"chunksize", value: } );

The method of adjusting the ChunkSize is to add a document to the settings set of the configuration library. The ID of this document is ChunkSize.

Notes:

The corresponding chunk split is only triggered by the insert and update operations. Adjust ChunkSize will immediately trigger the splitting of all chunks into a new size;
The ChunkSize value ranges from 1MB to 1024MB;
Turning down ChunkSize allows for a more balanced distribution of chunks, but the number of chunks that are migrated increases;
Turning up ChunkSize reduces the number of chunks to be migrated but results in uneven distribution of chunks.

Chunk Migration

Chunk migration definition

To ensure data load balancing, Apsaradb for MongoDB supports data migration between shards, which is called Chunk Migration.

Chunk migration methods:
- Automatic trigger: When chunks are unevenly distributed across shards, the balancer process will automatically migrate chunks;
- Manual trigger: sh.moveChunk(namespace, query, destination)
- Example : sh.moveChunk("records.people", { zipcode: "53187" }, "sha rd0019").
Chunk migration influence:
- The disk capability used by shards will be affected;
- The system performance will be affected by the increase in network bandwidth and system load.
Chunk migration constraints:
- Only one chunk can be migrated at a time for each shard;
- The number of documents in Chunk that will not be migrated is 1.3 times the average number of chunk documents. // Version 4.4 provides option support.

Balancer

The balancer is a background process of Apsaradb for MongoDB that ensures that chunks in the collection are balanced on all shards.
The balancer runs on the primary node of ConfigServer. The feature, by default, is enabled.
When chunks are unbalanced in a sharded cluster, the balancer will migrate chunks from the shard with the largest number of chunks to the shard with the least number of chunks.

Chunk quantity	Migration threshold
Less than 20	2
20-79	4
Greater than or equal to 80	8

As shown in the table above, the number of chunks is less than 20, and the migration threshold is 2. As the number of chunks increases, the migration thresholds increase by 4 and 8, respectively.

AutoSplit and Balancer Management Commands

Enable the chunk automatic splitting: sh.enableAutoSplit().
Disable the chunk automatic splitting: sh.disableAutoSplit().
Check whether balancer is on: sh.getBalancerState().
Check whether Balancer is running: sh.isBalancerRunning().
Turn on balancer: sh.startBalancer() / sh.setBalancerState(true).
From version 4.2, AutoSplit is also enabled at the same time.
Disable the balancer: sh.stopBalancer() / / sh.setBalancerState(false)
From version 4.2, AutoSplit will be disabled at the same time.
Enable a collection automatic migration: sh.enableBalancing(namespace).
Disable a collection automatic migration: sh.disableBalancing(namespace).
Modify the balancer window:

use config; 
db.settings.update( 
{ _id: "balancer" }, 
{ $set: { activeWindow : { start : "<start-time>", stop : "<stop-time>" } }
},
{ upsert: true } 
); 。

JumboChunk

JumboChunk definition: The smallest chunk can contain only one unique shard key, and this chunk cannot be split anymore.

JumboChunk generation: JumboChunk is generated only when the shard key is not properly selected. If these chunks are frequently accessed, the performance bottleneck of a single shard will be introduced. In addition, chunks cannot be migrated as well. If further migration is performed, it will cause data between shards unbalanced.

These problems are gradually solved as the Apsaradb for MongoDB version iterates. For example, in version 4.4, the RefineCollectionShardKey command has been provided, setting the shard key again. Meanwhile, this version also provides some settings for the balancer and some options for MoveChunk to support chunk migration.

In the newer versions of 4.2 and 4.0, the command is also provided to clean up the JumboChunk script in the sharded cluster.

Sharded Cluster Management

Command Review

Balancer
- sh.setBalancerState(state)
- true : sh.startBalancer()
- false : sh.stopBalancer()
- sh.getBalancerState()
- sh.isBalancerRunning()
- sh.disableBalancing(namespace)
- sh.enableBalancing(namespace)
Chunk
- sh.disableAutoSplit()
- sh.enableAutoSplit()
- sh.moveChunk( … )
- sh.splitAt( … )
- sh.splitFind( … )
Sharding
- sh.shardCollection()
- sh.enableSharding()

Cluster Status Check - sh.status()

The information is shown in the figure.

Sharding version: The version of the sharded cluster instance.

Shards: There are currently two shards in the sharded cluster, each with a name, link information, and current status.

Most recently active mongos: There are currently two 4.2.1 versions of mongos in the sharded cluster.

Autosplit and Balancer

Auto-Split is currently enabled.
The balancer is enabled.
The balancer is not currently running.
Indicate the balancer execution success and failure cases in the past period.

Records Library

Primary Shard: xxxx746b04
Enables sharding
Displays related version information

Records.people Collection

The shard key is { "Zipcode" : "Hashed" } without unique constraints.
The balancer can balance this collection.
The collection has four chunks, which are evenly distributed on two shards.
The scope of each chunk and the shard of the Chunk.

LogicalSession

Since version 3.6, the Apsaradb for MongoDB driver has associated all operations with LogicalSession

For versions 3.4 and earlier, as shown in the following figure:

For version 3.4 and later, as shown in the following figure:

LogicalSession ID

{ 
// The unique identifier. Either generated by the client or by the server
Introduction to the usage and principles of sharded cluster < 62
"id" : UUID("32415755-a156-4d1c-9b14-3c372a15abaf"), 
// Current Login User ID 
"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSu 
FU=") 
}

Self-cleaning mechanism
- Persistent storage: Config.System.Sessions. TTL index: 30 minutes by default
- Synchronization is performed every 5 minutes by default. Session cleanup is disabled, and the cursor on the session is disabled at the same time.
Operation methods
- use config; db.system.sessions.aggregate( [ { $listSessions: { allUse rs: true } } ] )
- db.runCommand( { killSessions: [ { id : }, ... ] } )
- startSession / refreshSessions / endSessions ...

Community

Introduction to the Usage and Principles of MongoDB Sharded Cluster

Basic Architecture of the Sharded Cluster

Why Use a Sharded Cluster?

Basic Architecture of the Sharded Cluster

How to Link Sharded Clusters

How to Split a Collection

Shard Key

How to Choose a Reasonable Shard Key

Shard Key Constraints

FefineCollectionShardKey

Chunk and Balancer

What Is the Chunk?

Chunk Splits

Chunk Split Management

Chunk Migration

Balancer

AutoSplit and Balancer Management Commands

JumboChunk

Sharded Cluster Management

Command Review

Cluster Status Check - sh.status()

Autosplit and Balancer

Records Library

Records.people Collection

LogicalSession

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

ApsaraDB for MyBase

ApsaraDB for MongoDB

ApsaraDB for OceanBase

ApsaraDB for Cassandra