Mongodb Space Usage Issues

MongoDB space usage issues


MongoDB space usage issues.The space usage of Alibaba Cloud Database MongoDB is a very important monitoring indicator. If the storage space of the instance is completely full, the instance will be directly unavailable. Generally speaking, when the storage space usage ratio of a MongoDB instance reaches more than 80-85%, it should be processed in time, either to reduce the actual size of the database space, or to expand the storage space to avoid the risk of full space. .
However, the analysis of the space usage of Alibaba Cloud Database MongoDB is not simple. This article will help you to view, analyze and optimize the space usage of Alibaba Cloud Database MongoDB.

1 MongoDB space usage issue.View space usage

1.1 Replica Set Mode
The deployment architecture is replica set mode, and there are multiple ways to view the space usage. You can learn about the space usage of MongoDB from shallow to deep according to your own needs.
1.2 General overview
"Basic Information" page of the MongoDB console will display the total storage space usage of the instance, but there is only the current total space usage rate. There is no specific information about the space occupied by various types of data, and there is no space usage. Historical information, as shown below:

1.3 Monitoring graph analysis
A MongoDB replica set consists of multiple roles, and a role may correspond to one or more physical nodes. Alibaba Cloud MongoDB exposes Primary and Secondary nodes to users, and also provides the role of a read-only instance. You can click "Monitoring Information" and select the corresponding role to view the monitoring situation related to the MongoDB space, as shown in the following figure:
The space usage of one of the MongoDB physical nodes consists of two blocks, data_size and log_size, where data_size is the space used by the data disk (excluding the local library), mainly including the data physical files at the head of collection, and the index physical files at the head of index. files, and a few metadata physical files such as WiredTiger.wt.
log_size includes the physical size of the local library, the size of the mongodb running log, and a small portion of the audit log size.
1.4 Detailed Analysis
If you need to analyze the detailed space occupation of the database or table in the replica set, in addition to the db.stats() and db.$collection_name.stats() that come with MongoDB, we recommend using the "CloudDBA-Space" provided by the Alibaba Cloud MongoDB console. Analytics", with "CloudDBA-Spatial Analytics", you can achieve the following:
l View the overview of the database tablespace, the average daily growth, and the predicted available days.
l Check the table space usage of the exception database.
l Detailed space usage of business tables, including index and data logical size, compression ratio analysis, average row length, and more.
For more CloudDBA- spatial analysis content reference:
more detailed explanations of the analysis commands officially provided by mongodb, please refer to:
2-shard cluster
In the deployment mode of a sharded cluster, since there may be multiple shards in a cluster, there is no concept of overall space utilization. So the console no longer provides space usage in "Basic Info".

2.1 MongoDB space usage issues.Monitoring graph analysis

The monitoring information of Alibaba Cloud MongoDB provides detailed information on cluster formation mongos routing nodes, config server configuration nodes, and space usage of shard nodes. However, in general, mongos and config server nodes will not become space bottlenecks. It is recommended to ignore them and directly view the space usage of each role in each shard, as shown in the following figure:
2.2 MongoDB space usage issues.Detailed analysis
MongoDB space usage issues.The detailed method of viewing the space usage of a sharded cluster is slightly different from the replica set mode. You need to log in to each shard one by one to view it through commands. The details of the space usage on each shard are exactly the same as those of the replica set.
On the other hand, in addition to the space usage rate, the space problem in a sharded cluster also has a very large number of "uneven space usage of each shard", which is very difficult to analyze and is currently not supported by CloudDBA. Subsequent chapters of this article will focus on taking you to understand and in-depth analysis of "unbalanced space usage under different shards" and "unbalanced space usage of each role under the same replica set".

3 Analysis and solutions of intractable diseases of space problems

3.1 Summary analysis and general solution of space problems
After receiving the disk space alarm, the general analysis and solution ideas are as follows:
1. Confirm the current space usage value and the detailed occupancy status of each library table in the current space usage.
2. Confirm the main source of space growth, such as log growth, or a specific business table write surge.
3. Confirm whether the current growth is in line with expectations, and do application analysis for large-scale write scenarios that do not meet expectations.
4. Check whether the current data has a large number of fragments, and whether the space can be reclaimed by recycling fragments.
5. Confirm whether you need to expand disk space, or deploy timed historical data deletion or TTL Index.
6. After a large amount of historical data is deleted, the fragmented space is reclaimed by compacting or redoing the replica set.
3.2 The impact on the instance during the compact method and compact
When compacting a collection, the mutually exclusive write lock of the DB where the collection is located will be added, which will cause all read and write requests on the DB to be blocked. Because the execution time of compact may be very long, which is related to the data volume of the collection, it is strongly recommended that the service is low. Execute during peak hours to avoid impacting business.
The compact method is very simple. It is recommended to execute it on the standby database first, and reduce the impact on the business during the compact period by switching between the active and standby databases. The command is:
db.runCommand({compact: " collectionName "}).
In addition, in the official version after MongoDB 4.4, the Compact command will no longer block business reading and writing. For more information on the usage and limitations of the compact command:
3.3 compact is invalid
As mentioned above, in the case of a large number of removes, we will need to use compact to reclaim the fragmented space. However, in extreme scenarios, although the compact operation indicates success, the disk space is not actually reclaimed. This should be regarded as a bug or a defect in MongoDB's compact design. It does not mean that as long as there are fragments, the recovery will be successful. The basic principle of compact is not to immediately open up new space to store data to replace the original file, but to continuously move the data to the front space hole, so in some scenarios, although there are space holes, the internal compact algorithm cannot It is guaranteed that these holes can definitely be reused. For this compact failure scenario, if you are entangled in space usage, you can solve it by rebuilding the copy.
In another scenario, in versions prior to MongoDB 3.4, compact may still have a bug: after a large amount of data is deleted, compact cannot reclaim index files, and only takes effect on data files. This can be confirmed by using the command db.$table_name.stats().indexSizes or by looking at the index physical file size directly. In this case, it is recommended to upgrade the kernel version to 3.4 or higher.
3.4 The journal log is too large, resulting in a huge gap between the main and standby spaces
In extreme cases, the journal log may trigger a bug and cause the space to increase indefinitely. You can view the following content through the MongoDB running log:
2019-08-25T09:45:16.867+0800 I NETWORK [thread1] Listener: accept() returns -1 Too many open files in system
2019-08-25T09:45:17.000+0800 I - [ftdc] Assertion: 13538:couldn't open [/proc/55692/stat] Too many open files in system src/mongo/util/processinfo_linux.cpp 74
2019-08-25T09:45:17.002+0800 W FTDC [ftdc] Uncaught exception in 'Location13538: couldn't open [/proc/55692/stat] Too many open files in system' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
The trigger condition of this bug is that the open files of the host reach the upper limit, which will interrupt the log server cleanup thread inside MongoDB . The official version before 4.0 has this problem. If you encounter it, you can upgrade the kernel version to 4.0 or higher, or It can be temporarily solved by restarting the mongod process. For the specific bug link, please refer to:
3.5 Delayed and incremental backup of the standby database may cause the secondary log space to continue to grow
The default official MongoDB , oplog is a fixed collection, the size is basically fixed, and the physical file size between the master and the backup will not be much different. Alibaba Cloud MongoDB developed the oplog adaptive feature due to the node recovering state caused by the frequent expiration of oplogs. That is to say, in the case of active/standby delay in extreme scenarios, the actual oplog size that can be used is no longer limited by the fixed set size defined in the configuration file, and theoretically can reach 20% of the disk capacity requested by the user. This creates a problem. When the standby database is delayed for recovery, the physical space previously occupied by the oplog will not be retracted.
In addition, considering the efficiency of backup and recovery, Alibaba Cloud MongoDB uses physical backup to back up the mongodb instance in Hidden. During this period, a large number of chepoints will be involved, which will occupy more data and log space.
For the above scenarios, if the space occupancy is not particularly large, it is usually recommended to simply ignore it. You can also perform a separate compact operation on the oplog as needed, and all write operations will be blocked during compaction. Methods as below:

db.grantRolesToUser("root", [{db: "local",role: "dbAdmin"}])
use local
db.runCommand({compact: "",force: true })

4.MongoDB space usage issues.Unbalanced space usage among different shards

4.1 Unreasonable selection of sharding key types
In a sharded cluster, the choice of shard key type is very important. Generally, two types of hash sharding or ranged sharding are used. Usually, in terms of disk balance, the hash sharding strategy is much better than ranged, because according to different key values, MongoDB can make the data evenly distributed on different shards through the internal hash function , as shown in the following figure Show:

Range sharding generally distributes data according to the size range of the key, so it often causes such a phenomenon: the newly inserted data on a hot chunk will not only cause the shard disk where the chunk is located to be too high, but also Bringing short-term data inhomogeneous scenarios. As shown in the figure below, all data is written to the shard where chunk C is located. When chunk C is full, a new chunk will be split in this shard, and then the chunk will be migrated through the cluster load balancer Balancer, but this migration requires a lot of money In a high concurrent write scenario, the data load balancing speed may not keep up with the data write speed, resulting in uneven data capacity between shards. In this usage scenario, the range sharding strategy is not recommended.

For more Sharding Key types introduction reference:
4.2 The selection of the sharding key field is unreasonable
Through sh.status(), you can see that the number of chunks on each shard is basically the same, but in fact, most of the data only exists on some of the chunks, resulting in the shards where these hot chunks are located. The amount of data is much larger than other shards. By viewing The MongoDB operation log can view obvious warning information:
2019-08-27T13:31:22.076+0800 W SHARDING [conn12681919] possible low cardinality key detected in superHotItemPool.haodanku_all - key is { batch: "201908260000" }
2019-08-27T13:31:22.076+0800 W SHARDING [conn12681919] possible low cardinality key detected in superHotItemPool.haodanku_all - key is { batch: "201908260200" }
2019-08-27T13:31:22.076+0800 W SHARDING [conn12681919] possible low cardinality key detected in superHotItemPool.haodanku_all - key is { batch: "201908260230" }
The main consideration of mongos load balancing is that the number of chunks of each shard is kept equal, and the data is considered to be balanced, so the above extreme scenario occurs: although the number of each shard is equal, the actual data is seriously skewed. Because the shardKeys in a chunk are almost identical but the chunk splitting threshold of 64M is triggered, an empty chunk will be split. Over time, although the number of chunks has increased and the migration of chunks has been completed, the chunks migrated are actually empty chunks, resulting in a situation where the number of chunks is balanced but the actual data is not balanced. (This should be based on cost considerations when Balancer migrates chunks, and thinks that the cost of empty chunk migration is lower, so it prefers empty chunk migration)
More split introduction reference:
In response to this situation, we can only redesign the architecture and select an appropriate column with a high degree of discrimination as the sharding key.
4.3 Some DBs are not sharding
MongoDB sharded clusters allow some dbs to do sharding, and some dbs not to do shading. Then it will inevitably bring about such a problem : the data of the db without shading must only exist on one shard. If the data volume of the db is large, the data volume of the shard may be much larger than that of other shards.
In addition to the above situations, the usage scenarios that are more likely to cause this problem are usually due to importing from a source-side mongos cluster to a new mongos cluster, and one step is ignored in the logical import process: implement a good sharding design on the target-side mongos cluster, because Logical import does not automatically do sharding design, this step needs to be done in advance.
For this problem, we recommend:
l If the target cluster is initially imported, do a good job of sharding design before importing.
l If there are many libraries that do not do sharding and the amount of data is basically the same, Mongos provides the movePriamy command to migrate the specified db to the specified shard.
l If there is a large amount of data in a db without sharding, it is recommended to design sharding for it or split it and treat it as a single replica set.
l Even if this happens, if the total number of disks is sufficient, it is recommended to ignore it.
4.4 Large-scale movechunk operations may cause uneven disk usage of shards
The essence of movechunk is to remove the source data after writing data to the target shard. By default, the remove operation will not release space, because each table has an independent data file and index file for the wiredTiger engine. If the file is not deleted, the total disk space cannot be retracted. Usually, the operation that is most likely to cause this problem is: the sharding cluster was not designed for sharding before, and the sharding was done after running for a period of time.
In principle, the space fragmentation caused by movechunk is the same as that of large-scale delete. Therefore, in the case of a large number of movechunk or remove documents, the fragmentation can be reclaimed by compact operation. Under normal circumstances, the data will be reorganized and then reclaimed after compaction. Fragmented space for files.
For more introduction to movechunk:
5 Alibaba Cloud MongoDB product planning on space usage optimization
5.1 Computational Storage Separation
Currently, the maximum storage space supported by a single physical node of Alibaba Cloud MongoDB is 3T. If the data volume of a node in single instance or replica set mode exceeds 3T, the business must be split or upgraded to a sharded cluster, and the expansion of specifications involves data migration. lead to longer time.
Alibaba Cloud MongoDB will soon support cloud disks in replica set deployment mode to achieve separation of computing and storage. Under cloud disk storage, the upper limit of data storage for a single node will reach tens of terabytes, and the minute-level expansion capability brought by the cloud-native architecture.
5.2 ECS snapshot backup
In the high-version Alibaba Cloud MongoDB instance, due to the more mature oplog multi-threaded playback technology, there is basically no longer a delay in the standby database. Therefore, Alibaba Cloud MongoDB no longer customizes the oplog at the kernel level, which is consistent with the official "" The "fixed set" form remains the same, and the oplog amplification problem caused by the delay of the standby database can be solved by upgrading to the latest version of the kernel.
In addition, Alibaba Cloud MongoDB is currently developing and using the ECS snapshot technology to back up MongoDB instances. When a fault occurs, it can bypass the time-consuming OSS physical backup download and achieve a minute-level recovery capability. And after using ECS snapshot backup, the problem that the traditional physical backup method may lead to the increase of the Hidden node space will also be solved.
5.3 Kernel Transformation and Optimization of Compact Instructions
The Compact command of MongoDB 4.4 does not block reading and writing. Alibaba Cloud MongoDB transplants the patch to ApsaraDB for MongoDB 3.4 and above at the kernel level, and realizes the productization of space defragmentation by clicking on the DAS console.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00