Memory utilization is a key metric to monitor an ApsaraDB for MongoDB instance. This topic describes how to view the memory utilization details of an ApsaraDB for MongoDB instance and how to troubleshoot high memory utilization issues on the instance.
Background information
ApsaraDB for MongoDB processes load binary files and dependent system library files to the memory. ApsaraDB for MongoDB processes also allocate and release memory in the client connection management, request processing, and WiredTiger storage engine. By default, ApsaraDB for MongoDB uses TCMalloc provided by Google as a memory allocator. A large amount of memory is consumed by the WiredTiger storage engine, client connections, and request processing.
Access method
View memory utilization in monitoring charts
On the Monitoring Data page of an ApsaraDB for MongoDB instance in the ApsaraDB for MongoDB console, you can view the memory utilization of the instance in monitoring charts. ApsaraDB for MongoDB provides a variety of node combinations for instance architectures. You can select a node to view its memory utilization.
Replica set instances A replica set instance consists of a primary node, one or more secondary nodes, a hidden node, and one or more optional read-only nodes.
Sharded cluster instances A sharded cluster instance consists of one or more shard components, a ConfigServer component that stores configuration metadata, and one or more mongos components. The memory consumption of a shard component is the same as that of a replica set instance. The memory utilization of a mongos component depends on the aggregation result sets, the number of connections, and the metadata size.
View memory utilization by running commands
To view and analyze the memory utilization of an instance, run the
db.serverStatus().mem
command in the mongo shell. For more information about how to connect to an ApsaraDB for MongoDB instance by using the mongo shell, see Instance connection. Sample result:
{ "bits" : 64, "resident" : 13116, "virtual" : 20706, "supported" : true }
// resident indicates the physical memory that is consumed by mongos nodes. Unit: MB.
// virtual indicates the virtual memory that is consumed by mongos nodes. Unit: MB.
For more information about the serverStatus command, see serverStatus.
Common causes
High memory usage of the WiredTiger storage engine
The WiredTiger storage engine consumes the largest amount of memory of an ApsaraDB for MongoDB instance. To ensure compatibility and security, ApsaraDB for MongoDB sets the CacheSize parameter to 60% of the actual memory of an instance. For more information, see Specifications.
If the storage engine cache reaches 95% of the CacheSize value that you specify, the instance loads are high, and threads used to process user requests evict clean pages. If the percentage of the dirty data in the storage engine cache exceeds 20%, user threads evict dirty pages. In this process, users obviously see blocked requests. For more information, see Eviction parameters.
You can use the following methods to view the memory usage of the WiredTiger storage engine:
View the memory usage of the WiredTiger engine
Run the
db.serverStatus().wiredTiger.cache
command in the mongo shell. The value of thebytes currently in the cache
parameter is the memory size. Sample result:{ ...... "bytes belonging to page images in the cache":6511653424, "bytes belonging to the cache overflow table in the cache":65289, "bytes currently in the cache":8563140208, "bytes dirty in the cache cumulative":NumberLong("369249096605399"), ...... }
View the percentage of dirty data in the WiredTiger cache
On the Real-time Monitoring page of an instance in the DAS console, view the percentage of dirty data in the WiredTiger cache. For more information, see Real-time performance monitoring.
Use the mongostat tool of ApsaraDB for MongoDB to view the percentage of dirty data in the WiredTiger cache. For more information, see mongostat.
High memory usage of connections and requests
If a large number of connections to the instance are established, a part of memory may be consumed due to the following reasons:
Thread dump overheads. Each connection has a thread that handles requests in the background. Each thread can occupy up to 1 MB of thread dump space. In most cases, dozens to hundreds of KB of thread dump space is occupied by a thread.
TCP connection buffers. Each TCP connection has read/ write buffers at the kernel layer. The buffer size is determined by TCP kernel parameters such as tcp_rmem and tcp_wmem. You do not need to specify the buffer size. However, a large number of concurrent connections occupy a larger amount of socket cache space and result in higher memory usage of TCP.
TCMalloc memory management. Each request has a unique context. Multiple temporary buffers may be allocated to request packets, response packets, and ordering processes. The temporary buffers are gradually released at the end of each request. The buffers are first released to the TCMalloc cache and then gradually released to the operating system. In most cases, memory utilization is high because TCMalloc fails to promptly release the memory consumed by requests. Requests may consume up to dozens of GB of memory before the memory is released to the operating system.
You can use one of the following methods to troubleshoot the high memory utilization issues of connections and requests:
View the memory usage of connections
On the Monitoring Data page of an ApsaraDB for MongoDB instance in the ApsaraDB for MongoDB console, you can view the memory usage of the instance in monitoring charts.
Use the mongo shell to query the number of connections. For more information, see How do I query and limit the number of connections?
View the size of memory that TCMalloc has not released to the operating system
To query the size of memory that TCMalloc has not released to the operating system, run the
db.serverStatus().tcmalloc
command. The TCMalloc cache size is the sum of the pageheap_free_bytes and total_free_byte values. Sample result:{ ...... "tcmalloc":{ "pageheap_free_bytes":NumberLong("3048677376"), "pageheap_unmapped_bytes":NumberLong("544994184"), "current_total_thread_cache_bytes":95717224, "total_free_byte":NumberLong(1318185960), ...... } }
NoteFor more information about TCMalloc, see tcmalloc.
High memory usage of metadata
A large amount of metadata in ApsaraDB for MongoDB, such as databases, collections, and indexes, can consume massive memory. The following issues may appear in ApsaraDB for MongoDB instances that run earlier versions:
Full logical backups may open a large number of file handles in an instance that runs versions earlier than MongoDB 4.0. If the file handles are not promptly returned to the operating system, the memory usage rapidly increases.
The file handles may not be deleted after a large number of collections in an instance that runs MongoDB 4.0 or earlier are deleted. This results in memory leaks.
High memory usage of index creation
In normal data writes, secondary nodes maintain a buffer that is approximately 256 MB in size for data replay. After indexes are created, more memory may be consumed by secondary nodes for data replay.
The
background
option is available for instances that run versions earlier than MongoDB 4.2 during index creation. If you configure{background:true}
, indexes are created in the background. Index creation is replayed in a serial manner. This consumes up to 500 MB of memory.By default, the
background
option is deprecated for instances that run MongoDB 4.2 and later. Secondary nodes can replay index creation in a parallel manner, which consumes more memory. Out-of-memory errors may appear when multiple indexes are created.
For more information, see index-build-impact-on-database-performance and index-build-process.
High memory usage of PlanCache
If a request has a large number of execution plans, the PlanCache methods may consume a large amount of memory.
View the memory usage of PlanCache: You can run the db.serverStatus().metrics.query.planCacheTotalSizeEstimateBytes
command to view the memory usage of PlanCache in an instance that runs MongoDB 4.0 or later.
If your instance runs MongoDB 4.0 but the preceding command does not contain the relevant fields, the minor version of the instance is too low. We recommend that you update the minor version of the instance. For more information, see Update the minor version of an instance.
For more information about the memory usage of PlanCache, see Secondary node memory arise while balancer doing work.
Optimization policies
The goal of memory optimization is to seek a balance between resource consumption and performance instead of minimizing memory usage. Ideally, the memory remains sufficient and stable and system performance is not affected.
You cannot change the value of the CacheSize parameter specified by ApsaraDB for MongoDB. You can use the following policies for memory optimization:
Control the number of concurrent connections. 100 persistent connections can be established in a database based on the results of performance tests. By default, a MongoDB driver can establish 100 connection pools with the backend. If a large number of clients exist, reduce the size of the connection pool for each client. We recommend that you establish no more than 1,000 persistent connections in a database. Otherwise, overheads in the memory and multi-thread context may increase and result in high request handling latency.
Reduce the memory overheads of a single request. For example, to optimize query performance, you can create indexes to minimize the need for collection scans and in-memory sorting.
Upgrade memory configurations. If the number of connections is appropriate but the memory usage continues to increase after you optimize query performance, we recommend that you upgrade the memory configurations. Otherwise, instance availability may be affected due to out-of-memory (OOM) errors and extensive cache clearing.
Accelerate the memory release of TCMalloc. If the memory utilization of your instance exceeds 80%, you can adjust TCMalloc parameters in the ApsaraDB for MongoDB console.
Preferentially enable the tcmallocAggressiveMemoryDecommit parameter. The parameter has been verified by rich practices and has a significant effect on solving memory issues.
Gradually increase the value of the tcmallocReleaseRate parameter. If you do not meet your requirements after adjusting the tcmallocAggressiveMemoryDecommit parameter, gradually increase the value of the tcmallocReleaseRate parameter. For example, you can increase the parameter value from 1 to 3 and then to 5.
ImportantWe recommend that you adjust the value of the tcmallocReleaseRate parameter during off-peak hours. The adjustment of the two parameters may cause performance degradation. If your business is affected by the adjustment, roll the parameter values back in a timely manner.
Optimize the number of databases and collections. If your instance contains an excessive number of databases and collections, you can remove unnecessary collections and indexes, integrate data from multiple collections, split the instance, or migrate the instance date to a sharded cluster instance. For more information, see What do I do if my instance is in the stuttering state or encounters an exception due to a large number of databases and collections?
In scenarios where memory leaks may occur in the use of ApsaraDB for MongoDB, contact Alibaba Cloud technical support.
References
Eviction parameters
Parameter | Default value | Description |
eviction_target | 80% | When the used cache size is larger than the value of the eviction_target parameter, eviction threads evict clean pages in the background. |
eviction_trigger | 95% | When the used cache size is larger than the value of the eviction_trigger parameter, user threads evict clean pages in the background. |
eviction_dirty_target | 5% | When the size of dirty data in the cache is larger than the value of the eviction_dirty_target parameter, eviction threads evict dirty pages in the background. |
eviction_dirty_trigger | 20% | When the size of dirty data in the cache is larger than the value of the eviction_dirty_trigger parameter, user threads evict dirty pages in the background. |
eviction_updates_target | 2.5% | If the cache update ratio exceeds the value of the eviction_updates_target parameter, the background evict threads start to eliminate the memory fragmentation space related to small objects. |
eviction_updates_trigger | 10% | If the cache update ratio exceeds the value of the eviction_updates_trigger parameter, user threads also start to evict the memory fragmentation space related to small objects. |
FAQ
How do I specify the upper limit of memory consumed by aggregation operations for ApsaraDB for MongoDB?
ApsaraDB for MongoDB does not allow you to directly specify the upper limit of memory consumed by aggregation operations. ApsaraDB for MongoDB has a memory limit of 100 MB for each stage of an aggregation operation. If a stage exceeds this limit, the system reports an error. To resolve this issue, you can add the option that specifies the {allowDiskUse:true}
value in a aggregation statement. For more information, see Memory Restrictions. MongoDB 6.0 or later supports global default parameter allowDiskUseByDefault. When an aggregate operation consumes excessive memory, ApsaraDB for MongoDB temporarily uses some disk space to avoid excessive memory usage. For more information about other policies to reduce memory usage, see Optimization policies.