Memory usage is a critical monitoring metric for ApsaraDB for MongoDB. This topic describes how to view the memory usage of an ApsaraDB for MongoDB instance, explains common causes of high memory usage, and provides optimization strategies.
Overview
When an ApsaraDB for MongoDB process starts, it not only loads binary files and various system libraries into memory but also manages memory allocation and deallocation for client connections, request processing, and the storage engine. By default, ApsaraDB for MongoDB uses Google tcmalloc as its memory allocator. The WiredTiger storage engine, client connections and request processing primarily consume the memory.
View memory usage
Monitoring charts
On the Monitoring Data page of the ApsaraDB for MongoDB console, select the node corresponding to your database architecture to view its memory usage rate.
Replica set architecture: Includes one Primary node, one or more Secondary nodes, one Hidden node, and optionally one or more ReadOnly nodes.
Sharded cluster architecture: The memory usage of each shard follows the same pattern as a replica set. The Config Server stores configuration metadata. The memory usage of Mongos routing nodes depends on the size of aggregation result sets, the number of connections, and the size of metadata.
Command line
Connect to the instance with MongoDB Shell and run the
db.serverStatus().memcommand to view memory usage. The following is a sample response:
{ "bits" : 64, "resident" : 13116, "virtual" : 20706, "supported" : true }
// resident: The amount of physical memory used by the mongod process, in MB.
// virtual: The amount of virtual memory used by the mongod process, in MB.For more information about serverStatus, see serverStatus.
Common causes
Storage engine's memory usage
The storage engine cache consumes most of the memory. For compatibility and security reasons, ApsaraDB for MongoDB sets the WiredTiger CacheSize to approximately 60% of the instance's memory specification. For details, see Product specifications.
If the storage engine cache uses 95% of the configured CacheSize, it indicates a high instance load, and threads handling user requests will participate in evicting clean pages. If the dirty data in the storage engine cache exceeds 20% of the cache size, user threads will also participate in evicting dirty pages. During this process, you may notice significant request blocking. For specific rules, see eviction parameter description.
You can use the following methods to check the engine's memory usage:
View memory usage of the WiredTiger engine
In MongoDB Shell, run the following command:
db.serverStatus().wiredTiger.cache. In the response,bytes currently in the cacheindicates the memory size. The following is a sample response:{ ...... "bytes belonging to page images in the cache":6511653424, "bytes belonging to the cache overflow table in the cache":65289, "bytes currently in the cache":8563140208, "bytes dirty in the cache cumulative":NumberLong("369249096605399"), ...... }
View cache dirty ratio of the WiredTiger engine
In the DAS console, go to the Real-Time Monitoring Data page to view the current cache dirty ratio.
Use the mongostat tool provided with ApsaraDB for MongoDB to view the current cache dirty ratio.
Connection and request memory usage
A high number of concurrent connections to the instance can consume significant memory for the following reasons:
Thread stack overhead: Each connection has a corresponding backend thread to process requests on that connection. Each thread can consume up to 1 MB of stack space, though this is typically in the range of tens to hundreds of KB.
TCP connection kernel buffer: At the kernel level, each TCP connection has read and write buffers, determined by kernel parameters such as tcp_rmem and tcp_wmem. You do not need to manage this memory usage. However, more concurrent connections and a larger default socket buffer result in higher TCP memory consumption.
tcmalloc memory management: When a request is received, a request context is created and temporary buffers (such as request packets, response packets, and temporary sort buffers) are allocated. After the request is completed, these temporary buffers are released back to the tcmalloc memory allocator. tcmalloc first returns them to its own cache before gradually releasing them back to the operating system. In many cases, high memory usage occurs because tcmalloc does not promptly release memory back to the OS. This unreleased memory can accumulate to tens of gigabytes.
You can use the following methods to troubleshoot:
View connection usage
On the Monitoring Data page of the ApsaraDB for MongoDB console, view the connection usage.
Use MongoDB Shell to query the number of connections.
View Unreturned Memory from tcmalloc to OS
Run the
db.serverStatus().tcmalloccommand to check the amount of memory held by tcmalloc. In this context, tcmalloc cache = pageheap_free_bytes + total_free_byte. The following is a sample response:{ ...... "tcmalloc":{ "pageheap_free_bytes":NumberLong("3048677376"), "pageheap_unmapped_bytes":NumberLong("544994184"), "current_total_thread_cache_bytes":95717224, "total_free_byte":NumberLong(1318185960), ...... } }
Metadata memory usage
When an ApsaraDB for MongoDB instance has a large amount of metadata for databases, collections, and indexes, it can consume a significant amount of memory. Earlier versions of ApsaraDB for MongoDB may have the following issues:
In ApsaraDB for MongoDB earlier than 4.0, a full logical backup may open a large number of file handles. If these are not promptly returned to the operating system, memory usage can increase rapidly.
In the ApsaraDB for MongoDB 4.0 and earlier, deleting a large number of collections may not properly remove the corresponding file handles, which can lead to a memory leak.
Memory usage during index creation
During normal data writes, a Secondary node maintains a buffer of approximately 256 MB for data oplog application. However, when creating indexes, the oplog application process on Secondary nodes may consume more memory.
In ApsaraDB for MongoDB earlier than 4.2, index creation supports the
backgroundoption. When{background:true}is specified, the index is built in the background. The oplog application for index creation is serial and can consume up to 500 MB of memory.ApsaraDB for MongoDB 4.2 and later versions have deprecated the
backgroundoption by default. Secondary nodes are allowed to apply index creation in parallel, which consumes more memory. Creating multiple indexes simultaneously may cause an Out of Memory (OOM) error on the instance.
For more information about memory usage during index creation, see Index Build Impact on Database Performance and Index Build Process.
PlanCache memory usage
In some scenarios, a single request may have a large number of potential execution plans, causing the PlanCache to consume a significant amount of memory.
View PlanCache memory usage: In ApsaraDB for MongoDB 4.0 and later versions, run thedb.serverStatus().metrics.query.planCacheTotalSizeEstimateBytes command to view the size.
If the preceding command does not return the relevant field on your ApsaraDB for MongoDB 4.0 instance, the instance is running an older minor version. Upgrade the minor version of the database to access this functionality.
For more information about PlanCache memory usage, see Secondary node memory arise while balancer doing work.
Optimization strategies
Memory optimization is not about minimizing memory usage at all costs. Instead, it is about ensuring that the system has sufficient and stable memory to perform normally, striking a balance between resource utilization and performance.
ApsaraDB for MongoDB specifies the CacheSize, and this value cannot be modified. You can use the following strategies to optimize memory usage:
Control the number of concurrent connections. Based on performance test results, the database server supports a maximum of 100 concurrent connections, and the MongoDB driver's default connection pool size is 100. When there are multiple clients, reduce the connection pool size for each client. Keep the total number of persistent connections to the instance under 1,000 to avoid increased memory and multi-threading context-switch overhead, which can affect request latency.
To reduce the memory overhead of individual requests, optimize query performance by creating indexes to reduce collection scans and in-memory sorting.
If memory usage remains high after you have optimized queries and configured an appropriate number of connections, upgrade the memory specification to prevent potential Out of Memory (OOM) errors and performance degradation caused by excessive cache eviction, and to ensure instance availability.
Accelerate memory release by tcmalloc. If your database instance's memory usage exceeds 80%, you can adjust tcmalloc-related parameters on the Parameters page in the console.
First, enable the
tcmallocAggressiveMemoryDecommitparameter. This parameter has been extensively tested and is proven to be effective in resolving memory-related issues.Gradually increase the value of the
tcmallocReleaseRateparameter. If adjusting the preceding parameter does not yield the expected results, gradually increase thetcmallocReleaseRatevalue (for example, from 1 to 3, then to 5).
ImportantAdjust these parameters during off-peak hours, as modifying the
tcmallocAggressiveMemoryDecommitandtcmallocReleaseRateparameters may degrade database performance. If your business is affected, roll back the changes immediately.Optimize the number of databases and collections. If your database instance has too many databases and collections, you can remove unnecessary collections and indexes, consolidate data from multiple tables, split the instance, or migrate to a sharded cluster. For more information, see Performance Degradation from Too Many Databases or Tables.
If you encounter other potential memory leak scenarios while using ApsaraDB for MongoDB, you can contact Alibaba Cloud technical support for assistance.
References
eviction parameter description
Parameter | Default value | Description |
eviction_target | 80% | When the cache usage exceeds the eviction_target, background threads start evicting clean pages. |
eviction_trigger | 95% | When the cache usage exceeds the eviction_trigger, user threads also start evicting clean pages. |
eviction_dirty_target | 5% | When the dirty cache ratio exceeds the eviction_dirty_target, background threads start evicting dirty pages. |
eviction_dirty_trigger | 20% | When the dirty cache ratio exceeds the eviction_dirty_trigger, user threads also start evicting dirty pages. |
eviction_updates_target | 2.5% | When the cache update ratio exceeds the eviction_updates_target, background threads start evicting memory fragments related to small objects. |
eviction_updates_trigger | 10% | When the cache update ratio exceeds the eviction_updates_trigger, user threads also start evicting memory fragments related to small objects. |
FAQ
Q: How can I increase the memory limit for aggregation operations in MongoDB?
A: ApsaraDB for MongoDB does not currently support directly increasing the memory limit for aggregation operations. MongoDB enforces a 100 MB memory limit per aggregation pipeline stage. If a stage exceeds this limit, the system returns an error. To resolve this issue, you can explicitly specify the {allowDiskUse:true} option in your aggregation pipeline. Starting with MongoDB 6.0, MongoDB introduced the allowDiskUseByDefault parameter globally. When an aggregation operation requires excessive memory, MongoDB automatically uses temporary disk space to reduce memory consumption. For additional strategies to optimize memory usage Optimization strategies.