Memory usage is a key metric to monitor an ApsaraDB for MongoDB instance. This topic describes how to view the memory usage details of an ApsaraDB for MongoDB instance and how to troubleshoot high memory usage issues on the instance.
Background information
ApsaraDB for MongoDB processes load binary files and dependent system library files to the memory. ApsaraDB for MongoDB processes also allocate and release memory in the connection management, request handling, and storage engine of clients. By default, ApsaraDB for MongoDB uses TCMalloc provided by Google as a memory allocator. A large amount of memory is consumed by the WiredTiger storage engine, client connections, and request handling.
Access method
For a sharded cluster instance, the memory usage on each shard node is the same as that on a replica set instance. The ConfigServer nodes store only configuration metadata. The memory usage on mongos nodes is affected by aggregated result sets, the number of connections, and the size of metadata.
For a replica set instance, you can use the following methods to view the memory usage:
View memory usage in monitoring charts
An ApsaraDB for MongoDB replica set instance consists of multiple node roles. Each node role can correspond to one or more physical nodes. An ApsaraDB for MongoDB replica set instance consists of a primary node that supports read and write operations, one or more high-availability secondary nodes, a hidden node, and one or more optional read-only nodes.
On the Monitoring Data page of an ApsaraDB for MongoDB replica set instance in the ApsaraDB for MongoDB console, view the memory usage on the instance in monitoring charts.
View memory usage by running commands
To view and analyze the memory usage on an ApsaraDB for MongoDB replica set instance, run the
db.serverStatus().mem
command in the mongo shell. Sample result:{ "bits" : 64, "resident" : 13116, "virtual" : 20706, "supported" : true } // resident indicates the physical memory that is consumed by mongos nodes. Unit: MB. // virtual indicates the virtual memory that is consumed by mongos nodes. Unit: MB.
NoteFor more information about the serverStatus command, see serverStatus.
Common causes
High memory usage of the WiredTiger storage engine
The WiredTiger storage engine consumes the largest amount of memory of an ApsaraDB for MongoDB instance. To ensure compatibility and security, ApsaraDB for MongoDB sets the cachesize parameter to 60% of the actual memory of an instance. For more information, see Specifications.
If the size of cached data exceeds 95% of the specified cache size, high memory usage occurs. In this case, the threads that originally handle user requests also evict dirty pages for protection purpose. As a result, user request congestion occurs. For more information, see Eviction parameters.
You can use the following methods to view the memory usage of the WiredTiger storage engine:
Run the
db.serverStatus().wiredTiger.cache
command in the mongo shell. The value of thebytes currently in the cache
parameter is the memory size. Sample output:{ ...... "bytes belonging to page images in the cache":6511653424, "bytes belonging to the cache overflow table in the cache":65289, "bytes currently in the cache":8563140208, "bytes dirty in the cache cumulative":NumberLong("369249096605399"), ...... }
On the Dashboard page of an instance in the DAS console, view the percentage of dirty data in the WiredTiger cache. For more information, see Performance trends.
Use the mongostat tool of ApsaraDB for MongoDB to view the percentage of dirty data in the WiredTiger cache. For more information, see mongostat.
High memory usage of connections and requests
If a large number of connections to the instance are established, memory may be consumed due to the following reasons:
Each connection has a thread that handles requests in the background. Each thread can occupy up to 1 MB of stack space. In most cases, dozens to hundreds of KB of stack space is occupied by a thread.
Each TCP connection has read and write buffers at the kernel layer. The buffer size is determined by TCP kernel parameters such as tcp_rmem and tcp_wmem. You do not need to specify the buffer size. However, a large number of concurrent connections occupy a larger amount of socket cache space and result in higher memory usage of TCP.
Each request has a unique context. Multiple temporary buffers may be allocated for request packets, response packets, and ordering. The temporary buffers are gradually released at the end of each request. The buffers are first released to the TCMalloc cache and then gradually released to the operating system.
In most cases, memory usage is high because TCMalloc fails to promptly release the memory consumed by requests. Requests may consume up to dozens of GB of memory before the memory is released to the operating system. To query the size of memory that TCMalloc has not released to the operating system, run the
db.serverStatus().tcmalloc
command. The TCMalloc cache size is the sum of the values of the pageheap_free_bytes and total_free_byte parameters. Sample output:{ "generic":{ "current_allocated_bytes":NumberLong("9641570544"), "heap_size":NumberLong("19458379776") }, "tcmalloc":{ "pageheap_free_bytes":NumberLong("3048677376"), "pageheap_unmapped_bytes":NumberLong("544994184"), "current_total_thread_cache_bytes":95717224, "total_free_byte":NumberLong(1318185960), ...... } }
NoteFor more information about TCMalloc, see tcmalloc.
High memory usage of metadata
Metadata includes the memory usage of databases, collections, and indexes in an ApsaraDB for MongoDB instance. You must pay attention to the memory that is consumed by a large number of collections and indexes in the instance. Full logical backups may open a large number of file handles especially in ApsaraDB for MongoDB instances that run versions earlier than MongoDB 4.0. The file handles may not be promptly returned to the operating system and result in the rapid increase of memory usage. The file handles may not be deleted after a large number of collections in an ApsaraDB for MongoDB instance are deleted and result in memory leaks.
High memory usage for creating indexes
In normal data writes, secondary nodes maintain a buffer that is approximately 256 MB in size for data replay. After the primary node creates indexes, secondary nodes may consume more memory for data replay. In ApsaraDB for MongoDB instances that run versions earlier than MongoDB 4.2, indexes are created in the background on the primary node. Serial replay for index creation may consume up to 500 MB of memory. In ApsaraDB for MongoDB instances that run MongoDB 4.2 or later, indexes cannot be created in the background and secondary nodes can perform parallel replay for index creation. In this case, more memory is required because instance out-of-memory (OOM) errors may occur when multiple indexes are created at the same time.
For more information, see index-build-impact-on-database-performance and index-build-process.
High memory usage of PlanCache
If a request has a large number of execution plans, the PlanCache methods may consume a large amount of memory. In ApsaraDB for MongoDB instances that run later versions, run the mgset-xxx:PRIMARY> db.serverStatus().metrics.query.planCacheTotalSizeEstimateBytes
command to view the memory usage of PlanCache. For more information, see Secondary node memory arise while balancer doing work.
Solutions
The goal of memory optimization is to seek a balance between resource consumption and performance instead of minimizing memory usage. Ideally, the memory remains sufficient and stable and system performance is not affected. You cannot change the value of the cachesize parameter that ApsaraDB for MongoDB specifies. We recommend that you use the following methods to optimize memory usage:
Control the number of concurrent connections. 100 persistent connections can be established in a database based on the results of performance tests. By default, a MongoDB driver can establish 100 connection pools with the backend. If a large number of clients exist, reduce the size of the connection pool for each client. We recommend that you establish no more than 1,000 persistent connections in a database. Otherwise, overheads in the memory and multi-thread context may increase and result in high request handling latency.
Reduce the memory overhead of a single request. For example, you can create indexes to reduce the number of collection scans and perform memory ordering.
If the number of connections is appropriate but the memory usage continues to increase, we recommend that you upgrade the memory configurations. Otherwise, system performance may sharply decline due to OOM errors and extensive cache clearing.
Accelerate the memory release of TCMalloc. You can modify the value of the setParameter.tcmallocReleaseRate parameter on the parameters page in the ApsaraDB for MongoDB console. The valid values of the parameter range from 1 to 10. A larger value indicates a faster memory release. After you modify the parameter value, we recommend that you view the memory monitoring and check whether your business is affected. If the speed of memory release does not increase after you set the parameter to 10, enable the setParameter.tcmallocAggressiveMemoryDecommit parameter. In this case, a large amount of memory is suddenly released. This may result in the temporary increase in the response time of your application.
In scenarios where memory leaks may occur, contact Alibaba Cloud technical support.ApsaraDB for MongoDB
References
Parameter | Default value | Description |
eviction_target | 80 | When the used cache size is larger than the value of the eviction_target parameter, eviction threads evict clean pages in the background. |
eviction_trigger | 95 | When the used cache size is larger than the value of the eviction_trigger parameter, user threads evict clean pages in the background. |
eviction_dirty_target | 5 | When the size of dirty data in the cache is larger than the value of the eviction_dirty_target parameter, eviction threads evict dirty pages in the background. |
eviction_dirty_trigger | 20 | When the size of dirty data in the cache is larger than the value of the eviction_dirty_trigger parameter, user threads evict dirty pages in the background. |