ApsaraDB for MongoDB lets you modify instance parameters from the console. Misconfigured parameters can cause performance degradation or application errors. This topic covers key parameters, their default values, the symptoms that indicate a problem, and concrete tuning guidance.
This topic covers kernel parameters only. Client-side driver parameters such as socketTimeout are not included.
Some parameters require an instance restart to take effect. A restart causes a brief connection interruption. Before modifying any parameter, check the Restart required field and schedule the change during off-peak hours if needed.
Parameter quick reference
The following table summarizes all parameters covered in this topic. "Applicable scope" indicates whether the parameter applies to mongod nodes (replica set members and shard nodes) or Mongos routers.
| Parameter | Default | Restart required | Applicable scope | Recommendation |
|---|---|---|---|---|
operationProfiling.mode | off | Yes | mongod | Keep default; enable only for targeted debugging |
operationProfiling.slowOpThresholdMs | 100 ms | No | mongod, mongos | Adjust to slightly above average latency of core queries |
replication.oplogGlobalIdEnabled | false | Yes | mongod | Enable only for two-way sync with DTS or mongoShake |
replication.oplogSizeMB | 10% of disk | No | mongod | Keep default; increase for high-update-rate workloads |
setParameter.cursorTimeoutMillis | 600000 ms | No | mongod, mongos | Do not increase; consider decreasing to 300000 |
setParameter.flowControlTargetLagSeconds | 10 | No | mongod | Increase if throttling is confirmed; investigate root cause if it persists |
setParameter.oplogFetcherUsesExhaust | true | Yes | mongod | Do not change |
setParameter.maxTransactionLockRequestTimeoutMillis | 5 ms | No | mongod | Increase if lock timeout errors are frequent |
setParameter.replWriterThreadCount | 16 | Yes | mongod | Do not adjust; contact support for guidance |
setParameter.tcmallocAggressiveMemoryDecommit | 0 | No | mongod | Enable only for confirmed OOM or fragmentation; monitor closely |
setParameter.transactionLifetimeLimitSeconds | 60 | No | mongod | Decrease (e.g., to 30); never increase |
storage.oplogMinRetentionHours | 0 | No | mongod | Keep default for stable workloads; use a float > 1.0 for variable workloads |
storage.wiredTiger.collectionConfig.blockCompressor | snappy | Yes | mongod | Change based on workload; use zstd for cold data |
setParameter.minSnapshotHistoryWindowInSeconds / maxTargetSnapshotHistoryWindowInSeconds | 300 | No | mongod | Set to 0 if atClusterTime reads are not used |
rsconf.chainingAllowed | true | No | mongod | See guidance based on cluster size |
setParameter.internalQueryMaxPushBytes / internalQueryMaxAddToSetBytes | 104857600 (100 MB) | No | mongod | Increase only if hitting the limit for a specific query |
setParameter.migrateCloneInsertionBatchSize | 0 | No | mongod (shard) | Adjust if chunk migration causes latency spikes |
setParameter.rangeDeleterBatchDelayMS | 20 ms | No | mongod (shard) | Increase (e.g., to 200) if CPU spikes during balancing |
setParameter.rangeDeleterBatchSize | 0 (auto ~128) | No | mongod (shard) | Adjust if CPU spikes during balancing |
setParameter.receiveChunkWaitForRangeDeleterTimeoutMS | 10000 ms | No | mongod (shard) | Increase if timeout errors appear during balancing |
setParameter.ShardingTaskExecutorPoolMaxConnecting | 2 | Yes (≤4.0) / No (≥4.2) | mongos | Do not adjust |
setParameter.ShardingTaskExecutorPoolMaxSize | 2^64 - 1 | Yes (≤4.0) / No (≥4.2) | mongos | No adjustment needed |
setParameter.ShardingTaskExecutorPoolMinSize | 1 | Yes (≤4.0) / No (≥4.2) | mongos | Set to a value in [10, 50] |
Replica sets
operationProfiling.mode
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | Yes |
| Default | off |
Controls the query profiler level.
Symptoms
Setting this to
allorslowOpunder high query load degrades instance performance.A
system.profilecollection appears in a database, indicating the profiler was left enabled.Some users mistakenly assume this parameter must be set to
slowOpto generate slow query logs.
Recommendation
Keep the default value (off). Enabling the query profiler adds overhead, and slow query logs typically provide equivalent diagnostic information. Enable the profiler only for targeted debugging sessions, and disable it immediately after analysis.
For details on profiler overhead, see the MongoDB Database Profiler documentation.
operationProfiling.slowOpThresholdMs
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | No |
| Default | 100 (ms) |
Defines the threshold above which a query is classified as slow.
Symptoms
Value too small: Excessive slow query and audit log entries create noise that makes real problems harder to identify.
Value too large: Genuinely slow queries go unrecorded, obscuring performance issues.
Recommendation
Set this slightly above the average latency of your most critical queries.
| Business profile | Typical query time | Suggested value |
|---|---|---|
| Latency-sensitive | ~30 ms | 50 ms |
| Analytical workload | 300–400 ms | 500 ms |
replication.oplogGlobalIdEnabled
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | Yes |
| Default | false |
A self-developed parameter that adds global IDs (GIDs) to oplog entries. GIDs enable two-way synchronization with DTS (Data Transmission Service) or mongoShake by breaking circular synchronization loops.
Recommendation
Enable only when two-way synchronization is required. Because a restart is needed, schedule the change during off-peak hours.
replication.oplogSizeMB
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | No |
| Default | 10% of instance disk space (e.g., 50 GB for a 500 GB disk) |
Sets the maximum logical size of the oplog collection, which stores replication change records.
Symptoms
If the oplog is too small:
Secondary nodes fall behind and enter the RECOVERING state.
Log backups miss oplog records, creating gaps that prevent point-in-time restore.
Recommendation
Keep the default. Never decrease it. Increase it if your workload has a low data volume but a high update rate — such workloads generate oplog entries quickly and can exhaust a small oplog window. As a rule of thumb, size the oplog to cover at least one hour of writes.
This parameter is not changed via the MongoDB configuration file. The Alibaba Cloud control plane resizes the oplog using the replsetResizeOplog command.setParameter.cursorTimeoutMillis
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | No |
| Default | 600000 (10 minutes) |
The idle timeout for server-side cursors, in milliseconds. MongoDB automatically closes cursors that exceed this threshold.
Symptom
Accessing a cursor after it has been closed returns:
Message: "cursor id xxxxxxx not found"
ErrorCode: CursorNotFound(43)Recommendation
Do not increase this value. To reduce the resource overhead of idle cursors, lower it — for example, to 300000. Regardless of this setting, avoid holding cursors idle on the application side.
setParameter.flowControlTargetLagSeconds
| Attribute | Value |
|---|---|
| Applicable versions | 4.2 and later |
| Restart required | No |
| Default | 10 |
The replication-lag threshold, in seconds, at which MongoDB activates flow control. Flow control throttles writes on the primary to prevent secondary nodes from falling too far behind.
Symptom
Slow query logs show that durationMillis is nearly equal to flowControl.timeAcquiringMicros, indicating that the request was throttled — not slow due to query execution:
{
"t": { "$date": "2024-04-25T13:28:45.840+08:00" },
"s": "I",
"c": "WRITE",
"id": 51803,
"ctx": "conn199253",
"msg": "Slow query",
"attr": {
"type": "update",
"ns": "xxx.xxxxx",
"command": "...",
"planSummary": "IDHACK",
"flowControl": {
"acquireCount": 1,
"acquireWaitCount": 1,
"timeAcquiringMicros": 959000
},
"durationMillis": 959
}
}Recommendation
Use the following decision path:
| Observation | Action |
|---|---|
durationMillis ≈ flowControl.timeAcquiringMicros in slow query logs | Confirm flow control is the cause; increase flowControlTargetLagSeconds to reduce sensitivity |
| Throttling continues after increasing the value | The instance has a deeper primary-secondary synchronization bottleneck; investigate replication lag |
| Replication lag root cause confirmed | Options include upgrading the instance, reducing write throughput, or setting write concern to {w: majority} |
setParameter.oplogFetcherUsesExhaust
| Attribute | Value |
|---|---|
| Applicable versions | 4.4 and later |
| Restart required | Yes |
| Default | true |
Controls whether stream replication is used for primary-secondary oplog transfer. When disabled, secondary nodes revert to the pull model, where each batch requires a separate network round trip.
Symptom
In some environments, stream replication produces extra CPU or network bandwidth overhead.
Recommendation
Do not change this parameter. Stream replication reduces replication lag in high-load and high-latency environments, reduces the risk of data loss when a primary with {w: 1} write concern goes down unexpectedly, and lowers write latency for {w: majority} and {w: >1} write concerns.
setParameter.maxTransactionLockRequestTimeoutMillis
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | 5 (ms) |
The time a transaction waits to acquire a lock before automatically aborting, in milliseconds.
Symptom
The client or server logs contain:
Message: "Unable to acquire lock '{8442595743001781021: Database, 1525066715360699165}' within a max lock request timeout of '5ms' milliseconds."
ErrorCode: LockTimeout(24)Drivers that support TransientTransactionError retry automatically, so the error may only appear in server logs.
Recommendation
If lock timeout errors are frequent, increase this parameter to reduce aborts from transient lock contention. If errors persist after the increase, address the root cause in application logic:
Avoid concurrent modifications to the same document within a transaction.
Audit the transaction for operations that hold locks for long periods, such as Data Definition Language (DDL) operations or unoptimized queries without index coverage.
setParameter.replWriterThreadCount
| Attribute | Value |
|---|---|
| Applicable versions | 3.2 and later |
| Restart required | Yes |
| Default | 16 |
The maximum number of threads used for parallel oplog application on secondary nodes. The effective ceiling is twice the number of CPU cores of the instance type.
Symptom
In extreme cases, secondary nodes accumulate replication lag continuously because oplog application cannot keep up with write volume on the primary.
Recommendation
Do not adjust this parameter in normal operations. If replication lag persists despite tuning other parameters, contact Alibaba Cloud support for guidance specific to your workload.
setParameter.tcmallocAggressiveMemoryDecommit
| Attribute | Value |
|---|---|
| Applicable versions | 4.2 and later |
| Restart required | No |
| Default | 0 (disabled) |
Controls whether TCMalloc uses aggressive memory decommit. When enabled, MongoDB actively merges contiguous free memory blocks and returns them to the operating system.
Symptoms
An out-of-memory (OOM) error occurs on a mongod node because memory cannot be reclaimed fast enough to keep up with query load.
Heap fragmentation causes memory usage to rise slowly past 80% and continue climbing steadily.
Recommendation
Do not adjust this parameter in normal operations. If OOM errors or heap fragmentation are confirmed, enable this parameter during off-peak hours.
Enabling aggressive memory decommit may reduce throughput depending on your workload. After enabling, monitor performance and roll back promptly if business impact is observed.
setParameter.transactionLifetimeLimitSeconds
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | 60 |
The maximum duration for an open transaction, in seconds. Transactions that exceed this limit are marked as expired and aborted by a background cleanup thread.
Symptom
The client receives:
Message: "Aborting transaction with txnNumber xxx on session with lsid xxxxxxxxxx because it has been running for longer than 'transactionLifetimeLimitSeconds'"Recommendation
Decrease this value (for example, to 30) rather than increase it. Long-running uncommitted transactions hold WiredTiger cache resources — an overloaded cache causes request latency spikes, database stalls, and full CPU utilization.
To address transaction timeouts without increasing the limit:
Break large transactions into smaller units that complete within the configured time.
Optimize queries inside the transaction to use indexes, reducing execution time.
For best practices on transactions, see Transactions and Read/Write Concern.
storage.oplogMinRetentionHours
| Attribute | Value |
|---|---|
| Applicable versions | 4.4 and later |
| Restart required | No |
| Default | 0 (disabled; oplog size governed entirely by replication.oplogSizeMB) |
The minimum number of hours the oplog collection is retained, regardless of the replication.oplogSizeMB limit.
Symptoms
Setting this too high causes the oplog collection to consume disk space beyond the size cap.
Forgetting about a non-zero value here can make disk usage appear to fluctuate unexpectedly.
Recommendation
For stable write workloads, keep the default (0). For workloads with highly variable write volumes, set this to a floating-point number greater than 1.0. Before setting a value, estimate peak disk usage to avoid triggering a disk-full lock.
storage.wiredTiger.collectionConfig.blockCompressor
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | Yes |
| Default | snappy |
| Supported algorithms | none, snappy, zlib, zstd (zstd requires MongoDB 4.2 and later) |
Sets the compression algorithm for new collections. Existing collections are not affected by this change.
Recommendation
Change based on your workload characteristics. Higher compression ratios come with higher CPU cost for compression and decompression — measure the trade-off in your own environment. For instances used primarily to store cold data, zstd offers a significantly higher compression ratio.
To use different compression algorithms for individual collections, use the createCollection command with explicit storage engine options. See the MongoDB documentation.setParameter.minSnapshotHistoryWindowInSeconds / setParameter.maxTargetSnapshotHistoryWindowInSeconds
| Attribute | Value |
|---|---|
| Applicable versions | 4.4 and later |
| Restart required | No |
| Default | 300 (5 minutes) |
The duration, in seconds, for which WiredTiger retains snapshot history. Setting this to 0 disables the snapshot history window. This parameter primarily supports reads at a specific cluster time using atClusterTime.
Symptom
This parameter adds pressure to the WiredTiger cache (WT cache), particularly when the same documents are updated frequently.
Recommendation
No adjustment is needed in most cases.
If your workload does not use
atClusterTimereads, set this to0to reduce WT cache pressure.If you need to read snapshot data older than 5 minutes, increase this value — but account for the additional memory and CPU overhead.
If the snapshot window is smaller than the age of the snapshot you request, MongoDB returns a SnapshotTooOld error.rsconf.chainingAllowed
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | true |
Controls whether secondary nodes in the replica set can sync from another secondary (chained replication) rather than always syncing directly from the primary.
Symptoms
Disabling chained replication increases the primary node's CPU utilization and network traffic.
Enabling chained replication makes it easier for secondary nodes to accumulate replication lag.
Recommendation
| Cluster size | Guidance |
|---|---|
| 4 or fewer nodes | Enable or disable based on your network topology and latency requirements |
5 or more nodes with {w: majority} | Disabling chained replication improves write performance but significantly increases primary node load — evaluate the trade-off for your workload |
setParameter.internalQueryMaxPushBytes / setParameter.internalQueryMaxAddToSetBytes
| Attribute | Value |
|---|---|
| Applicable versions | 4.2 and later |
| Restart required | No |
| Default | 104857600 (100 MB) |
The maximum memory that the $push and $addToSet accumulator operators can use per query.
Symptom
A query using $push or $addToSet fails with:
"errMsg": "$push used too much memory and cannot spill to disk. Memory limit: 104857600...Recommendation
No adjustment is needed in most cases. If you consistently hit this limit for a specific query, increase the value. Setting this to a very large value risks out-of-memory (OOM) errors on the mongod node.
Sharded clusters (Shard)
setParameter.migrateCloneInsertionBatchSize
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | 0 (bounded by the 16 MB BSON document size limit) |
The maximum number of documents per batch during the clone phase of a chunk migration.
Symptom
Chunk migrations during balancing cause latency spikes on the affected shard.
Recommendation
No adjustment is needed in most cases. If chunk migration consistently causes performance fluctuations during balancing, set this to a fixed batch size to control migration throughput.
setParameter.rangeDeleterBatchDelayMS
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | 20 (ms) |
The pause between consecutive batch deletions during the cleanup phase of a chunk migration. Also applies to the cleanupOrphaned command.
Symptoms
Asynchronous post-migration document deletion causes a CPU spike on the shard.
Setting this value too high delays orphaned document cleanup, and may result in a timeout:
Message: "OperationFailed: Data transfer error: ExceededTimeLimit: Failed to delete orphaned <db>.<collection> range [xxxxxx,xxxxx] :: caused by :: operation exceeded time limit"
Recommendation
No adjustment is needed in most cases. If CPU spikes during balancing are traced to orphaned document deletion, increase this value — for example, to 200 — to throttle the deletion rate.
This parameter works together with setParameter.rangeDeleterBatchSize. Adjust them separately or in combination to control the overall deletion throughput.setParameter.rangeDeleterBatchSize
| Attribute | Value |
|---|---|
| Applicable versions | 4.0 and later |
| Restart required | No |
| Default | 0 (auto-selected, typically 128 documents per batch) |
The maximum number of documents per batch for asynchronous orphaned document deletion after chunk migration.
Symptom
Asynchronous post-migration deletion causes CPU utilization spikes on the shard.
Recommendation
No adjustment is needed in most cases. If CPU spikes during balancing are traced to orphaned document deletion, set this to a fixed batch size. Use this parameter together with setParameter.rangeDeleterBatchDelayMS to fine-tune deletion throughput.
setParameter.receiveChunkWaitForRangeDeleterTimeoutMS
| Attribute | Value |
|---|---|
| Applicable versions | 4.4 and later |
| Restart required | No |
| Default | 10000 (10 seconds) |
The time a moveChunk operation waits for the range deleter to finish clearing orphaned documents before a migration starts, in milliseconds.
Symptom
The balancer logs a timeout error:
ExceededTimeLimit: Failed to delete orphaned <db.collection> range [{ <shard_key>: MinKey }, { <shard_key>: -9186000910690368367 }) :: caused by :: operation exceeded time limitRecommendation
No adjustment is needed in most cases. If this timeout error appears consistently, increase this value to give the range deleter more time to complete before the next migration begins.
setParameter.minSnapshotHistoryWindowInSeconds / setParameter.maxTargetSnapshotHistoryWindowInSeconds
Same behavior and recommendation as the replica set section above. Applies to each shard's mongod nodes.
rsconf.chainingAllowed
Same behavior and recommendation as the replica set section above. Applies to each shard's replica set.
setParameter.internalQueryMaxPushBytes / setParameter.internalQueryMaxAddToSetBytes
Same behavior and recommendation as the replica set section above. Applies to shard nodes.
Sharded clusters (Mongos)
operationProfiling.slowOpThresholdMs
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | No |
| Default | 100 (ms) |
Same behavior and recommendation as the replica set section above. Applies to Mongos nodes.
setParameter.ShardingTaskExecutorPoolMaxConnecting
| Attribute | Value |
|---|---|
| Applicable versions | 3.6 and later |
| Restart required | Yes (3.6 and 4.0) / No (4.2 and later) |
| Default | 2 |
The maximum number of concurrent connection handshakes in the TaskExecutor connection pool on a Mongos node. This controls the rate at which Mongos establishes new connections to mongod nodes.
Symptom
When many connections are created simultaneously, the Mongos node experiences a CPU spike.
Recommendation
Do not adjust this parameter.
setParameter.ShardingTaskExecutorPoolMaxSize
| Attribute | Value |
|---|---|
| Applicable versions | 3.6 and later |
| Restart required | Yes (3.6 and 4.0) / No (4.2 and later) |
| Default | 2^64 - 1 (maximum 64-bit integer) |
The maximum number of connections per TaskExecutor connection pool on a Mongos node.
Recommendation
No adjustment is needed. If you need to cap the number of Mongos-to-shard connections, set a lower bound — but avoid setting it too low. An exhausted connection pool causes requests on Mongos to queue and stall.
setParameter.ShardingTaskExecutorPoolMinSize
| Attribute | Value |
|---|---|
| Applicable versions | 3.6 and later |
| Restart required | Yes (3.6 and 4.0) / No (4.2 and later) |
| Default | 1 |
The minimum number of connections maintained per TaskExecutor connection pool on a Mongos node.
Symptom
A sudden burst of requests forces the connection pool to create many new connections at once, causing a CPU spike and request latency increase on the Mongos node.
Recommendation
Set this to a value in the range [10, 50]. The right value depends on your shard topology — the number of shards and the number of nodes per shard. Keep in mind that Mongos consumes a small amount of memory to maintain idle connections to each shard.
setParameter.cursorTimeoutMillis
| Attribute | Value |
|---|---|
| Applicable versions | 3.0 and later |
| Restart required | No |
| Default | 600000 (10 minutes) |
Same behavior and recommendation as the replica set section above. Applies to Mongos nodes.