The ApsaraDB for MongoDB console provides a wide range of performance monitoring information for you to check the running status of instances.

Usage notes

  • When you receive an alert message from Alibaba Cloud, such as a message indicating that the CPU usage of your instance exceeds 80%, you can view monitoring information about the instance on ApsaraDB for MongoDB console to troubleshoot the issue. You can filter the nodes of the instance to check the status of each node and locate the node where the issue occurs.
  • Monitoring information is retained for up to seven days. You cannot view the monitoring information that was generated seven days ago.

Procedure

  1. Log on to the ApsaraDB for MongoDB console.
  2. In the upper-left corner of the page, select the resource group and the region of the target instance.
  3. In the left-side navigation pane, click Replica Set Instances or Sharded Cluster Instances based on the instance type.
  4. Find the target instance and click its ID.
  5. In the left-side navigation pane of the page that appears, click Monitoring Info.
  6. View monitoring information based on instance types:
    Note By default, the Monitoring Info page displays the monitoring information of the last day. You can also select a time range to view historical monitoring information.
    • Standalone instances: You can only view the monitoring information about primary nodes.
    • Replica set instances: You can view the monitoring information about primary or secondary nodes by selecting a node from the drop-down list in the upper part of the Monitoring Info page.
    • Sharded cluster instances: You can view the monitoring information about Mongos, shard, or ConfigServer nodes by selecting a node from the drop-down list in the upper part of the Monitoring Info page.
      Note Mongos nodes have IDs prefixed with s-. Shard nodes have IDs prefixed with d-. ConfigServer nodes have IDs suffixed with -cs.

Metrics

Metric Description
CPU Utilization Percentage The CPU usage of the instance.
Memory Usage Percentage The memory usage of the instance.
IOPS Usage The input/output operations per second (IOPS) of the instance. You can view the following specific metrics:
  • data_iops: the IOPS of the data disk.
  • log_iops: the IOPS of the log disk.
IOPS Usage Percentage The percentage of the IOPS used by the instance to the maximum IOPS allowed.
Disk Usage The disk space used by the instance. You can view the following specific metrics:
  • ins_size: the total space used.
  • data_size: the space used on the data disk.
  • log_size: the space used on the log disk.
Disk Usage Percentage The percentage of the disk space used by the instance to the maximum disk space that can be used.
Opcounters The queries per second (QPS) of the instance. You can view the following specific metrics:
  • insert: the number of insert operations.
  • query: the number of query operations.
  • delete: the number of delete operations.
  • update: the number of update operations.
  • getmore: the number of getmore operations.
  • command: the number of command operations.
Connections The current number of connections to the instance.
Cursors The number of cursors used by the instance. You can view the following specific metrics:
  • total_open: the number of cursors that are opened.
  • timed_out: the number of cursors that timed out.
Network The network traffic of the instance. You can view the following specific metrics:
  • bytes_in: the inbound network traffic.
  • bytes_out: the outbound network traffic.
  • num_requests: the number of requests that are processed.
Global Lock The length of the queues that are waiting for global locks for the instance. You can view the following specific metrics:
  • gl_cq_readers: the length of the queue that is waiting for global read locks.
  • gl_cq_writers: the length of the queue that is waiting for global write locks.
  • gl_cq_total: the length of the queue that is waiting for both global read and write locks.
WiredTiger The cache metrics of the WiredTiger engine used by the instance. You can view the following specific metrics:
  • bytes_read_into_cache: the volume of data that is read from the disk to the cache.
  • bytes_written_from_cache: the volume of data that is written from the cache to the disk.
  • maximum_bytes_configured: the size of the maximum available disk space that is configured.
Primary/Secondary Replication Latency repl_lag: the latency in data synchronization between the primary node and secondary nodes of the instance.
WT Request Queues The number of concurrent requests that are being handled and the remaining number of concurrent requests that can be handled. You can view the following specific metrics:
  • write_concurrent_trans_out: the number of concurrent write requests that are being handled.
  • read_concurrent_trans_out: the number of concurrent read requests that are being handled.
  • write_concurrent_trans_available: the remaining number of concurrent write requests that can be handled.
  • read_concurrent_trans_available: the remaining number of concurrent read requests that can be handled.
IO Latency iocheck_cost: the latency of I/O operations, indicating the I/O response performance.