When you use ApsaraDB for MongoDB, the CPU utilization may become excessively high or even approach 100%. A high CPU utilization slows down read and write operations and affects normal business operations. This topic describes how to troubleshoot the high CPU utilization of ApsaraDB for MongoDB for your applications.

Analyze running requests in ApsaraDB for MongoDB databases

  1. Connect to an ApsaraDB for MongoDB instance by using the mongo shell.
  2. Run the db.currentOp() command to check running operations in ApsaraDB for MongoDB databases.
    The following code provides an example of the command output:
    {
            "desc" : "conn632530",
            "threadId" : "140298196924160",
            "connectionId" : 632530,
            "client" : "11.192.159.236:57052",
            "active" : true,
            "opid" : 1008837885,
            "secs_running" : 0,
            "microsecs_running" : NumberLong(70),
            "op" : "update",
            "ns" : "mygame.players",
            "query" : {
                "uid" : NumberLong(31577677)
            },
            "numYields" : 0,
            "locks" : {
                "Global" : "w",
                "Database" : "w",
                "Collection" : "w"
            },
            ....
        }
    The following table describes the fields to which you need to pay close attention.
    Parameter Description
    client The client that sent the request.
    opid The unique ID of the operation.

    If necessary, you can run the db.killOp(opid) command to terminate the operation.

    secs_running The duration that the operation has been running. Unit: seconds.

    If a relatively large value is returned for this field, check whether the request is appropriate.

    microsecs_running The duration that the operation has been running. Unit: microseconds.

    If a relatively large value is returned for this field, check whether the request is appropriate.

    ns The collection on which the operation performs a scan.
    op The operation type. In most cases, this value is query, insert, update, or delete.
    locks The lock-related information. For more information, see FAQ: Concurrency.
    Note For more information about the db.currentOp() command, see db.currentOp().
You can run the db.currentOp() command to check running operations and analyze whether ApsaraDB for MongoDB is processing time-consuming requests. For example, the CPU utilization is not high for your routine business. However, when an O&M personnel logs on to ApsaraDB for MongoDB databases to perform specific operations that require a collection scan, the CPU utilization significantly increases and ApsaraDB for MongoDB becomes sluggish. In this case, you must check for time-consuming operations.
Note If you find an abnormal request, you can obtain the operation ID (opid) of this request and run the db.killOp(opid) command to terminate this request.

For more information about the db.killOp() command, see db.killOp().

Analyze slow requests in ApsaraDB for MongoDB databases

If the CPU utilization of your ApsaraDB for MongoDB instance immediately increases and remains high after your application starts running and you cannot find abnormal requests in the output of the db.currentOp() command, you can analyze slow requests in the instance databases.

  1. View the slow query logs of the instance in the ApsaraDB for MongoDB console. For more information, see View slow query logs.
  2. Analyze the slow query logs to troubleshoot high CPU utilization for the instance.
    The following example shows a slow query log. For this request, ApsaraDB for MongoDB runs a collection scan on 11,000,000 documents, instead of querying data based on an index.
    {
            "op" : "query",
            "ns" : "123.testCollection",
            "command" : {
                    "find" : "testCollection",
                    "filter" : {
                            "name" : "zhangsan"
                    },
                    "$db" : "123"
            },
            "keysExamined" : 0,
            "docsExamined" : 11000000,
            "cursorExhausted" : true,
            "numYield" : 85977,
            "nreturned" : 0,
            "locks" : {
                    "Global" : {
                            "acquireCount" : {
                                    "r" : NumberLong(85978)
                            }
                    },
                    "Database" : {
                            "acquireCount" : {
                                    "r" : NumberLong(85978)
                            }
                    },
                    "Collection" : {
                            "acquireCount" : {
                                    "r" : NumberLong(85978)
                            }
                    }
            },
            "responseLength" : 232,
            "protocol" : "op_command",
            "millis" : 19428,
            "planSummary" : "COLLSCAN",
            "execStats" : {
                    "stage" : "COLLSCAN",
                    "filter" : {
                            "name" : {
                                    "$eq" : "zhangsan"
                            }
                    },
                    "nReturned" : 0,
                    "executionTimeMillisEstimate" : 18233,
                    "works" : 11000002,
                    "advanced" : 0,
                    "needTime" : 11000001,
                    "needYield" : 0,
                    "saveState" : 85977,
                    "restoreState" : 85977,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "direction" : "forward",
    ....in"
                    }
            ],
            "user" : "root@admin"
    }
For slow query logs, you must pay close attention to the following items:
  • Collection scan (keywords: COLLSCAN and docsExamined)
    • COLLSCAN indicates a collection scan.
      A collection scan for a request (such as a query, update, or delete operation) may cause a high CPU utilization. If you find a COLLSCAN keyword in slow query logs, your CPU resources may have been occupied by these requests.
      Note If such requests are frequently submitted, we recommend that you create an index on queried fields to optimize query performance.
    • The docsExamined field indicates the number of documents that ApsaraDB for MongoDB has scanned for a request. The greater the field value is, the more CPU resources this request occupies.
  • Inappropriate indexes (keywords: IXSCAN and keysExamined)
    Note
    • Excessive indexes affect the write and update performance.
    • If your application involves a large number of write operations and you use indexes, the application performance may be affected.

    The keysExamined field indicates the number of index keys that ApsaraDB for MongoDB has scanned for a request that uses an index. The greater the field value is, the more CPU resources this request occupies.

    If you create an index that is inappropriate or matches a large amount of data, the index cannot reduce CPU overheads or accelerate the execution of a request.

    For example, for the data in a collection, the x field can be set only to 1 or 2, whereas the y field has a wider value range.
    { x: 1, y: 1 }
    { x: 1, y: 2 }
    { x: 1, y: 3 }
    ......
    { x: 1, y: 100000} 
    { x: 2, y: 1 }
    { x: 2, y: 2 }
    { x: 2, y: 3 }
    ......
    { x: 1, y: 100000}
    To query data {x: 1, y: 2}, you can create an index.
    db.createIndex( {x: 1} )         //This index is inappropriate because a large amount of data has the same value as the x field.
    db.createIndex( {x: 1, y: 1} )   //This index is inappropriate because a large amount of data has the same value as the x field.
    db.createIndex( {y: 1 } )        //This index is appropriate because a small amount of data has the same value as the y field.
    db.createIndex( {y: 1, x: 1 } )  //This index is appropriate because a small amount of data has the same value as the y field.
    Note For the difference between indexes {y: 1} and {y: 1, x: 1}, see Design Principles of MongoDB Indexes and Compound Indexes.
  • Sorting of a large amount of data (keywords: SORT and hasSortStage)
    The value of the hasSortStage field is true when a query contains a sort order. If the query cannot use an index to return the requested sorted results in order, ApsaraDB for MongoDB must sort the query results. A sort operation may cause a high CPU utilization. In response to this issue, you can create an index on frequently sorted fields to optimize sorting performance.
    Note If you find the SORT keyword in slow query logs, you can use an index to optimize sorting performance.

Other operations such as index creation and aggregation (a combination of traverse, query, update, and sort) may also cause a high CPU utilization. You can also use the preceding troubleshooting methods.

Assess service capabilities

After you analyze and optimize running requests and slow requests in ApsaraDB for MongoDB databases, all requests are appropriate and efficiently use indexes. If CPU resources are still fully occupied, your instances may have reached the maximum capabilities. In this case, we recommend that you use the following method to address the issue:
  1. View the monitoring information to analyze the resource usage of instances. For more information, see View monitoring data and Basic monitoring.
  2. Check whether current instances meet the performance and capability requirements of your business scenarios.

For information about how to upgrade instances, see Overview or Change the configurations of a replica set instance.