All Products
Search
Document Center

PolarDB:Troubleshoot high CPU utilization

Last Updated:May 12, 2025

During the use of PolarDB for PostgreSQL clusters, CPU utilization may increase or even reach 100%. This topic describes the common causes, troubleshooting methods, and solutions for high CPU utilization.

Causes and troubleshooting methods

Increasing workloads

The most common reason for soaring CPU utilization is that increasing workloads consume more computing resources of the database.

Troubleshooting methods

You can check whether the current database has more active connections than usual by using the following methods:

  • If a monitoring system is provided for the database, view the changes in the number of active connections in the chart.

  • If no monitoring system is provided for the database, connect to the database and execute the following statement to obtain the number of active connections:

    SELECT COUNT(*) FROM pg_stat_activity WHERE state NOT LIKE 'idle';
    Note

    • pg_stat_activity is the built-in system view of PolarDB for PostgreSQL. Each row returned by the view is a running PolarDB for PostgreSQL process.

    • state indicates the current state of the process. Valid values:

      • active: The process is executing a query.

      • idle: The process is idle and waiting for new commands from the client.

      • idle in transaction: The process is in transaction but not executing a query.

      • idle in transaction (aborted): The process is in transaction and has an error for a statement.

      • fastpath function call: The process is executing the fast-path function.

      • disabled: The process has the status acquisition feature disabled.

    The preceding command can query the number of all processes not in the idle state. It is the number of active connections that may consume CPU resources. If the database has more active connections than usual, the high CPU utilization is caused by increasing workloads.

Slow queries

If CPU utilization increases but the number of active connections is within the normal range, multiple slow queries with low performance may occur. These slow queries may consume much CPU for a long period of time, resulting in an increase in CPU utilization.

PolarDB for PostgreSQL provides the slow query log feature. SQL statements whose execution time is longer than the log_min_duration_statement are recorded in slow query logs. However, when CPU utilization is close to 100%, the system freezes and the execution of all SQL statements is slow. Therefore, slow query logs have many entries, which may complicate troubleshooting.

Troubleshooting methods

Discover slow queries of long execution duration

The pg_stat_statements extension can record the statistics of all SQL statements on the database server during the optimization and execution phases.

Because the extension uses the shared memory, the extension name is added to the shared_preload_libraries parameter.

Note

If the pg_stat_statements extension is not created in the current database, execute the following statement to create the extension. The functions and views provided by the extension are also registered in this process:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
  1. The pg_stat_statements extension and the database continue accumulating statistics. To troubleshoot high CPU utilization, you must clear all existing statistics in the database and extension, and then start to collect statistics from now.

    -- Clear all existing statistics in the current database.
    SELECT pg_stat_reset();
    -- Clear all existing statistics collected by the pg_stat_statements extension.
    SELECT pg_stat_statements_reset();
  2. Wait one or two minutes for the database and extension to collect sufficient statistics.

  3. After the statistics are collected, execute the following statements to query the top five SQL statements in terms of execution duration.

    SELECT * FROM pg_stat_statements ORDER BY total_plan_time DESC LIMIT 5;
    SELECT * FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;

Discover slow queries of high buffer reads

If no index is created for a table and point queries are often performed on the table, full scans are required. All records are filtered in memory by using specified conditions, which significantly increases CPU utilization.

Execute the following statement to query the top five SQL statements in terms of buffer reads based on the statistics of the pg_stat_statements extension:

SELECT * FROM pg_stat_statements
ORDER BY shared_blks_hit + shared_blks_read DESC
LIMIT 5;

You can also query the tables with high full scans based on the statistics of the built-in system view pg_stat_user_tables in PolarDB for PostgreSQL.

Execute the following statement to query the top five tables in terms of tuples obtained in full scans from tables that have about 100,000 tuples:

SELECT * FROM pg_stat_user_tables
WHERE n_live_tup > 100000 AND seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 5;

Discover slow queries that do not end after a very long period

You can use the built-in view pg_stat_activity to query SQL statements that do not end after a very long period. These SQL statements may cause high CPU utilization.

Execute the following statement to query the top five SQL statements that last a long period and do not end.

SELECT
    *,
    extract(epoch FROM (NOW() - xact_start)) AS xact_stay,
    extract(epoch FROM (NOW() - query_start)) AS query_stay
FROM pg_stat_activity
WHERE state NOT LIKE 'idle%'
ORDER BY query_stay DESC
LIMIT 5;

Execute the following statement and consider the aforementioned tables with high full scans to obtain the slow queries that last more than 10 seconds.

SELECT * FROM pg_stat_activity
WHERE
    state NOT LIKE 'idle%' AND
    query ILIKE '%Table name%' AND
    NOW() - query_start > interval '10s';

Solutions

  • If only a few unexpected SQL statements consume very high CPU resources, you can send a signal to the backend process to interrupt the SQL statements. CPU utilization can return to normal. Execute the following statements which use the PID (the pid column of the pg_stat_activity view) as a parameter to interrupt SQL statements.

    SELECT pg_cancel_backend(pid);
    SELECT pg_terminate_backend(pid);
  • If the time-consuming SQL statements are necessary, you must optimize them. You can sample the tables involved in such SQL statements and update their statistics, so that the optimizer can generate better execution plans.

    Note

    Sampling consumes CPU resources. We recommend that you sample tables during off-peak hours.

    ANALYZE <Table name>;
  • For tables that require frequent full scans, you can create indexes on commonly used filter columns and use indexes in full scans to reduce CPU utilization which may be very high if no records are filtered out before entering the memory.