All Products
Search
Document Center

AnalyticDB:Slow query diagnostics

Last Updated:Apr 19, 2024

AnalyticDB for PostgreSQL provides the slow query diagnostics feature that helps you diagnose, analyze, and optimize slow queries. You must enable the slow query feature before you can use the slow query diagnostics feature. After the slow query feature is enabled, the system automatically records the SQL queries that consume more than 1 second (default threshold). You can log on to the postgres database to query and analyze the SQL queries.

Prerequisites

The slow query diagnostics feature is available only for AnalyticDB for PostgreSQL V7.0 instances in elastic storage mode and with minor engine versions V7.0.5.0 and later. For more information about how to view and update minor engine versions, see View the minor engine version and Update the minor engine version.

Precautions

  • By default, slow query logs retain the data of the last seven days but do not record data about failed queries. Slow query logs record all SQL statements whose execution time exceeds the value of the Grand Unified Configuration (GUC) parameter slow_query_min_duration. The default value of this parameter is 1 second. This means that all SQL statements that consume more than 1 second and all data definition language (DDL) statements are recorded in slow query logs. To ensure system stability, we recommend that you do not modify the default setting of this parameter.

  • Slow query logs do not record SQL statements that exceed 1024 bytes in length. The excess parts of the SQL statements are truncated.

Enable or disable the slow query diagnostics feature

--ON: enables the slow query diagnostics feature. OFF: disables the slow query diagnostics feature.
SHOW adbpg_feature_enable_query_monitor;
-- Enable the slow query diagnostics feature for a business database.
ALTER database {Business database name} SET adbpg_feature_enable_query_monitor to ON; 

Examples

By default, the system collects statistics about all SQL statements that consume more than 1 second and all DDL statements. Slow query logs are stored in the postgres database. If you want to query the slow queries of the current instance, you can switch to the postgres database and use the qmonitor.instance_slow_queries (instance level) and qmonitor.host_slow_queries (node level) views to diagnose the slow queries. The following section describes the typical scenarios for diagnosing slow queries.

Note

For more information about the related fields, see the Appendixes section in this topic.

  • Query all slow query statements that are executed within a specified time range.

    • Query all slow query statements that were executed in a specified time period, such as the past 30 minutes. You can also modify the time period based on your business requirements to query the slow queries that consume a long time in the specified time period.

      SELECT 
             query_start as "Start time",
             query_end AS "End time",
             query_duration_ms AS "Time consumed, in milliseconds",
             query_id AS "Query ID",
             query AS "SQL statement"
       FROM qmonitor.instance_slow_queries
       WHERE query_start >= now() - interval '30 min';
    • Query all slow queries that were executed on February 26, 2024. By default, slow query logs retain the data of the last seven days.

      SELECT 
             query_start as "Start time",
             query_end AS "End time",
             query_duration_ms AS "Time consumed, in milliseconds",
             query_id AS "Query ID",
             query AS "SQL statement"
       FROM qmonitor.instance_slow_queries
       WHERE query_start >= '2024-02-26 00:00:00'
       AND query_end <= '2024-02-27 00:00:00';
  • Query the top-N slow queries by instance-level resource (CPU utilization, memory usage, or file size) within a specified time range.

    • Query the top 20 slow queries with high CPU utilization for an instance in a specified time period, such as 30 minutes. You can also modify the time period based on your business requirements to query the slow queries with high CPU utilization in the specified time period.

      SELECT 
             (cpu_time_ms/1000)::text || ' s' AS "CPU time",
             query_start as "Start time",
             query_end AS "End time",
             query_duration_ms AS "Time consumed, in milliseconds",
             query_id AS "Query ID",
             query AS "SQL statement"
      FROM qmonitor.instance_slow_queries
      WHERE query_start >= now() - interval '30 min'
      ORDER BY cpu_time_ms DESC
      LIMIT 20;
    • Query the top 20 slow queries with high memory usage for an instance in a specified time period, such as 30 minutes. You can also modify the time period based on your business requirements to query the slow queries with high memory usage in the specified time period.

      SELECT 
             pg_size_pretty(memory_bytes) AS "Memory consumed",
             query_start as "Start time",
             query_end AS "End time",
             query_duration_ms AS "Time consumed, in milliseconds",
             query_id AS "Query ID",
             query AS "SQL statement"
       FROM qmonitor.instance_slow_queries
       WHERE query_start >= now() - interval '30 min'
       ORDER BY memory_bytes DESC
       LIMIT 20;
      Note

      You must consider memory usage during statistics collection. The memory usage fluctuates during queries. Therefore, the peak values of memory usage that are collected during queries can roughly reflect memory usage but are not exact values.

    • Query the top 10 slow queries with high CPU utilization in the period from 00:00 to 12:00 on February 26, 2024.

      SELECT 
             (cpu_time_ms/1000)::text || ' s' AS "CPU time",
             query_start as "Start time",
             query_end AS "End time",
             query_duration_ms AS "Time consumed, in milliseconds",
             query_id AS "Query ID",
             query AS "SQL statement"
      FROM qmonitor.instance_slow_queries
      WHERE query_start >= '2024-02-26 00:00:00'
      AND query_end <= '2024-02-26 12:00:00'
      ORDER BY cpu_time_ms DESC
      LIMIT 10;
  • Query the top-N slow queries by node-level resource (CPU utilization, memory usage, or file size) within a specified time range.

    • Query the top 20 slow queries with high CPU utilization for a node in a specified time period, such as 30 minutes. You can also modify the time period based on your business requirements to query the slow queries with high CPU utilization in the specified time period.

      SELECT 
              (host_cpu_time_ms/1000)::text || ' s' as "CPU time",
             query_start as "Start time",
             query_end as "End time",
             query_duration_ms as "Time consumed, in milliseconds",
             query_id as "Query ID",
             query as "SQL statement"
      FROM qmonitor.host_slow_queries
      WHERE hostname ='Node hostname' 
      AND query_start >= now() - interval '30 min'
      ORDER BY host_cpu_time_ms DESC
      LIMIT 20;
    • Query the top 20 slow queries with high memory usage for a node in a specified time period, such as 30 minutes. You can also modify the time period based on your business requirements to query the slow queries with high memory usage in the specified time period.

      SELECT 
              pg_size_pretty(host_mem_bytes) as "Memory consumed",
             query_start as "Start time",
             query_end as "End time",
             query_duration_ms as "Time consumed, in milliseconds",
             query_id as "Query ID",
             query as "SQL statement"
       FROM qmonitor.host_slow_queries
       WHERE hostname ='Node hostname'
       AND query_start >= now() - interval '30 min'
       ORDER BY host_mem_bytes DESC
       LIMIT 20;
  • Query the specific information of a slow query log.

SELECT * FROM qmonitor.instance_slow_queries WHERE query_id = '<Query ID>';
  • Query the slow queries of each user within a specified time range.

    • Query the slow queries of each user within a specified time range, such as 10 minutes. You can also modify the time range based on your business requirements to query the slow queries of each user within the specified time range.

      SELECT user_name AS "User name",
      COUNT(1) AS "Number of queries"
      FROM qmonitor.instance_slow_queries
      WHERE query_start >= now() - interval '10 min'
      GROUP BY user_name
      ORDER BY COUNT(1) DESC;

Export slow query logs

AnalyticDB for PostgreSQL allows you to execute SELECT statements to export data from the qmonitor.instance_slow_queries or qmonitor.host_slow_queries view to custom internal tables or external tables such as Object Storage Service (OSS) buckets or MaxCompute tables. For more information, see Data Lake Analysis. To export data in slow query logs correctly and efficiently, take note of the following items:

In AnalyticDB for PostgreSQL, indexes are created for slow query logs and partitioned table fields are configured based on the query_start parameter. If you want to export data by time range, you can include query_start columns in the query condition for better performance and less resource consumption. For example, if you want to export slow query logs generated during the period from 13:00 to 16:00 on February 26, 2024, you can add the following query condition: query_start >= '2024-02-26 13:00:00' and query_start <= '2024-02-26 16:00:00'.

Configuration items

You can modify the default threshold configuration items for slow queries to query slow query logs under different conditions.

Note

Only superusers can modify these configuration items.

  • slow_query_min_duration

    By default, the threshold specified by this configuration item is 1 second. You can also modify this configuration item to collect slow queries that consume less than 1 second. If the amount of time consumed by a slow query is greater than or equal to the value of this configuration item, the SQL statement, execution time, and other related information are collected. If the value of this configuration item is set to -1, no slow queries are collected.

    Examples:

    1. To collect the slow queries that consume more than 5 seconds for the current database, execute the following statement as a privileged user:

      ALTER DATABASE '<Database name>' SET slow_query_min_duration = '5s';
    2. To collect the slow queries that consume more than 5 seconds for the current session, execute the following statement as a regular user:

      SET slow_query_min_duration = '5s';
  • slow_query_plan_min_duration

    This configuration item specifies a threshold for the amount of time consumed by slow queries for the collection of execution plans. By default, the system collects the execution plans of slow queries that consume 10 or more seconds. If the amount of time consumed by a slow query is greater than or equal to the value of this configuration item, the execution plan of the slow query is collected. In most cases, you can execute the EXPLAIN statement to query execution plans in real time, and you do not need to collect such data. If the value of this configuration item is set to -1, no execution plans of slow queries are collected.

    Examples:

    1. To collect the execution plans of slow queries that consume more than 10 seconds, execute the following statement as a privileged user:

      ALTER DATABASE '<Database name>' SET slow_query_plan_min_duration = '10s';
    2. To collect the execution plans of slow queries that consume more than 10 seconds, execute the following statement as a regular user:

      SET slow_query_plan_min_duration = '10s';

Appendixes

  • Fields of the qmonitor.instance_slow_queries (instance level) view

    Field

    Type

    Description

    query_id

    text

    The unique ID of the query.

    session_id

    integer

    The ID of the query session.

    db_name

    character varying(128)

    The name of the database that is queried.

    user_name

    character varying(128)

    The name of the user who initiated the query.

    application_name

    character varying(128)

    The type of the application that initiated the query.

    client_hostname

    character varying(128)

    The hostname of the client that initiated the query.

    client_addr

    character varying(128)

    The IP address of the client that initiated the query.

    client_port

    character varying(32)

    The port number of the client that initiated the query.

    rsg_name

    character varying(128)

    The name of the resource group to which the table accessed by the query belongs.

    query_start

    timestamptz

    The start time of the query.

    query_end

    timestamptz

    The end time of the query.

    query_duration_ms

    bigint

    The total time consumed by the query. Unit: milliseconds.

    The value of this field is the sum of the values of the following fields:

    • optimizer_duration_ms

    • lock_wait_time_ms

    • queue_wait_time_ms

    • executor_duration_ms

    optimizer_duration_ms

    bigint

    The time consumed to generate an execution plan. Unit: milliseconds. The execution plan of a complex SQL statement consumes a longer time.

    lock_wait_time_ms

    bigint

    The lock wait time. Unit: milliseconds.

    queue_wait_time_ms

    bigint

    The time consumed to wait for a resource queue. Unit: milliseconds.

    executor_duration_ms

    bigint

    The time consumed to run the query on the execution engine. Unit: milliseconds.

    query

    text

    The text of the query.

    is_plpgsql

    boolean

    Specifies whether the query is a PL/pgSQL stored procedure.

    query_optimizer

    character varying(16)

    The optimizer used by the query. Valid values: ORCA and Planner.

    access_tables

    text

    The name of the table that is accessed by the query.

    result_rows

    bigint

    The number of rows returned for the query.

    If the INSERT statement is executed in the query, the number of inserted rows is returned.

    num_segments

    integer

    The number of compute nodes where the query runs.

    num_slices

    integer

    The number of slices generated for the query execution plan.

    cpu_time_ms

    numeric

    The total CPU time. Unit: milliseconds.

    The time specified by this field is the sum of the following time:

    • The CPU time consumed on the coordinator node.

    • The total CPU time consumed by tasks on all compute nodes.

    mem_bytes

    numeric

    The cumulative value of the peak memory usage on each node. This value roughly reflects the memory usage of the query but is not an exact value.

    spill_bytes

    numeric

    Data may be saved to disks due to insufficient memory during computing. The cumulative value of the maximum numbers of files that are saved to the disks on compute nodes where the query runs. This value roughly reflects the disk space consumed by the query but is not an exact value.

  • Fields of the qmonitor.host_slow_queries (node level) view

    Field

    Type

    Description

    hostname

    character varying(128)

    The hostname of the node.

    hostrole

    text

    The role of the node. Valid values:

    • master: coordinator node

    • segment: compute node

    query_id

    text

    The unique ID of the query.

    db_name

    character varying(128)

    The name of the database that is queried.

    user_name

    character varying(128)

    The name of the user who initiated the query.

    query_start

    timestamptz

    The start time of the query.

    query_end

    timestamptz

    The end time of the query.

    query

    text

    The text of the query.

    query_duration_ms

    bigint

    The time consumed by the query. Unit: milliseconds.

    optimizer_duration_ms

    bigint

    The time consumed to generate an execution plan. Unit: milliseconds. The execution plan of a complex SQL statement consumes a longer time.

    host_cpu_time_ms

    numeric

    The CPU time of the query on the node. Unit: milliseconds.

    host_mem_bytes

    numeric

    The peak memory usage of the query on the node.

    host_spill_bytes

    numeric

    The cumulative value of the maximum numbers of files that are saved to the disks on the node.