If a Hologres instance experiences slow response or query speeds, you can view and analyze its slow query logs to quickly identify the cause. These logs help you diagnose and analyze slow or failed queries, allowing you to implement optimizations and improve overall system performance. This topic describes how to view and analyze slow query logs in Hologres.
Version guide
Hologres version | Feature | Description |
V0.10 | The feature of viewing and analyzing slow query logs is added. |
|
V2.2.7 | The default value of the log_min_duration_statement parameter is optimized. | Slow query logs record all data definition language (DDL) statements, all failed SQL statements, and SQL statements whose execution time is greater than Starting from Hologres V2.2.7, the default value of the |
Limits
The following limits apply when you query slow query logs in Hologres:
By default, slow query logs are retained for the last month. You can query data from any time period within this month.
To ensure system stability and prevent traffic overloads, a single query can return a maximum of 10,000 slow query logs. Some fields in slow query logs have length limits. For more information, see the field descriptions of the
query_logtable.Slow query logs belong to the Hologres metadata warehouse system. A failure to query slow query logs does not affect business queries in the instance. Therefore, the stability of slow query logs is not covered by the Service-Level Agreement (SLA).
View the query_log table
The slow query logs of Hologres are stored in the hologres.hg_query_log system table.
After an instance is upgraded to V0.10, the system collects logs of slow Data Manipulation Language (DML) queries that take longer than 100 ms and all Data Definition Language (DDL) operations by default.
Starting from V3.0.2, the hologres.hg_query_log table supports aggregated records of DML and query operations that take less than 100 ms to execute.
The system aggregates records for successful Data Query Language (DQL) and DML queries that have fingerprints and take less than 100 ms to execute.
The main columns for aggregation are: server_addr, usename, datname, warehouse_id, application_name, and digest.
For each connection, data is summarized and reported once per minute.
The following table describes the fields in the hologres.hg_query_log system table.
Field | Data type | Description for detailed records of DML and DQL queries that take more than 100 ms | Description for aggregated records of DML and DQL queries that take less than 100 ms |
usename | text | The username of the query. | The username of the query. |
status | text | The query either succeeds or fails.
| The final status of the query is SUCCESS. Aggregated data includes only successful records. |
query_id | text | The unique ID of the query. A failed query always has a query_id. A successful query might not have one. | The query ID of the first query with the same primary key in an aggregation period. |
digest | text | The SQL fingerprint encoding. For SELECT, INSERT, DELETE, or UPDATE queries, the system calculates an MD5 hash value as the SQL fingerprint for each query. For more information about how to collect and calculate SQL fingerprints, see SQL fingerprint. Note In Hologres V2.2 and later, the digest column is added to this table to store SQL fingerprints of queries. If the version of your Hologres instance is V2.1 or earlier, contact the Hologres technical support to upgrade your instance. | The SQL fingerprint encoding. |
datname | text | The name of the database being queried. | The name of the database being queried. |
command_tag | text | The type of the query. Includes:
| The type of the first query with the same primary key in an aggregation period. |
warehouse_id | integer | The ID of the compute group that the query uses. | The compute group ID of the first query with the same primary key in an aggregation period. |
warehouse_name | integer | The name of the compute group that the query uses. | The compute group name of the first query with the same primary key in an aggregation period. |
warehouse_cluster_id | integer | This field is new in Hologres V3.0.2. It specifies the cluster ID in the compute group that the query uses. The cluster ID starts from 1 for each compute group. | The cluster ID of the first query with the same primary key in an aggregation period. |
duration | integer | The duration of the query in milliseconds (ms).
By default, the system displays only DML statements that take longer than 1 s and all DDL statements. You can modify the duration for collection using the Grand Unified Configuration (GUC) parameter | The average query duration of all queries with the same primary key in an aggregation period. |
message | text | The error message. | Empty. |
query_start | timestamptz | The start time of the query. | The start time of the first query with the same primary key in an aggregation period. |
query_date | text | The start date of the query. | The start date of the first query with the same primary key in an aggregation period. |
query | text | The text of the query. The maximum length of a query is 51,200 characters. Longer queries might be truncated. | The text of the first query with the same primary key in an aggregation period. |
result_rows | bigint | The number of rows returned. If it is an INSERT command, this is the number of inserted rows. | The average value of result_rows for all queries with the same primary key in an aggregation period. |
result_bytes | bigint | The number of bytes returned. | The average value of result_bytes for all queries with the same primary key in an aggregation period. |
read_rows | bigint | The number of data rows read. This is not an exact value. If a Bitmap index is used, it differs from the actual number of scanned rows. | The average value of read_rows for all queries with the same primary key in an aggregation period. |
read_bytes | bigint | The number of bytes read. | The average value of read_bytes for all queries with the same primary key in an aggregation period. |
affected_rows | bigint | The number of rows affected by the DML statement. | The average value of affected_rows for all queries with the same primary key in an aggregation period. |
affected_bytes | bigint | The number of bytes affected by the DML statement. | The average value of affected_bytes for all queries with the same primary key in an aggregation period. |
memory_bytes | bigint | The accumulated value of the peak memory usage on each node (not exact). This represents the accumulated value of the peak memory usage of the query on all running nodes. It roughly reflects the data read volume of the query. | The average value of memory_bytes for all queries with the same primary key in an aggregation period. |
shuffle_bytes | bigint | The estimated number of bytes for data shuffle (not exact). It roughly reflects the amount of data transmitted over the network. | The average value of shuffle_bytes for all queries with the same primary key in an aggregation period. |
cpu_time_ms | bigint | The total CPU time in milliseconds (ms) (not exact). This reflects the CPU time consumed by all computing tasks. It is the sum of the computing time of multiple CPU cores and roughly reflects the complexity. | The average value of cpu_time_ms for all queries with the same primary key in an aggregation period. |
physical_reads | bigint | The number of physical reads. This reflects the number of times a record batch is read from the disk. It roughly reflects the number of memory cache misses. | The average value of physical_reads for all queries with the same primary key in an aggregation period. |
pid | integer | The process ID of the query service. | The process ID of the query service for the first query with the same primary key in an aggregation period. |
application_name | text | The type of the query application. Common application types include the following:
For other applications that are connected to a Hologres instance, we recommend that you explicitly specify the | The type of the query application. |
engine_type | text[] | The engine used by the query. The engine types include the following:
| The engine used by the first query with the same primary key in an aggregation period. |
client_addr | text | The source address of the query. This represents the egress IP address of the application, which is not necessarily the real application IP address. | The source address of the first query with the same primary key in an aggregation period. |
table_write | text | The table to which the SQL statement writes data. | The table to which the SQL statement of the first query with the same primary key in an aggregation period writes data. |
table_read | text[] | The table from which the SQL statement reads data. | The table from which the SQL statement of the first query with the same primary key in an aggregation period reads data. |
session_id | text | The session ID. | The session ID of the first query with the same primary key in an aggregation period. |
session_start | timestamptz | The time when the session started. This represents the time when the connection was established. | The session start time of all queries with the same primary key in an aggregation period. |
command_id | text | The command or statement ID. | The command or statement ID of all queries with the same primary key in an aggregation period. |
optimization_cost | integer | The duration for generating the query execution plan. A high value usually means the SQL statement is complex. | The duration for generating the query execution plan for all queries with the same primary key in an aggregation period. |
start_query_cost | integer | The query startup duration. A high value usually means the query is waiting for a lock or resources. | The query startup duration for all queries with the same primary key in an aggregation period. |
get_next_cost | integer | The query execution duration. A high value usually means the computation is large and the execution takes a long time. You can optimize the SQL statement as needed. | The query execution duration for all queries with the same primary key in an aggregation period. |
extended_cost | text | Other detailed costs of the query. You can view the time consumed by operations other than optimization_cost, start_query_cost, and get_next_cost in the
| Other detailed costs of the first query with the same primary key in an aggregation period. |
plan | text | The execution plan corresponding to the query. An execution plan can contain a maximum of 102,400 characters. Content that exceeds this limit may be truncated. You can modify the length limit using the GUC parameter | The execution plan of the first query with the same primary key in an aggregation period. |
statistics | text | The execution statistics corresponding to the query. The maximum length for statistics is 102,400 characters. Statistics that exceed this limit may be truncated. You can modify the length limit using the GUC parameter | The execution statistics of the first query with the same primary key in an aggregation period. |
visualization_info | text | The visualization information of the query plan. | The visualization information of the query plan for the first query with the same primary key in an aggregation period. |
query_detail | text | Other extended information about the query (stored in JSON format). Note The maximum length of the extended information is 10,240 characters. Longer information might be truncated. | Other extended information about the first query with the same primary key in an aggregation period. |
query_extinfo | text[] |
Note The AccessKey ID is not recorded for logons using a local account, a service-linked role (SLR), or Security Token Service (STS). If you log on with a temporary account, only the temporary AccessKey ID is recorded. | Other extended information about the first query with the same primary key in an aggregation period. |
calls | INT | The value is 1 because detailed records are not aggregated. | The number of queries that have the same primary key in an aggregation period. |
agg_stats | JSONB | Empty. | Records the statistics, such as MIN, MAX, and AVG, of numeric fields for queries that have the same primary key in an aggregation period. The numeric fields include duration, memory_bytes, cpu_time_ms, physical_reads, optimization_cost, start_query_cost, and get_next_cost. |
extended_info | JSONB | Other extension information, such as Query Queue and Serverless Computing.
| Empty. |
The calls and agg_stats fields are new in Hologres V3.0.2.
Grant permissions to view logs
You must have specific permissions to view slow query logs. The permission rules and authorization methods are as follows:
View the slow query logs of all databases in the instance.
Grant a user superuser permissions.
A superuser can view all slow query logs for all databases in the instance. To grant a user this capability, assign them superuser permissions.
ALTER USER "Alibaba Cloud account ID" SUPERUSER;--Replace "Alibaba Cloud account ID" with the actual username. If it is a RAM user, add "p4_" before the account ID.Add the user to the pg_read_all_stats user group.
In addition to superusers, users in the pg_read_all_stats user group can also view all slow query logs. If a regular user needs to view all logs, a superuser must add them to this user group. The authorization method is as follows:
GRANT pg_read_all_stats TO "Alibaba Cloud account ID";-- Grant authorization using the standard PostgreSQL authorization model CALL spm_grant('pg_read_all_stats', 'Alibaba Cloud account ID'); -- Grant authorization using the simple permission model (SPM) CALL slpm_grant('pg_read_all_stats', 'Alibaba Cloud account ID'); -- Grant authorization using the schema-level permission model (SLPM)
View the slow query logs of the current database.
Enable the simple permission model (SPM) or the schema-level permission model (SLPM) and add the user to the db_admin user group. The db_admin role can view the query logs of the current database.
CALL spm_grant('<db_name>_admin', 'Alibaba Cloud account ID'); -- Grant authorization using the SPM CALL slpm_grant('<db_name>.admin', 'Alibaba Cloud account ID'); -- Grant authorization using the SLPMA regular user can only query the slow query logs of queries that they executed in the database corresponding to the current account.
View slow queries in the HoloWeb visualization
You can view slow query logs in the HoloWeb visualization.
HoloWeb currently supports viewing historical slow query logs for up to seven days.
Only superuser accounts can view the logs. Regular users who have the required permissions must use SQL commands to view the logs.
Log on to the HoloWeb console. For more information, see Connect to HoloWeb and perform queries.
In the top navigation bar, click Diagnostics and Optimization.
In the navigation pane on the left, click Historical Slow Query.
At the top of the Historical Slow Query page, edit the query conditions.
For more information about the parameters, see Historical Slow Query.
Click Query. The query results are displayed in the Query Trend Analysis and Query List areas.
Query Trend Analysis
The Query Trend Analysis section visually displays recent trends for slow and failed queries. You can focus on monitoring periods with a high frequency of slow queries to better identify peak problem times and resolve issues.
Query List
The Query List area displays the details of slow and failed queries. For more information about the parameters in the Query List, see Query and analyze slow query logs. You can also click Customize Columns to select the columns to display in the Query List.
SQL fingerprint
Starting from Hologres V2.2, the system table hologres.hg_query_log, which stores slow query logs, includes a new digest column to display SQL fingerprints. For SELECT, INSERT, DELETE, and UPDATE queries, the system calculates an MD5 hash value as the SQL fingerprint of the query to facilitate further query classification and analysis.
The rules for collecting and calculating SQL fingerprints are as follows:
By default, fingerprints are collected only for SELECT, INSERT, DELETE, and UPDATE queries.
For
INSERTstatements, when you insert constant values into a table, the SQL fingerprint is not affected by the number of rows being inserted.When calculating SQL fingerprints, the case sensitivity rules for SQL statements are the same as those used by Hologres during query execution.
When calculating SQL fingerprints, all whitespace characters in the query (such as spaces, line breaks, and tab characters) are ignored. Only the structural information within the SQL statement is considered.
When calculating SQL fingerprints, the specific constant values in the query do not affect the fingerprint.
For example, the query
SELECT * FROM t WHERE a > 1;and the querySELECT * FROM t WHERE a > 2;have the same SQL fingerprint.For array constants in a query, the calculation of the SQL fingerprint is not affected by the number of elements in the array.
For example, the SQL fingerprint of the
SELECT * FROM t WHERE a IN (1, 2);statement is the same as that of theSELECT * FROM t WHERE a IN (3, 4, 5);statement.When calculating SQL fingerprints, the database name is considered, and the schema for each table is automatically resolved. Tables in different queries are distinguished by both the table name and its schema.
Example: For the queries
SELECT * FROM t;andSELECT * FROM public.t;, their SQL fingerprints are the same only if the tabletis in thepublicschema and both queries refer to the same table.
Diagnose a query
You can retrieve slow query logs by querying the hologres.hg_query_log table. The following section provides common SQL statements for diagnosing query logs. These include the following:
Query the most resource-intensive queries in the last 10 minutes
View the hourly query volume and total data read in the last 3 hours
View the data access volume over the past 3 hours compared to the same time yesterday
Query the queries with high costs at each stage in the last 10 minutes
Query the total number of queries in query_log (defaults to data from the last month).
SELECT count(*) FROM hologres.hg_query_log;The following result indicates that a total of 44 queries with a duration greater than the specified threshold were executed in the last month.
count ------- 44 (1 row)Query the number of slow queries for each user.
SELECT usename AS "User", count(1) as "Query Count" FROM hologres.hg_query_log GROUP BY usename order by count(1) desc;The execution result is as follows, where
count(1)indicates the number of slow queries.User | Query Count -----------------------+----- 1111111111111111 | 27 2222222222222222 | 11 3333333333333333 | 4 4444444444444444 | 2 (4 rows)Query the specific information of a slow query.
SELECT * FROM hologres.hg_query_log WHERE query_id = '13001450118416xxxx';The execution result is as follows. For more information about the parameters, see View the query_log table.
usename | status | query_id | datname| command_tag | duration | message | query_start | query_date | query | result_rows | result_bytes | read_rows |read_bytes | affected_rows | affected_bytes | memory_bytes | shuffle_bytes | cpu_time_ms | physical_reads | pid | application_name | engine_type | client_addr | table_write | table_read | session_id | session_start | command_id | optimization_cost | start_query_cost | get_next_cost | extended_cost | plan | statistics | visualization_info | query_detail | query_extinfo -----------------------+---------+--------------------+---------+-------------+----------+---------+------------------------+------------+---------------------------------------------------------------------------------------------------------+-------------+--------------+-----------+------------+---------------+----------------+--------------+---------------+-------------+----------------+---------+------------------+-------------+---------------+-------------+------------+-----------------+------------------------+------------+-------------------+------------------+---------------+---------------+------+------------+--------------------+--------------+--------------- p4_11111111111xxxx | SUCCESS | 13001450118416xxxx | dbname | SELECT | 149 | | 2021-03-30 23:45:01+08 | 20210330 | explain analyze SELECT * FROM tablename WHERE user_id = '20210330010xxxx' limit 1000; | 1000 | 417172 | 0 | 0 | -1 | -1 | 26731376 | 476603616 | 321626 | 0 | 1984913 | psql | {HQE} | 33.41.xxx.xxx | | | 6063475a.1e4991 | 2021-03-30 23:44:26+08 | 0 | 58 | 22 | 67 | | | | | | (1 row)Query the most resource-intensive queries in a recent time period, such as the last 10 minutes. You can also modify the time period as needed to query resource-intensive queries for a specific time period.
SELECT status AS "Status", duration AS "Duration (ms)", query_start AS "Start Time", (read_bytes/1048576)::text || ' MB' AS "Read Volume", (memory_bytes/1048576)::text || ' MB' AS "Memory", (shuffle_bytes/1048576)::text || ' MB' AS "Shuffle", (cpu_time_ms/1000)::text || ' s' AS "CPU Time", physical_reads as "Disk Reads", query_id as "QueryID", query::char(30) FROM hologres.hg_query_log WHERE query_start >= now() - interval '10 min' ORDER BY duration DESC, read_bytes DESC, shuffle_bytes DESC, memory_bytes DESC, cpu_time_ms DESC, physical_reads DESC LIMIT 100;The following result is returned:
Status |Duration (ms) | Start Time | Read Volume | Memory | Shuffle | CPU Time | Disk Reads | QueryID | query ---------+---------+------------------------+--------+-------+---------+---------+------+--------------------+-------------------------------- SUCCESS | 149 | 2021-03-30 23:45:01+08 | 0 MB | 25 MB | 454 MB | 321 s | 0 | 13001450118416xxxx | explain analyze SELECT * FROM SUCCESS | 137 | 2021-03-30 23:49:18+08 | 247 MB | 21 MB | 213 MB | 803 s | 7771 | 13001491818416xxxx | explain analyze SELECT * FROM FAILED | 53 | 2021-03-30 23:48:43+08 | 0 MB | 0 MB | 0 MB | 0 s | 0 | 13001484318416xxxx | SELECT ds::bigint / 0 FROM pub (3 rows)View the number of new queries from the previous day.
Count the total number of new queries from the previous day.
SELECT COUNT(1) FROM ( SELECT DISTINCT t1.digest FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day') AND digest IS NOT NULL ) AS a;The following result indicates that a total of 10 new queries were executed on the previous day.
count ------- 10 (1 row)Count the number of new queries from the previous day, grouped by query type (command_tag).
SELECT a.command_tag, COUNT(1) FROM ( SELECT DISTINCT t1.digest, t1.command_tag FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day') AND t1.digest IS NOT NULL) AS a GROUP BY 1 ORDER BY 2 DESC;The following result indicates that eight
INSERTqueries and twoSELECTqueries were initiated on the previous day.command_tag | count ------------+-------- INSERT | 8 SELECT | 2 (2 rows)
View the details of new queries from the previous day.
SELECT a.usename, a.status, a.query_id, a.digest, a.datname, a.command_tag, a.query, a.cpu_time_ms, a.memory_bytes FROM ( SELECT DISTINCT t1.usename, t1.status, t1.query_id, t1.digest, t1.datname, t1.command_tag, t1.query, t1.cpu_time_ms, t1.memory_bytes FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day' ) AND t1.digest IS NOT NULL ) AS a;The following result is returned:
usename |status |query_id |digest |datname|command_tag |query |cpu_time_ms |memory_bytes -----------------+--------+--------------------+------------------------------------+-------+-------------+-----------------------------------+--------------+-------------- 111111111111xxxx |SUCCESS |100100425827776xxxx |md58cf93d91c36c6bc9998add971458ba1a |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 1748| 898808596 111111111111xxxx |SUCCESS |100100425827965xxxx |md5f7e87e2c9e0b3d9eddcd6b3bc7f04b3b |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 59891| 6819529886 111111111111xxxx |SUCCESS |100100425829654xxxx |md55612dc09d2d81074fd5deed1aa3eca9b |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 3| 2100039 111111111111xxxx |SUCCESS |100100425829664xxxx |md58d3bf67fbdf2247559bc916586d40011 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 10729| 2052861937 111111111111xxxx |SUCCESS |100100425830099xxxx |md503bd45d6b2d7701c2617d079b4d55a10 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 2196| 897948034 111111111111xxxx |SUCCESS |100100425830186xxxx |md5c62169eaf3ea3a0c59bdc834a8141ac4 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 5268| 1734305972 111111111111xxxx |SUCCESS |100100425830448xxxx |md59aa0c73b24c9c9eba0b34c8fdfc23bb0 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 2| 2098402 111111111111xxxx |SUCCESS |100100425830459xxxx |md57d22c1d37b68984e9472f11a4c9fd04e |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 113| 76201984 111111111111xxxx |SUCCESS |100100525468694xxxx |md5ac7d6556fae123e9ea9527d8f1c94b1c |dbname |SELECT |SELECT * FROM xxx limit 200 | 6| 1048576 111111111111xxxx |SUCCESS |100101025538840xxxx |md547d09cdad4d5b5da74abaf08cba79ca0 |dbname |SELECT |SELECT * FROM xxx limit 200 |\N |\N (10 rows)View the trend of new queries from the previous day, displayed by hour.
SELECT to_char(a.query_start, 'HH24') AS query_start_hour, a.command_tag, COUNT(1) FROM ( SELECT DISTINCT t1.query_start, t1.digest, t1.command_tag FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day' ) AND t1.digest IS NOT NULL ) AS a GROUP BY 1, 2 ORDER BY 3 DESC;The following result indicates that on the previous day, 1 new SELECT query was executed at 11:00, 1 new SELECT query at 13:00, and 8 new INSERT queries at 21:00.
query_start_hour |command_tag |count ------------------+-------------+----- 21 |INSERT | 8 11 |SELECT | 1 13 |SELECT | 1 (3 rows)Categorize slow queries by frequency.
SELECT digest, command_tag, count(1) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '1 day' AND query_start < CURRENT_DATE GROUP BY 1,2 ORDER BY 3 DESC;Query the top 10 queries with the highest average CPU time in the last day.
SELECT digest, avg(cpu_time_ms) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '1 day' AND query_start < CURRENT_DATE AND digest IS NOT NULL AND usename != 'system' AND cpu_time_ms IS NOT NULL GROUP BY 1 ORDER BY 2 DESC LIMIT 10;View the top 10 queries with the highest average memory consumption (memory_bytes) in the last week.
SELECT digest, avg(memory_bytes) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '7 day' AND query_start < CURRENT_DATE AND digest IS NOT NULL AND memory_bytes IS NOT NULL GROUP BY 1 ORDER BY 2 DESC LIMIT 10;View the hourly query volume and total data read in the last 3 hours to compare and determine whether there are changes in data volume each hour.
SELECT date_trunc('hour', query_start) AS query_start, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '3 h' GROUP BY 1;View the data access volume over the past 3 hours compared to the same time yesterday to determine whether there is a change in data access volume.
SELECT query_date, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '3 h' GROUP BY query_date UNION ALL SELECT query_date, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '1d 3h' AND query_start <= now() - interval '1d' GROUP BY query_date;Query the queries with high costs at each stage in a recent time period, such as the last 10 minutes. You can also modify the time period as needed to query queries with high costs at each stage for a specific time period.
SELECT status AS "Status", duration AS "Duration (ms)", optimization_cost AS "Optimization Cost (ms)", start_query_cost AS "Startup Cost (ms)", get_next_cost AS "Execution Cost (ms)", duration - optimization_cost - start_query_cost - get_next_cost AS "Other Costs (ms)", query_id AS "QueryID", query::char(30) FROM hologres.hg_query_log WHERE query_start >= now() - interval '10 min' ORDER BY duration DESC, start_query_cost DESC, optimization_cost, get_next_cost DESC, duration - optimization_cost - start_query_cost - get_next_cost DESC LIMIT 100;The following result is returned:
Status | Duration (ms) | Optimization Cost (ms) | Startup Cost (ms) | Execution Cost (ms) | Other Costs (ms) | QueryID | query ---------+----------+--------------+--------------+--------------+--------------+--------------------+-------------------------------- SUCCESS | 4572 | 521 | 320 | 3726 | 5 | 6000260625679xxxx | -- /* user: wang ip: xxx.xx.x SUCCESS | 1490 | 538 | 98 | 846 | 8 | 12000250867886xxxx | -- /* user: lisa ip: xxx.xx.x SUCCESS | 1230 | 502 | 95 | 625 | 8 | 26000512070295xxxx | -- /* user: zhang ip: xxx.xx. (3 rows)Query the first failed query.
SELECT status AS "Status", regexp_replace(message, '\n', ' ')::char(150) AS "Error Message", duration AS "Duration (ms)", query_start AS "Start Time", query_id AS "QueryID", query::char(100) AS "Query" FROM hologres.hg_query_log WHERE query_start BETWEEN '2021-03-25 17:00:00'::timestamptz AND '2021-03-25 17:42:00'::timestamptz + interval '2 min' AND status = 'FAILED' ORDER BY query_start ASC LIMIT 100;The following result is returned:
Status | Error Message | Duration (ms) | Start Time | QueryID | Query --------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------+------------------------+-------------------+------- FAILED | Query:[1070285448673xxxx] code: kActorInvokeError msg: "[holo_query_executor.cc:330 operator()] HGERR_code XX000 HGERR_msge internal error: status { c | 1460 | 2021-03-25 17:28:54+08 | 1070285448673xxxx | S... FAILED | Query:[1016285560553xxxx] code: kActorInvokeError msg: "[holo_query_executor.cc:330 operator()] HGERR_code XX000 HGERR_msge internal error: status { c | 131 | 2021-03-25 17:28:55+08 | 1016285560553xxxx | S... (2 rows)
Modify the lifecycle of slow query logs
Starting from Hologres V3.0.27, you can modify the lifecycle of slow query logs. Execute the following SQL statement to apply the change at the database level:
ALTER DATABASE <db_name> SET hg_query_log_retention_time_sec = 2592000;The parameter is in seconds. You can set the lifecycle to a value from 3 to 30 days. The corresponding value range in seconds is 259200 to 2592000.
The modified lifecycle applies only to new logs. Existing logs retain their original lifecycle.
The modified lifecycle takes effect only for new connections.
After you set the lifecycle using this parameter, expired logs are cleaned up immediately, not asynchronously.
Export slow query logs
Hologres supports using the INSERT statement to export data from the slow query log (hg_query_log) to a custom internal table or a foreign table such as a MaxCompute or OSS table.
Notes
To export data from slow query logs correctly and efficiently, note the following:
The account that is used to execute the
INSERT INTO ... SELECT ... FROM hologres.hg_query_log;statement must have access permissions for thehologres.hg_query_logtable. For more information, see Grant permissions to view logs. To export all slow query logs of an instance, the account that is used to execute theINSERTcommand must have the Superuser or pg_read_all_stats permission. Otherwise, the data retrieved from thehologres.hg_query_logtable will be incomplete.query_start is an index for the slow query log table. When you export data based on a time range, use the
query_startcolumn in the `WHERE` clause to specify the time range. This improves performance and reduces resource consumption.When you use
query_startto filter a time range, do not apply a nested expression toquery_start. Otherwise, this prevents an index hit. For example, instead ofWHERE to_char(query_start, 'yyyymmdd') = '20220101';, writeWHERE query_start >= to_char('20220101', 'yyyy-mm-dd') and query_start < to_char('20220102', 'yyyy-mm-dd');.
Example 1: Export to a Hologres internal table.
Execute the following command in Hologres to export slow query logs to the query_log_download internal table.
--Hologres SQL CREATE TABLE query_log_download ( usename text, status text, query_id text, datname text, command_tag text, duration integer, message text, query_start timestamp with time zone, query_date text, query text, result_rows bigint, result_bytes bigint, read_rows bigint, read_bytes bigint, affected_rows bigint, affected_bytes bigint, memory_bytes bigint, shuffle_bytes bigint, cpu_time_ms bigint, physical_reads bigint, pid integer, application_name text, engine_type text[], client_addr text, table_write text, table_read text[], session_id text, session_start timestamp with time zone, trans_id text, command_id text, optimization_cost integer, start_query_cost integer, get_next_cost integer, extended_cost text, plan text, statistics text, visualization_info text, query_detail text, query_extinfo text[] ); INSERT INTO query_log_download SELECT usename, status, query_id, datname, command_tag, duration, message, query_start, query_date, query, result_rows, result_bytes, read_rows, read_bytes, affected_rows, affected_bytes, memory_bytes, shuffle_bytes, cpu_time_ms, physical_reads, pid, application_name, engine_type, client_addr, table_write, table_read, session_id, session_start, trans_id, command_id, optimization_cost, start_query_cost, get_next_cost, extended_cost, plan, statistics, visualization_info, query_detail, query_extinfo FROM hologres.hg_query_log WHERE query_start >= '2022-08-03' AND query_start < '2022-08-04';Example 2: Export to a MaxCompute foreign table.
In MaxCompute, use the following command to create the destination table.
CREATE TABLE if NOT EXISTS mc_holo_query_log ( username STRING COMMENT 'The username of the query' ,status STRING COMMENT 'The final status of the query, success or failure' ,query_id STRING COMMENT 'The query ID' ,datname STRING COMMENT 'The name of the queried database' ,command_tag STRING COMMENT 'The type of the query' ,duration BIGINT COMMENT 'The duration of the query (ms)' ,message STRING COMMENT 'The error message' ,query STRING COMMENT 'The text of the query' ,read_rows BIGINT COMMENT 'The number of rows read by the query' ,read_bytes BIGINT COMMENT 'The number of bytes read by the query' ,memory_bytes BIGINT COMMENT 'The peak memory consumption of a single node (not exact)' ,shuffle_bytes BIGINT COMMENT 'The estimated number of bytes for data shuffle (not exact)' ,cpu_time_ms BIGINT COMMENT 'The total CPU time (milliseconds, not exact)' ,physical_reads BIGINT COMMENT 'The number of physical reads' ,application_name STRING COMMENT 'The type of the query application' ,engine_type ARRAY<STRING> COMMENT 'The engine used by the query' ,table_write STRING COMMENT 'The table modified by the SQL statement' ,table_read ARRAY<STRING> COMMENT 'The table read by the SQL statement' ,plan STRING COMMENT 'The plan corresponding to the query' ,optimization_cost BIGINT COMMENT 'The duration for generating the query execution plan' ,start_query_cost BIGINT COMMENT 'The query startup duration' ,get_next_cost BIGINT COMMENT 'The query execution duration' ,extended_cost STRING COMMENT 'Other detailed costs of the query' ,query_detail STRING COMMENT 'Other extended information about the query (stored in JSON format)' ,query_extinfo ARRAY<STRING> COMMENT 'Other extended information about the query (stored in ARRAY format)' ,query_start STRING COMMENT 'The start time of the query' ,query_date STRING COMMENT 'The start date of the query' ) COMMENT 'hologres instance query log daily' PARTITIONED BY ( ds STRING COMMENT 'stat date' ) LIFECYCLE 365; ALTER TABLE mc_holo_query_log ADD PARTITION (ds=20220803);In Hologres, use the following command to export slow query logs to the MaxCompute foreign table.
IMPORT FOREIGN SCHEMA project_name LIMIT TO (mc_holo_query_log) FROM SERVER odps_server INTO public; INSERT INTO mc_holo_query_log SELECT usename AS username, status, query_id, datname, command_tag, duration, message, query, read_rows, read_bytes, memory_bytes, shuffle_bytes, cpu_time_ms, physical_reads, application_name, engine_type, table_write, table_read, plan, optimization_cost, start_query_cost, get_next_cost, extended_cost, query_detail, query_extinfo, query_start, query_date, '20220803' FROM hologres.hg_query_log WHERE query_start >= '2022-08-03' AND query_start < '2022-08-04';
Configuration items
Hologres provides the following configuration items to selectively record queries:
log_min_duration_statement
Description:
This configuration item sets the minimum runtime for a query to be logged as a slow query. By default, the system logs queries that run for longer than 100 ms. However, when you query the logs, only queries that ran for longer than 1 s are displayed by default. Use this configuration item to change the logging threshold. Note:
Changes to this configuration item affect only new queries. Queries that are already logged are displayed based on the previous setting.
Only a superuser can change this setting.
Set the value to
-1to disable logging for all queries. For positive values, the minimum supported value is 100 ms.
Example:
Log all SQL statements that run for
250 msor longer:-- Set at the DB level. Requires a superuser. ALTER DATABASE dbname SET log_min_duration_statement = '250ms'; -- Set at the current session level. A regular user can run this command. SET log_min_duration_statement = '250ms';
log_min_duration_statement
Description:
This configuration item sets the minimum execution time for a query to be logged. Queries that run longer than this specified duration (in milliseconds) are recorded in the slow query log. The default value is 100 ms. Note:
This change affects only new queries. Existing logs are not affected.
Only a superuser can change this setting.
Set the value to
-1to disable slow query logging. For positive values, the minimum is 100 ms.
Example:
This setting lets you query the slow query log for all SQL statements with a runtime of at least
250 ms:-- Set at the DB level, requires superuser ALTER DATABASE dbname SET log_min_duration_statement = '250ms'; -- Set at the current session level, can be executed by a regular user SET log_min_duration_statement = '250ms';
log_min_duration_query_stats
Description:
This configuration item sets the minimum execution time for a query's execution statistics to be recorded. By default, statistics are recorded for queries that run longer than 10s. Note:
This change affects only new queries. Statistics for existing logs are not affected.
A value of
-1disables the recording of query statistics information.This information requires a large amount of storage. Recording statistics for too many queries can slow down the performance of querying and analyzing slow query logs. Therefore, while you can set the value to less than 10s for troubleshooting, it is not recommended to use a smaller value for routine operations.
Examples:
To record execution statistics for queries that run longer than 20s:
--Set at the DB level, requires superuser ALTER DATABASE dbname SET log_min_duration_query_stats = '20s'; -- Set at the current session level, can be executed by a regular user SET log_min_duration_query_stats = '20s';
log_min_duration_query_plan
Description:
This configuration item sets the minimum execution time for a query's execution plan to be recorded. By default, execution plans are recorded for queries that run for 10s or longer. Note:
Changes to this configuration item affect only new queries. For previously recorded queries, the original default value is displayed.
If a statement's running time is greater than or equal to the specified value in milliseconds, its execution plan is recorded in the slow query log.
An execution plan can be obtained in real time using the
EXPLAINcommand, so it is not always necessary to record it in the logs.Setting the value to
-1disables the recording of query execution plans.
Examples:
To record the execution plan for queries that run longer than 10s:
-- Set at the DB level, requires superuser ALTER DATABASE dbname SET log_min_duration_query_plan = '10s'; -- Set at the current session level, can be executed by a regular user SET log_min_duration_query_plan = '10s';
FAQ
Problem:
In Hologres V1.1, when viewing slow query logs, information such as the number of scanned rows and returned rows is not displayed.
Possible cause:
The collection of slow query logs is incomplete.
Solution:
In Hologres versions from V1.1.36 to V1.1.49, you can use the following GUC parameter to display the complete information. In Hologres V1.1.49 and later, this information is displayed by default.
NoteIf your Hologres instance version is earlier than V1.1.36, see Troubleshoot failed upgrade preparations or join the Hologres DingTalk group for technical support. For more information about obtaining support, see Obtain online support for Hologres.
-- (Recommended) Set at the database level. You only need to set it once for each database. ALTER databse <db_name> SET hg_experimental_force_sync_collect_execution_statistics = ON; -- Set at the session level SET hg_experimental_force_sync_collect_execution_statistics = ON;db_name is the name of the database.
References
For more information about how to diagnose and manage queries in an instance, see Manage queries.