This topic describes how to use the garbage collection feature during off-peak hours of a PolarDB for PostgreSQL cluster. This topic also provides optimization examples for the feature.
Prerequisites
The PolarDB for PostgreSQL cluster runs one of the following engines:
PostgreSQL 16 (revision version 2.0.16.8.3.0 or later)
PostgreSQL 15 (revision version 2.0.15.12.4.0 or later)
PostgreSQL 14 (revision version 2.0.14.12.24.0 or later)
PostgreSQL 11 (revision version 2.0.11.15.42.0 or later)
You can view the revision version in the console or execute the SHOW polardb_version; statement to query the revision version. If you need to upgrade the revision version, see version management.
Background information
Like the native PostgreSQL, PolarDB for PostgreSQL starts an automatic cleanup in the background to perform garbage collection. The automatic cleanup is started after the autovacuum parameter is set to true. The automatic cleanup provides the following benefits: 1. Old data is reclaimed to reduce disk space usage. 2. Statistics are updated to ensure that the query optimizer selects the optimal execution plan. 3. Transaction IDs are not rolled back, which effectively reduces cluster failures. For more information about an automatic cleanup, see 25.1.6. Background automatic cleanup process.
The garbage collection operations consume a large number of hardware resources. To prevent the frequent execution of an automatic cleanup, the native PostgreSQL sets some conditions for triggering the automatic cleanup. The automatic cleanup is started only when the specific conditions are met. For more information, see 20.10. Automatic cleanup. The conditions are related to the number of rows that are changed and the database age. During peak hours, a large volume of data is changed, and transaction IDs are quickly consumed. Therefore, automatic cleanups are frequently triggered, which causes the following issues:
High resource usage: During peak hours, automatic cleanups are frequently performed, occupying a large number of CPU and I/O resources, and competing for hardware resources with business reads/writes. This affects the read/write performance of your databases. In the example shown in the following figure, the CPU usage and I/O throughput of an automatic cleanup during the peak daytime business hours rank first among all processes.


Read/write requests blocked by locked tables: Automatic cleanups hold exclusive locks and block business requests on a table for a short period of time when reclaiming empty pages. The temporary blocking compromises the performance of your databases during peak hours.
Invalid plan cache: Automatic cleanups collect statistics and invalidate the existing plan cache. As a result, execution plans need to be generated for new queries. In this case, the execution plans may be generated by multiple connections in parallel during peak hours, which increases the response time to business requests from the connections.
NoteThe global plan cache (GPC) feature provided by PolarDB for PostgreSQL can minimize the impact of invalid plan cache on your business. For more information about the feature, see Global plan cache.
The reason for the preceding issues is that the native PostgreSQL does not have the concept of off-peak hours. However, business in real scenarios usually has peak hours and off-peak hours. PolarDB for PostgreSQL allows you to specify off-peak hours. This way, you can use idle hardware resources during off-peak hours to fully reclaim garbage, which reduces the frequency of automatic cleanup during peak hours. This reserves more hardware resources for read/write requests and improves read/write performance.
Benefits
Garbage collection operations during off-peak hours are expected to resolve the preceding issues. In addition, cleanup policies configured during off-peak hours work better than automatic cleanup operations in the native PostgreSQL, and provide the following benefits:
Optimized resource utilization: Garbage is collected during off-peak hours, which leads to a significant decrease in the frequency and resource utilization of automatic cleanups during peak hours.
Optimized database age: More transaction IDs are reclaimed during off-peak hours, which prevents database failures caused by ID rollback.
Optimized statistics and slow queries: More statistics are collected, which helps the optimizer select more accurate query plans and reduces slow queries caused by expired statistics.
Optimized locked tables: Locked tables caused by automatic cleanup operations during peak hours block read/write operations. Scheduling these operations during off-peak hours can reduce the probability of this issue.
Optimized plan cache invalidation: Automatic cleanup operations during peak hours invalidate the plan cache. Performing these operations during off-peak hours minimizes the likelihood of this issue.
Usage notes
You can use the following methods to specify the off-peak hours for a PolarDB for PostgreSQL cluster that runs a minor version described in the Prerequisites section. If the cluster runs an earlier minor version, upgrade the cluster to the latest minor version in the PolarDB console. For more information, see Minor version update.
To specify the off-peak hours without updating the minor version of the cluster, contact us so that we can make the necessary backend configurations. Provide the start time, end time, and time zone of the off-peak hours. This method has limitations and may fail due to O&M operations such as primary/secondary switchover, configuration changes, and zone switching. Therefore, this method is only a temporary solution. If you want new settings to permanently take effect, upgrade the cluster to the latest minor version.
Specify the off-peak hours
Create an extension.
Create the
polar_advisorextension in the postgres database and all databases that need to perform garbage collection operations.CREATE EXTENSION IF NOT EXISTS polar_advisor;For PolarDB cluster that have the
polar_advisorextension created, you can execute the following statement to update the extension:ALTER EXTENSION polar_advisor UPDATE;Specify the off-peak hours.
Execute the following statement to specify the off-peak hours:
-- Execute the following statement in the postgres database. SELECT polar_advisor.set_advisor_window(start_time, end_time);start_time: the start time of the off-peak hours.
end_time: the end time of the off-peak hours.
By default, the off-peak hours that you specify takes effect on the current day. Then, the system automatically performs garbage collection operations every day during the off-peak hours.
NoteOnly the off-peak hours that you specify in the postgres database takes effect.
The time zone offset of the off-peak hours must be consistent with the time zone configuration of the PolarDB cluster. Otherwise, the off-peak hours do not take effect.
You can execute the following statement to specify the period of time from 23:00 one day to 2:00 the next day (UTC+8) as the off-peak hours. Garbage collection operations are performed on the cluster during the off-peak hours.
SELECT polar_advisor.set_advisor_window(start_time => '23:00:00+08', end_time => '02:00:00+08');View the off-peak hours.
Execute the following statements to view the details of the specified off-peak hours:
-- Execute the following statements in the postgres database. -- View the details of the specified off-peak hours. SELECT * FROM polar_advisor.get_advisor_window(); -- View the duration of the specified off-peak hours. Unit: seconds. SELECT polar_advisor.get_advisor_window_length(); -- Check whether the current time falls into the specified off-peak hours. SELECT now(), * FROM polar_advisor.is_in_advisor_window(); -- View the length of time from the current time to the start time of the next off-peak hours. Unit: seconds. SELECT * FROM polar_advisor.get_secs_to_advisor_window_start(); -- View the length of time from the current time to the end time of the next off-peak hours. Unit: seconds. SELECT * FROM polar_advisor.get_secs_to_advisor_window_end();The following example shows the execution of the preceding statements:
-- View the details of the specified off-peak hours. postgres=# SELECT * FROM polar_advisor.get_advisor_window(); start_time | end_time | enabled | last_error_time | last_error_detail | others -------------+-------------+---------+-----------------+-------------------+-------- 23:00:00+08 | 02:00:00+08 | t | | | (1 row) -- View the duration of the specified off-peak hours. postgres=# SELECT polar_advisor.get_advisor_window_length() / 3600.0 AS "Window duration/h"; Window duration/h -------------------- 3.0000000000000000 (1 row) -- Check whether the current time falls into the specified off-peak hours. postgres=# SELECT now(), * FROM polar_advisor.is_in_advisor_window(); now | is_in_advisor_window -------------------------------+---------------------- 2024-04-01 07:40:37.733911+00 | f (1 row) -- View the length of time from the current time to the start time of the next off-peak hours. postgres=# SELECT * FROM polar_advisor.get_secs_to_advisor_window_start(); secs_to_window_start | time_now | window_start | window_end ----------------------+--------------------+--------------+------------- 26362.265179 | 07:40:37.734821+00 | 15:00:00+00 | 18:00:00+00 (1 row) -- View the length of time from the current time to the end time of the next off-peak hours. postgres=# SELECT * FROM polar_advisor.get_secs_to_advisor_window_end(); secs_to_window_end | time_now | window_start | window_end --------------------+--------------------+--------------+------------- 36561.870337 | 07:40:38.129663+00 | 15:00:00+00 | 18:00:00+00 (1 row)Disable or enable the specified off-peak hours.
The off-peak hours takes effect by default. Garbage collection operations are performed at the specified time period. If you do not want to perform garbage collection operations on the cluster during the off-peak hours of a day, such as when you want to manually perform other O&M operations and worry about conflicts with the garbage collection operations, execute the following statements to disable the specified off-peak hours. After the O&M operations are completed, call a function to re-enable the window.
-- Execute the following statements in the postgres database. -- Disable the specified off-peak hours. SELECT polar_advisor.disable_advisor_window(); -- Enable the specified off-peak hours. SELECT polar_advisor.enable_advisor_window(); -- Check whether the specified off-peak hours are enabled. SELECT polar_advisor.is_advisor_window_enabled();The following example shows the execution of the preceding statements:
-- The specified off-peak hours are enabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- t (1 row) -- Disable the specified off-peak hours. postgres=# SELECT polar_advisor.disable_advisor_window(); disable_advisor_window ------------------------ (1 row) -- The specified off-peak hours are disabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- f (1 row) -- Re-enable the specified off-peak hours. postgres=# SELECT polar_advisor.enable_advisor_window(); enable_advisor_window ----------------------- (1 row) -- The specified off-peak hours are re-enabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- t (1 row)
Other settings
Configure a blacklist.
If you have specified off-peak hours, databases automatically select tables on which the system will perform garbage collection operations during off-peak hours. Any table may be selected. If you do not want a table to be selected for garbage collection, execute the following statements to add the table to the blacklist:
-- Execute the following statements in a business database. -- Add the table to the VACUUM & ANALYZE blacklist. SELECT polar_advisor.add_relation_to_vacuum_analyze_blacklist(schema_name, relation_name); -- Check whether the table is in the VACUUM & ANALYZE blacklist. SELECT polar_advisor.is_relation_in_vacuum_analyze_blacklist(schema_name, relation_name); -- Obtain the VACUUM & ANALYZE blacklist. SELECT * FROM polar_advisor.get_vacuum_analyze_blacklist();The following example shows the execution of the preceding statements:
-- Add the table public.t1 to the blacklist. postgres=# SELECT polar_advisor.add_relation_to_vacuum_analyze_blacklist('public', 't1'); add_relation_to_vacuum_analyze_blacklist --------------------------- t (1 row) -- Check whether the table is in the blacklist. postgres=# SELECT polar_advisor.is_relation_in_vacuum_analyze_blacklist('public', 't1'); is_relation_in_vacuum_analyze_blacklist -------------------------- t (1 row) -- Obtain the complete blacklist and check whether the table is in the blacklist. postgres=# SELECT * FROM polar_advisor.get_vacuum_analyze_blacklist(); schema_name | relation_name | action_type -------------+---------------+---------------- public | t1 | VACUUM ANALYZE (1 row)Specify a threshold for active connections.
To prevent garbage collection operations during off-peak hours from affecting your business, the system automatically detects the number of active connections during off-peak hours. If the number of active connections exceeds the default threshold, the system automatically cancels the garbage collection operations. You can manually adjust the threshold to suit your business characteristics. By default, the threshold ranges from 5 to 10 based on the number of CPU cores in the cluster.
-- Execute the following statements in the postgres database. -- Obtain the threshold of active connections allowed during off-peak hours. If the actual number of active connections exceeds the threshold, the system does not perform garbage collection. SELECT polar_advisor.get_active_user_conn_num_limit(); -- Execute the following statement during off-peak hours to query the actual number of active connections during this time. You can also obtain the actual number by performing the following steps in the PolarDB console: First, log on to the PolarDB console and find the cluster that you want to manage. Then, choose Performance Monitoring > Advanced Monitoring > Standard View. Finally, find the Sessions metric and view the value of the active_session parameter. SELECT COUNT(*) FROM pg_catalog.pg_stat_activity sa JOIN pg_catalog.pg_user u ON sa.usename = u.usename WHERE sa.state = 'active' AND sa.backend_type = 'client backend' AND NOT u.usesuper; -- Specify a threshold for active connections. The threshold overrides the default threshold. SELECT polar_advisor.set_active_user_conn_num_limit(active_user_conn_limit); -- Cancel the settings. This means that the default threshold is used. SELECT polar_advisor.unset_active_user_conn_num_limit();The following example shows the execution of the preceding statements:
-- Obtain the default threshold of active connections to the cluster. The threshold is 5. The threshold of active connections varies based on the number of CPU cores in the cluster. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit by CPU cores number get_active_user_conn_num_limit -------------------------------- 5 (1 row) -- Obtain the actual number of active connections. The actual number is 8, which is greater than the threshold 5. Therefore, the system considers that the number of active connections is high and garbage collection operations cannot be performed within the off-peak hours. postgres=# SELECT COUNT(*) FROM pg_catalog.pg_stat_activity sa JOIN pg_catalog.pg_user u ON sa.usename = u.usename postgres-# WHERE sa.state = 'active' AND sa.backend_type = 'client backend' AND NOT u.usesuper; count ------- 8 (1 row) -- Set the threshold of active connections to 10. The threshold that you specify is greater than the value 8. Therefore, the system considers that the actual number of active connections does not exceed the specified threshold and garbage collection operations can be performed within the off-peak hours. postgres=# SELECT polar_advisor.set_active_user_conn_num_limit(10); set_active_user_conn_num_limit -------------------------------- (1 row) -- Query the threshold of active connections. The value 10 is returned. This value is specified by you in the previous step. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit from table get_active_user_conn_num_limit -------------------------------- 10 (1 row) -- Cancel the settings made in the previous step. postgres=# SELECT polar_advisor.unset_active_user_conn_num_limit(); unset_active_user_conn_num_limit ---------------------------------- (1 row) -- After you cancel the settings, the threshold of active connections returns to the default value 5. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit by CPU cores number get_active_user_conn_num_limit -------------------------------- 5 (1 row)
View operation results
The results and benefits of garbage collection operations performed during off-peak hours are recorded in the log tables of the postgres database. Data from the last 90 days is retained in the log tables.
Table format
The table named polar_advisor.db_level_advisor_log retains the details of each round of garbage collection operations in databases.
CREATE TABLE polar_advisor.db_level_advisor_log (
id BIGSERIAL PRIMARY KEY,
exec_id BIGINT,
start_time TIMESTAMP WITH TIME ZONE,
end_time TIMESTAMP WITH TIME ZONE,
db_name NAME,
event_type VARCHAR(100),
total_relation BIGINT,
acted_relation BIGINT,
age_before BIGINT,
age_after BIGINT,
others JSONB
);The following table describes the parameters in the preceding statement.
Parameter | Description |
id | The primary key of the table. The id values are in the ascending order. |
exec_id | The number of rounds of garbage collection operations. A round is generally executed every day. A round can involve multiple databases. Therefore, multiple entries recorded on a given day have the same |
start_time | The start time of the garbage collection operation. |
end_time | The end time of the garbage collection operation. |
db_name | The name of the database on which the garbage collection operation is performed. |
event_type | The operation type. The parameter is fixed to |
total_relation | The total number of subtables and indexes in the table on which the garbage collection operation can be performed. |
acted_relation | The number of subtables and indexes in the table on which the garbage collection operation is actually performed. |
age_before | The database age before the garbage collection operation. |
age_after | The database age after the garbage collection operation. |
others | The expanded statistics. Related options:
|
The table named polar_advisor.advisor_log retains the details of a garbage collection operation performed on tables or indexes. Multiple entries in the table polar_advisor.advisor_log correspond to an entry in the table polar_advisor.db_level_advisor_log.
CREATE TABLE polar_advisor.advisor_log (
id BIGSERIAL PRIMARY KEY,
exec_id BIGINT,
start_time TIMESTAMP WITH TIME ZONE,
end_time TIMESTAMP WITH TIME ZONE,
db_name NAME,
schema_name NAME,
relation_name NAME,
event_type VARCHAR(100),
sql_cmd TEXT,
detail TEXT,
tuples_deleted BIGINT,
tuples_dead_now BIGINT,
tuples_now BIGINT,
pages_scanned BIGINT,
pages_pinned BIGINT,
pages_frozen_now BIGINT,
pages_truncated BIGINT,
pages_now BIGINT,
idx_tuples_deleted BIGINT,
idx_tuples_now BIGINT,
idx_pages_now BIGINT,
idx_pages_deleted BIGINT,
idx_pages_reusable BIGINT,
size_before BIGINT,
size_now BIGINT,
age_decreased BIGINT,
others JSONB
);The following table describes the parameters in the preceding statement.
Parameter | Description |
id | The primary key of the table. The id values are in the ascending order. |
exec_id | The number of rounds of garbage collection operations. A round is generally executed every day. A round can involve multiple databases. Therefore, multiple entries recorded on a given day have the same |
start_time | The start time of the garbage collection operation. |
end_time | The end time of the garbage collection operation. |
db_name | The name of the database on which the garbage collection operation is performed. |
schema_name | The schema name of the database on which the garbage collection operation is performed. |
relation_name | The name of the table or index on which the garbage collection operation is performed. |
event_type | The operation type. The parameter is fixed to |
sql_cmd | The executed command. Example: |
detail | The results of the garbage collection operation. For example, the parameter can return the execution results of the |
tuples_deleted | The number of dead tuples that are reclaimed from the table in the garbage collection operation. |
tuples_dead_now | The number of dead tuples left in the table after the garbage collection operation. |
tuples_now | The number of live tuples in the table after the garbage collection operation. |
pages_scanned | The number of pages scanned in the garbage collection operation. |
pages_pinned | The number of pages that cannot be deleted due to cache references in the garbage collection operation. |
pages_frozen_now | The number of frozen pages after the garbage collection operation. |
pages_truncated | The number of empty pages deleted or truncated in the garbage collection operation. |
pages_now | The number of pages in the table after the garbage collection operation. |
idx_tuples_deleted | The number of dead tuples in the index reclaimed in the garbage collection operation. |
idx_tuples_now | The number of live tuples in the index after the garbage collection operation. |
idx_pages_now | The number of index pages after the garbage collection operation. |
idx_pages_deleted | The number of index pages deleted in the garbage collection operation. |
idx_pages_reusable | The number of index pages that are reused in the garbage collection operation. |
size_before | The size of the table or index before the garbage collection operation. |
size_after | The size of the table or index after the garbage collection operation. |
age_decreased | The table age difference before and after the garbage collection operation. |
others | The expanded statistics. |
Collect statistics
Execute the following statement to query the start time, end time, and the number of tables or indexes of recent garbage collection operations:
-- Execute the following statement in the postgres database. SELECT COUNT(*) AS "the number of tables or indexes", MIN(start_time) AS "start time", MAX(end_time) AS "end time", exec_id AS "round" FROM polar_advisor.advisor_log GROUP BY exec_id ORDER BY exec_id DESC;The following results are returned. The results indicate that the last three rounds of garbage collection operations are performed on about 4,390 tables and that the operations are performed between 1:00 and 4:00 in the morning.
Number of tables or indexes | Start time | End time | Round ----------------------------+--------------------------------+--------------------------------+------ 4391 | 2024-09-23 01:00:09.413901 +08 | 2024-09-23 03:25:39.029702 +08 | 139 4393 | 2024-09-22 01:03:07.365759 +08 | 2024-09-22 03:37:45.227067 +08 | 138 4393 | 2024-09-21 01:03:08.094989 +08 | 2024-09-21 03:45:20.280011 +08 | 137Execute the following statement to query the number of tables or indexes on which recent garbage collection operations are performed and sort the results by date.
-- Execute the following statement in the postgres database. SELECT start_time::pg_catalog.date AS "Date", count(*) AS "Number of tables or indexes" FROM polar_advisor.advisor_log GROUP BY start_time::pg_catalog.date ORDER BY start_time::pg_catalog.date DESC, count(*) DESC;The following results are returned. The results indicate that garbage collection operations are performed on about 4,390 tables each day in the last three days.
Date | Number of tables or indexes ------------+------------- 2024-09-23 | 4391 2024-09-22 | 4393 2024-09-21 | 4393Execute the following statement to query the number of tables or indexes on which recent garbage collection operations are performed and sort the results by date and database.
-- Execute the following statement in the postgres database. SELECT start_time::pg_catalog.date AS "Date", count(*) AS "Number of tables or indexes" FROM polar_advisor.advisor_log GROUP BY start_time::pg_catalog.date ORDER BY start_time::pg_catalog.date DESC, count(*) DESC;-- Execute the following statement in the postgres database. SELECT start_time::date AS "Date", db_name AS "DB", count(*) AS "Number of tables or indexes" FROM polar_advisor.advisor_log WHERE now() - start_time::date < INTERVAL '3 days' GROUP BY start_time::date, db_name ORDER BY start_time::date DESC, count(*) DESC;The following results are returned. The results indicate that garbage collection operations are performed on the postgres, db_123, db_12345, and db_123456789 databases in the last three days. Each database has dozens to hundreds of tables or indexes on which the operations are performed.
Date | DB | Number of tables or indexes -----------+----------------+------------- 2024-03-05 | db_123456789 | 697 2024-03-05 | db_123 | 277 2024-03-04 | db_123456789 | 695 2024-03-04 | db_123 | 267 2024-03-04 | db_12345 | 174 2024-03-03 | postgres | 65 (6 rows)
Detailed data
Execute the following statement to query the benefits of databases on which recent garbage collection operations are performed:
-- Execute the following statement in the postgres database. SELECT id, start_time AS "Start time", end_time AS "End time", db_name AS "DB", event_type AS "Operation type", total_relation AS "Total number of tables", acted_relation AS "Number of executed tables", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "Age decrease", CASE WHEN others->>'db_size_before' IS NOT NULL AND others->>'db_size_after' IS NOT NULL THEN (others->>'db_size_before')::BIGINT - (others->>'db_size_after')::BIGINT ELSE NULL END AS "Storage capacity reduction" FROM polar_advisor.db_level_advisor_log ORDER BY id DESC;The following results are returned. The results indicate that the last three operations are of the
VACUUMtype.id | Start time | End time | DB | Operation type | Total number of tables | Number of executed tables | Age decrease | Storage capacity reduction -----+-------------------------------+-------------------------------+--------------+----------------+------------------------+---------------------------+--------------+-------------- 1184 | 2024-03-05 00:44:26.776894+08 | 2024-03-05 00:45:56.396519+08 | db_12345 | VACUUM | 174 | 164 | 694 | 0 1183 | 2024-03-05 00:43:30.243505+08 | 2024-03-05 00:44:26.695602+08 | db_123456789 | VACUUM | 100 | 90 | 396 | 0 1182 | 2024-03-05 00:41:47.70952+08 | 2024-03-05 00:43:30.172527+08 | db_12345 | VACUUM | 163 | 153 | 701 | 0 (3 rows)Execute the following statement to query the benefits of the tables on which recent garbage collection operations are performed:
-- Execute the following statement in the postgres database. SELECT start_time AS "Start time", end_time AS "End time", db_name AS "DB", schema_name AS "Schema", relation_name AS "Table/index", event_type AS "Operation type", tuples_deleted AS "Number of reclaimed dead tuples", pages_scanned AS "Number of scanned pages",pages_truncated AS "Number of reclaimed pages", idx_tuples_deleted AS "Reclaimed dead tuples in the index", idx_pages_deleted AS "Reclaimed index pages", age_decreased AS "Table age decrease" FROM polar_advisor.advisor_log ORDER BY id DESC LIMIT 3;The following results are returned. The results include information such as the number of dead tuples and pages reclaimed in the last three garbage collection operations and the table age decrease.
Start time | End time | DB | Schema | Table/index | Operation type | Number of reclaimed dead tuples | Number of scanned pages | Number of reclaimed pages | Number of reclaimed dead tuples in the index | Number of reclaimed index pages | Table age decrease ------------------------------+-------------------------------+----------+--------+-------------+----------------+---------------------------------+-------------------------+---------------------------+----------------------------------------------+---------------------------------+------------ 2024-03-05 00:45:56.204254+08 | 2024-03-05 00:45:56.357263+08 | db_12345 | public | cccc | VACUUM | 0 | 33 | 0 | 0 | 0 | 1345944 2024-03-05 00:45:56.068499+08 | 2024-03-05 00:45:56.200036+08 | db_12345 | public | aaaa | VACUUM | 0 | 28 | 0 | 0 | 0 | 1345946 2024-03-05 00:45:55.945677+08 | 2024-03-05 00:45:56.065316+08 | db_12345 | public | bbbb | VACUUM | 0 | 0 | 0 | 0 | 0 | 1345947 (3 rows)View the records of the operations causing the largest database age decrease.
PolarDB for PostgreSQL has about 2.1 billion available transaction IDs. The database age is used to measure the number of consumed transaction IDs. When the database age reaches 2.1 billion, transaction IDs are rolled back and the database becomes unavailable. Therefore, the younger a database is, the better its performance is.
-- Execute the following statements in the postgres database. -- Obtain the database and operation type corresponding to the record with the largest cluster age decrease. SELECT id, exec_id AS "Round", start_time AS "Start time", end_time AS "End time", db_name AS "DB", event_type AS "Operation type", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "Age decrease" FROM polar_advisor.db_level_advisor_log ORDER BY "Age decrease" DESC NULLS LAST LIMIT 3; -- Obtain detailed records on the database age decrease in a round based on the information of the round obtained in the previous step. SELECT id, start_time AS "Start time", end_time AS "End time", db_name AS "DB", schema_name AS "Schema", relation_name AS "Table name", sql_cmd AS "Command", event_type AS "Operation type", age_decreased AS "Age decrease" FROM polar_advisor.advisor_log WHERE exec_id = 91 ORDER BY "Age decrease" DESC NULLS LAST; -- Obtain the current age of a database. You can also obtain the current database age by performing the following steps in the PolarDB console: First, log on to the PolarDB console and find the cluster that you want to manage. Then, choose Performance Monitoring > Advanced Monitoring > Standard View. Finally, find the Vacuum- metric and view the value of the db_age parameter. SELECT MAX(pg_catalog.age(datfrozenxid)) AS "Cluster age" FROM pg_catalog.pg_database;The following sample code provides an example of the results:
-- The vacuum operation performed on the database named aaaaaaaaaaaaa on February 22, 2024 caused the database age to decrease by 9275406 (about 10 million). The operation was performed in Round 91 on that day. postgres=# SELECT id, exec_id AS "Round", start_time AS "Start time", end_time AS "End time", db_name AS "DB", event_type AS "Operation type", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "Age decrease" FROM polar_advisor.db_level_advisor_log ORDER BY "Age decrease" DESC NULLS LAST LIMIT 3; id | Round | Start time | End time | DB | Operation type | Age decrease --------+------+-------------------------------+-------------------------------+---------------+----------+---------- 259 | 91 | 2024-02-22 00:00:18.847978+08 | 2024-02-22 00:14:18.785085+08 | aaaaaaaaaaaaa | VACUUM | 9275406 256 | 90 | 2024-02-21 00:00:39.607552+08 | 2024-02-21 00:00:42.054733+08 | bbbbbbbbbbbbb | VACUUM | 7905122 262 | 92 | 2024-02-23 00:00:05.999423+08 | 2024-02-23 00:00:08.411993+08 | postgres | VACUUM | 578308 -- Obtain detailed records on the vacuum operation performed in Round 91. The records indicate that the database age decreases due to the vacuum operation performed on some pg_catalog system tables. postgres=# SELECT id, start_time AS "Start time", end_time AS "End time", db_name AS "DB", schema_name AS "Schema", relation_name AS "Table name", event_type AS "Operation type", age_decreased AS "Age decrease" FROM polar_advisor.advisor_log WHERE exec_id = 91 ORDER BY "Age decrease" DESC NULLS LAST; id | Start time | End time | DB | Schema | Table name | Operation type | Age decrease ----------+-------------------------------+-------------------------------+-------+------------+--------------------+---------+---------- 43933 | 2024-02-22 00:00:19.070493+08 | 2024-02-22 00:00:19.090822+08 | abc | pg_catalog | pg_subscription | VACUUM | 27787409 43935 | 2024-02-22 00:00:19.116292+08 | 2024-02-22 00:00:19.13875+08 | abc | pg_catalog | pg_database | VACUUM | 27787408 43936 | 2024-02-22 00:00:19.140992+08 | 2024-02-22 00:00:19.171938+08 | abc | pg_catalog | pg_db_role_setting | VACUUM | 27787408 -The current cluster age is more than 20 million, which is far from the threshold of 2.1 billion. Therefore, the cluster is in the secure state. postgres=> SELECT MAX(pg_catalog.age(datfrozenxid)) AS "Cluster age" FROM pg_catalog.pg_database; Cluster age ---------- 20874380 (1 row)
Optimization examples
This section shows the optimization results of resource usage and database age for some clusters after off-peak hours are specified.
Charts cannot effectively display the optimization results of issues such as read/write operations blocked by locked tables and invalid plan cache. Therefore, this section omits the information.
Not all clusters can achieve the optimization results displayed in the following examples. The actual optimization results vary based on specific business scenarios. Many reasons can cause unsatisfactory optimization results. For example, some clusters may be busy throughout the day and do not have obvious off-peak hours. Alternatively, some clusters may be configured for tasks such as data analysis, data import, and materialized view update during off-peak hours. As a result, the clusters do not have many idle resources for garbage collection operations.
Optimization results for memory usage
After off-peak hours are specified, the peak memory usage of automatic cleanups in a cluster decreases from 2.06 GB to 37 MB, a drop of 98%, as shown in the following figure.

The total peak memory usage of all processes also decreases from 10 GB to 8 GB, a drop of 20%.

Optimization results for I/O throughput
After off-peak hours are specified, the peak PFS IOPS of automatic cleanups in a cluster decreases by about 50%, as shown in the following figure.

The peak PFS IOPS of all processes also decreases from 35,000 to about 21,000, a drop of about 40%.

The peak PFS throughput of automatic cleanups decreases from 225 MB to 173 MB, a drop of 23%. The width and number of peaks also significantly decreases, and the average throughput decreases from 65.5 MB to 42.5 MB, a drop of 35%.

Optimization results for CPU usage
After off-peak hours are specified, the CPU usage of automatic cleanups in a cluster gradually decreases, a drop of about 50% at its peak, as shown in the following figure.

Optimization results for the number of automatic cleanups
After off-peak hours are specified, the number of automatic cleanups in a cluster decreases from 2 to 1, as shown in the following figure.

Optimization results for database age
Within two days after off-peak hours are specified for a cluster, the cluster reclaims more than 1 billion transaction IDs, and the database age decreases from more than 1 billion to less than 0.1 billion, as shown in the following figure. As a result, the risk of transaction ID rollback is greatly reduced.
