This topic describes how to use the off-peak garbage collection feature for PolarDB for PostgreSQL and , and provides usage examples.
Applicability
This feature is supported in the following versions of PolarDB for PostgreSQL:
PostgreSQL 17 (minor engine version 2.0.17.2.2.1 and later)
PostgreSQL 16 (minor engine version 2.0.16.8.3.0 and later)
PostgreSQL 15 (minor engine version 2.0.15.12.4.0 and later)
PostgreSQL 14 (minor engine version 2.0.14.12.24.0 and later)
PostgreSQL 11 (minor engine version 2.0.11.15.42.0 and later)
You can view the minor engine version in the console or by running the SHOW polardb_version; statement. If the minor engine version does not meet the requirements, you must upgrade the minor engine version.
Background information
Similar to native PostgreSQL, PolarDB for PostgreSQL and start an autovacuum process in the background to perform garbage collection. This provides benefits that include, but are not limited to, the following:
Reclaiming old data versions to reduce disk space usage.
Updating statistics to ensure the query optimizer can select the optimal execution plan.
Preventing transaction ID wraparound to effectively reduce the risk of cluster unavailability.
These garbage collection operations consume significant hardware resources. To avoid running autovacuum too frequently, native PostgreSQL sets specific trigger conditions. For more information, see Autovacuum Parameter Settings. The cleanup process starts only when these conditions are met. Because the trigger conditions are related to the number of changed rows and the database age, autovacuum is triggered more frequently during business peak hours when data changes are more frequent and transaction IDs are consumed faster. This leads to the following issues:
High resource usage: During peak hours, the autovacuum process frequently performs garbage collection. This consumes a large amount of CPU and I/O and competes with business read and write requests for hardware resources. This affects the read and write performance of the database. In the following example, the CPU usage and I/O throughput of the autovacuum process rank first among all processes during daytime peak hours.


Read/write requests blocked by locked tables: The autovacuum process needs to briefly hold an exclusive lock while it reclaims empty pages. This blocks business requests on a single table. Although the blocking time is usually short, even a brief interruption is unacceptable during peak business hours.
Plan cache invalidation: The autovacuum process collects statistics, which can invalidate the existing plan cache. New queries must then regenerate execution plans. During peak hours, multiple connections might generate execution plans in parallel. This affects the response time of business requests from multiple connections.
NoteThe global plan cache (GPC) feature of PolarDB for PostgreSQL can help reduce the impact of this issue. For more information, see Global plan cache (GPC).
The core of these issues is that native PostgreSQL does not have a concept of off-peak hours, but real-world business scenarios usually have distinct peak and off-peak periods. PolarDB for PostgreSQL and allow you to configure a time window for off-peak hours. This feature uses idle hardware resources during off-peak hours to perform active and thorough garbage collection. As a result, the frequency of autovacuum during peak hours is reduced, which leaves more hardware resources for business read and write requests and optimizes read and write performance.
Expected benefits
Garbage collection during off-peak hours can mitigate the problems mentioned above. In addition, because the cleanup strategy during off-peak hours is more aggressive than the native PostgreSQL autovacuum, it may provide extra benefits. The overall benefits are as follows:
Optimized resource utilization: Garbage is collected during off-peak hours. This greatly reduces the probability of autovacuum being triggered during peak hours and lowers resource usage.
Optimized database age: More transaction IDs are reclaimed during off-peak hours. This prevents database unavailability caused by transaction ID wraparound.
Optimized statistics and slow SQL statements: More table statistics are collected. This helps the optimizer choose more accurate query plans and reduces the number of slow SQL statements caused by outdated statistics.
Reduced table locking issues: The probability of autovacuum operations locking tables and blocking business read/write requests during peak hours is lowered.
Reduced plan cache invalidation: The probability of autovacuum operations causing plan cache invalidation during peak hours is lowered.
Usage
You can use the following methods to configure an off-peak period for a PolarDB for PostgreSQL cluster only if it runs a supported version listed in the Applicability section. If the cluster runs an earlier minor engine version, you must first upgrade it to the latest minor engine version in the console. For more information, see Version management.
To configure an off-peak period without upgrading the minor engine version of the cluster, you can contact us for backend configuration. You must provide the start time, end time, and time zone for the off-peak period. This method has limitations. The configuration may become invalid due to operations such as primary/standby switchover, configuration changes, or zone switching. Therefore, this is only a temporary solution. For a permanent configuration, you must upgrade to the latest minor engine version.
Configure off-peak hour information
Create a plugin.
You must create the
polar_advisorextension on the postgres database and on all databases where you need to perform garbage collection.CREATE EXTENSION IF NOT EXISTS polar_advisor;For PolarDB clusters that already have the
polar_advisorextension installed, run the following command to update it:ALTER EXTENSION polar_advisor UPDATE;Set the time window.
Run the following command to set the off-peak period.
-- Run on the postgres database. SELECT polar_advisor.set_advisor_window(start_time, end_time);start_time: The start time of the window.
end_time: The end time of the window.
By default, the window takes effect on the same day. Garbage collection is then automatically performed within this time window each day.
NoteOnly the time window configured in the postgres database takes effect. Setting a time window on other databases has no effect.
The time zone offset in the window time must be consistent with the time zone setting of the PolarDB cluster. Otherwise, the time window does not take effect.
The following example sets the off-peak period from 23:00 to 02:00 daily in the UTC+8 time zone. The cluster performs garbage collection during this period every day:
SELECT polar_advisor.set_advisor_window(start_time => '23:00:00+08', end_time => '02:00:00+08');View the time window.
Run the following command to view information about the configured off-peak period.
-- Run on the postgres database. -- View details of the off-peak period. SELECT * FROM polar_advisor.get_advisor_window(); -- View the duration of the off-peak period in seconds. SELECT polar_advisor.get_advisor_window_length(); -- Check if the current time is within the off-peak period. SELECT now(), * FROM polar_advisor.is_in_advisor_window(); -- View the time remaining until the next off-peak period starts, in seconds. SELECT * FROM polar_advisor.get_secs_to_advisor_window_start(); -- View the time remaining until the current off-peak period ends, in seconds. SELECT * FROM polar_advisor.get_secs_to_advisor_window_end();The following is an example:
-- View details of the off-peak period. postgres=# SELECT * FROM polar_advisor.get_advisor_window(); start_time | end_time | enabled | last_error_time | last_error_detail | others -------------+-------------+---------+-----------------+-------------------+-------- 23:00:00+08 | 02:00:00+08 | t | | | (1 row) -- View the duration of the off-peak period. postgres=# SELECT polar_advisor.get_advisor_window_length() / 3600.0 AS "window_duration_hours"; window_duration_hours -------------------- 3.0000000000000000 (1 row) -- Check if the current time is within the off-peak period. postgres=# SELECT now(), * FROM polar_advisor.is_in_advisor_window(); now | is_in_advisor_window -------------------------------+---------------------- 2024-04-01 07:40:37.733911+00 | f (1 row) -- View the time remaining until the next off-peak period starts. postgres=# SELECT * FROM polar_advisor.get_secs_to_advisor_window_start(); secs_to_window_start | time_now | window_start | window_end ----------------------+--------------------+--------------+------------- 26362.265179 | 07:40:37.734821+00 | 15:00:00+00 | 18:00:00+00 (1 row) -- View the time remaining until the current off-peak period ends. postgres=# SELECT * FROM polar_advisor.get_secs_to_advisor_window_end(); secs_to_window_end | time_now | window_start | window_end --------------------+--------------------+--------------+------------- 36561.870337 | 07:40:38.129663+00 | 15:00:00+00 | 18:00:00+00 (1 row)Disable or enable the window.
After you set a window, it is enabled by default, and garbage collection runs within this window each day. If you do not want the cluster to perform garbage collection during the off-peak period on a particular day, for example, to perform other manual operations where you are concerned about conflicts, you can run the following command to disable the off-peak period. After the work is complete, you can run the command again to re-enable the window.
-- Run on the postgres database. -- Disable garbage collection during the off-peak period. SELECT polar_advisor.disable_advisor_window(); -- Enable garbage collection during the off-peak period. SELECT polar_advisor.enable_advisor_window(); -- Check if the window is enabled. SELECT polar_advisor.is_advisor_window_enabled();The following is an example:
-- Window is enabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- t (1 row) -- Disable the window. postgres=# SELECT polar_advisor.disable_advisor_window(); disable_advisor_window ------------------------ (1 row) -- Window is disabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- f (1 row) -- Re-enable the window. postgres=# SELECT polar_advisor.enable_advisor_window(); enable_advisor_window ----------------------- (1 row) -- Window is enabled. postgres=# SELECT polar_advisor.is_advisor_window_enabled(); is_advisor_window_enabled --------------------------- t (1 row)
Other configurations
Set a blacklist.
If you configure an off-peak period, the database automatically decides which tables to perform garbage collection on during the window. Any table can be selected. If you want to prevent a specific table from being selected, you can run the following command to add it to a blacklist.
-- Run in a specific business database. -- Add a table to the VACUUM & ANALYZE blacklist. SELECT polar_advisor.add_relation_to_vacuum_analyze_blacklist(schema_name, relation_name); -- Verify if a table is in the VACUUM & ANALYZE blacklist. SELECT polar_advisor.is_relation_in_vacuum_analyze_blacklist(schema_name, relation_name); -- Get the VACUUM & ANALYZE blacklist. SELECT * FROM polar_advisor.get_vacuum_analyze_blacklist();The following is an example:
-- Add the public.t1 table to the blacklist. postgres=# SELECT polar_advisor.add_relation_to_vacuum_analyze_blacklist('public', 't1'); add_relation_to_vacuum_analyze_blacklist --------------------------- t (1 row) -- Check if the table is in the blacklist. postgres=# SELECT polar_advisor.is_relation_in_vacuum_analyze_blacklist('public', 't1'); is_relation_in_vacuum_analyze_blacklist -------------------------- t (1 row) -- Get the complete blacklist to check if the table is in it. postgres=# SELECT * FROM polar_advisor.get_vacuum_analyze_blacklist(); schema_name | relation_name | action_type -------------+---------------+---------------- public | t1 | VACUUM ANALYZE (1 row)Set the active connection threshold.
To prevent garbage collection during off-peak hours from affecting normal business operations, the system automatically checks the number of active connections. If the number of active connections exceeds a threshold, the garbage collection operation is automatically canceled. You can manually adjust this threshold to meet your business requirements. The default threshold is between 5 and 10 and depends on the number of CPU cores in the cluster.
-- Run on the postgres database. -- Get the acceptable connection threshold for the off-peak period. Garbage collection is not performed if the actual number of active connections is higher than this value. SELECT polar_advisor.get_active_user_conn_num_limit(); -- Run the SQL statement during the off-peak period to get the actual number of active connections. You can also view the active_session metric in the PolarDB console by navigating to Performance Monitoring > Advanced Monitoring > Standard View > Session Connection. SELECT COUNT(*) FROM pg_catalog.pg_stat_activity sa JOIN pg_catalog.pg_user u ON sa.usename = u.usename WHERE sa.state = 'active' AND sa.backend_type = 'client backend' AND NOT u.usesuper; -- Manually set the active connection threshold. This overrides the system default. SELECT polar_advisor.set_active_user_conn_num_limit(active_user_conn_limit); -- Unset the active connection threshold to restore the system default. SELECT polar_advisor.unset_active_user_conn_num_limit();For example:
-- Get the default active connection threshold for the instance. The threshold for this instance is 5. The threshold may vary for different instances based on the number of CPU cores. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit by CPU cores number get_active_user_conn_num_limit -------------------------------- 5 (1 row) -- Get the current number of active connections. The result is 8, which is greater than the threshold of 5. The system considers the number of active connections to be high and will not perform garbage collection. postgres=# SELECT COUNT(*) FROM pg_catalog.pg_stat_activity sa JOIN pg_catalog.pg_user u ON sa.usename = u.usename postgres-# WHERE sa.state = 'active' AND sa.backend_type = 'client backend' AND NOT u.usesuper; count ------- 8 (1 row) -- Set the active connection threshold to 10. This is greater than the actual number of active connections (8). The system considers that the actual number of connections does not exceed the threshold, so garbage collection can be performed. postgres=# SELECT polar_advisor.set_active_user_conn_num_limit(10); set_active_user_conn_num_limit -------------------------------- (1 row) -- View the active connection threshold. The value is 10, which was manually set in the previous step. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit from table get_active_user_conn_num_limit -------------------------------- 10 (1 row) -- Unset the active connection threshold. postgres=# SELECT polar_advisor.unset_active_user_conn_num_limit(); unset_active_user_conn_num_limit ---------------------------------- (1 row) -- After unsetting, the active connection threshold reverts to the default value of 5. postgres=# SELECT polar_advisor.get_active_user_conn_num_limit(); NOTICE: get active user conn limit by CPU cores number get_active_user_conn_num_limit -------------------------------- 5 (1 row)
View results
The results and benefits of garbage collection operations performed during the off-peak period are recorded in log tables in the postgres database. The logs retain data from the last 90 days.
Table schema
The polar_advisor.db_level_advisor_log table stores information about each round of database-level garbage collection.
CREATE TABLE polar_advisor.db_level_advisor_log (
id BIGSERIAL PRIMARY KEY,
exec_id BIGINT,
start_time TIMESTAMP WITH TIME ZONE,
end_time TIMESTAMP WITH TIME ZONE,
db_name NAME,
event_type VARCHAR(100),
total_relation BIGINT,
acted_relation BIGINT,
age_before BIGINT,
age_after BIGINT,
others JSONB
);Description:
Parameter Name | Description |
id | The primary key of the table. It auto-increments. |
exec_id | The execution round. One round usually runs per day. A round can operate on multiple databases, so multiple records on the same day have the same |
start_time | The start time of the operation. |
end_time | The end time of the operation. |
db_name | The name of the database being operated on. |
event_type | The operation type. Currently, only |
total_relation | The number of tables and indexes that can be operated on. |
acted_relation | The number of tables and indexes that were actually operated on. |
age_before | The database age before the operation. |
age_after | The database age after the operation. |
others | Contains extended statistical data:
|
The polar_advisor.advisor_log table stores detailed information about each table-level or index-level garbage collection operation. One record in the polar_advisor.db_level_advisor_log table corresponds to multiple records in the polar_advisor.advisor_log table.
CREATE TABLE polar_advisor.advisor_log (
id BIGSERIAL PRIMARY KEY,
exec_id BIGINT,
start_time TIMESTAMP WITH TIME ZONE,
end_time TIMESTAMP WITH TIME ZONE,
db_name NAME,
schema_name NAME,
relation_name NAME,
event_type VARCHAR(100),
sql_cmd TEXT,
detail TEXT,
tuples_deleted BIGINT,
tuples_dead_now BIGINT,
tuples_now BIGINT,
pages_scanned BIGINT,
pages_pinned BIGINT,
pages_frozen_now BIGINT,
pages_truncated BIGINT,
pages_now BIGINT,
idx_tuples_deleted BIGINT,
idx_tuples_now BIGINT,
idx_pages_now BIGINT,
idx_pages_deleted BIGINT,
idx_pages_reusable BIGINT,
size_before BIGINT,
size_now BIGINT,
age_decreased BIGINT,
others JSONB
);Description:
Parameter | Description |
id | The primary key of the table. It auto-increments. |
exec_id | The execution round. One round usually runs per day. A round can operate on multiple databases, so multiple records on the same day have the same |
start_time | The start time of the operation. |
end_time | The end time of the operation. |
db_name | The name of the database being operated on. |
schema_name | The name of the database schema being operated on. |
relation_name | The name of the database table or index being operated on. |
event_type | The operation type. Currently, only |
sql_cmd | The specific command that was run, such as |
detail | The detailed results of the operation, such as the output of |
tuples_deleted | The number of dead tuples reclaimed from the table during this operation. |
tuples_dead_now | The number of dead tuples remaining in the table after this operation. |
tuples_now | The number of live tuples in the table after this operation. |
pages_scanned | The number of pages scanned during this operation. |
pages_pinned | The number of pages that could not be deleted because they were pinned by other sessions. |
pages_frozen_now | The number of pages that are now frozen after this operation. |
pages_truncated | The number of empty pages that were deleted or truncated during this operation. |
pages_now | The number of pages in the table after this operation. |
idx_tuples_deleted | The number of dead index tuples reclaimed during this operation. |
idx_tuples_now | The number of live tuples in the index after this operation. |
idx_pages_now | The number of pages in the index after this operation. |
idx_pages_deleted | The number of index pages deleted during this operation. |
idx_pages_reusable | The number of reusable index pages after this operation. |
size_before | The size of the table or index before this operation. |
size_after | The size of the table or index after this operation. |
age_decreased | The reduction in table age from this operation. |
others | Extended statistical data. |
Statistical data
View the start time, end time, and number of tables and indexes operated on for each recent garbage collection round. The following is an example:
-- Run on the postgres database. SELECT COUNT(*) AS "table_index_count", MIN(start_time) AS "start_time", MAX(end_time) AS "end_time", exec_id AS "round" FROM polar_advisor.advisor_log GROUP BY exec_id ORDER BY exec_id DESC;The results show that in the last three rounds, garbage collection was performed on about 4,390 tables, and the operations ran between 01:00 and 04:00.
table_index_count | start_time | end_time | round -------------------+--------------------------------+--------------------------------+------ 4391 | 2024-09-23 01:00:09.413901 +08 | 2024-09-23 03:25:39.029702 +08 | 139 4393 | 2024-09-22 01:03:07.365759 +08 | 2024-09-22 03:37:45.227067 +08 | 138 4393 | 2024-09-21 01:03:08.094989 +08 | 2024-09-21 03:45:20.280011 +08 | 137View the number of tables/indexes that underwent garbage collection each day recently, grouped by date. The following is an example:
-- Run on the postgres database. SELECT start_time::pg_catalog.date AS "date", count(*) AS "table_index_count" FROM polar_advisor.advisor_log GROUP BY start_time::pg_catalog.date ORDER BY start_time::pg_catalog.date DESC, count(*) DESC;The results show that garbage collection was performed on about 4,390 tables each day for the last three days.
date | table_index_count ------------+------------------- 2024-09-23 | 4391 2024-09-22 | 4393 2024-09-21 | 4393View the number of tables/indexes that recently underwent garbage collection, grouped by date and database. The following is an example:
-- Execute in the postgres database SELECT start_time::pg_catalog.date AS "Time", count(*) AS "Table/Index Count" FROM polar_advisor.advisor_log GROUP BY start_time::pg_catalog.date ORDER BY start_time::pg_catalog.date DESC, count(*) DESC;The results show that garbage collection was performed on the postgres, db_123, db_12345, and db_123456789 databases over the last three days. The number of tables/indexes operated on per database ranged from tens to hundreds.
date | db | table_index_count -------------+----------------+------------------- 2024-03-05 | db_123456789 | 697 2024-03-05 | db_123 | 277 2024-03-04 | db_123456789 | 695 2024-03-04 | db_123 | 267 2024-03-04 | db_12345 | 174 2024-03-03 | postgres | 65 (6 rows)
Detailed data
View the benefit information for databases that recently underwent garbage collection. The following is an example:
-- Run on the postgres database. SELECT id, start_time AS "start_time", end_time AS "end_time", db_name AS "database", event_type AS "operation_type", total_relation AS "total_relations_in_db", acted_relation AS "operated_relations", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "age_reduction", CASE WHEN others->>'db_size_before' IS NOT NULL AND others->>'db_size_after' IS NOT NULL THEN (others->>'db_size_before')::BIGINT - (others->>'db_size_after')::BIGINT ELSE NULL END AS "storage_reduction" FROM polar_advisor.db_level_advisor_log ORDER BY id DESC;The results show that the last three operations performed were all
VACUUMoperations.id | start_time | end_time | database | operation_type | total_relations_in_db | operated_relations | age_reduction | storage_reduction ---------+-------------------------------+-------------------------------+----------------+----------+----------------+--------------+----------+-------------- 1184 | 2024-03-05 00:44:26.776894+08 | 2024-03-05 00:45:56.396519+08 | db_12345 | VACUUM | 174 | 164 | 694 | 0 1183 | 2024-03-05 00:43:30.243505+08 | 2024-03-05 00:44:26.695602+08 | db_123456789 | VACUUM | 100 | 90 | 396 | 0 1182 | 2024-03-05 00:41:47.70952+08 | 2024-03-05 00:43:30.172527+08 | db_12345 | VACUUM | 163 | 153 | 701 | 0 (3 rows)View the benefit information for tables that recently underwent garbage collection. The following is an example:
-- Run on the postgres database. SELECT start_time AS "start_time", end_time AS "end_time", db_name AS "database", schema_name AS "schema", relation_name AS "table_index", event_type AS "operation_type", tuples_deleted AS "reclaimed_dead_tuples", pages_scanned AS "pages_scanned",pages_truncated AS "pages_truncated", idx_tuples_deleted AS "reclaimed_index_dead_tuples", idx_pages_deleted AS "reclaimed_index_pages", age_decreased AS "table_age_reduction" FROM polar_advisor.advisor_log ORDER BY id DESC;The results show information such as the number of dead tuples reclaimed, pages reclaimed, and table age reduction for the last three operations.
start_time | end_time | database | schema | table_index | operation_type | reclaimed_dead_tuples | pages_scanned | pages_truncated | reclaimed_index_dead_tuples | reclaimed_index_pages | table_age_reduction -------------------------------+-------------------------------+----------+--------+--------+---------+------------+---------+---------+---------------+------------+------------ 2024-03-05 00:45:56.204254+08 | 2024-03-05 00:45:56.357263+08 | db_12345 | public | cccc | VACUUM | 0 | 33 | 0 | 0 | 0 | 1345944 2024-03-05 00:45:56.068499+08 | 2024-03-05 00:45:56.200036+08 | db_12345 | public | aaaa | VACUUM | 0 | 28 | 0 | 0 | 0 | 1345946 2024-03-05 00:45:55.945677+08 | 2024-03-05 00:45:56.065316+08 | db_12345 | public | bbbb | VACUUM | 0 | 0 | 0 | 0 | 0 | 1345947 (3 rows)View the operation records with the largest database age reduction.
PolarDB for PostgreSQL and have about 2.1 billion available transaction IDs. The database age measures the number of consumed transaction IDs. When the age reaches 2.1 billion, a transaction ID wraparound event occurs and the database becomes unavailable. Therefore, a lower database age is better.
-- Run on the postgres database. -- Get the records for the database and operation type with the largest instance age reduction. SELECT id, exec_id AS "round", start_time AS "start_time", end_time AS "end_time", db_name AS "database", event_type AS "operation_type", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "age_reduction" FROM polar_advisor.db_level_advisor_log ORDER BY "age_reduction" DESC NULLS LAST; -- Based on the round information from the previous step, get the detailed records that caused the database age reduction in that round. SELECT id, start_time AS "start_time", end_time AS "end_time", db_name AS "database", schema_name AS "schema", relation_name AS "table_name", sql_cmd AS "command", event_type AS "operation_type", age_decreased AS "age_reduction" FROM polar_advisor.advisor_log WHERE exec_id = 91 ORDER BY "age_reduction" DESC NULLS LAST; -- Get the current database age. This can be run on any database. You can also view the db_age metric in the PolarDB console by navigating to Performance Monitoring > Advanced Monitoring > Standard View > Vacuum. SELECT MAX(pg_catalog.age(datfrozenxid)) AS "instance_age" FROM pg_catalog.pg_database;The following is an example of the results:
-- On 2024-02-22, a vacuum operation on the 'aaaaaaaaaaaaa' database reduced the database age by 9,275,406 (nearly 10 million). The execution round was 91. postgres=# SELECT id, exec_id AS "round", start_time AS "start_time", end_time AS "end_time", db_name AS "database", event_type AS "operation_type", CASE WHEN others->>'cluster_age_before' IS NOT NULL AND others->>'cluster_age_after' IS NOT NULL THEN (others->>'cluster_age_before')::BIGINT - (others->>'cluster_age_after')::BIGINT ELSE NULL END AS "age_reduction" FROM polar_advisor.db_level_advisor_log ORDER BY "age_reduction" DESC NULLS LAST; id | round | start_time | end_time | database | operation_type | age_reduction --------+------+-------------------------------+-------------------------------+---------------+----------+---------- 259 | 91 | 2024-02-22 00:00:18.847978+08 | 2024-02-22 00:14:18.785085+08 | aaaaaaaaaaaaa | VACUUM | 9275406 256 | 90 | 2024-02-21 00:00:39.607552+08 | 2024-02-21 00:00:42.054733+08 | bbbbbbbbbbbbb | VACUUM | 7905122 262 | 92 | 2024-02-23 00:00:05.999423+08 | 2024-02-23 00:00:08.411993+08 | postgres | VACUUM | 578308 -- Get the detailed vacuum records for execution round 91. The results show that the database age reduction was mainly due to vacuum operations on some pg_catalog system tables. postgres=# SELECT id, start_time AS "start_time", end_time AS "end_time", db_name AS "database", schema_name AS "schema", relation_name AS "table_name", event_type AS "operation_type", age_decreased AS "age_reduction" FROM polar_advisor.advisor_log WHERE exec_id = 91 ORDER BY "age_reduction" DESC NULLS LAST; id | start_time | end_time | database | schema | table_name | operation_type | age_reduction ----------+-------------------------------+-------------------------------+-------+------------+--------------------+---------+---------- 43933 | 2024-02-22 00:00:19.070493+08 | 2024-02-22 00:00:19.090822+08 | abc | pg_catalog | pg_subscription | VACUUM | 27787409 43935 | 2024-02-22 00:00:19.116292+08 | 2024-02-22 00:00:19.13875+08 | abc | pg_catalog | pg_database | VACUUM | 27787408 43936 | 2024-02-22 00:00:19.140992+08 | 2024-02-22 00:00:19.171938+08 | abc | pg_catalog | pg_db_role_setting | VACUUM | 27787408 -- The current instance age is over 20 million, which is far from the threshold of 2.1 billion. The instance is very safe. postgres=> SELECT MAX(pg_catalog.age(datfrozenxid)) AS "instance_age" FROM pg_catalog.pg_database; instance_age ---------- 20874380 (1 row)
Optimization effect examples
This section shows the optimization effects on resource usage and database age for some clusters after an off-peak period is configured.
The optimization effects for issues such as read/write operations blocked by locked tables and plan cache invalidation are not easily shown with charts, so they are not displayed here.
Not all clusters can achieve the same optimization effects as shown in the examples. The actual effect depends on the specific business scenario. The improvement may not be significant for several reasons. For example, some clusters are busy all day with no obvious off-peak period. Other clusters might have tasks such as data analytics, data import, or materialized view refreshes scheduled during their off-peak period, which leaves few idle resources for garbage collection.
Memory usage optimization effect
As shown in the following figure, after configuring garbage collection for the off-peak period, the peak memory usage of the cluster's autovacuum process dropped from 2.06 GB to 37 MB, a 98% reduction.

The total peak memory usage of all processes also dropped from 10 GB to 8 GB, a 20% reduction.

I/O usage optimization effect
As shown in the following figure, after the off-peak period was configured, the peak PFS IOPS of the cluster's autovacuum process was reduced by about 50%.

The total peak PFS IOPS of all processes also dropped from 35,000 to about 21,000, which is a reduction of about 40%.

The peak PFS I/O throughput of the autovacuum process dropped from 225 MB to 173 MB, a 23% reduction. The width and number of peaks also decreased significantly. The average throughput dropped from 65.5 MB to 42.5 MB, a 35% reduction.

CPU usage optimization effect
As shown in the following figure, after the off-peak period was configured, the CPU utilization of the cluster's autovacuum process gradually decreased, with a peak reduction of about 50%.

Autovacuum process count optimization effect
As shown in the following figure, after the off-peak period was configured, the number of autovacuum processes in the cluster decreased from 2 to 1.

Database age optimization effect
As shown in the following figure, within two days of configuring the off-peak period, the cluster reclaimed more than 1 billion transaction IDs. The database age dropped from over 1 billion to less than 100 million, significantly reducing the risk of transaction ID wraparound.
