AnalyticDB for PostgreSQL can automatically perform dirty data cleanup operations on tables in the background to prevent degradation of scanning performance and transaction ID (XID) wraparound issues caused by data bloat. This topic describes the working mechanism and optimization parameters of the auto-vacuum process.
Prerequisites
An AnalyticDB for PostgreSQL instance of a minor version released on or after May 27, 2021 is created. For information about how to update the minor version of an AnalyticDB for PostgreSQL instance, see Update the minor version of an instance.
Working mechanism
The auto-vacuum process can automatically execute the VACUUM and VACUUM FREEZE statements. The auto-vacuum process periodically checks tables on which a large number of inserts, updates, or deletes are performed, and the XID age of each table. The process performs VACUUM or VACUUM FREEZE operations on the tables as needed to clean up dirty data in the tables or reduce the XID age to speed up data queries.
Trigger formula
Number of dirty data rows > Number of table rows × Ratio threshold for triggering the auto-vacuum feature + Row number threshold for triggering the auto-vacuum feature
The number of rows that contain dirty data increases when you perform updates and deletes on a table. The number of rows of the table is the number of rows obtained from the last analysis of the table. You can use the autovacuum_vacuum_scale_factor parameter to specify the ratio threshold for triggering the auto-vacuum feature, which defaults to 0.5. You can use the autovacuum_vacuum_threshold parameter to specify the row number threshold for triggering the auto-vacuum feature, which defaults to 10000. By default, the auto-vacuum feature is automatically triggered to clean up dirty data when the ratio of the size of dirty data in a table to the total table size exceeds 50%.
Lock conflict
The auto-vacuum process holds a 4-level lock on a table. If a business process requires a lock that conflicts with the lock held by the auto-vacuum process, the auto-vacuum process gives up the lock to ensure the running of the business process.
Resource usage
The auto-vacuum process uses CPU and I/O resources during execution but does not affect the performance when the resource bottleneck is not reached.
Query information about the auto-vacuum process
In AnalyticDB for PostgreSQL V7.0, execute the following statement to query information about the running auto-vacuum process:
SELECT * FROM pg_stat_activity WHERE backend_type = 'autovacuum worker';
In AnalyticDB for PostgreSQL V6.0, execute the following statement to query information about the running auto-vacuum process:
SELECT * FROM pg_stat_activity WHERE query LIKE 'autovacuum%';
Query tables that require the auto-vacuum feature
SELECT schemaname, c.relname, c.reltuples, n_dead_tup, last_autovacuum, (n_dead_tup::float / c.reltuples::float)
AS bloat_ratio FROM pg_stat_all_tables s, pg_class c
WHERE c.oid = s.relid AND c.reltuples > 10000 AND (n_dead_tup::float > 10000 + c.reltuples::float * 0.5)
ORDER BY (n_dead_tup::float/ c.reltuples::float) DESC;
Parameters:
0.5: the value specified for the instance-level autovacuum_vacuum_scale_factor parameter.
10000: the value specified for the instance-level autovacuum_vacuum_threshold parameter.
Configure parameters
Instance-level parameters
The following table describes the instance-level auto-vacuum parameters that can be modified.
Parameter | Description | Default value |
autovacuum_naptime | The minimum scheduling interval between two executions of the auto-vacuum process. Unit: seconds. | 60 |
autovacuum_vacuum_scale_factor | The data size ratio threshold for triggering the auto-vacuum feature. If you want to clean up dirty data more frequently, set this parameter to a smaller value. | 0.5 |
autovacuum_vacuum_threshold | The minimum number of updated or deleted rows for triggering the auto-vacuum feature. | 10000 |
autovacuum_freeze_max_age | The maximum XID age for triggering the auto-vacuum feature to prevent XID wraparound. | 200,000,000 |
autovacuum_max_execute_workers | The maximum number of concurrent auto-vacuum tasks. | max(3, cpucores/2), cpucores is the number of CPU cores on a single compute node. |
In most cases, we do not recommend that you modify the instance-level auto-vacuum parameters. If you want to modify the value of a parameter, submit a ticket.
Table-level parameters
Modify auto-vacuum parameters
Execute the following statement to modify an auto-vacuum parameter of a table:
ALTER TABLE <test_table> SET (<autovaccum_options>=<expect_value>);
Parameters:
<test_table>
: the name of the table for which you want to modify an auto-vacuum parameter.<autovaccum_options>
: the table-level auto-vacuum parameter whose value you want to modify.<expect_value>
: the new value of the parameter.
Examples
Example 1: Disable the auto-vacuum feature for the test table.
ALTER TABLE test SET (autovacuum_enabled=false);
You can set the autovacuum_enabled
parameter to false to disable the auto-vacuum feature for a table. We recommend that you increase the value of the autovacuum_vacuum_scale_factor parameter to disable the auto-vacuum feature.
Example 2: Increase the ratio threshold for triggering the auto-vacuum feature to 80%.
ALTER TABLE test SET (autovacuum_vacuum_scale_factor=0.8);
Restore auto-vacuum parameters
Execute the following statement to restore the auto-vacuum parameters of a table to the default auto-vacuum parameter values of the instance:
ALTER TABLE <test_table> RESET (<autovaccum_options>);
Example
Restore the ratio threshold for triggering the auto-vacuum feature to the default value.
ALTER TABLE test RESET (autovacuum_vacuum_scale_factor);
Table-level parameters that can be modified
Parameter | Description |
autovacuum_vacuum_scale_factor | The data size ratio threshold for triggering the auto-vacuum feature. |
autovacuum_vacuum_threshold | The minimum number of updated or deleted rows for triggering the auto-vacuum feature. |