After table data is updated or deleted, it is not physically deleted but is marked as invisible. Such invisible data slows down data scans because it takes up storage. Therefore, you must reclaim the storage on a regular basis.
Reclaim the storage
VACUUM
statement to rearrange the data in a table and reclaim the storage for better data
read performance. VACUUM [FULL] [FREEZE] [VERBOSE] [table];
A VACUUM statement organizes data within a data page, whereas a VACUUM FULL statement organizes data across data pages. A VACUUM operation runs faster. A VACUUM FULL operation reclaims more storage but requests exclusive locks. We recommend that you perform a VACUUM operation on system tables once a week.
Usage notes
VACUUM:
- VACUUM statements cannot reclaim all storage of a table.
- For tables whose data is frequently updated in real time, you must execute a VACUUM statement once a day.
- If data is batch updated once a day, you can execute the statement once after the batch update is complete.
- The system does not lock tables on which VACUUM statements are being executed. You can read data from or write data to the tables while the statements are being executed. However, VACUUM statements increase CPU utilization and I/O usage, which may affect query performance.
VACUUM FULL:
- VACUUM FULL statements rearrange all data of a table to reclaim the storage occupied by invisible data. For tables that have a large number of data updates, we recommend that you execute VACUUM FULL statements at your earliest opportunity.
- VACUUM FULL statements must be executed at least once a week. If the majority of your data is updated every day, you must execute a VACUUM FULL statement once a day.
- The system locks tables on which VACUUM FULL statements are being executed. You cannot read data from or write data to the tables while the statements are being executed. VACUUM FULL statements also increase CPU utilization and I/O usage. We recommend that you execute VACUUM FULL statements in the maintenance window.
Query the tables for which VACUUM statements are required
AnalyticDB for PostgreSQL provides a gp_bloat_diag view to measure the ratio of actual pages to expected pages. After you use an ANALYZE statement to collect statistics on tables, execute the following statement to check this view:
gpadmin=# SELECT * FROM gp_toolkit.gp_bloat_diag;
bdirelid | bdinspname | bdirelname | bdirelpages | bdiexppages | bdidiag
----------+------------+------------+-------------+-------------+---------------------------------------
21488 | public | t1 | 97 | 1 | significant amount of bloat suspected
(1 row)
The results include only tables that have moderate or significant data bloat. If the ratio of actual pages to expected pages is greater than 4 and less than 10, moderate data bloat is reported. If the ratio is greater than 10, significant data bloat is reported. For these tables, we recommend that you use VACUUM FULL statements to reclaim storage.
Use VACUUM FREEZE
Each transaction executed by AnalyticDB for PostgreSQL has a unique transaction ID (XID). XIDs increase monotonically and have an upper limit of two billion.
WARNING: database "database_name" must be vacuumed within number_of_transactions transactions
In this case, you can manually execute a VACUUM FREEZE statement on the current database to decrease the XID and ensure that the XID does not exceed the limit.
FATAL: database is not accepting commands to avoid wraparound data loss in database "database_name"
In this case, you must submit a ticket to contact Alibaba Cloud technical support.