After data in a table is deleted or updated, the physical storage layer does not delete the data but marks it as invisible. In this situation, a number of holes are left in data pages. When the system reads data, these holes are loaded with the data pages. This slows down data scans. Therefore, you must reclaim space regularly.
How to reclaim space
VACUUMstatement to rearrange the data in a table and reclaim disk space for better data read performance. The statement syntax is as follows:
VACUUM [FULL] [FREEZE] [VERBOSE] [table];
A VACUUM statement organizes data on data pages, whereas a VACUUM FULL statement moves data across data pages. A VACUUM operation runs faster. A VACUUM FULL operation reclaims more space but requests exclusive locks. We recommend that you perform a VACUUM operation on system tables once a week.
- VACUUM statements cannot reclaim all space of a table.
- For tables whose data is frequently updated in real time, execute a VACUUM statement once a day.
- If data is updated in batches once a day, you can execute the statement once after the batch update is complete.
- The system does not lock tables on which VACUUM statements are being executed. You can read data from the tables or write data into them while the statements are executed. However, VACUUM statements increase CPU utilization and I/O usage, which may affect query performance.
- VACUUM FULL statements reclaim space by rearranging data in tables and can reclaim all space occupied by holes. For tables with most data updates, we recommend that you execute VACUUM FULL statements as soon as possible.
- VACUUM FULL statements must be executed at least once a week. If the majority of your data is updated every day, execute a VACUUM FULL statement once a day.
- The system locks tables on which VACUUM FULL statements are being executed. You can neither read data from the tables nor write data into them while the statements are executed. VACUUM FULL statements also increase CPU utilization and I/O usage. We recommend that you execute VACUUM FULL statements in the maintenance window.
Query the tables for which VACUUM statements are required
AnalyticDB for PostgreSQL provides a gp_bloat_diag view to measure the ratio of actual pages to expected pages. After you use an ANALYZE statement to collect statistics on tables, execute the following statement to check this view:
gpadmin=# SELECT * FROM gp_toolkit.gp_bloat_diag; bdirelid | bdinspname | bdirelname | bdirelpages | bdiexppages | bdidiag ----------+------------+------------+-------------+-------------+--------------------------------------- 21488 | public | t1 | 97 | 1 | significant amount of bloat suspected (1 row)
The results only include tables with moderate or significant data bloat. If the ratio of actual pages to expected pages is greater than 4 but less than 10, moderate data bloat is reported. If the ratio is greater than 10, significant data bloat is reported. For these tables, we recommend that you use VACUUM FULL statements to reclaim space.
When to use VACUUM FREEZE
Each transaction executed by AnalyticDB for PostgreSQL has a unique transaction ID (XID). XIDs monotonically increase, with an upper limit of two billion.
WARNING: database "database_name" must be vacuumed within number_of_transactions transactions
In this situation, you can manually execute a VACUUM FREEZE statement on the current database to reduce the XID.
FATAL: database is not accepting commands to avoid wraparound data loss in database "database_name"
In this situation, you must submit a ticket to contact Alibaba Cloud technical support and resolve the issue.