All Products
Search
Document Center

ApsaraDB RDS:Use the pg_squeeze extension to shrink bloated tables and indexes

Last Updated:Mar 28, 2026

The pg_squeeze extension reclaims disk space from bloated tables and indexes in ApsaraDB RDS for PostgreSQL without blocking read or write operations.

Prerequisites

Before you begin, ensure that you have:

  • An RDS instance running minor engine version 20240229 or later. For PostgreSQL 17, the minor engine version must be 20241030 or later. To check and update your version, see Update the minor engine version.

  • The following parameters configured on your RDS instance. For details, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance:

    • wal_level set to logical

    • pg_squeeze added to shared_preload_libraries

  • Enough free disk space. The extension requires at least twice the combined size of the table and its indexes. For example, if a table and its indexes total 1 GB, you need at least 2 GB of additional free space.

How it works

PostgreSQL uses Multi-Version Concurrency Control (MVCC) to handle concurrent transactions. When a row is updated or deleted, PostgreSQL marks the old row version as a dead tuple rather than overwriting it. Dead tuples are invisible to new transactions but continue to occupy disk space. Over time, this accumulates as table bloat.

autovacuum can clean up dead tuples, but it does not fully eliminate table bloat because the space previously occupied by the dead tuples is not immediately reclaimed. VACUUM FULL does reclaim space, but it requires an Access Exclusive Lock, making the table inaccessible for the entire duration.

pg_squeeze works differently: it rebuilds a table by creating a new copy of the original table, which eliminates the dead tuples. The table remains available for reads and writes throughout this process. You can also execute SQL statements directly without the need to use a client.

Detect table bloat

Tables with frequent UPDATE or DELETE operations are most prone to bloat. Use the following SQL to estimate the bloat rate for a specific table:

CREATE EXTENSION pgstattuple;
SELECT *, 1.0 - tuple_len::numeric / table_len AS bloat FROM pgstattuple('your_relation');

The bloat rate is the proportion of unused space in the table:

bloat rate = 1 - (live tuple space / total table space)
Note

This query performs a full table scan. Run it during off-peak hours on large tables.

Limitations

  • The table must have a unique key.

  • A large amount of I/O resources are consumed during the cleanup process. We recommend that you clean up the table during off-peak hours.

  • To clean up tables across multiple databases, create the extension separately in each database.

Set up pg_squeeze

Create the extension in the target database:

CREATE EXTENSION pg_squeeze;

To remove the extension:

DROP EXTENSION pg_squeeze;

Shrink bloated tables

One-time table squeeze

Run the following statement to immediately squeeze a table. This example cleans up the bar table in the public schema:

SELECT squeeze.squeeze_table('public', 'bar');

Scheduled automatic squeeze

For tables that accumulate bloat regularly, set up a scheduled cleanup task.

Step 1: Register the table and schedule.

The following example schedules the foo table in the public schema to be squeezed every Wednesday and Friday at 22:30:

INSERT INTO squeeze.tables (tabschema, tabname, schedule)
VALUES ('public', 'foo', ('{30}', '{22}', NULL, NULL, '{3, 5}'));

For the full schedule syntax, see the pg_squeeze documentation.

Step 2: Start the worker.

Choose one of the following methods to start the background worker:

  • Start manually with SQL:

    SELECT squeeze.start_worker();  -- Start
    SELECT squeeze.stop_worker();   -- Stop
  • Start automatically on database restart: Set the following parameters in the ApsaraDB RDS console, then restart the database. For details, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.

    Note

    rds_superuser must be a privileged account. Create the pg_squeeze extension in each listed database before the worker starts. For details, see Set up pg_squeeze.

    squeeze.worker_autostart = 'database1 database2'
    squeeze.worker_role = rds_superuser

References