This topic describes how to clear the tablespaces of an ApsaraDB RDS for PostgreSQL instance by using the pg_repack plug-in. If you perform a large number of operations, such as UPDATE, on an entire table, a table bloat may occur. In this case, you can use this plug-in to remove the table bloat. When you use this plug-in to handle a table bloat issue, this plug-in does not need to acquire an exclusive lock on the bloated table. This plug-in is more lightweight than the CLUSTER and VACUUM FULL statements.

Prerequisites

The RDS instance must meet the following requirements:
  • The RDS instance runs PostgreSQL 10 or a later version.
  • The RDS instance runs AliPG 20210331 or a later version.

Precautions

  • The pg_repack plug-in requires additional storage. A full-table repack requires the amount of free storage to be at least twice the size of the table that you want to repack.
  • The pg_repack plug-in cannot remove bloats from temporary tables.
  • The pg_repack plug-in cannot remove bloats from GiST indexes.
  • When a table is being repacked by the pg_repack plug-in, you cannot execute DDL statements on the table. The pg_repack plug-in holds an ACCESS SHARE lock on the bloated table to prohibit the execution of DDL statements on the bloated table.
  • The pg_repack plug-in consumes a large number of disk I/O resources to create tables and indexes. Before you use the pg_repack plug-in, you must evaluate whether the repack operation can interrupt your workloads. For example, if your RDS instance is equipped with enhanced SSDs (ESSDs) of PL1 and you want to repack a 100-GB table, the IOPS of the RDS instance can reach the maximum IOPS, which is 250 MB/s.

Features

The pg_repack plug-in supports full-table repack and index repack.
  • The following procedure shows how the pg_repack plug-in repacks an entire table:
    1. Creates a logging table. The logging table is used to record the changes that are made to the original table during the repack process.
    2. Creates triggers on the original table. The triggers are used to record the INSERT, UPDATE, and DELETE statements that are executed on the original table and insert the generated log records into the logging table during the repack process.
    3. Creates a table. The new table contains the same rows and columns as the original table.
    4. Creates indexes in the new table.
    5. Applies the data changes in the logging table to the new table.
    6. Switches the original and new tables in the system catalog.
    7. Deletes the original table.
    Note The pg_repack plug-in holds an ACCESS EXCLUSIVE lock on the original table to prohibit operations on the original table in Steps i, ii, vi, and vii. In the other steps, the plug-in holds an ACCESS SHARE lock on the original table. The lock does not prohibit the execution of INSERT, UPDATE, and DELETE statements on the original table.
  • The following procedure shows how the pg_repack plug-in repacks the indexes on a table:
    1. Concurrently creates indexes.
    2. Switches the original and new indexes in the system catalog.
    3. Deletes the original indexes.

Enable or disable the pg_repack plug-in

  • Enable the pg_repack plug-in.
    CREATE EXTENSION pg_repack;
  • Disable the pg_repack plug-in.
    DROP EXTENSION pg_repack;

Install the client utility of the pg_repack plug-in

You must install the client utility of the pg_repack plug-in. If your Elastic Compute Service (ECS) instance runs Alibaba Cloud Linux 3.2104 or a later version, run the following command to install the client utility:
  1. Run the following command to install environment dependencies:
    yum install postgresql* redhat-rpm-config libpq* openssl-devel readline-devel -y
  2. Run the following command to add environment variables:
    export PATH=$PATH:/usr/lib64/pgsql/postgresql-12/bin
  3. Run the following commands to download the client utility, compile the client utility, and then install the client utility:
    wget https://github.com/reorg/pg_repack/archive/refs/tags/ver_1.4.6.tar.gz
    tar zxvf ver_1.4.6.tar.gz
    cd pg_repack-ver_1.4.6
    make && make install

Example

-- Check the pg_repack plug-in but do not perform the repack operation: --dry-run
$pg_repack --dry-run --no-superuser-check --echo --no-order -h endpoint-p port-d database1 -U user --table schema1.table1

-- Check the pg_repack plug-in and perform the repack operation:
$pg_repack --no-superuser-check --echo --no-order -h endpoint-p port-d database1 -U user --table schema1.table1

FAQ

What do I do if the "ERROR: pg_repack failed with error: You must be a superuser to use pg_repack" message is returned?

Use the -k or --no-superuser-check option to skip the superuser permission check. This way, you can prevent this type of permission error.

References

For more information about the pg_repack plug-in, see Reorganize tables in PostgreSQL databases with minimal locks.