All Products
Search
Document Center

ApsaraDB RDS:Use the TPC feature

Last Updated:Dec 15, 2023

If you want to reduce the storage usage and improve the read performance of an ApsaraDB RDS for PostgreSQL instance without the need to change the instance type, you can use the transparent page compression (TPC) feature. The feature compresses and decompresses page data in the buffer pool in real time. This effectively reduces storage costs and improves I/O throughput. However, the feature deteriorates CPU performance.

Prerequisites

The RDS instance runs PostgreSQL 14 or later. If the TPC feature is still not supported, you must update the minor engine version of the RDS instance. For more information, see Update the minor engine version.

Feature description

The TPC feature compresses and decompresses page data in the buffer pool in real time. Data is automatically compressed when the data is written to a disk and decompressed when the data is read from the disk. The compression and decompression processes are imperceptible to users.

The TPC feature is developed to reduce the storage usage and improve the read performance of your RDS instance. The feature reduces the disk I/O and the storage usage, improves cache efficiency, and accelerates data transmission by compressing data.

Scenarios

The CPU utilization is less than 50%. The IOPS or I/O throughput frequently reaches a bottleneck.

Benefits

  • Storage costs are reduced by approximately 50% on average.

  • The I/O usage is reduced by approximately 50% on average.

  • In read scenarios, transactions per second (TPS) is increased. In some read scenarios in which the I/O throughout reaches the upper limit, TPS is increased by up to 100%.

Impacts

  • The feature increases the CPU utilization of your RDS instance. The CPU resources that are used for data compression are increased by approximately 260% and the CPU resources that are used for data decompression are increased by approximately 40%.

  • In write scenarios, TPS is decreased.

  • If the transparent data encryption (TDE) feature is also enabled for your RDS instance, the compression results of the TPC feature is affected.

  • The TPC feature performs poor on TOAST data.

Procedure

  1. Create a tablespace for compression to use the TPC feature because the feature depends on tablespaces.

    CREATE TABLESPACE rds_compress LOCATION '/data/postgresql/rds_compress' WITH(COMPRESSION='zstd');
    Important

    Do not modify the name, path, and compression algorithm of the tablespace that is created in this step.

  2. Create a table or an index for compression.

    • When you create or modify a table and an index, specify the tablespace for compression to use the TPC feature.

      • Table compression

        -- When you create a table, specify the tablespace for compression to use the TPC feature.
        CREATE TABLE <tablename> ... TABLESPACE rds_compress;
        
        -- When you modify a table, change the tablespace of the table to the tablespace for compression.
        ALTER TABLE <tablename> SET TABLESPACE rds_compress;
      • Index compression

        -- When you create an index, specify the tablespace for compression to use the TPC feature.
        CREATE INDEX <indexname> ... TABLESPACE rds_compress;
        
        -- When you modify an index, change the tablespace of the index to the tablespace for compression.
        ALTER INDEX <indexname> SET TABLESPACE rds_compress;
    • Change the default tablespace to the tablespace for compression. This way, the TPC feature is used for new tables and indexes by default.

      -- Set the default tablespace to the tablespace for compression.
      SET default_tablespace TO 'rds_compress';
      
      -- When you create a table or an index, you do not need to specify a tablespace. By default, the TPC feature is used. 
      CREATE TABLE <tablename> ...;
      CREATE INDEX <indexname> ...;

Related queries

  • Check whether the table or index is created in the tablespace for compression.

    1. In the psql command line, enter \d+ <Table name> to query the details of a table.

      If the table and index are created in the tablespace for compression, you can use the TPC feature. The following output shows the schema of the sysbench table.

                                                                 Table "public.sbtest1"
       Column |      Type      | Collation | Nullable |               Default               | Storage  | Compression | Stats target | Description 
      --------+----------------+-----------+----------+-------------------------------------+----------+-------------+--------------+-------------
       id     | integer        |           | not null | nextval('sbtest1_id_seq'::regclass) | plain    |             |              | 
       k      | integer        |           | not null | 0                                   | plain    |             |              | 
       c      | character(120) |           | not null | ''::bpchar                          | extended |             |              | 
       pad    | character(60)  |           | not null | ''::bpchar                          | extended |             |              | 
      Indexes:
          "sbtest1_pkey" PRIMARY KEY, btree (id), tablespace "rds_compress"
          "k_1" btree (k), tablespace "rds_compress"
      Tablespace: "rds_compress"
      Access method: heap
  • Check whether data is compressed.

    You can use functions such as pg_database_size, pg_tablespace_size, pg_relation_size, pg_table_size, pg_index_size, and pg_total_relation_size to view the size of data in real time.

    Note

    You can use sysbench to separately insert the same amount of test data into a compressed table and a non-compressed table to check the disk usage of the tables. The disk usage of the compressed table is approximately 50% of the disk usage of the non-compressed table.

  • Calculate the compression ratio of a table.

    SELECT pg_relation_size('<tablename>')::float / (relpages * 8192) from pg_class WHERE relname = '<tablename>';
    Note

    In the preceding statement, the pg_relation_size('<tablename>') function is called to query the disk usage of a table in bytes from the pg_class system directory and convert the obtained value into a floating-point number. Then, the floating-point number is divided by the value of (relpages × 8192) to obtain the average size of each page in bytes. relpages specifies the total number of pages in the table. 8192 is the default page size in PostgreSQL, which is 8 KB in most cases. The result of the query indicates the average compression ratio for the table. A small value indicates a high compression ratio.

FAQ

After I enable the TPC feature for my RDS instance, do pg_dump and pg_basebackup run as expected?

Yes, pg_dump and pg_basebackup run as expected. If you want to restore backup data that is generated by using pg_basebackup to another RDS instance, you must enable the TPC feature for the destination RDS instance.