All Products
Search
Document Center

AnalyticDB for PostgreSQL:Tiered storage of hot and cold data

Last Updated:Dec 14, 2023

AnalyticDB for PostgreSQL supports tiered storage of hot and cold data. You can move infrequently accessed hot tables to cold storage and store cold tables in Object Storage Service (OSS) to reduce storage costs. This topic describes the usage limits and methods for tiered storage of hot and cold data.

Note

In this topic, data tables that are stored on local disks are called hot tables, and data tables that are stored in OSS are called cold tables.

Version limits

  • Tiered storage of hot and cold data is supported for AnalyticDB for PostgreSQL V6.0 instances in elastic storage mode whose minor version is V6.3.11.1 or later.

  • Tiered storage of hot and cold data is supported for AnalyticDB for PostgreSQL V7.0 instances in elastic storage mode whose minor version is V7.0.3.0 or later.

Note
  • For information about how to view the minor version, see View the minor engine version.

  • Tiered storage of hot and cold data is not supported for AnalyticDB for PostgreSQL instances in Serverless mode.

Usage limits

The usage limits on tiered storage of hot and cold data are different between AnalyticDB for PostgreSQL V6.0 and AnalyticDB for PostgreSQL V7.0 instances.

AnalyticDB for PostgreSQL V6.0 instances

  • You can move a single hot table to cold storage.

  • You can move specific partitions of a partitioned table to cold storage, but you cannot move an entire parent partitioned table to cold storage.

  • When you move hot tables or hot partitions to cold storage, the system creates temporary tables to store data. After you move the data to cold storage, the temporary tables are automatically deleted. Make sure that the remaining disk space on each compute node is greater than the total disk space that is occupied by the hot tables that you want to move to cold storage.

  • After you move a hot table to cold storage, the primary keys, indexes, sequences, rules, and comments that are associated with the original table are automatically deleted and cannot be recovered.

  • You cannot move cold tables or cold partitions to hot storage. You can use the CREATE TABLE AS SELECT statement to store cold data on local disks.

  • Data in cold tables or cold partitions is in the read-only state. You cannot perform write, delete, update, or DDL operations such as ALTER COLUMN and DROP COLUMN on cold data. You can use the DROP TABLE statement to delete cold tables.

  • Cold tables are stored in a compressed format similar to append-optimized column-oriented (AOCO) tables. The disk space that is occupied by cold tables is equal to the disk space that is occupied by the original hot tables.

  • You cannot use indexes for cold tables.

AnalyticDB for PostgreSQL V7.0 instances

  • You can move a single hot table to cold storage.

  • You can move specific partitions of a partitioned table to cold storage, but you cannot move an entire parent partitioned table to cold storage.

  • After you move a hot table to cold storage, the primary keys, indexes, sequences, rules, and comments that are associated with the original table are automatically deleted and cannot be recovered.

  • You cannot move cold tables or cold partitions to hot storage. You can use the CREATE TABLE AS SELECT statement to store cold data on local disks.

  • You can read data from or write data to cold tables or cold partitions. You cannot perform delete, update, or DDL operations such as ALTER COLUMN and DROP COLUMN on cold data. You can use the DROP TABLE statement to delete cold tables.

  • You cannot move partitions of a partitioned table that has a primary key or unique index to cold storage. You can perform the preceding operation for a partitioned table that has a non-unique index.

  • The disk space that is occupied by cold tables is equal to or slightly less than the disk space that is occupied by the original hot tables.

  • You cannot use indexes for cold tables.

Billing rules for cold storage

After you move hot tables to cold storage, cold tables are stored in OSS and you are charged storage fees. The following billing rules apply:

  • You are charged for cold storage based on the pay-as-you-go billing method.

  • Cold storage is collected at a 5-minute interval and billed on an hourly basis.

  • You are charged fees based on the standard storage prices of OSS. For more information, see OSS pricing.

    For example, in a region in the Chinese mainland, the cost of OSS storage is USD 0.017 per GB-month and USD 0.0000236111 per GB-hour.

You can go to the Billing Management console, click Bill Details in the left-side navigation pane, and then click the Billing Details tab to view the billing details of cold storage.

Usage methods

Important
  • When you move data to cold storage, the system creates temporary tables, writes the data to the temporary tables, and then uploads the data to OSS. During this process, local and network I/O workloads occur, which may affect the performance of running queries. To ensure that your business runs as expected, move data to cold storage during off-peak hours.

  • After you move hot tables to cold storage, the local disk space that is occupied by the original hot tables is released.

  • For AnalyticDB for PostgreSQL V6.0 instances, the system moves hot tables to cold storage at a scheduled time. A specific period of time is required for scheduling and queuing. For AnalyticDB for PostgreSQL V7.0 instances, the system immediately moves hot tables to cold storage after you execute a statement. The amount of time that is required for cold storage varies based on the instance specifications, number of tables, and data volumes. For more information, see the "Performance data" section of this topic.

The usage methods on tiered storage of hot and cold data are different between AnalyticDB for PostgreSQL V6.0 and AnalyticDB for PostgreSQL V7.0 instances.

AnalyticDB for PostgreSQL V6.0 instances

Move an entire standard table to cold storage

Syntax

ALTER TABLE <tableName> SET ttl interval '<scheduling_interval>' move to storage_cold;

Examples

Create a standard table named tiered_storage_heap and insert data into the table.

CREATE TABLE tiered_storage_heap (a int, b int);
INSERT INTO tiered_storage_heap SELECT random() * 1000,1 FROM generate_series(1,1000);
  • Example 1: Move the tiered_storage_heap table to cold storage at a scheduled time after three days. For example, if you execute the ALTER TABLE statement at 09:00:00 on July 17, 2023, the system moves the tiered_storage_heap table to cold storage at 09:00:00 on July 20, 2023.

    ALTER TABLE tiered_storage_heap SET ttl interval '3days' move to storage_cold;
  • Example 2: Move the tiered_storage_heap table to cold storage at 16:53:58 on July 28, 2023.

    ALTER TABLE tiered_storage_heap SET ttl '2023-07-28 16:53:58'::Timestamp move to storage_cold;
  • Example 3: Immediately move the tiered_storage_heap table to cold storage by specifying an earlier scheduled time.

    • Set the scheduled time to three days ago. Sample statement:

      ALTER TABLE tiered_storage_heap SET ttl interval '-3days' move to storage_cold;
    • Set the scheduled time to an earlier point in time. For example, the current time is 16:53:58 on July 17, 2023. Sample statement:

      ALTER TABLE tiered_storage_heap SET ttl '2022-07-16 16:53:58'::Timestamp move to storage_cold;

Move a child partitioned table to cold storage

Syntax

ALTER TABLE <Name of the child partitioned table> SET ttl interval '<scheduling_interval>' move to storage_cold;

You can run the \d+ command on the psql client to query the names of child partitioned tables.

Examples

Create a partitioned table named tiered_storage_partition_hdfs.

CREATE TABLE tiered_storage_partition_hdfs(a int,b int) distributed by (a) partition by range(a) (start(1) end(20) every(10));
Note

In this example, two child partitioned tables named tiered_storage_partition_hdfs_1_prt_1 and tiered_storage_partition_hdfs_1_prt_2 are created.

Insert data into the tiered_storage_partition_hdfs_1_prt_1 table.

INSERT INTO tiered_storage_partition_hdfs_1_prt_1 values(1, 1), (2, 2), (3, 3), (4, 4);

Move the tiered_storage_partition_hdfs_1_prt_1 table to cold storage at a scheduled time after three days. For example, if you execute the ALTER TABLE statement at 09:00:00 on July 17, 2023, the system moves the tiered_storage_partition_hdfs_1_prt_1 table to cold storage at 09:00:00 on July 20, 2023. The other child partitioned table remains at the original location.

ALTER TABLE tiered_storage_partition_hdfs_1_prt_1 SET ttl interval '3days' move to storage_cold;

Query the hot or cold storage status of a table

You can use one of the following methods to check whether a table is stored in hot storage or cold storage. If the table is stored in cold storage, cold is returned. If the table is stored in hot storage, hot is returned.

  • Method 1:

    SELECT pg_tiered_storage_relation_status('<table_name>'::regclass::oid::bigint);
  • Method 2

    SELECT pg_tiered_storage_relation_status(<OID of the table>::bigint);

    To query the OID of the table, execute the SELECT oid FROM pg_class where relname='<table_name>'; statement.

AnalyticDB for PostgreSQL V7.0 instances

Move an entire standard table to cold storage

Syntax

SELECT pg_tiered_storage_move_table_to_storage_cold('<schema_name>', '<table_name>');

Examples

Create a standard table named tiered_storage_heap_oss in the public schema and insert data into the table.

CREATE TABLE tiered_storage_heap_oss (a int, b int) DISTRIBUTED BY(a) ;
INSERT INTO  tiered_storage_heap_oss SELECT random() * 1000,1 FROM generate_series(1,100);
  • Example 1: Immediately move an entire table to cold storage.

    Execute the following statement to immediately move the entire standard table to cold storage:

    SELECT pg_tiered_storage_move_table_to_storage_cold('public', 'tiered_storage_heap_oss');
  • Example 2: Use the pg_cron extension to configure a scheduled job for moving an entire table to cold storage.

    For example, you want to use the etl_user database account to move the tiered_storage_heap_oss table of the etl database to cold storage at 01:00:00 the next day. Connect to the postgres database and execute the following statement:

    SELECT cron.schedule('etl_table_transfer_to_cold', '0 1 * * *', 'SELECT pg_tiered_storage_move_table_to_storage_cold('public', 'tiered_storage_heap_oss');', 'etl', 'etl_user');

    After you make sure that the table is moved to cold storage, you can execute the following statement to delete the scheduled job:

    SELECT cron.unschedule(<Job ID>);
    Note

    The ID of a scheduled job is automatically generated when you create the scheduled job. You can obtain the job ID from the jobid field of the cron.job table.

Move a child partitioned table to cold storage

Syntax

SELECT pg_tiered_storage_move_table_to_storage_cold('<schema_name>', '<Name of the child partitioned table>');

You can run the \d+ command on the psql client to query the names of child partitioned tables.

Examples

  • Example 1: Immediately move a child partitioned table to cold storage.

    Create a partitioned table named tiered_storage_partition_oss in the public schema.

    CREATE TABLE tiered_storage_partition_oss(a int,b int) DISTRIBUTED BY (a) PARTITION BY range(a) (start(1) end(20) every(10));
    Note

    In this example, two child partitioned tables named tiered_storage_partition_oss_1_prt_1 and tiered_storage_partition_oss_1_prt_2 are created.

    Insert data into the tiered_storage_partition_oss_1_prt_1 table.

    INSERT INTO  tiered_storage_partition_oss_1_prt_1 VALUES(1, 1), (2, 2), (3, 3), (4, 4);

    Immediately move the child partitioned table to cold storage.

    SELECT pg_tiered_storage_move_table_to_storage_cold('public', 'tiered_storage_partition_oss_1_prt_1');
  • Example 2: Use the pg_cron extension to move a child partitioned table that is partitioned by day to cold storage at a scheduled time.

    Create a daily partitioned table named daily_log_details in the etl database.

    CREATE TABLE daily_log_details (id INT, log_message text, created_date character varying(64))
    PARTITION BY LIST (created_date)
    (
       PARTITION p20230601 VALUES ('20230601'),
       PARTITION p20230602 VALUES ('20230602'),
       PARTITION p20230603 VALUES ('20230603'),
       PARTITION p20230604 VALUES ('20230604'),
       PARTITION p20230605 VALUES ('20230605'),
       PARTITION p20230606 VALUES ('20230606'),
       PARTITION p20230607 VALUES ('20230607'),
       PARTITION p20230608 VALUES ('20230608'),
       PARTITION p20230609 VALUES ('20230609'),
       PARTITION p20230610 VALUES ('20230610'),
       PARTITION p20230611 VALUES ('20230611'),
       DEFAULT PARTITION others
    );

    Use the etl_user database account to move the child partitioned table 10 days earlier than the current date to cold storage at 03:00:00.

    1. Create a cleanup function in the etl database.

      CREATE OR REPLACE FUNCTION pg_tiered_storage_move_partition_daily_table_to_cold_storage(schemaname text, tablename text) RETURNS void
      AS $$
      DECLARE
          fetch_overdue_partition_sql text;
          cold_storage_sql text;
          target record;
      BEGIN
          fetch_overdue_partition_sql := 'WITH targetpartitions AS
          (SELECT * FROM pg_partitions 
          WHERE tablename = $1
          AND schemaname = $2 
          AND partitionlevel = 1
          AND partitionisdefault = FALSE)
          SELECT partitiontablename FROM targetpartitions WHERE 
          to_date(substring(targetpartitions.partitionname FROM 2), ''YYYYMMDD'') <= current_date - INTERVAL ''10 days''';
          
          -- fetch overdue partitions
          FOR target IN EXECUTE fetch_overdue_partition_sql USING tablename, schemaname LOOP 
             cold_storage_sql := 'SELECT pg_tiered_storage_move_table_to_storage_cold($1::text, $2::text)';
             raise notice 'sql %', cold_storage_sql;
             EXECUTE cold_storage_sql USING schemaname, target.partitiontablename;
          END LOOP;
      END;
      $$
      LANGUAGE plpgsql;
    2. Connect to the postgres database and execute the following statement to move child partitioned tables to cold storage:

      SELECT cron.schedule('etl_daily_transfer_to_cold', '0 3 * * *', 'SELECT pg_tiered_storage_move_partition_daily_table_to_cold_storage(''public'', ''daily_log_details'');', 'etl', 'etl_user');
  • Example 3: Use the pg_cron extension to move a child partitioned table that is partitioned by month to cold storage at a scheduled time.

    Create a monthly partitioned table named month_log_details in the etl database.

    CREATE TABLE month_log_details (id INT, log_message text, created_date character varying(64))
    PARTITION BY LIST (created_date)
    (
       PARTITION p202306 VALUES ('202306'),
       PARTITION p202307 VALUES ('202307'),
       PARTITION p202308 VALUES ('202308'),
       PARTITION p202309 VALUES ('202309'),
       PARTITION p202310 VALUES ('202310'),
       DEFAULT PARTITION others
    );

    Use the etl_user database account to move the child partitioned table three months earlier than the current date to cold storage at 05:00:00.

    1. Create a cleanup function in the etl database.

      CREATE OR REPLACE FUNCTION pg_tiered_storage_move_partition_table_to_cold_storage(schemaname text, tablename text) RETURNS void
      AS $$
      DECLARE
          fetch_overdue_partition_sql text;
          cold_storage_sql text;
          target record;
      BEGIN
          fetch_overdue_partition_sql := 'WITH targetpartitions AS
          (SELECT * FROM pg_partitions 
          WHERE tablename = $1
          AND schemaname = $2 
          AND partitionlevel = 1
          AND partitionisdefault = FALSE)
          SELECT partitiontablename FROM targetpartitions WHERE 
          to_date(substring(targetpartitions.partitionname FROM 2), ''YYYYMM'') <= current_date - INTERVAL ''3 months''';
          
          -- fetch overdue partitions
          FOR target IN EXECUTE fetch_overdue_partition_sql USING tablename, schemaname LOOP 
             cold_storage_sql := 'SELECT pg_tiered_storage_move_table_to_storage_cold($1::text, $2::text)';
             raise notice 'sql %', cold_storage_sql;
             EXECUTE cold_storage_sql USING schemaname, target.partitiontablename;
          END LOOP;
      END;
      $$
      LANGUAGE plpgsql;
    2. Connect to the postgres database and execute the following statement to move child partitioned tables to cold storage:

      SELECT cron.schedule('etl_month_transfer_to_cold', '0 5 1 * *', 'SELECT pg_tiered_storage_move_partition_table_to_cold_storage(''public'', ''month_log_details'');', 'etl', 'etl_user');

Query the hot or cold storage status of a table

Execute the following statement to query whether a table is stored in hot storage or cold storage. If the table is stored in cold storage, cold is returned. If the table is stored in hot storage, hot is returned.

SELECT pg_tiered_storage_table_status('<schema_name>', '<table_name>|<Name of the child partitioned table>')

View the volumes of hot and cold data storage

Log on to the AnalyticDB for PostgreSQL console, go to the Basic Information page, and then view the Hot Storage and Cold Storage parameters in the Instance Status section.

Backup and restoration

Tiered storage of hot and cold data in AnalyticDB for PostgreSQL supports backup and restoration. The following rules apply:

AnalyticDB for PostgreSQL V6.0 instances

If the data of a table is fully backed up, you can restore the table to a specific point in time. The restored table is in the same hot or cold storage state as the original table at the backup time.

AnalyticDB for PostgreSQL V7.0 instances

If the data of a table is fully backed up, you can restore the table to a specific point in time. The restored table is in the same hot or cold storage state as the original table at the backup time. AnalyticDB for PostgreSQL V7.0 instances are subject to the following limits:

  • If no data is written after you move a table to cold storage, you can restore the table to a specific point in time.

  • If data is written after you move a table to cold storage, you can restore the table to a specific point in time before you move the table to cold storage. If you want to restore the table to the cold storage state, you can restore the table only to the state that corresponds to the last write.

To support backup and restoration, the system does not immediately release the occupied OSS disk space after a table is deleted and retains the disk space for a specific period of time. The retention period is the value of the Data Backup Retention Period (Days) parameter in the backup and restoration settings. During the retention period, you are charged for the occupied OSS disk space.

For more information, see Overview of backup and restoration.

Instance scaling

When an AnalyticDB for PostgreSQL V6.0 instance is scaled down, the system moves cold tables to local temporary tables for data redistribution. After the scale-down operation is complete, the system uploads data to OSS and deletes the local temporary tables. Make sure that the remaining disk space of all nodes after the scale-down operation is greater than the total disk space that is occupied by cold tables. During the scale-down operation, the system downloads data from OSS. The amount of time that is required for the scale-down operation varies based on the download bandwidth of OSS.

When an AnalyticDB for PostgreSQL V7.0 instance is scaled, cold tables are not moved to local temporary tables or redistributed. You do not need to consider the disk space that is occupied by cold tables.

Performance data

Execute the following statements to determine the amount of time that is required to move a single table to cold storage for a four-node instance and an eight-node instance that have 2 cores and 8 GB memory:

CREATE TABLE t333 (a int, b int);
INSERT INTO t333 SELECT random() * 1000000, random()*1000000 FROM generate_series(1,3000000000);

AnalyticDB for PostgreSQL V6.0: ALTER TABLE t333 SET ttl interval '-3days' move to storage_cold;

AnalyticDB for PostgreSQL V7.0: SELECT pg_tiered_storage_move_table_to_storage_cold('public', 't333');

The following table describes the test results.

Hot table size (GB)

Amount of time required for a four-node instance (seconds)

Amount of time required for an eight-node instance (seconds)

V6.0 instance

V7.0 instance

V6.0 instance

V7.0 instance

1

10

5

5

2.8

10

96

48

42

25.2

100

848

490

333

243