Starting from version V1.3.37, Hologres supports tiered storage of hot and cold data to help you manage data tiers effectively. This topic describes how to use tiered storage.
Function introduction
Hologres offers two data storage types: Standard and Infrequent Access.
-
Standard storage uses all-SSD hot storage. It is the default storage type in Hologres and meets low-latency, high-performance data access requirements. For most scenarios, Standard storage is the most effective and cost-efficient option. In this document, it is referred to as hot storage.
-
Infrequent Access storage uses all-HDD cold storage. It provides low-cost storage for infrequently accessed data and is suitable for very large datasets where latency sensitivity is low or access is rare. In this document, it is referred to as cold storage.
-
For primary-secondary instances, we recommend upgrading to version V1.3.55 or later to enable cold storage.
Hologres also supports table-level tiered storage. You can dynamically set the storage medium for partition child tables using the CREATE PARTITION TABLE feature.
Starting from version V2.1, Hologres supports SSD-based cache acceleration to improve cold storage access speed. Cache space is allocated by default, and cache acceleration is enabled by default. Access performance is expected to improve by more than 2×.
Prerequisites
-
Tiered storage of hot and cold data is supported only in Hologres V1.3.37 and later. If your instance runs an earlier version, refer to Common upgrade preparation errors or join the Hologres DingTalk group for support. For more information, see How do I get more online support?.
-
For subscription instances running older versions, the Infrequent Access storage quota defaults to 0 GB after upgrading to V1.3.37.
Notes
-
Hologres table storage consists of Data and Meta. Data stores your user data, while Meta stores table metadata. In tiered storage, you can only set the storage medium for the Data portion. For performance reasons, Meta is always stored in hot storage. Therefore, even if you set a table’s storage policy to cold storage, it still consumes a small amount of hot storage and incurs minor hot storage fees.
-
To avoid impacting your services, hot-to-cold data migration tasks run at low priority. After you change the storage policy for table Data, the change does not take effect immediately. Hologres performs data migration asynchronously in the background. By default, migration starts at 00:00 (midnight) the same day. Starting from V2.2, you can customize the migration time. For details, see Dynamic partition management.
-
When new data is written to a cold storage table, it is first written to hot storage. A background task then asynchronously migrates it to cold storage, which incurs some hot storage fees.
-
Due to disk seek time limitations, we do not recommend using cold storage tables for point query scenarios (such as Flink dimension tables or serving scenarios). Cold storage tables support read throughput that is two orders of magnitude lower than hot storage tables.
-
We do not recommend setting row-oriented tables to cold storage.
-
During batch hot-to-cold data conversion, all table
tablets are opened, increasing memory usage. We recommend converting no more than 200 tables at once.
Create tiered storage tables
When creating a table, specify the data storage policy by setting the storage_mode parameter in SET_TABLE_PROPERTY. For details, see CREATE TABLE.
Non-partitioned tables
Set storage policy for non-partitioned tables
For non-partitioned tables, use SET_TABLE_PROPERTY to set the storage policy. Supported options are:
-
hot (hot storage).
-
cold (cold storage).
For example, to set table tbl1 to cold storage, use the following code:
-- Specify cold storage when creating the table
BEGIN;
CREATE TABLE tbl1 (
"id" int NOT NULL,
"name" text NOT NULL
);
CALL set_table_property('tbl1', 'storage_mode', 'cold');
COMMIT;
Change storage medium for non-partitioned tables
You can change the storage medium after table creation. For example, changing table tbl1 to hot storage triggers an asynchronous data migration to hot storage. Example:
-- Change storage medium to hot storage after table creation
CALL set_table_property('tbl1', 'storage_mode', 'hot');
Partitioned tables
Set storage policy for partitioned tables
Partitioned tables also use SET_TABLE_PROPERTY to set storage policies. Child partitions inherit the parent table’s storage property by default, but you can set storage policies separately for parent and child tables. Supported options are:
-
hot (hot storage).
-
cold (cold storage).
For example, setting the parent partition table (tbl2) to cold storage means all its child partitions (such as tbl2_v1) will use cold storage. Code example:
-- Specify cold storage when creating the table
BEGIN;
CREATE TABLE tbl2(
c1 text NOT NULL,
c2 text
)
PARTITION BY LIST(c2);
CALL set_table_property('tbl2', 'storage_mode', 'cold');
CREATE TABLE tbl2_v1 PARTITION OF tbl2 FOR VALUES IN ('v1');
COMMIT;
Modify storage for partitioned tables
You can change the storage medium after table creation. For example, changing the parent partition table tbl2 to hot storage triggers asynchronous migration of all its child partitions to hot storage. Example:
-- Change storage policy to hot storage after table creation
CALL set_table_property('tbl2', 'storage_mode', 'hot');
To set a specific child partition to cold storage, modify its storage_mode property. Example:
-- Create multiple child partitions
CREATE TABLE tbl2_v2 PARTITION OF tbl2 FOR VALUES IN ('v2');
CREATE TABLE tbl2_v3 PARTITION OF tbl2 FOR VALUES IN ('v3');
-- Query table properties; they currently inherit the parent's hot storage setting
SELECT * FROM hg_table_storage_status('public', 'tbl2');
-- Set a child partition to cold storage
CALL set_table_property('tbl2_v3', 'storage_mode', 'cold');
Dynamically manage storage medium for partitioned tables
To intelligently manage partition storage media, use dynamic partitioning to define hot-to-cold data transformation rules. This enables smart tiered storage that balances cost and performance effectively. For more information about dynamic partition management, see Dynamic partition management.
Query storage medium status
Call the hg_table_storage_status function to check a table’s storage status. hg_table_storage_status shows only data storage size, excluding Meta storage size. Usage:
SELECT * FROM hg_table_storage_status('<schema_name>', '<table_name>');
|
Parameter |
Description |
|
|
Schema name of the table |
|
|
Table name |
The following result is returned:
|
Column |
Content |
|
table_name |
|
|
hot_size |
Hot storage size in bytes |
|
cold_size |
Cold storage size in bytes |
|
status |
Status:
|
Examples:
-- Non-partitioned table
SELECT * FROM hg_table_storage_status('public', 'tbl1');-- Returns size in bytes
table_name | hot_size | cold_size | status
------------+----------+---------------+--------
tbl1 | 145643 | 3685 | transferring
-- Parent partition table
SELECT * FROM hg_table_storage_status('public', 'tbl2');-- Returns size in bytes
table_name | hot_size | cold_size | status
-----------------+----------+-----------+--------
tbl2_2022062222 | 0 | 0 | hot
tbl2_2022062221 | 1125 | 0 | hot
tbl2_2022062220 | 1245 | 0 | hot
tbl2_2022062219 | 1358 | 0 | hot
tbl2_2022062218 | 0 | 1875 | cold
tbl2_2022062217 | 0 | 1143 | cold
tbl2_2022062216 | 0 | 1299 | cold
Query table access frequency
Starting from V1.3.37, Hologres provides the system log table hologres.hg_table_info to collect daily statistics on tables within an instance. This helps you analyze table usage and optimize accordingly. For details, see View and analyze table statistics. You can query this table to obtain hot/cold storage volume, table access frequency, and partition access frequency to decide whether to perform hot-to-cold data conversion. Sample queries:
Non-partitioned tables
SELECT a.table_name,
(a.total_read_count - b.total_read_count) AS read_count,
(a.total_write_count - b.total_write_count) AS write_count,
a.hot_storage_size
FROM (SELECT * FROM hologres.hg_table_info
WHERE type='TABLE' AND collect_time::DATE = CURRENT_DATE - interval '1 day') a
JOIN
(SELECT * FROM hologres.hg_table_info
WHERE type='TABLE' AND collect_time::DATE = CURRENT_DATE - interval '${days} day') b
ON a.table_name = b.table_name
ORDER BY hot_storage_size DESC;
Partitioned tables
Query all partitions:
SELECT parent_table_name,count(*) AS partition_cnt,
sum(hot_storage_size)/1024/1024/1024 AS hot_size_gb
FROM hologres.hg_table_info
WHERE type = 'PARTITION' AND collect_time::DATE = CURRENT_DATE - interval '1 day'
GROUP BY parent_table_name
ORDER BY hot_size_gb DESC;
Query access frequency for specific partitions:
SELECT a.table_name,
(a.total_read_count - b.total_read_count) as read_count,
(a.total_write_count - b.total_write_count) as write_count,
a.hot_storage_size
FROM (SELECT * FROM hologres.hg_table_info
WHERE type = 'PARTITION'
AND parent_table_name = '${p_table_name}'
AND collect_time::DATE = CURRENT_DATE - interval '1 day') a
JOIN
(SELECT * FROM hologres.hg_table_info
WHERE type = 'PARTITION'
AND parent_table_name = '${p_table_name}'
AND collect_time::DATE = CURRENT_DATE - interval '${days} day') b
ON a.table_name = b.table_name
ORDER BY table_name DESC;