This topic describes how to create a dynamic table.
Limits
For information about the limits of dynamic tables, see Application scope and limits of dynamic tables.
Syntax
Table creation syntax
Syntax for creating a dynamic table:
CREATE DYNAMIC TABLE [IF NOT EXISTS] <schema.tablename>(
[col_name],
[col_name]
) [PARTITION BY LIST (col_name)]
WITH (
[refresh_mode='[full|incremental]',]
[auto_refresh_enable='[true|false',]
-- Parameters for incremental data refresh:
[incremental_auto_refresh_schd_start_time='[immediate|<timestamptz>]',]
[incremental_auto_refresh_interval='[<num> {minute|minutes|hour|hours]',]
[incremental_guc_hg_computing_resource='[ local | serverless]',]
[incremental_guc_hg_experimental_serverless_computing_required_cores='<num>',]
-- Parameters for full data refresh:
[full_auto_refresh_schd_start_time='[immediate|<timestamptz>]',]
[full_auto_refresh_interval='[<num> {minute|minutes|hour|hours]',]
[full_guc_hg_computing_resource='[ local | serverless]',]-- By default, the hg_full_refresh_computing_resource parameter is set to serverless. The parameter can be configured at the database level or left empty.
[full_guc_hg_experimental_serverless_computing_required_cores='<num>',]
-- Shared parameter. You can configure Grand Unified Configuration (GUC) parameters:
[refresh_guc_<guc>='xxx]',]
-- Common properties of the dynamic table:
[orientation = '[column | row | row,column]',]
[table_group = '[tableGroupName]',]
[distribution_key = 'columnName[,...]]',]
[clustering_key = '[columnName{:asc]} [,...]]',]
[event_time_column = '[columnName [,...]]',]
[bitmap_columns = '[columnName [,...]]',]
[dictionary_encoding_columns = '[columnName [,...]]',]
[time_to_live_in_seconds = '<non_negative_literal>',]
[storage_mode = '[hot | cold]']
)
AS
<query> -- The definition of the query.
Parameters
Parameters related to refresh modes and resources
Parameter category | Parameter | Description | Required | Default value |
Shared refresh parameters | refresh_mode | The refresh mode of data. Valid values: full and incremental. If you leave this parameter empty, data is not refreshed. | No | No default value |
auto_refresh_enable | Specifies whether to enable auto refresh. Valid values:
| No | false | |
refresh_guc_<guc> | The GUC parameters that are configured for the refresh operation. For information about the supported GUC parameters, see GUC parameters. | No | No default value | |
Incremental data refresh | incremental_auto_refresh_schd_start_time | The start time of the incremental data refresh. Valid values:
| No | immediate |
incremental_auto_refresh_interval | The interval at which incremental data is refreshed. Supported units: minute, minutes, hour, and hours.
| No | No default value | |
incremental_guc_hg_computing_resource | The computing resources for the incremental data refresh. Valid values:
Note You can execute the | No | serverless | |
incremental_guc_hg_experimental_serverless_computing_required_cores | The number of serverless computing resources required for the incremental data refresh. Note The serverless computing resources available for instances of different specifications are limited. For more information, see User guide on Serverless Computing. | No | No default value | |
Full data refresh | full_auto_refresh_schd_start_time | The start time of the full data refresh. Valid values:
| No | immediate |
full_auto_refresh_interval | The interval at which full data is refreshed. Supported units: minute, minutes, hour, and hours.
| No | No default value | |
full_guc_hg_computing_resource | The computing resources for the full data refresh. Valid values:
Note You can execute the | No | serverless | |
full_guc_hg_experimental_serverless_computing_required_cores | The number of serverless computing resources required for the refresh. Note The serverless computing resources available for instances of different specifications are limited. For more information, see User guide on Serverless Computing. | No | No default value |
Parameters for table properties
Parameter | Description | Required | Default value | |
full | incremental | |||
col_name | The name of a field in the dynamic table. You can explicitly specify the names of columns in a dynamic table. However, you cannot specify the properties and data types of the columns. The engine automatically infers the properties and data types of the columns. Note If you specify the properties and data types for columns in a dynamic table, the properties and data types of the columns inferred by the engine may become incorrect. | No | Name of the queried column | Name of the queried column |
orientation | The storage orientation of the dynamic table. Valid values:
| No | column | column |
table_group | The table group to which the dynamic table belongs. By default, the default table group of the current database is used. For more information, see User guide of table groups and shard counts. | No | Name of the default table group | Name of the default table group |
distribution_key | The distribution key of the dynamic table. For more information, see Distribution key. | No | No default value | No default value |
clustering_key | The clustering key of the dynamic table. For more information, see Clustering key. | No | You can set the parameter to a custom value. A default inferred value is available. | You can set the parameter to a custom value. A default inferred value is available. |
event_time_column | The event time columns of the dynamic table. For more information, see Event time column (segment key). | No | No default value | No default value |
bitmap_columns | The bitmap columns of the dynamic table. For more information, see Bitmap index. | No | Fields of the TEXT type | Fields of the TEXT type |
dictionary_encoding_columns | The dictionary encoding columns of the dynamic table. For more information, see Dictionary encoding. | No | Fields of the TEXT type | Fields of the TEXT type |
time_to_live_in_seconds | The lifecycle of data in the dynamic table. | No | Permanent | Permanent |
storage_mode | The storage mode of the dynamic table. Valid values:
Note For more information about storage modes, see Tiered storage of hot data and cold data. | No | hot | hot |
Query
A query can be generated for data in a dynamic table. The query types and base table types that are supported in a dynamic table vary based on the refresh mode of the dynamic table. For more information, see Application scope and limits of dynamic tables.
Full data refresh
If the refresh mode is full, all data is written to a dynamic table in a query. The full data refresh has the following strengths in contrast to the incremental data refresh:
More base table types are supported.
More query types and operator types are supported.
In full data refresh mode, more data is processed and more resources may be consumed. Therefore, this data refresh mode is suitable for scenarios such as periodic report check and periodic data backfill.
For more information about the full data refresh mode, see the Full data refresh section in this topic.
Incremental data refresh
If the refresh mode is incremental, data changes in a base table can be identified and incremental data in the base table is written to a dynamic table in a query. In incremental data refresh mode, less data is processed and data processing timeliness is higher. In actual use cases, if you want to query data at the minute level, we recommend that you use the incremental data refresh mode. However, you must take note of the following items when you use this mode for a dynamic table:
You must enable binary logging for the base table of the dynamic table. However, if you want to perform a JOIN operation on a dimension table, you do not need to enable binary logging for the dimension table.
After binary logging is enabled for a base table, a specific amount of storage resources will be consumed. For information about the storage resources used by binary logs, see the Query the storage details of a table section of the "Query the storage sizes of tables and databases" topic.
After incremental data refresh is enabled, the system generates a state table in the backend to record the intermediate aggregate results. For information about the technical principles of a state table, see the Incremental refresh section of the "Overview" topic. The state table stores intermediate aggregate data, which consumes a specific amount of storage resources. For more information about storage resource usage, see Manage dynamic tables.
The incremental data refresh mode of Hologres supports incremental data refresh and full-and-incremental data refresh to meet different business requirements. For more information, see the following content.
Incremental data refresh
After you enable binary logging for an existing base table and create a dynamic table in incremental data refresh mode for the base table, the existing data in the base table is not synchronized to the dynamic table. Only the new data is synchronized to the dynamic table. In this case, data consistency issues may occur.
Full-and-incremental data refresh
A dynamic table in incremental data refresh mode also supports full-and-incremental data consumption. This indicates that the full data of a base table is consumed first, and then the incremental data of the base table is consumed. The full-and-incremental data refresh feature is controlled by a GUC parameter:
-- Enable full-and-incremental data consumption.
SET hg_experimental_enable_hybrid_incremental_mode = true;
Examples:
-- Full-and-incremental data consumption for a dynamic table in incremental data refresh mode
-- Create a base table for which binary logging is not enabled.
CREATE TABLE base_sales(
day TEXT NOT NULL,
hour INT,
user_id BIGINT,
ts TIMESTAMPTZ,
amount FLOAT,
pk text NOT NULL PRIMARY key
);
-- Import data to the base table.
INSERT INTO base_sales VALUES ('2024-08-29',1,222222,'2024-08-29 16:41:19.141528+08',5,'ddd');
-- Enable binary logging for the base table.
ALTER TABLE base_sales SET (binlog_level = replica);
-- Import incremental data to the base table.
INSERT INTO base_sales VALUES ('2024-08-29',2,3333,'2024-08-29 17:44:19.141528+08',100,'aaaaa');
-- Create a dynamic table in incremental data refresh mode.
CREATE DYNAMIC TABLE sales_incremental
WITH (refresh_mode='incremental')
AS
SELECT day, hour, SUM(amount), COUNT(1)
FROM base_sales
GROUP BY day, hour;
-- Configure a GUC parameter to enable full-and-incremental data consumption and refresh the dynamic table.
SET hg_experimental_enable_hybrid_incremental_mode = true;
REFRESH TABLE sales_incremental;
Compare data consistency:
Query the base table
SELECT day, hour, SUM(amount), COUNT(1) FROM base_sales GROUP BY day, hour;
Returned result:
day hour sum count 2024-08-29 1 5 1 2024-08-29 2 100 1
Query the dynamic table
SELECT * FROM sales_incremental;
Returned result:
day hour sum count 2024-08-29 1 5 1 2024-08-29 2 100 1
Examples
Example 1: Create a dynamic table in full data refresh mode and enable auto refresh for the dynamic table
Before you perform the following operations, refer to Import public datasets with a few clicks to import data from the public dataset tpch_10g to Hologres.
-- Create a dynamic table in full data refresh mode, enable immediate refresh for the dynamic table, and set the refresh interval to 1 hour.
CREATE DYNAMIC TABLE test.thch_q1_full
WITH (
refresh_mode='full',
auto_refresh_enable='true',
full_auto_refresh_interval='1 hours',
full_guc_hg_computing_resource='serverless',
full_guc_hg_experimental_serverless_computing_required_cores='32'
)
AS
SELECT
l_returnflag,
l_linestatus,
SUM(l_quantity) AS sum_qty,
SUM(l_extendedprice) AS sum_base_price,
SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
AVG(l_quantity) AS avg_qty,
AVG(l_extendedprice) AS avg_price,
AVG(l_discount) AS avg_disc,
COUNT(*) AS count_order
FROM
hologres_dataset_tpch_100g.lineitem
WHERE
l_shipdate <= DATE '1998-12-01' - INTERVAL '120' DAY
GROUP BY
l_returnflag,
l_linestatus;
Example 2: Create a dynamic table in incremental data refresh mode and specify the refresh start time for the dynamic table
Before you perform the following operations, refer to Import public datasets with a few clicks to import data from the public dataset tpch_10g to Hologres.
The following example shows how to create a dynamic table in incremental data refresh mode.
Before you create a dynamic table in incremental data refresh mode, you must enable binary logging for the base table of the dynamic table. You do not need to enable binary logging for a dimension table.
-- Enable binary logging for the base table.
BEGIN;
CALL set_table_property('hologres_dataset_tpch_100g.lineitem', 'binlog.level', 'replica');
COMMIT;
-- Create a dynamic table in incremental data refresh mode, specify the refresh start time, and set the refresh interval to 3 minutes.
CREATE DYNAMIC TABLE public.tpch_q1_incremental
WITH (
refresh_mode='incremental',
auto_refresh_enable='true',
incremental_auto_refresh_schd_start_time='2024-09-15 23:50:0',
incremental_auto_refresh_interval='3 minutes',
incremental_guc_hg_computing_resource='serverless',
incremental_guc_hg_experimental_serverless_computing_required_cores='30'
) AS SELECT
l_returnflag,
l_linestatus,
COUNT(*) AS count_order
FROM
hologres_dataset_tpch_100g.lineitem
WHERE
l_shipdate <= DATE '1998-12-01' - INTERVAL '120' DAY
GROUP BY
l_returnflag,
l_linestatus
;
Example 3: Create a dynamic table in full data refresh mode for which a JOIN operation is performed on multiple tables
-- In a query, create a dynamic table for which a JOIN operation is performed on multiple tables, select full as the refresh mode, and set the refresh interval to 3 hours.
CREATE DYNAMIC TABLE test.dt_q_full
WITH (
refresh_mode='full',
auto_refresh_enable='true',
full_auto_refresh_schd_start_time='immediate'
full_auto_refresh_interval='3 hours',
full_guc_hg_computing_resource='serverless',
full_guc_hg_experimental_serverless_computing_required_cores='64'
)
AS
SELECT
o_orderpriority,
COUNT(*) AS order_count
FROM
public.orders
WHERE
o_orderdate >= DATE '1996-07-01'
AND o_orderdate < DATE '1996-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
public.lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority;
Example 4: Create a dynamic table in incremental data refresh mode for which a JOIN operation is performed on a dimension table
The following example shows how to create a dynamic table in incremental data refresh mode for which a JOIN operation is performed on a dimension table.
Before you create a dynamic table in incremental data refresh mode, you must enable binary logging for the base table of the dynamic table. You do not need to enable binary logging for a dimension table.
In the semantics of a JOIN operation performed on a dimension table, each data record in a base table is associated with the latest version of data in the dimension table. This indicates that the JOIN operation is performed at the time when data is processed. Therefore, if the data in the dimension table is added, updated, or deleted after the JOIN operation is performed, the associated data remains unchanged.
-- Enable binary logging for the base table of the dynamic table. You do not need to enable binary logging for a dimension table.
BEGIN;
CALL set_table_property('hm.sale_detail ', 'binlog.level', 'replica');
COMMIT;
CREATE DYNAMIC TABLE dt_sales_incremental
WITH (
refresh_mode='incremental',
auto_refresh_enable='true',
incremental_auto_refresh_schd_start_time='2024-09-15 00:00:00'
incremental_auto_refresh_interval='5 minutes',
incremental_guc_hg_computing_resource='serverless',
incremental_guc_hg_experimental_serverless_computing_required_cores='128')
AS
SELECT
sale_detail.app_id,
sale_detail.uid,
product,
SUM(sale_detail.gmv) AS sum_gmv,
sale_detail.order_time,
user_info.province,
user_info.city
FROM hm.sale_detail
INNER JOIN hm.user_info FOR SYSTEM_TIME AS OF PROCTIME()
ON sale_detail.uid =user_info.uid
GROUP BY sale_detail.app_id,sale_detail.uid,sale_detail.product,sale_detail.order_time,user_info.province,user_info.city;
Example 5: Create a partitioned dynamic table
Prepare a base table and data.
-- Create a partitioned source table. CREATE TABLE base_sales( uid INT, opreate_time TIMESTAMPTZ, amount FLOAT, pk text NOT NULL, ds text, PRIMARY key(ds) ) PARTITION BY LIST (ds) ; CREATE TABLE base_sales_20240616 PARTITION OF base_sales FOR VALUES IN ('20240616'); INSERT INTO base_sales_20240616 VALUES (1,'2024-06-16 16:08:25.387466+08','2','1','20240616');
Create a parent partitioned dynamic table. Configure the definition of the query for the parent partitioned table and do not configure the refresh mode.
CREATE DYNAMIC TABLE test.partition_dt_base_sales PARTITION BY LIST (ds) AS SELECT public.RB_BUILD_AGG(uid), opreate_time, amount, pk, ds, COUNT(1) FROM base_sales GROUP BY opreate_time ,amount,pk,ds;
Manually create a child partitioned table and configure the refresh mode for the child partitioned table.
-- Create a child partitioned dynamic table and set the refresh mode to full for the table. After the table is created, the table is immediately refreshed. The refresh interval is 30 minutes and the resources of the current instance are used to refresh the table. CREATE DYNAMIC TABLE test.partition_dt_base_sales_20240616 PARTITION OF partition_dt_base_sales FOR VALUES IN ('20240616') WITH ( refresh_mode='full', auto_refresh_enable='true', full_auto_refresh_schd_start_time='immediate', full_auto_refresh_interval='30 minutes' );
Next step: Manage dynamic tables
After you create a dynamic table, you can perform the following operations on the table:
View the DDL and lineages of the dynamic table. For more information, see Manage dynamic tables.
Modify the properties of the dynamic table. For more information, see ALTER DYNAMIC TABLE.