Use CREATE TABLE to define the storage layout of a table in Hologres. Setting the right storage mode, distribution key, and indexes at creation time is critical — most of these properties cannot be changed after the table is created.
Prerequisites
Before you begin, ensure that you have:
A connection to the target database — HoloWeb is the recommended tool for running queries
Quick start
This example creates a transaction details table using the CREATE TABLE WITH syntax (available in Hologres V2.1 and later). It uses a hierarchical naming convention, defines multiple fields, and includes metadata comments.
BEGIN;
-- Create a transaction details fact table.
-- Use the public schema and follow the hierarchical naming convention (dwd_xxx).
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders (
order_id BIGINT NOT NULL,
shop_id INT NOT NULL,
user_id TEXT NOT NULL,
order_amount NUMERIC(12, 2) DEFAULT 0.00,
payment NUMERIC(12, 2) DEFAULT 0.00,
payment_type INT DEFAULT 0, -- 0: Unpaid, 1: Alipay, 2: WeChat Pay, 3: Credit Card
is_delivered BOOLEAN DEFAULT false,
dt TEXT NOT NULL, -- Data timestamp, in YYYYMMDD format
order_time TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (order_id)
)
WITH (
orientation = 'column', -- Column-oriented: best for OLAP aggregation on large datasets
distribution_key = 'order_id', -- Shard data by order_id for even distribution
clustering_key = 'order_time:asc', -- Sort data by time within files to accelerate range queries
event_time_column = 'order_time', -- Enable file-level clipping for time range filters
bitmap_columns = 'shop_id,payment_type,is_delivered', -- Accelerate equality filters on low-cardinality columns
dictionary_encoding_columns = 'user_id:auto' -- Accelerate GROUP BY and FILTER on string columns
);
-- Add metadata comments.
COMMENT ON TABLE public.dwd_trade_orders IS 'Base fact table for transaction order details.';
COMMENT ON COLUMN public.dwd_trade_orders.order_id IS 'The unique identifier of the order.';
COMMENT ON COLUMN public.dwd_trade_orders.shop_id IS 'The unique ID of the shop.';
COMMENT ON COLUMN public.dwd_trade_orders.user_id IS 'The ID of the buyer.';
COMMENT ON COLUMN public.dwd_trade_orders.dt IS 'The data timestamp, in YYYYMMDD format.';
COMMENT ON COLUMN public.dwd_trade_orders.order_time IS 'The precise timestamp when the order was created.';
COMMIT;View the table schema
Run the following query to retrieve the Data Definition Language (DDL) statement for the table:
SELECT hg_dump_script('public.dwd_trade_orders');Insert data
Hologres is compatible with standard Data Manipulation Language (DML) syntax. The following statement inserts 10 rows of sample data:
INSERT INTO public.dwd_trade_orders
(order_id, shop_id, user_id, order_amount, payment, payment_type, is_delivered, dt, order_time)
VALUES
(50001, 101, 'U678', 299.00, 280.00, 1, true, '20231101', '2023-11-01 10:00:01+08'),
(50002, 102, 'U992', 59.00, 59.00, 2, false, '20231101', '2023-11-01 10:05:12+08'),
(50003, 101, 'U441', 150.00, 145.00, 1, true, '20231101', '2023-11-01 10:10:45+08'),
(50004, 105, 'U219', 888.00, 888.00, 3, true, '20231101', '2023-11-01 10:20:11+08'),
(50005, 102, 'U883', 35.00, 30.00, 1, false, '20231101', '2023-11-01 10:32:00+08'),
(50006, 110, 'U007', 120.50, 120.50, 2, true, '20231101', '2023-11-01 10:45:33+08'),
(50007, 101, 'U321', 210.00, 210.00, 1, true, '20231101', '2023-11-01 11:02:19+08'),
(50008, 108, 'U556', 45.00, 45.00, 2, false, '20231101', '2023-11-01 11:15:04+08'),
(50009, 101, 'U112', 300.00, 290.00, 3, true, '20231101', '2023-11-01 11:25:55+08'),
(50010, 105, 'U449', 99.90, 99.90, 1, true, '20231101', '2023-11-01 11:40:22+08');Query data
-- Calculate the total transaction amount per shop, sorted in descending order.
SELECT
shop_id,
COUNT(1) AS total_orders,
SUM(payment) AS total_payment
FROM public.dwd_trade_orders
GROUP BY shop_id
ORDER BY total_payment DESC;Expected output:
shop_id total_orders total_payment
105 2 987.90
101 4 925.00
110 1 120.50
102 1 59.00
108 1 45.00Syntax
CREATE TABLE syntax
Hologres supports two syntaxes for defining table properties and comments:
Standard syntax (recommended for Hologres V2.1 and later)
Use the WITH keyword to define properties inline. This syntax is more compact and offers better performance.
BEGIN;
CREATE TABLE [IF NOT EXISTS] [schema_name.]table_name (
{
column_name column_type [column_constraints, [...]]
| table_constraints
[,...]
}
)
[WITH (
property = 'value'
[, ...]
)];
[COMMENT ON COLUMN <[schema_name.]tablename.column> IS '<value>';]
[COMMENT ON TABLE <[schema_name.]tablename> IS '<value>';]
COMMIT;Compatible syntax (supported in all versions)
Use CALL set_table_property to set properties and COMMENT to add comments. All statements must be in the same BEGIN...COMMIT transaction block as CREATE TABLE.
BEGIN;
CREATE TABLE [IF NOT EXISTS] [schema_name.]table_name (
{
column_name column_type [column_constraints, [...]]
| table_constraints
[,...]
}
);
CALL set_table_property('[schema_name.]<table_name>', '<property>', '<value>');
COMMENT ON COLUMN <[schema_name.]tablename.column> IS '<value>';
COMMENT ON TABLE <[schema_name.]tablename> IS '<value>';
COMMIT;Table properties
Properties fall into three groups based on their function. Properties marked not modifiable cannot be changed after the table is created — you must recreate the table.
Data organization (not modifiable after creation)
| Property | Description | Column-oriented | Row-oriented | Row-column hybrid | Default |
|---|---|---|---|---|---|
orientation | Sets the storage format. See Storage modes. | column | row | row,column | column |
distribution_key | Sets the data sharding policy. See Distribution key. | Primary key by default; choose one column from the primary key for best performance. | Primary key by default. | Primary key by default. | Primary key |
clustering_key | Physically sorts data within files to accelerate range queries. See Clustering key. | Empty by default. Use at most one column; only ascending order is supported. | Primary key by default. | Empty by default. | — |
event_time_column | Divides data into file segments by time, enabling fast time range filtering. See Event time column. | First non-null timestamp field by default. | Not supported. | First non-null timestamp field by default. | First non-null timestamp |
table_group | Controls the shard count for data distribution. See Table groups and shard counts. | Default table group. | Default table group. | Default table group. | Default table group |
orientation, distribution_key, clustering_key, and event_time_column cannot be modified after table creation. Plan these carefully before creating the table. table_group also cannot be changed without recreating the table or resharding.
Index acceleration (modifiable after creation)
| Property | Description | Column-oriented | Row-oriented | Row-column hybrid |
|---|---|---|---|---|
bitmap_columns | Builds a bitmap index for fast equality filtering on low-cardinality columns. See Bitmap index. Use for columns in equality comparisons; avoid setting more than 10 columns. | Supported | Not supported | Supported |
dictionary_encoding_columns | Builds a dictionary map that converts string comparisons to numeric comparisons, accelerating GROUP BY and FILTER operations. All TEXT columns in column-oriented tables are enabled by default. In V0.9 and later, Hologres automatically determines whether to apply dictionary encoding based on data characteristics. | Supported | Not supported | Supported |
Both bitmap_columns and dictionary_encoding_columns can be modified after table creation via ALTER TABLE.
`dictionary_encoding_columns` syntax:
CALL set_table_property('table_name', 'dictionary_encoding_columns', '[columnName{:[on|off|auto]}[,...]]');Data lifecycle
| Property | Description | Column-oriented | Row-oriented | Row-column hybrid |
|---|---|---|---|---|
time_to_live_in_seconds | Sets the Time to Live (TTL) for table data in seconds. TTL starts from write time, not update time. In V1.3.24 and later, the minimum allowed value is 86,400 (one day). | Supported | Not recommended — use the default value | Not recommended |
storage_mode | Specifies whether data is stored in hot or cold storage. Supported in V1.3 and later. See Tiered data storage. | Use as needed | Use as needed | — |
`time_to_live_in_seconds` notes:
TTL is not enforced at an exact time. After the TTL expires, data is deleted within a window, not at a precise moment.
Only data is deleted; the table itself remains.
TTL can cause duplicate primary keys or inconsistent query results after deletion.
For production data lifecycle management, use partitioned tables instead. See CREATE PARTITION TABLE.
If unset, the default TTL is 100 years (effectively no expiration).
`time_to_live_in_seconds` syntax:
CALL set_table_property('table_name', 'time_to_live_in_seconds', '<non_negative_literal>');`storage_mode` syntax:
-- Set storage mode at table creation:
CREATE TABLE <table_name> (...) WITH (storage_mode = 'hot');
CREATE TABLE <table_name> (...) WITH (storage_mode = 'cold');
-- Set storage mode after table creation:
CALL set_table_property('table_name', 'storage_mode', 'hot');
CALL set_table_property('table_name', 'storage_mode', 'cold');Examples
Example: Partitioned table for large-scale time-series data
As data volumes grow, a single flat table becomes costly to maintain — purging historical data requires scanning all rows, and time-based queries perform full table scans. A partitioned table physically isolates data by day, enabling instant partition drops for cleanup and automatic partition pruning during queries.
This example upgrades the dwd_trade_orders table from the Quick start section to a partitioned structure. The field definitions and index settings are inherited, but the primary key must include the partition key dt.
BEGIN;
-- Create the partitioned parent table.
-- The primary key includes both the business key (order_id) and the partition key (dt).
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders_partitioned (
order_id BIGINT NOT NULL,
shop_id INT NOT NULL,
user_id TEXT NOT NULL,
order_amount NUMERIC(12, 2) DEFAULT 0.00,
payment NUMERIC(12, 2) DEFAULT 0.00,
payment_type INT DEFAULT 0,
is_delivered BOOLEAN DEFAULT false,
dt TEXT NOT NULL,
order_time TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (order_id, dt)
)
PARTITION BY LIST (dt)
WITH (
orientation = 'column',
distribution_key = 'order_id',
event_time_column = 'order_time',
clustering_key = 'order_time:asc'
);
COMMIT;
-- Create a child table to provide physical storage for a specific date.
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders_20231101
PARTITION OF public.dwd_trade_orders_partitioned FOR VALUES IN ('20231101');
-- Insert data. The application logic is identical to the flat table.
-- Data is automatically routed to the correct child table.
INSERT INTO public.dwd_trade_orders_partitioned
(order_id, shop_id, user_id, order_amount, payment, payment_type, is_delivered, dt, order_time)
VALUES
(50001, 101, 'U678', 299.00, 280.00, 1, true, '20231101', '2023-11-01 10:00:01+08'),
(50002, 102, 'U992', 59.00, 59.00, 2, false, '20231101', '2023-11-01 10:05:12+08'),
(50003, 101, 'U441', 150.00, 145.00, 1, true, '20231101', '2023-11-01 10:10:45+08'),
(50004, 105, 'U219', 888.00, 888.00, 3, true, '20231101', '2023-11-01 10:20:11+08'),
(50005, 102, 'U883', 35.00, 30.00, 1, false, '20231101', '2023-11-01 10:32:00+08'),
(50006, 110, 'U007', 120.50, 120.50, 2, true, '20231101', '2023-11-01 10:45:33+08'),
(50007, 101, 'U321', 210.00, 210.00, 1, true, '20231101', '2023-11-01 11:02:19+08'),
(50008, 108, 'U556', 45.00, 45.00, 2, false, '20231101', '2023-11-01 11:15:04+08'),
(50009, 101, 'U112', 300.00, 290.00, 3, true, '20231101', '2023-11-01 11:25:55+08'),
(50010, 105, 'U449', 99.90, 99.90, 1, true, '20231101', '2023-11-01 11:40:22+08');
-- Query with a partition filter. Hologres scans only the matching child table.
SELECT COUNT(*) FROM public.dwd_trade_orders_partitioned WHERE dt = '20231101';
-- Drop a child table to reclaim space in seconds (much faster than DELETE).
-- DROP TABLE public.dwd_trade_orders_20231101;Example: Real-time analytics for large datasets (fact table)
Scenario: Real-time dashboard summaries with large data volumes, where the core requirement is fast aggregation — for example, calculating Gross Merchandise Volume (GMV) and order counts.
Column-oriented storage excels here: high compression ratios reduce I/O, and column scans read only the columns needed for aggregation.
BEGIN;
CREATE TABLE IF NOT EXISTS public.dwd_order_summary (
order_id BIGINT PRIMARY KEY,
category_id INT NOT NULL,
gmv NUMERIC(15, 2),
order_time TIMESTAMPTZ NOT NULL
) WITH (
orientation = 'column', -- Best for large-scale aggregation; high compression, efficient column scans
distribution_key = 'order_id', -- Even distribution; enables local joins with other order tables
event_time_column = 'order_time', -- Enables file-level segment clipping for time range filters
clustering_key = 'order_time:asc' -- Reduces disk I/O for "last hour" or "specific day" queries
);
COMMENT ON TABLE public.dwd_order_summary IS 'Fact table for order summary details.';
COMMENT ON COLUMN public.dwd_order_summary.order_id IS 'The unique order ID.';
COMMENT ON COLUMN public.dwd_order_summary.category_id IS 'The category ID.';
COMMENT ON COLUMN public.dwd_order_summary.gmv IS 'The gross merchandise volume.';
COMMENT ON COLUMN public.dwd_order_summary.order_time IS 'The time when the order was placed.';
COMMIT;Example: High-concurrency point queries (dimension table)
Scenario: Retrieve a user profile by user_id in milliseconds, under high Queries Per Second (QPS).
Row-oriented storage is optimized for primary key-based lookups. All row data is stored contiguously, making point queries extremely fast without scanning unrelated columns.
BEGIN;
CREATE TABLE IF NOT EXISTS public.dim_user_persona (
user_id TEXT PRIMARY KEY,
user_level INT,
persona_jsonb JSONB
) WITH (
orientation = 'row'
-- For row-oriented tables, the primary key is automatically used as the distribution key
-- and clustering key. No additional settings are required.
);
COMMENT ON TABLE public.dim_user_persona IS 'Dimension table for user profiles.';
COMMENT ON COLUMN public.dim_user_persona.user_id IS 'The unique user ID.';
COMMENT ON COLUMN public.dim_user_persona.user_level IS 'The user level.';
COMMENT ON COLUMN public.dim_user_persona.persona_jsonb IS 'The user profile features in JSON format.';
COMMIT;Example: Hybrid workload (row-column hybrid storage)
Scenario: A logistics and after-sales system that needs both analytics (summarizing logistics statuses) and point queries (retrieving order details by order_id).
Row-column hybrid storage combines the millisecond-level point query performance of row-oriented storage with the efficient aggregation of column-oriented storage.
BEGIN;
CREATE TABLE IF NOT EXISTS public.ads_shipping_info (
order_id BIGINT PRIMARY KEY,
shipping_status INT,
receiver_address TEXT,
update_time TIMESTAMPTZ
) WITH (
orientation = 'row,column', -- Supports both point queries and aggregation
distribution_key = 'order_id', -- Controls data distribution across shards
bitmap_columns = 'shipping_status' -- Accelerates "status = X" filter queries on a low-cardinality column
);
COMMENT ON TABLE public.ads_shipping_info IS 'Application table for logistics status queries.';
COMMENT ON COLUMN public.ads_shipping_info.order_id IS 'The order ID.';
COMMENT ON COLUMN public.ads_shipping_info.shipping_status IS 'Logistics status (1: To be shipped, 2: In transit, 3: Delivered).';
COMMENT ON COLUMN public.ads_shipping_info.receiver_address IS 'The shipping address.';
COMMIT;Limitations
Primary key limits
Composite primary keys: Multiple fields can form the primary key. All fields must be
NOT NULLand must be declared in a single statement.BEGIN; CREATE TABLE public.test ( id TEXT NOT NULL, ds TEXT NOT NULL, PRIMARY KEY (id, ds) ); CALL SET_TABLE_PROPERTY('public.test', 'orientation', 'column'); COMMIT;Unsupported types: FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, DATE, and other complex types cannot be used as primary key columns.
Not modifiable: The primary key cannot be changed after table creation. Recreate the table if you need a different primary key.
Storage requirements: Row-oriented and row-column hybrid tables must have a primary key. Primary keys are optional for column-oriented tables.
Constraint support
| Constraint | Column-level | Table-level |
|---|---|---|
primary key | Supported | Supported |
not null | Supported | — |
null | Supported | — |
unique | Not supported | Not supported |
check | Not supported | Not supported |
default | Supported | Not supported |
Naming and escaping rules
Column names cannot start with
hg_.Schema names cannot start with
holo_,hg_, orpg_.Table names cannot exceed 127 bytes.
Enclose names in double quotation marks (
"") when they are SQL keywords, reserved words, system fields (such asctid), case-sensitive identifiers, names with special characters, or names that start with a digit.
Syntax for escaped column names in Hologres V2.0 and later:
-- Single escaped column
BEGIN;
CREATE TABLE tbl (c1 INT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"c1":asc');
COMMIT;
-- Multiple columns, including an uppercase one (V2.1 and later)
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL) WITH (clustering_key = '"C1",c2');
COMMIT;
-- Multiple columns, including an uppercase one (V2.0 and later)
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"C1",c2');
COMMIT;Syntax for escaped column names in versions earlier than Hologres V2.0:
BEGIN;
CREATE TABLE tbl (c1 INT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"c1:asc"');
COMMIT;
-- Multiple columns, including an uppercase one
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"C1,c2"');
COMMIT;To switch to the old parser syntax in Hologres V2.0 when needed:
-- Enable the old syntax at the session level.
SET hg_disable_parse_holo_property = on;
-- Enable the old syntax at the database level.
ALTER DATABASE <db_name> SET hg_disable_parse_holo_property = on;IF NOT EXISTS behavior
| Condition | IF NOT EXISTS specified | IF NOT EXISTS not specified |
|---|---|---|
| Table with the same name exists | Returns a NOTICE, skips creation, operation succeeds | Returns an ERROR |
| No table with the same name exists | Operation succeeds | Operation succeeds |
Modification limits
After a table is created, the following cannot be changed:
Data types (before Hologres V3.0)
Column order
Nullability constraint (
NOT NULL↔ nullable)Storage layout properties:
orientation,distribution_key,clustering_key,event_time_column
The following can be changed after table creation:
bitmap_columnsanddictionary_encoding_columns— via ALTER TABLEData types: some types in V3.0 and later; all types via REBUILD in V3.1 and later — see Modify data types and REBUILD
What's next
CREATE PARTITION TABLE — manage data lifecycle at scale using partitioned tables
ALTER TABLE — modify modifiable properties after table creation
Distribution key — learn how to choose the right distribution key
Clustering key — understand how clustering keys improve query performance
Storage modes — compare column-oriented, row-oriented, and row-column hybrid storage in depth