All Products
Search
Document Center

Hologres:CREATE TABLE

Last Updated:Mar 26, 2026

Use CREATE TABLE to define the storage layout of a table in Hologres. Setting the right storage mode, distribution key, and indexes at creation time is critical — most of these properties cannot be changed after the table is created.

Prerequisites

Before you begin, ensure that you have:

Quick start

This example creates a transaction details table using the CREATE TABLE WITH syntax (available in Hologres V2.1 and later). It uses a hierarchical naming convention, defines multiple fields, and includes metadata comments.

BEGIN;

-- Create a transaction details fact table.
-- Use the public schema and follow the hierarchical naming convention (dwd_xxx).
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders (
    order_id      BIGINT NOT NULL,
    shop_id       INT NOT NULL,
    user_id       TEXT NOT NULL,
    order_amount  NUMERIC(12, 2) DEFAULT 0.00,
    payment       NUMERIC(12, 2) DEFAULT 0.00,
    payment_type  INT DEFAULT 0,              -- 0: Unpaid, 1: Alipay, 2: WeChat Pay, 3: Credit Card
    is_delivered  BOOLEAN DEFAULT false,
    dt            TEXT NOT NULL,              -- Data timestamp, in YYYYMMDD format
    order_time    TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (order_id)
)
WITH (
    orientation                  = 'column',              -- Column-oriented: best for OLAP aggregation on large datasets
    distribution_key             = 'order_id',            -- Shard data by order_id for even distribution
    clustering_key               = 'order_time:asc',      -- Sort data by time within files to accelerate range queries
    event_time_column            = 'order_time',          -- Enable file-level clipping for time range filters
    bitmap_columns               = 'shop_id,payment_type,is_delivered', -- Accelerate equality filters on low-cardinality columns
    dictionary_encoding_columns  = 'user_id:auto'         -- Accelerate GROUP BY and FILTER on string columns
);

-- Add metadata comments.
COMMENT ON TABLE  public.dwd_trade_orders           IS 'Base fact table for transaction order details.';
COMMENT ON COLUMN public.dwd_trade_orders.order_id  IS 'The unique identifier of the order.';
COMMENT ON COLUMN public.dwd_trade_orders.shop_id   IS 'The unique ID of the shop.';
COMMENT ON COLUMN public.dwd_trade_orders.user_id   IS 'The ID of the buyer.';
COMMENT ON COLUMN public.dwd_trade_orders.dt        IS 'The data timestamp, in YYYYMMDD format.';
COMMENT ON COLUMN public.dwd_trade_orders.order_time IS 'The precise timestamp when the order was created.';

COMMIT;

View the table schema

Run the following query to retrieve the Data Definition Language (DDL) statement for the table:

SELECT hg_dump_script('public.dwd_trade_orders');

Insert data

Hologres is compatible with standard Data Manipulation Language (DML) syntax. The following statement inserts 10 rows of sample data:

INSERT INTO public.dwd_trade_orders
    (order_id, shop_id, user_id, order_amount, payment, payment_type, is_delivered, dt, order_time)
VALUES
    (50001, 101, 'U678', 299.00, 280.00, 1, true,  '20231101', '2023-11-01 10:00:01+08'),
    (50002, 102, 'U992',  59.00,  59.00, 2, false, '20231101', '2023-11-01 10:05:12+08'),
    (50003, 101, 'U441', 150.00, 145.00, 1, true,  '20231101', '2023-11-01 10:10:45+08'),
    (50004, 105, 'U219', 888.00, 888.00, 3, true,  '20231101', '2023-11-01 10:20:11+08'),
    (50005, 102, 'U883',  35.00,  30.00, 1, false, '20231101', '2023-11-01 10:32:00+08'),
    (50006, 110, 'U007', 120.50, 120.50, 2, true,  '20231101', '2023-11-01 10:45:33+08'),
    (50007, 101, 'U321', 210.00, 210.00, 1, true,  '20231101', '2023-11-01 11:02:19+08'),
    (50008, 108, 'U556',  45.00,  45.00, 2, false, '20231101', '2023-11-01 11:15:04+08'),
    (50009, 101, 'U112', 300.00, 290.00, 3, true,  '20231101', '2023-11-01 11:25:55+08'),
    (50010, 105, 'U449',  99.90,  99.90, 1, true,  '20231101', '2023-11-01 11:40:22+08');

Query data

-- Calculate the total transaction amount per shop, sorted in descending order.
SELECT
    shop_id,
    COUNT(1)     AS total_orders,
    SUM(payment) AS total_payment
FROM public.dwd_trade_orders
GROUP BY shop_id
ORDER BY total_payment DESC;

Expected output:

shop_id  total_orders  total_payment
105      2             987.90
101      4             925.00
110      1             120.50
102      1              59.00
108      1              45.00

Syntax

CREATE TABLE syntax

Hologres supports two syntaxes for defining table properties and comments:

Standard syntax (recommended for Hologres V2.1 and later)

Use the WITH keyword to define properties inline. This syntax is more compact and offers better performance.

BEGIN;

CREATE TABLE [IF NOT EXISTS] [schema_name.]table_name (
    {
        column_name column_type [column_constraints, [...]]
        | table_constraints
        [,...]
    }
)
[WITH (
    property = 'value'
    [, ...]
)];

[COMMENT ON COLUMN <[schema_name.]tablename.column> IS '<value>';]
[COMMENT ON TABLE  <[schema_name.]tablename>        IS '<value>';]

COMMIT;

Compatible syntax (supported in all versions)

Use CALL set_table_property to set properties and COMMENT to add comments. All statements must be in the same BEGIN...COMMIT transaction block as CREATE TABLE.

BEGIN;

CREATE TABLE [IF NOT EXISTS] [schema_name.]table_name (
    {
        column_name column_type [column_constraints, [...]]
        | table_constraints
        [,...]
    }
);

CALL set_table_property('[schema_name.]<table_name>', '<property>', '<value>');
COMMENT ON COLUMN <[schema_name.]tablename.column> IS '<value>';
COMMENT ON TABLE  <[schema_name.]tablename>        IS '<value>';

COMMIT;

Table properties

Properties fall into three groups based on their function. Properties marked not modifiable cannot be changed after the table is created — you must recreate the table.

Data organization (not modifiable after creation)

PropertyDescriptionColumn-orientedRow-orientedRow-column hybridDefault
orientationSets the storage format. See Storage modes.columnrowrow,columncolumn
distribution_keySets the data sharding policy. See Distribution key.Primary key by default; choose one column from the primary key for best performance.Primary key by default.Primary key by default.Primary key
clustering_keyPhysically sorts data within files to accelerate range queries. See Clustering key.Empty by default. Use at most one column; only ascending order is supported.Primary key by default.Empty by default.
event_time_columnDivides data into file segments by time, enabling fast time range filtering. See Event time column.First non-null timestamp field by default.Not supported.First non-null timestamp field by default.First non-null timestamp
table_groupControls the shard count for data distribution. See Table groups and shard counts.Default table group.Default table group.Default table group.Default table group
Important

orientation, distribution_key, clustering_key, and event_time_column cannot be modified after table creation. Plan these carefully before creating the table. table_group also cannot be changed without recreating the table or resharding.

Index acceleration (modifiable after creation)

PropertyDescriptionColumn-orientedRow-orientedRow-column hybrid
bitmap_columnsBuilds a bitmap index for fast equality filtering on low-cardinality columns. See Bitmap index. Use for columns in equality comparisons; avoid setting more than 10 columns.SupportedNot supportedSupported
dictionary_encoding_columnsBuilds a dictionary map that converts string comparisons to numeric comparisons, accelerating GROUP BY and FILTER operations. All TEXT columns in column-oriented tables are enabled by default. In V0.9 and later, Hologres automatically determines whether to apply dictionary encoding based on data characteristics.SupportedNot supportedSupported

Both bitmap_columns and dictionary_encoding_columns can be modified after table creation via ALTER TABLE.

`dictionary_encoding_columns` syntax:

CALL set_table_property('table_name', 'dictionary_encoding_columns', '[columnName{:[on|off|auto]}[,...]]');

Data lifecycle

PropertyDescriptionColumn-orientedRow-orientedRow-column hybrid
time_to_live_in_secondsSets the Time to Live (TTL) for table data in seconds. TTL starts from write time, not update time. In V1.3.24 and later, the minimum allowed value is 86,400 (one day).SupportedNot recommended — use the default valueNot recommended
storage_modeSpecifies whether data is stored in hot or cold storage. Supported in V1.3 and later. See Tiered data storage.Use as neededUse as needed

`time_to_live_in_seconds` notes:

  • TTL is not enforced at an exact time. After the TTL expires, data is deleted within a window, not at a precise moment.

  • Only data is deleted; the table itself remains.

  • TTL can cause duplicate primary keys or inconsistent query results after deletion.

  • For production data lifecycle management, use partitioned tables instead. See CREATE PARTITION TABLE.

If unset, the default TTL is 100 years (effectively no expiration).

`time_to_live_in_seconds` syntax:

CALL set_table_property('table_name', 'time_to_live_in_seconds', '<non_negative_literal>');

`storage_mode` syntax:

-- Set storage mode at table creation:
CREATE TABLE <table_name> (...) WITH (storage_mode = 'hot');
CREATE TABLE <table_name> (...) WITH (storage_mode = 'cold');

-- Set storage mode after table creation:
CALL set_table_property('table_name', 'storage_mode', 'hot');
CALL set_table_property('table_name', 'storage_mode', 'cold');

Examples

Example: Partitioned table for large-scale time-series data

As data volumes grow, a single flat table becomes costly to maintain — purging historical data requires scanning all rows, and time-based queries perform full table scans. A partitioned table physically isolates data by day, enabling instant partition drops for cleanup and automatic partition pruning during queries.

This example upgrades the dwd_trade_orders table from the Quick start section to a partitioned structure. The field definitions and index settings are inherited, but the primary key must include the partition key dt.

BEGIN;

-- Create the partitioned parent table.
-- The primary key includes both the business key (order_id) and the partition key (dt).
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders_partitioned (
    order_id      BIGINT NOT NULL,
    shop_id       INT NOT NULL,
    user_id       TEXT NOT NULL,
    order_amount  NUMERIC(12, 2) DEFAULT 0.00,
    payment       NUMERIC(12, 2) DEFAULT 0.00,
    payment_type  INT DEFAULT 0,
    is_delivered  BOOLEAN DEFAULT false,
    dt            TEXT NOT NULL,
    order_time    TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (order_id, dt)
)
PARTITION BY LIST (dt)
WITH (
    orientation       = 'column',
    distribution_key  = 'order_id',
    event_time_column = 'order_time',
    clustering_key    = 'order_time:asc'
);

COMMIT;

-- Create a child table to provide physical storage for a specific date.
CREATE TABLE IF NOT EXISTS public.dwd_trade_orders_20231101
    PARTITION OF public.dwd_trade_orders_partitioned FOR VALUES IN ('20231101');

-- Insert data. The application logic is identical to the flat table.
-- Data is automatically routed to the correct child table.
INSERT INTO public.dwd_trade_orders_partitioned
    (order_id, shop_id, user_id, order_amount, payment, payment_type, is_delivered, dt, order_time)
VALUES
    (50001, 101, 'U678', 299.00, 280.00, 1, true,  '20231101', '2023-11-01 10:00:01+08'),
    (50002, 102, 'U992',  59.00,  59.00, 2, false, '20231101', '2023-11-01 10:05:12+08'),
    (50003, 101, 'U441', 150.00, 145.00, 1, true,  '20231101', '2023-11-01 10:10:45+08'),
    (50004, 105, 'U219', 888.00, 888.00, 3, true,  '20231101', '2023-11-01 10:20:11+08'),
    (50005, 102, 'U883',  35.00,  30.00, 1, false, '20231101', '2023-11-01 10:32:00+08'),
    (50006, 110, 'U007', 120.50, 120.50, 2, true,  '20231101', '2023-11-01 10:45:33+08'),
    (50007, 101, 'U321', 210.00, 210.00, 1, true,  '20231101', '2023-11-01 11:02:19+08'),
    (50008, 108, 'U556',  45.00,  45.00, 2, false, '20231101', '2023-11-01 11:15:04+08'),
    (50009, 101, 'U112', 300.00, 290.00, 3, true,  '20231101', '2023-11-01 11:25:55+08'),
    (50010, 105, 'U449',  99.90,  99.90, 1, true,  '20231101', '2023-11-01 11:40:22+08');

-- Query with a partition filter. Hologres scans only the matching child table.
SELECT COUNT(*) FROM public.dwd_trade_orders_partitioned WHERE dt = '20231101';

-- Drop a child table to reclaim space in seconds (much faster than DELETE).
-- DROP TABLE public.dwd_trade_orders_20231101;

Example: Real-time analytics for large datasets (fact table)

Scenario: Real-time dashboard summaries with large data volumes, where the core requirement is fast aggregation — for example, calculating Gross Merchandise Volume (GMV) and order counts.

Column-oriented storage excels here: high compression ratios reduce I/O, and column scans read only the columns needed for aggregation.

BEGIN;

CREATE TABLE IF NOT EXISTS public.dwd_order_summary (
    order_id    BIGINT PRIMARY KEY,
    category_id INT NOT NULL,
    gmv         NUMERIC(15, 2),
    order_time  TIMESTAMPTZ NOT NULL
) WITH (
    orientation       = 'column',           -- Best for large-scale aggregation; high compression, efficient column scans
    distribution_key  = 'order_id',         -- Even distribution; enables local joins with other order tables
    event_time_column = 'order_time',       -- Enables file-level segment clipping for time range filters
    clustering_key    = 'order_time:asc'    -- Reduces disk I/O for "last hour" or "specific day" queries
);

COMMENT ON TABLE  public.dwd_order_summary            IS 'Fact table for order summary details.';
COMMENT ON COLUMN public.dwd_order_summary.order_id   IS 'The unique order ID.';
COMMENT ON COLUMN public.dwd_order_summary.category_id IS 'The category ID.';
COMMENT ON COLUMN public.dwd_order_summary.gmv        IS 'The gross merchandise volume.';
COMMENT ON COLUMN public.dwd_order_summary.order_time IS 'The time when the order was placed.';

COMMIT;

Example: High-concurrency point queries (dimension table)

Scenario: Retrieve a user profile by user_id in milliseconds, under high Queries Per Second (QPS).

Row-oriented storage is optimized for primary key-based lookups. All row data is stored contiguously, making point queries extremely fast without scanning unrelated columns.

BEGIN;

CREATE TABLE IF NOT EXISTS public.dim_user_persona (
    user_id       TEXT PRIMARY KEY,
    user_level    INT,
    persona_jsonb JSONB
) WITH (
    orientation = 'row'
    -- For row-oriented tables, the primary key is automatically used as the distribution key
    -- and clustering key. No additional settings are required.
);

COMMENT ON TABLE  public.dim_user_persona              IS 'Dimension table for user profiles.';
COMMENT ON COLUMN public.dim_user_persona.user_id      IS 'The unique user ID.';
COMMENT ON COLUMN public.dim_user_persona.user_level   IS 'The user level.';
COMMENT ON COLUMN public.dim_user_persona.persona_jsonb IS 'The user profile features in JSON format.';

COMMIT;

Example: Hybrid workload (row-column hybrid storage)

Scenario: A logistics and after-sales system that needs both analytics (summarizing logistics statuses) and point queries (retrieving order details by order_id).

Row-column hybrid storage combines the millisecond-level point query performance of row-oriented storage with the efficient aggregation of column-oriented storage.

BEGIN;

CREATE TABLE IF NOT EXISTS public.ads_shipping_info (
    order_id         BIGINT PRIMARY KEY,
    shipping_status  INT,
    receiver_address TEXT,
    update_time      TIMESTAMPTZ
) WITH (
    orientation      = 'row,column',      -- Supports both point queries and aggregation
    distribution_key = 'order_id',        -- Controls data distribution across shards
    bitmap_columns   = 'shipping_status'  -- Accelerates "status = X" filter queries on a low-cardinality column
);

COMMENT ON TABLE  public.ads_shipping_info                  IS 'Application table for logistics status queries.';
COMMENT ON COLUMN public.ads_shipping_info.order_id         IS 'The order ID.';
COMMENT ON COLUMN public.ads_shipping_info.shipping_status  IS 'Logistics status (1: To be shipped, 2: In transit, 3: Delivered).';
COMMENT ON COLUMN public.ads_shipping_info.receiver_address IS 'The shipping address.';

COMMIT;

Limitations

Primary key limits

  • Composite primary keys: Multiple fields can form the primary key. All fields must be NOT NULL and must be declared in a single statement.

    BEGIN;
    CREATE TABLE public.test (
        id TEXT NOT NULL,
        ds TEXT NOT NULL,
        PRIMARY KEY (id, ds)
    );
    CALL SET_TABLE_PROPERTY('public.test', 'orientation', 'column');
    COMMIT;
  • Unsupported types: FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, DATE, and other complex types cannot be used as primary key columns.

  • Not modifiable: The primary key cannot be changed after table creation. Recreate the table if you need a different primary key.

  • Storage requirements: Row-oriented and row-column hybrid tables must have a primary key. Primary keys are optional for column-oriented tables.

Constraint support

ConstraintColumn-levelTable-level
primary keySupportedSupported
not nullSupported
nullSupported
uniqueNot supportedNot supported
checkNot supportedNot supported
defaultSupportedNot supported

Naming and escaping rules

  • Column names cannot start with hg_.

  • Schema names cannot start with holo_, hg_, or pg_.

  • Table names cannot exceed 127 bytes.

  • Enclose names in double quotation marks ("") when they are SQL keywords, reserved words, system fields (such as ctid), case-sensitive identifiers, names with special characters, or names that start with a digit.

Syntax for escaped column names in Hologres V2.0 and later:

-- Single escaped column
BEGIN;
CREATE TABLE tbl (c1 INT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"c1":asc');
COMMIT;

-- Multiple columns, including an uppercase one (V2.1 and later)
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL) WITH (clustering_key = '"C1",c2');
COMMIT;

-- Multiple columns, including an uppercase one (V2.0 and later)
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"C1",c2');
COMMIT;

Syntax for escaped column names in versions earlier than Hologres V2.0:

BEGIN;
CREATE TABLE tbl (c1 INT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"c1:asc"');
COMMIT;

-- Multiple columns, including an uppercase one
BEGIN;
CREATE TABLE tbl ("C1" INT NOT NULL, c2 TEXT NOT NULL);
CALL set_table_property('tbl', 'clustering_key', '"C1,c2"');
COMMIT;

To switch to the old parser syntax in Hologres V2.0 when needed:

-- Enable the old syntax at the session level.
SET hg_disable_parse_holo_property = on;

-- Enable the old syntax at the database level.
ALTER DATABASE <db_name> SET hg_disable_parse_holo_property = on;

IF NOT EXISTS behavior

ConditionIF NOT EXISTS specifiedIF NOT EXISTS not specified
Table with the same name existsReturns a NOTICE, skips creation, operation succeedsReturns an ERROR
No table with the same name existsOperation succeedsOperation succeeds

Modification limits

After a table is created, the following cannot be changed:

  • Data types (before Hologres V3.0)

  • Column order

  • Nullability constraint (NOT NULL ↔ nullable)

  • Storage layout properties: orientation, distribution_key, clustering_key, event_time_column

The following can be changed after table creation:

  • bitmap_columns and dictionary_encoding_columns — via ALTER TABLE

  • Data types: some types in V3.0 and later; all types via REBUILD in V3.1 and later — see Modify data types and REBUILD

What's next

  • CREATE PARTITION TABLE — manage data lifecycle at scale using partitioned tables

  • ALTER TABLE — modify modifiable properties after table creation

  • Distribution key — learn how to choose the right distribution key

  • Clustering key — understand how clustering keys improve query performance

  • Storage modes — compare column-oriented, row-oriented, and row-column hybrid storage in depth