A warehouse-mode columnar table is the OLAP resource group's built-in analytical table format. It uses columnar storage and an MPP execution engine for both bulk and real-time analytics. Data lives entirely inside the OLAP resource group — separate from lakehouse columnar data (lindorm_columnar) and wide-table engine data (lindorm_table) — and supports a full DDL/DML surface and high-throughput OLAP execution.
After connecting to the OLAP resource group, switch to the warehouse-mode catalog:
SET CATALOG default_catalog;Table types
Pick a table type based on whether the data needs to be updated after it lands. The two types differ in how writes are processed and stored. The table type is fixed at creation and cannot be changed afterward. The engine sorts, processes, and stores incoming data according to the table type.
Duplicate Key Table
A Duplicate Key Table is the default table type. It uses an append-only write model — every row is preserved exactly as written. The engine performs no deduplication or aggregation, so multiple rows with the same sort key are allowed to coexist.
When to use
Append-only detail data that never needs updating — raw logs, behavioral events, instrumentation traces.
A single dataset that needs to be aggregated along many ad-hoc dimensions, with no fixed pre-aggregation pattern.
Full historical snapshots that must be queried at row granularity by arbitrary predicates.
When not to use: if your workload involves frequent updates or deletes (such as order status changes or user profile edits), use a Primary Key Table instead.
DDL
-- Example: user behavior event log
-- Sorted by (event_day, event_type) — speeds up filters on date and event type.
-- Hash-bucketed by user_id — co-locates a user's rows in one bucket, enabling local joins.
CREATE TABLE user_event_log (
event_day DATE NOT NULL COMMENT "Event date",
event_type TINYINT NOT NULL COMMENT "Event type (1=click, 2=impression, 3=conversion)",
user_id BIGINT NOT NULL COMMENT "User ID",
item_id BIGINT COMMENT "Content/item ID",
session_id VARCHAR(64) COMMENT "Session ID",
duration_ms INT COMMENT "Dwell time in milliseconds",
province VARCHAR(32) COMMENT "Province",
channel VARCHAR(32) COMMENT "Source channel"
)
DUPLICATE KEY(event_day, event_type)
PARTITION BY date_trunc('day', event_day)
DISTRIBUTED BY HASH(user_id);Notes
ORDER BYis an equivalent alternative toDUPLICATE KEYfor declaring the sort key. If both are present,DUPLICATE KEYis ignored. When usingORDER BY, place it afterDISTRIBUTED BY.If neither sort key nor bucket key is specified, the engine uses the first three columns as the sort key by default.
Bitmap and Bloom Filter indexes can be created on any column — there is no Key/Value column distinction.
Duplicate Key Tables do not support
UPDATEorDELETE(use a Primary Key Table when you need them).Two identical rows are stored as two independent rows — there is no automatic deduplication.
Storage model
All written rows are physically sorted by the sort key and stored in columnar files. Each record is independent and immutable in place. Background compaction merges small files and reclaims space, but never deduplicates or modifies row contents.
The strengths of this model are high write throughput and query flexibility: writes carry no merge or dedup overhead, making the table ideal for high-frequency append workloads; queries can aggregate freely along any dimension, with no pre-defined aggregation locking the schema.
Designing the sort key
The sort key is declared with DUPLICATE KEY or ORDER BY. It defines both the physical row order and the basis for the shortkey index.
Design principles:
Put the most frequently filtered column first. The shortkey index only matches a contiguous prefix of the sort key, so the leading column should be the one most often used in
WHEREclauses.Prefer low-cardinality columns in the prefix. Low-cardinality columns (date, event type, city) cluster identical values together, which combines well with the ZoneMap index to skip large amounts of I/O.
Avoid high-cardinality columns at the prefix. Columns like
user_idordevice_idare too dispersed for the shortkey index to be useful — use them as Bloom Filter columns or bucket keys instead.Keep the sort key to three columns or fewer. The shortkey index stops extending after 36 bytes, so additional columns deliver diminishing returns.
Primary Key Table
A Primary Key Table enforces a unique, non-null constraint on the primary key columns. It is purpose-built for workloads that mix real-time updates with high-throughput ad-hoc queries.
When to use
Synchronizing full CDC streams (inserts/updates/deletes) from transactional databases like MySQL or PostgreSQL via Flink CDC and similar tools.
High-frequency UPSERT workloads — order-status tracking, real-time user-profile updates, logistics waybill state sync.
Multiple data streams writing to one wide table, each updating its own subset of columns (partial column update).
Concurrent complex analytical queries that must see the latest snapshot.
DDL
-- Example: real-time analytical table for e-commerce orders
-- Primary key columns (order_id, dt) lead the column list.
-- Partition column dt and bucket column order_id are both included in the primary key.
-- ORDER BY (dt, merchant_id) speeds up range scans by date and merchant.
CREATE TABLE orders (
order_id BIGINT NOT NULL COMMENT "Order ID",
dt DATE NOT NULL COMMENT "Order date",
merchant_id INT NOT NULL COMMENT "Merchant ID",
user_id BIGINT NOT NULL COMMENT "User ID",
good_id INT NOT NULL COMMENT "Item ID",
good_name VARCHAR(128) NOT NULL COMMENT "Item name",
total_amount DECIMAL(18, 2) NOT NULL COMMENT "Order amount",
payment_amount DECIMAL(18, 2) COMMENT "Paid amount",
status TINYINT NOT NULL COMMENT "Order status",
create_time DATETIME NOT NULL COMMENT "Created at",
update_time DATETIME COMMENT "Last updated at"
)
PRIMARY KEY (order_id, dt)
PARTITION BY date_trunc('day', dt)
DISTRIBUTED BY HASH(order_id)
ORDER BY (dt, merchant_id);Notes
Primary key columns are unique and non-null.
DECIMALis not allowed in the primary key. The total primary key length must not exceed 128 bytes.Primary key columns must lead the column list with no other columns interleaved. Place all primary key columns at the very start of the
CREATE TABLEcolumn definitions; mixing non-primary-key columns in between causes the DDL to fail.When using partitioning and hash bucketing, the primary key must contain the partition columns and the bucket columns so the engine can locate the exact partition and bucket to update.
Primary key values are immutable.
UPDATEonly rewrites non-primary-key columns. To change a primary key value,DELETEthe row, thenINSERTthe new one.The sort key (
ORDER BY) is fully decoupled from the primary key — it can be any combination of columns. The sort key can be modified after table creation viaALTER TABLE tbl ORDER BY ..., but you cannot drop it or change the data type of a sort column.Primary Key Tables support the full DML surface:
INSERT,UPDATE(filter by either key or value columns), andDELETE(filter by either key or value columns).Primary Key Tables only support hash bucketing. Random bucketing is not supported.
Write model: Delete+Insert
Primary Key Tables use a Delete+Insert write strategy, in contrast to the traditional Merge-On-Read model:
On write: The engine looks up the existing row through the primary key index, marks it as deleted, writes the new row, and updates the index.
On query: Data is already in its final state — no online multi-version merge is needed; the reader just streams the live rows.
As a result, Primary Key Tables deliver markedly better query latency than Merge-On-Read engines, especially under heavy update traffic where read amplification would otherwise dominate.
Primary key design principles
The primary key should uniquely identify a business entity. Pick a column or set of columns that pins down a single record across the whole table — for example,
order_idor(user_id, event_id).The primary key must include the partition and bucket columns. When using
PARTITION BYandDISTRIBUTED BY HASH, the primary key must cover both — otherwise the engine cannot locate the exact partition and bucket for an update.Keep the primary key compact. Total length must be at most 128 bytes; aim for three columns or fewer. Prefer compact types like
INTandBIGINToverVARCHAR— too many primary key columns inflate index storage and maintenance cost.Primary key columns cannot use
DECIMAL. UseBIGINT,VARCHAR, or other supported types instead.Primary key columns must be NOT NULL. Every primary key column has to be declared
NOT NULL.
Sort key decoupled from primary key
PRIMARY KEY defines uniqueness; ORDER BY defines the physical row order. The two are fully decoupled and can be designed independently.
The sort key can be any combination of columns and does not have to include primary key columns — choose it based on the columns most frequently used in filters.
If you omit
ORDER BY, the primary key columns are used as the sort key by default.The sort key can be modified after table creation:
ALTER TABLE tbl ORDER BY (col1, col2, ...). Dropping the sort key is not supported, and you cannot change the data type of a sort column.
Partial column update
Primary Key Tables support partial column update: each writer carries only the columns it owns, and the engine merges them by primary key. This is the foundation for filling a wide table from multiple data streams in real time.
-- Update only the status and update_time columns; other columns keep their current values.
INSERT INTO orders (order_id, status, update_time)
VALUES (10001, 3, '2024-03-01 10:30:00');
-- Columns not listed (total_amount, user_id, ...) remain unchanged.Data distribution
The distribution strategy determines how data is physically laid out across the cluster. A well-designed partition and bucket scheme lets the engine prune irrelevant data at query time and spreads work evenly across compute resources.
Warehouse-mode columnar tables use a two-level layout:
Table
└── Partition ← Logical division — used for data lifecycle and query pruning.
└── Bucket ← Physical division — used for even distribution and parallel scan.
Partition
A partition splits the table into independent management units along the partition key. When a query filters on the partition key, the engine prunes irrelevant partitions and scans only the targets, drastically reducing I/O. Partitions are also the unit of lifecycle management — DROP PARTITION removes historical data far faster than DELETE.
Expression partition (recommended)
Defined with a function expression like date_trunc(). The engine creates partitions on the fly based on the time field in incoming data, so you don't have to predefine a partition list. This is the recommended choice for time-series workloads.
-- Daily partitions, automatically created
PARTITION BY date_trunc('day', event_time)
-- Monthly partitions, automatically created
PARTITION BY date_trunc('month', order_date)Range partition
Splits data into contiguous numeric or date ranges. Suitable for time-series data managed by day, week, or month.
PARTITION BY RANGE(event_day) (
PARTITION p202401 VALUES LESS THAN ("2024-02-01"),
PARTITION p202402 VALUES LESS THAN ("2024-03-01"),
PARTITION p202403 VALUES LESS THAN ("2024-04-01")
)List partition
Splits data by an enumerated set of column values. Suitable for managing data by region, status, or other discrete categories.
PARTITION BY LIST(region) (
PARTITION p_east VALUES IN ("shanghai", "jiangsu", "zhejiang"),
PARTITION p_north VALUES IN ("beijing", "tianjin", "hebei"),
PARTITION p_south VALUES IN ("guangdong", "fujian", "hainan")
)Note Partition keys must be a date type (DATE,DATETIME) or an integer type (INT,BIGINT, etc.).
Bucket
Within each partition, buckets distribute data evenly across physical storage units (Bucket / Tablet) so that scans can run in parallel.
Hash bucketing
Rows are routed by the hash of the bucket key, so rows with the same bucket key land in the same bucket. This accelerates equality predicates and same-key joins (a colocated join needs zero network shuffle). Choose a bucket key that is high-cardinality and frequently used in equality filters or join conditions to avoid skew.
-- Engine picks the bucket count automatically
DISTRIBUTED BY HASH(user_id)
-- Specify the bucket count manually
DISTRIBUTED BY HASH(order_id) BUCKETS 32Random bucketing
Rows are randomly assigned to buckets without specifying a bucket key. Writes are evenly distributed, so this option is well suited to detail-data ingest with no fixed query pattern. Duplicate Key Tables default to random bucketing; Primary Key Tables do not support it.
DISTRIBUTED BY RANDOMBest practice Aim for 100 MB to 1 GB of data per bucket. Buckets that are too large reduce parallelism; too many buckets inflate metadata and scheduling overhead. Likewise, avoid over-fine partitions (such as hourly partitions) — large numbers of small partitions add significant metadata pressure.
Combined example
-- Top-up detail table: daily expression partition, hash-bucketed by user_id
CREATE TABLE recharge_detail (
id BIGINT NOT NULL COMMENT "Sequence ID",
user_id BIGINT NOT NULL COMMENT "User ID",
recharge_money DECIMAL(18, 2) NOT NULL COMMENT "Top-up amount",
city VARCHAR(32) NOT NULL COMMENT "City",
dt DATE NOT NULL COMMENT "Top-up date"
)
DUPLICATE KEY(id)
PARTITION BY date_trunc('day', dt)
DISTRIBUTED BY HASH(user_id);Data types
Numeric types
Type | Bytes | Range |
BOOLEAN | 1 |
|
TINYINT | 1 | -128 to 127 |
SMALLINT | 2 | -32,768 to 32,767 |
INT | 4 | -2,147,483,648 to 2,147,483,647 |
BIGINT | 8 | -2⁶³ to 2⁶³-1 |
LARGEINT | 16 | -2¹²⁷ to 2¹²⁷-1, for very large integers (such as 128-bit device IDs) |
FLOAT | 4 | Single-precision floating point, approximate |
DOUBLE | 8 | Double-precision floating point, approximate |
DECIMAL(M, D) | Variable | Exact decimal — M is total digits (max 38), D is digits after the decimal point. Recommended for financial and transactional data. |
Note Numeric types do not support the UNSIGNED modifier.String types
Type | Max length | Description |
CHAR(N) | 255 bytes | Fixed-length string, padded with spaces to length N. |
VARCHAR(N) | 1,048,576 bytes | Variable-length string, stored at the actual length. |
STRING | Same as VARCHAR | Equivalent to a large-capacity VARCHAR. Suitable for unstructured text. |
Date and time types
Type | Description |
DATE | Date in |
DATETIME | Date-time in |
OLAP-specific types
Type | Description |
BITMAP | Bitmap type. Combined with |
HLL | HyperLogLog type. Combined with |
BITMAP and HLL columns cannot be used as sort key columns or primary key columns.
Semi-structured types
Type | Description |
JSON | Flexible JSON document. Path queries are supported through |
ARRAY<T> | Array of homogeneous elements. Supports array functions such as |
MAP<K, V> | Key-value map. Suitable for storing dynamic attributes, tags, or other data with a non-fixed schema. |
STRUCT<field: type> | Nested struct accessed by field name. Suitable for modeling deeply nested data. |
Compute features
SQL analytical operators
Warehouse-mode columnar tables support standard SQL analytical syntax, covering the full range of OLAP query patterns.
Projection and filter
SELECT projects any subset of columns. Columns can be aliased with AS and used in arithmetic expressions or function calls. WHERE accepts the standard set of predicate combinations:
Predicate | Example |
Equality |
|
Range |
|
Set membership |
|
Pattern match |
|
NULL check |
|
Regular expression |
|
Whenever possible, predicates are pushed down to the storage layer so that fewer rows are read from disk.
JOIN
Supported join types:
Join type | Description |
| Returns rows that match in both tables. |
| Keeps all rows from the left table; fills NULL when no match exists on the right. |
| Keeps all rows from the right table; fills NULL when no match exists on the left. |
| Keeps all rows from both tables. |
| Cartesian product. |
| Returns left-table rows that have a match on the right; right-table columns are not returned. |
| Returns left-table rows that have no match on the right. |
The optimizer chooses the join strategy automatically based on table sizes and data distribution: broadcast a small table to every node (Broadcast Join), reshuffle data on the join key (Shuffle Join), or — when the bucket key matches the join key — run a colocated join with no network shuffle.
-- Example: join order details with user profiles
SELECT
o.order_id,
o.total_amount,
u.user_level,
u.city
FROM orders o
INNER JOIN user_profile u ON o.user_id = u.user_id
WHERE o.dt >= '2024-01-01'
AND u.user_level IN ('gold', 'platinum');Set operations
Operator | Description |
| Combines results from multiple queries while keeping duplicates. Faster than |
| Combines results from multiple queries and deduplicates them. |
| Returns the intersection of two result sets. |
| Returns rows in the first result set that do not appear in the second. |
Aggregation (GROUP BY)
Standard aggregate functions are supported: COUNT, SUM, AVG, MIN, MAX, COUNT DISTINCT. Use HAVING to filter on aggregated results.
The following extended grouping syntax is also supported. Each form computes multiple aggregations in a single scan, so it outperforms running the queries separately:
-- GROUPING SETS: explicitly enumerate the dimension combinations
SELECT city, dt, SUM(amount) AS total
FROM orders
GROUP BY GROUPING SETS ((city, dt), (city), ());
-- ROLLUP: hierarchical totals (from finest grain to grand total)
-- Equivalent to GROUPING SETS ((city, dt), (city), ())
SELECT city, dt, SUM(amount) AS total
FROM orders
GROUP BY ROLLUP (city, dt);
-- CUBE: every combination of the dimensions
-- Equivalent to GROUPING SETS ((city, dt), (city), (dt), ())
SELECT city, dt, SUM(amount) AS total
FROM orders
GROUP BY CUBE (city, dt);Sort and pagination
ORDER BY sorts on multiple columns (ASC or DESC); pair it with LIMIT and OFFSET for paginated reads.
SELECT order_id, total_amount
FROM orders
WHERE dt = '2024-03-01'
ORDER BY total_amount DESC, order_id ASC
LIMIT 20 OFFSET 40;Subquery
The optimizer rewrites convertible subqueries into equivalent joins. Supported subquery forms:
Subquery type | Example |
Scalar subquery |
|
|
|
|
|
Correlated subquery | A subquery that references columns from the outer query and is evaluated row by row. |
Window functions
Window functions compute aggregates over a row group while preserving the original rows — a core capability for OLAP workloads. Define a window with OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE ...).
Category | Common functions |
Ranking |
|
Aggregation |
|
Offset access |
|
First/last |
|
-- Example: cumulative spend per user and daily ranking
SELECT
user_id,
order_id,
dt,
total_amount,
SUM(total_amount) OVER (
PARTITION BY user_id
ORDER BY dt
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_amount,
RANK() OVER (
PARTITION BY dt
ORDER BY total_amount DESC
) AS daily_rank
FROM orders;
-- Example: days between consecutive purchases for each user
SELECT
user_id,
order_id,
dt,
DATEDIFF(dt, LAG(dt, 1) OVER (PARTITION BY user_id ORDER BY dt)) AS days_since_last_order
FROM orders;
CASE WHEN
Both searched and simple CASE forms are supported. CASE WHEN can appear in SELECT, WHERE, ORDER BY, aggregate function arguments, and elsewhere.
SELECT
order_id,
total_amount,
CASE
WHEN total_amount >= 10000 THEN 'large'
WHEN total_amount >= 1000 THEN 'medium'
WHEN total_amount >= 100 THEN 'small'
ELSE 'micro'
END AS amount_tier,
SUM(CASE WHEN status = 1 THEN total_amount ELSE 0 END) AS paid_amount
FROM orders
GROUP BY order_id, total_amount;Indexes
Warehouse-mode columnar tables ship with a multi-layer index system. Each index targets a different query pattern, and they work together to skip data that the query does not need.
Shortkey index
Built automatically. The engine takes the prefix of the sort key (up to 36 bytes and three columns) every 1024 rows and stores it as a sparse index entry. When a query filters on a prefix of the sort key, a binary search locates the target data block — no manual index creation is needed.
Triggering condition: the filter must form a prefix of the sort key.
Sort key: (event_day, site_id, city_code)
✅ WHERE event_day = '2024-01-01' -- Hits the shortkey index
✅ WHERE event_day = '2024-01-01' AND site_id = 10 -- Hits the shortkey index
❌ WHERE city_code = 'BJ' -- Not a prefix; cannot use the index
Place the most frequently filtered column at the start of the sort key so it is covered first by the shortkey index.
ZoneMap index
Built automatically. For every column on every data block (Page, 64 KB by default), the engine maintains the minimum, maximum, and NULL existence statistics. During a range query, the engine compares the predicate against each Page's Min/Max and skips Pages that cannot match — pruning at the data-block level.
When it helps: range filters (BETWEEN, >, <, >=, <=) on ordered types such as numerics and dates.
Bloom Filter index
A Bloom Filter–based skip index, built per data block (Page). On write, the engine hashes column values from each Page into the filter; on query, the engine hashes the predicate value and skips Pages whose filter does not contain it, reducing I/O.
When to use: equality predicates (=, IN) on high-cardinality columns such as user IDs, order IDs, or device IDs. These columns have too many distinct values for Bitmap to be cost-effective.
Specify the index through PROPERTIES's bloom_filter_columns. After table creation, you can add or change it with ALTER TABLE.
-- Define at table creation
CREATE TABLE orders (
order_id BIGINT NOT NULL,
user_id BIGINT NOT NULL,
merchant_id INT NOT NULL,
dt DATE NOT NULL,
amount DECIMAL(18, 2)
)
PRIMARY KEY(order_id)
DISTRIBUTED BY HASH(order_id)
PROPERTIES (
"bloom_filter_columns" = "user_id, merchant_id"
);
-- Add or change after creation
ALTER TABLE orders SET ("bloom_filter_columns" = "user_id, merchant_id, dt");
-- Drop the Bloom Filter index
ALTER TABLE orders SET ("bloom_filter_columns" = "");Notes:
TINYINT,FLOAT,DOUBLE, andDECIMALcolumns are not supported.Only equality predicates benefit. Range predicates (
>,<,BETWEEN) cannot use it.Not effective on low-cardinality columns — use the Bitmap index there instead.
Bitmap index
Builds a bitmap per distinct value (each bit corresponds to a row, 1 means the row contains that value). Multi-condition queries are evaluated by AND/OR/NOT operations on the bitmaps, very efficiently producing the result set. The internal storage is Roaring Bitmap, which can save up to 90% of the space compared with a classical bitmap.
When to use: equality predicates on medium-cardinality columns (10,000–100,000 distinct values is a good fit), and multi-column combination filters — for example, segmenting a user-profile table by age band, customer tier, and city in a single query.
Define a Bitmap index inside INDEX at table creation, or add it later with CREATE INDEX (an asynchronous Schema Change operation).
-- Create at table creation
CREATE TABLE user_profile (
user_id BIGINT NOT NULL COMMENT "User ID",
age_range TINYINT COMMENT "Age band (1-5)",
gender TINYINT COMMENT "Gender (0=female, 1=male)",
city_code VARCHAR(20) COMMENT "City code",
user_level TINYINT COMMENT "Customer tier (1-5)",
register_day DATE COMMENT "Registration date",
INDEX idx_age_range (age_range) USING BITMAP COMMENT "Age-band index",
INDEX idx_city_code (city_code) USING BITMAP COMMENT "City index",
INDEX idx_user_level (user_level) USING BITMAP COMMENT "Tier index"
)
DUPLICATE KEY(user_id)
DISTRIBUTED BY HASH(user_id);
-- Add after creation
CREATE INDEX idx_gender ON user_profile (gender) USING BITMAP;
-- Inspect indexes
SHOW INDEX FROM user_profile;
-- Drop an index
DROP INDEX idx_gender ON user_profile;Notes:
FLOAT,DOUBLE,BOOLEAN, andDECIMALcolumns are not supported.For very high cardinalities (more than 1 million distinct values), the bitmap storage inflates — switch to the Bloom Filter index.
If the index filters out only a small fraction of rows (for example, more than 50% of rows still match), the index hurts more than it helps — skip it.
Choosing an index
Query pattern | Recommended index |
Equality or range predicate on the sort-key prefix | Shortkey index (automatic — nothing to create) |
Range predicate on numeric or date columns | ZoneMap index (automatic — nothing to create) |
Equality predicate on medium-cardinality columns; multi-column combination filters | Bitmap index |
Equality predicate on high-cardinality (ID-like) columns | Bloom Filter index |