ApsaraDB for ClickHouse supports four families of table engines: MergeTree, Log, Integrations, and Special. Each family is optimized for a different set of workloads. This topic describes what each engine does and walks through examples for the most commonly used engines.
Engine families at a glance
| Family | Best for | Engines |
|---|---|---|
| MergeTree | High-throughput inserts with background processing, partitioning, and replication | MergeTree, ReplacingMergeTree, CollapsingMergeTree, VersionedCollapsingMergeTree, SummingMergeTree, AggregatingMergeTree, GraphiteMergeTree, Approximate Nearest Neighbor Search Indexes, Full-text Search using Inverted Indexes |
| Log | Fast writes to small tables (~1 million rows) with full-table reads | TinyLog, StripeLog, Log |
| Integrations | Importing or querying external data sources | Kafka, MySQL, JDBC, ODBC, HDFS |
| Special | Specific architectural needs (distributed queries, materialized views, in-memory tables) | Distributed, MaterializedView, Dictionary, Merge, File, NULL, Set, Join, URL, View, Memory, Buffer |
For a complete reference of all supported table engines, see Table Engines.
MergeTree family
Engines in the MergeTree family are the primary choice for production workloads. They support high-speed inserts, data partitioning, primary key indexes, sparse indexes, data sampling, data replication, and time to live (TTL) for data.
All MergeTree engines share the same storage structure: data is written in parts and merged in the background, similar to a log-structured merge-tree (LSM tree). Storage-layer processing — including deduplication, collapsing, and aggregation — happens during compaction, not at insert time.
MergeTree in ApsaraDB for ClickHouse supports all ClickHouse SQL syntax, but primary keys work differently than in standard SQL. The primary key accelerates queries rather than enforcing uniqueness. Duplicate primary keys can exist even after compaction.
MergeTree
MergeTree is the base engine for high-load insert workloads. Data is inserted in parts and merged in the background based on the sort order defined by ORDER BY.
Example: MergeTree with partitioning
Create a table partitioned by
create_timeand sorted on(id, create_time).CREATE TABLE test_tbl ON CLUSTER default ( id UInt16, create_time Date, comment Nullable(String) ) ENGINE = MergeTree() PARTITION BY create_time ORDER BY (id, create_time) PRIMARY KEY (id, create_time) SETTINGS index_granularity=8192;Insert rows with duplicate primary keys.
INSERT INTO test_tbl VALUES (1, '2019-12-13', null); INSERT INTO test_tbl VALUES (1, '2019-12-13', null); INSERT INTO test_tbl VALUES (2, '2019-12-14', null); INSERT INTO test_tbl VALUES (3, '2019-12-15', null); INSERT INTO test_tbl VALUES (3, '2019-12-15', null);Query the table.
SELECT * FROM test_tbl;Result — all five rows returned, including duplicates:
┌─id─┬─create_time─┬─comment──┐ │ 1 │ 2019-12-13 │ NULL │ │ 1 │ 2019-12-13 │ NULL │ │ 2 │ 2019-12-14 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ └────┴─────────────┴──────────┘Force compaction.
OPTIMIZE TABLE test_tbl FINAL;Query again.
SELECT * FROM test_tbl;Result — duplicates still present. MergeTree does not remove them.
┌─id─┬─create_time─┬─comment──┐ │ 1 │ 2019-12-13 │ NULL │ │ 1 │ 2019-12-13 │ NULL │ │ 2 │ 2019-12-14 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ └────┴─────────────┴──────────┘
For more information, see MergeTree.
ReplacingMergeTree
ReplacingMergeTree removes duplicate rows with the same primary key during background compaction. Use it when eventual deduplication is acceptable — it does not guarantee uniqueness at query time.
Limitations
In a distributed cluster, rows with the same primary key may land on different shards. Deduplication only happens within a single shard, not across shards.
Before compaction completes, some duplicates may still be present. Compaction runs in the background on an unpredictable schedule.
On large datasets, manually running
OPTIMIZE ... FINALcan take a long time, making real-time deduplication impractical.
For more information, see ReplacingMergeTree.
Example: ReplacingMergeTree deduplication
Create a table.
CREATE TABLE test_tbl_replacing ( id UInt16, create_time Date, comment Nullable(String) ) ENGINE = ReplacingMergeTree() PARTITION BY create_time ORDER BY (id, create_time) PRIMARY KEY (id, create_time) SETTINGS index_granularity=8192;Insert rows with duplicate primary keys.
INSERT INTO test_tbl_replacing VALUES (1, '2019-12-13', null); INSERT INTO test_tbl_replacing VALUES (1, '2019-12-13', null); INSERT INTO test_tbl_replacing VALUES (2, '2019-12-14', null); INSERT INTO test_tbl_replacing VALUES (3, '2019-12-15', null); INSERT INTO test_tbl_replacing VALUES (3, '2019-12-15', null);Query before compaction — duplicates are still present.
SELECT * FROM test_tbl_replacing;┌─id─┬─create_time─┬─comment──┐ │ 1 │ 2019-12-13 │ NULL │ │ 1 │ 2019-12-13 │ NULL │ │ 2 │ 2019-12-14 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ └────┴─────────────┴──────────┘Force compaction.
OPTIMIZE TABLE test_tbl_replacing FINAL;Query again — duplicates removed.
SELECT * FROM test_tbl_replacing;┌─id─┬─create_time─┬─comment──┐ │ 1 │ 2019-12-13 │ NULL │ │ 2 │ 2019-12-14 │ NULL │ │ 3 │ 2019-12-15 │ NULL │ └────┴─────────────┴──────────┘
CollapsingMergeTree
CollapsingMergeTree tracks row state changes using a Sign column instead of updating rows in place — updates are expensive in ClickHouse's append-only storage model. Each row is marked as either a state row (Sign = 1) or a cancel row (Sign = -1). During compaction, pairs of state and cancel rows with the same primary key are collapsed and removed.
How collapsing works
To update a row's state:
Insert a cancel row with
Sign = -1and the same primary key values as the row you want to replace.Insert a new state row with
Sign = 1and the updated values.
To delete a row's state: insert a cancel row that matches the original state row in all columns except Sign.
Usage notes
If state and cancel rows are inserted out of order — for example, if a cancel row is inserted and then its matching state row arrives separately — they may not be collapsed correctly. Consider using VersionedCollapsingMergeTree to handle out-of-order inserts.
Collapsing only happens within a single shard. In a distributed cluster, rows with the same primary key on different nodes are not collapsed.
Before compaction completes, state and cancel rows coexist. When running aggregate queries before compaction, adjust your SQL:
Replace
COUNT()withSUM(Sign)Replace
SUM(col)withSUM(col * Sign)
For more information, see CollapsingMergeTree.
Example: CollapsingMergeTree state updates
Create a table with a
Signcolumn.CREATE TABLE test_tbl_collapsing ( UserID UInt64, PageViews UInt8, Duration UInt8, Sign Int8 ) ENGINE = CollapsingMergeTree(Sign) ORDER BY UserID;Insert the initial state row.
INSERT INTO test_tbl_collapsing VALUES (4324182021466249494, 5, 146, 1);Update the state: insert a cancel row for the old state and a new state row.
Insert the cancel row and the new state row in the same batch. If you insert the cancel row and then the state row in separate operations, they may arrive out of order. In that case, even if compaction is forced, rows with the same primary key cannot be collapsed or deleted.
INSERT INTO test_tbl_collapsing VALUES (4324182021466249494, 5, 146, -1), -- cancel old state (4324182021466249494, 6, 185, 1); -- new stateQuery before compaction — all three rows are present.
SELECT * FROM test_tbl_collapsing;┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign──┐ │ 4324182021466249494 │ 5 │ 146 │ 1 │ │ 4324182021466249494 │ 5 │ 146 │ -1 │ │ 4324182021466249494 │ 6 │ 185 │ 1 │ └─────────────────────┴───────────┴──────────┴───────┘To get correct aggregate results before compaction, use
Sign-aware expressions:SELECT UserID, SUM(PageViews * Sign) AS PageViews, SUM(Duration * Sign) AS Duration FROM test_tbl_collapsing GROUP BY UserID HAVING SUM(Sign) > 0;┌────────UserID───────┬─PageViews─┬─Duration──┐ │ 4324182021466249494 │ 6 │ 185 │ └─────────────────────┴───────────┴───────────┘Force compaction.
OPTIMIZE TABLE test_tbl_collapsing FINAL;Query again — only the latest state row remains.
SELECT * FROM test_tbl_collapsing;┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign──┐ │ 4324182021466249494 │ 6 │ 185 │ 1 │ └─────────────────────┴───────────┴──────────┴───────┘
VersionedCollapsingMergeTree
VersionedCollapsingMergeTree extends CollapsingMergeTree with a Version column that lets the engine correctly match state and cancel rows regardless of insertion order. During compaction, rows with the same primary key, the same Version value, and opposite Sign values are collapsed and removed.
Use this engine when rows may arrive out of order — for example, in event-driven pipelines where cancel events can be processed before the state events they reference.
As with CollapsingMergeTree, adjust aggregate queries to use SUM(Sign) instead of COUNT() and SUM(col * Sign) instead of SUM(col).
For more information, see VersionedCollapsingMergeTree.
Example: VersionedCollapsingMergeTree with out-of-order inserts
Create a table with
SignandVersioncolumns.CREATE TABLE test_tbl_Versioned ( UserID UInt64, PageViews UInt8, Duration UInt8, Sign Int8, Version UInt8 ) ENGINE = VersionedCollapsingMergeTree(Sign, Version) ORDER BY UserID;Insert a cancel row first (out of order).
INSERT INTO test_tbl_Versioned VALUES (4324182021466249494, 5, 146, -1, 1);Insert the matching state row and a new state row.
INSERT INTO test_tbl_Versioned VALUES (4324182021466249494, 5, 146, 1, 1), -- state row matching the cancel above (4324182021466249494, 6, 185, 1, 2); -- new stateQuery before compaction.
SELECT * FROM test_tbl_Versioned;┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign───┬Version─┐ │ 4324182021466249494 │ 5 │ 146 │ -1 │ 1 │ │ 4324182021466249494 │ 5 │ 146 │ 1 │ 1 │ │ 4324182021466249494 │ 6 │ 185 │ 1 │ 2 │ └─────────────────────┴───────────┴──────────┴────────┴────────┘Use
Sign-aware aggregation to get correct results before compaction:SELECT UserID, SUM(PageViews * Sign) AS PageViews, SUM(Duration * Sign) AS Duration FROM test_tbl_Versioned GROUP BY UserID HAVING SUM(Sign) > 0;┌────────UserID───────┬─PageViews─┬─Duration─┐ │ 4324182021466249494 │ 6 │ 185 │ └─────────────────────┴───────────┴──────────┘Force compaction.
OPTIMIZE TABLE test_tbl_Versioned FINAL;Query again — the
Versioncolumn correctly matched the cancel and state rows despite out-of-order insertion.SELECT * FROM test_tbl_Versioned;┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign───┬Version─┐ │ 4324182021466249494 │ 6 │ 185 │ 1 │ 2 │ └─────────────────────┴───────────┴──────────┴────────┴────────┘
SummingMergeTree
We recommend using SummingMergeTree together with MergeTree. Store the raw detail data in a MergeTree table and use SummingMergeTree for pre-aggregated results. This prevents data loss if your primary key is incorrectly composed.
SummingMergeTree pre-aggregates rows with the same primary key into a single row by summing numeric columns. This reduces storage and speeds up aggregate queries.
Usage notes
Pre-aggregation happens during background compaction, not at insert time. Some data may not yet be aggregated, so always include a
GROUP BYclause withSUM()in your queries.Only numeric columns are summed. String and other non-numeric columns that are not part of the primary key get an arbitrary value after aggregation.
For more information, see SummingMergeTree.
Example: SummingMergeTree pre-aggregation
Create a table.
CREATE TABLE test_tbl_summing ( key UInt32, value UInt32 ) ENGINE = SummingMergeTree() ORDER BY key;Insert rows — some share the same key.
INSERT INTO test_tbl_summing VALUES (1, 1), (1, 2), (2, 1);Query before compaction — rows not yet aggregated.
SELECT * FROM test_tbl_summing;┌─key─┬value─┐ │ 1 │ 1 │ │ 1 │ 2 │ │ 2 │ 1 │ └─────┴──────┘Force compaction.
OPTIMIZE TABLE test_tbl_summing FINAL;Query with
GROUP BYandSUM()to get correct results.SELECT key, SUM(value) FROM test_tbl_summing GROUP BY key;┌─key─┬value─┐ │ 1 │ 3 │ │ 2 │ 1 │ └─────┴──────┘
AggregatingMergeTree
AggregatingMergeTree is a more flexible pre-aggregation engine that supports any aggregate function, not just SUM. It uses the special AggregateFunction data type and a two-phase query pattern:
Write phase: use
-Statesuffix functions (for example,sumState,uniqState) to store intermediate aggregate states.Query phase: use
-Mergesuffix functions (for example,sumMerge,uniqMerge) to finalize the aggregation.
AggregatingMergeTree is used in two ways: with a materialized view (which automates the write phase), or directly with AggregateFunction columns.
For more information, see AggregatingMergeTree.
Example 1: AggregatingMergeTree with a materialized view
Create a detail table.
CREATE TABLE visits ( UserID UInt64, CounterID UInt8, StartDate Date, Sign Int8 ) ENGINE = CollapsingMergeTree(Sign) ORDER BY UserID;Create a materialized view that pre-aggregates the detail table using
-Statefunctions.CREATE MATERIALIZED VIEW visits_agg_view ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate) AS SELECT CounterID, StartDate, sumState(Sign) AS Visits, uniqState(UserID) AS Users FROM visits GROUP BY CounterID, StartDate;Insert data into the detail table. The materialized view is updated automatically.
INSERT INTO visits VALUES (0, 0, '2019-11-11', 1); INSERT INTO visits VALUES (1, 1, '2019-11-12', 1);Query the materialized view using
-Mergefunctions.Use
sumMergeanduniqMerge— the plainsumanduniqfunctions cannot processAggregateFunctioncolumns and will return an error.SELECT StartDate, sumMerge(Visits) AS Visits, uniqMerge(Users) AS Users FROM visits_agg_view GROUP BY StartDate ORDER BY StartDate;┌──StartDate──┬─Visits─┬─Users──┐ │ 2019-11-11 │ 1 │ 1 │ │ 2019-11-12 │ 1 │ 1 │ └─────────────┴────────┴────────┘
Example 2: AggregatingMergeTree with AggregateFunction columns
Create a detail table.
CREATE TABLE detail_table ( CounterID UInt8, StartDate Date, UserID UInt64 ) ENGINE = MergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate);Insert data.
INSERT INTO detail_table VALUES (0, '2019-11-11', 1); INSERT INTO detail_table VALUES (1, '2019-11-12', 1);Create an aggregate table with an
AggregateFunctioncolumn.CREATE TABLE agg_table ( CounterID UInt8, StartDate Date, UserID AggregateFunction(uniq, UInt64) ) ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate);Insert aggregated data using
uniqState.Insert using a
SELECTwith-Statefunctions. DirectINSERT INTO ... VALUESwith raw values will fail with a type error.INSERT INTO agg_table SELECT CounterID, StartDate, uniqState(UserID) FROM detail_table GROUP BY CounterID, StartDate;Query the aggregate table using
uniqMerge.SELECT uniqMerge(UserID) AS state FROM agg_table GROUP BY CounterID, StartDate;┌─state─┐ │ 1 │ │ 1 │ └───────┘
Log family
Engines in the Log family are designed for fast writes to small tables (around one million rows) where the entire table is read at once.
Common characteristics of all Log engines:
Data is appended to disk sequentially
DELETEandUPDATEare not supportedIndexes are not supported
Data is not written atomically
INSERTblocksSELECTon the same table
| Engine | Concurrent reads | Storage layout | When to use |
|---|---|---|---|
| TinyLog | No | One file per column | Temporary intermediate data where performance is not critical |
| StripeLog | Yes | All columns in a single file | Higher query performance than TinyLog with fewer file handles |
| Log | Yes | One file per column | Higher query performance than TinyLog with per-column access |
Integrations family
Engines in the Integrations family connect ApsaraDB for ClickHouse to external data sources, either by importing data or by querying it in place.
| Engine | Description |
|---|---|
| Kafka | Reads data from Kafka topics into ApsaraDB for ClickHouse |
| MySQL | Queries MySQL tables directly from ApsaraDB for ClickHouse using SELECT and other operations |
| JDBC | Connects to data sources via Java Database Connectivity (JDBC) connection strings |
| ODBC | Connects to data sources via Open Database Connectivity (ODBC) connection strings |
| HDFS | Reads files in a specified format from the Hadoop Distributed File System (HDFS) |
Special family
Engines in the Special family serve specific architectural needs.
| Engine | Description |
|---|---|
| Distributed | Does not store data. Routes queries to multiple shards and aggregates results. |
| MaterializedView | Stores pre-computed query results. Used to create materialized views. |
| Dictionary | Exposes dictionary data as a queryable table. |
| Merge | Does not store data. Reads from multiple tables simultaneously as a virtual union. |
| File | Uses a local file as the data store. |
| NULL | Discards all written data. Reads return empty results. Useful as a target for materialized views during testing. |
| Set | Always stores data in random access memory (RAM). |
| Join | Stores join data in RAM for fast lookup. |
| URL | Reads from and writes to remote HTTP or HTTPS endpoints. |
| View | Stores a SELECT query definition but no data. Acts as a non-materialized view. |
| Memory | Stores data in RAM. Data is lost when the server restarts. Suitable for temporary tables with fewer than 100 million rows and no data persistence requirements. Commonly used for temporary tables in ApsaraDB for ClickHouse. |
| Buffer | Buffers writes in RAM and flushes to a destination table when configurable thresholds are met. |