Table engines - ApsaraDB for ClickHouse - Alibaba Cloud Documentation Center

ApsaraDB for ClickHouse supports table engines in four families: MergeTree, Log, Integrations, and Special. This topic describes table engines in the four families and provides examples to show how to use the features of common table engines.

Overview

Table engines refer to the types of tables. In ApsaraDB for ClickHouse, table engines determine how data is stored and read, whether indexes are supported, and whether primary/secondary replication is supported. The following table describes the table engines supported by ApsaraDB for ClickHouse.

Family	Description	Table engine	Feature
MergeTree	Engines in the MergeTree family are suitable for high-load tasks. These powerful general-purpose engines allow you to insert a large volume of data at a high speed and support subsequent data processing. Engines in this family support features, such as data replication, partitioning, and data sampling.	MergeTree	This table engine inserts a large volume of data into a table. The data is quickly inserted into the table part by part. Then, the parts are merged based on rules.
		Data Replication	This table engine replicates data from one node to another node and ensures data consistency.
		Custom Partitioning Key	This table engine allows you to customize the partitions of data and define partition keys based on your business requirements to distribute data to different partitions.
		ReplacingMergeTree	This table engine removes duplicates that have the same primary key. The MergeTree table engine does not support this feature.
		CollapsingMergeTree	This table engine allows you to add the `Sign` column to the CREATE TABLE statement. This eliminates the following limits of the ReplacingMergeTree table engine: In a distributed scenario, the data that has the same primary key may be distributed to different nodes. Duplicates cannot be removed for data across different shards. Before the execution of the OPTIMIZE statement is complete, duplicates that have the same primary key may not be removed. For example, duplicates are removed for some data, but duplicates that have the same primary key are not removed for other data. The OPTIMIZE statement is executed in the background, and the execution time cannot be predicted. In a scenario in which a large volume of data exists, a long period of time is required to manually execute the OPTIMIZE statement. In this case, requirements for real-time queries cannot be met.
		VersionedCollapsingMergeTree	This table engine allows you to add the `Version` column to the CREATE TABLE statement. This helps resolve the issue that the CollapsingMergeTree table engine cannot collapse or delete rows as expected if the rows are inserted in an incorrect order.
		SummingMergeTree	This table engine pre-aggregates primary key columns and combines all rows that have the same primary key into one row. This helps reduce storage usage and improves aggregation performance.
		AggregatingMergeTree	This table engine is a pre-aggregation engine and is used to improve aggregation performance. You can use various aggregate functions.
		GraphiteMergeTree	This table engine is used to store and summarize Graphite data. This helps reduce storage space and improves the query efficiency of Graphite data.
		Approximate Nearest Neighbor Search Indexes	This data engine is an index engine for approximate nearest neighbor search and efficiently searches for the data points closest to a given query point in large-scale datasets.
		Full-text Search using Inverted Indexes	This data engine uses inverted indexes for full-text search, and is used for full-text search and retrieval in large-scale text data.
Log	Engines in the Log family are suitable for scenarios in which you need to quickly write data to small tables that contain about one million rows and read all data. The following section describes the common features of engines in this family: Data is appended and written to a disk. The `DELETE` and `UPDATE` operations are not supported. Indexes are not supported. Data cannot be atomically written. The `INSERT` operation blocks the `SELECT` operation.	TinyLog	This table engine does not support concurrent reading for data files and provides poor query performance. This table engine uses a simple format and is suitable for temporarily storing intermediate data.
		StripeLog	This table engine supports concurrent reading for data files and provides higher query performance than TinyLog. All columns are stored in a large file to reduce the number of files.
		Log	This table engine supports concurrent reading for data files and provides higher query performance than TinyLog. Each column is stored in a separate file.
Integrations	Engines in the Integrations family are suitable for importing external data to ApsaraDB for ClickHouse or using external data sources in ApsaraDB for ClickHouse.	Kafka	This table engine imports data from Kafka topics to ApsaraDB for ClickHouse.
		MySQL	This table engine uses MySQL as the storage engine and performs operations such as `SELECT` on MySQL tables in ApsaraDB for ClickHouse.
		JDBC	This table engine reads data sources by using Java Database Connectivity (JDBC) connection strings.
		ODBC	This table engine reads data sources by using Open Database Connectivity (ODBC) connection strings.
		HDFS	This table engine reads data files in a specified format on the Hadoop Distributed File System (HDFS).
Special	Engines in the Special family are suitable for specific scenarios.	Distributed	This table engine does not store data, but supports distributed queries on multiple servers.
		MaterializedView	This table engine is used to create materialized views.
		Dictionary	This table engine displays dictionary data as an ApsaraDB for ClickHouse table.
		Merge	This table engine does not store data, but can read data from other tables simultaneously.
		File	This table engine uses local files as data storage.
		NULL	This table engine discards the data that is written to a Null table. If data is read from a Null table, the response is empty.
		Set	This table engine always stores data in random access memory (RAM).
		Join	This table engine always stores data in RAM.
		URL	This table engine manages data on remote HTTP and HTTPS servers.
		View	This table engine does not store data. This table engine stores only the specified `SELECT` queries.
		Memory	This table engine stores data in RAM. Data is lost after the server is restarted. This table engine provides excellent query performance. This table engine is suitable for querying small tables that contain less than 100 million rows and do not have data persistence requirements. In ApsaraDB for ClickHouse, this table engine is used for querying temporary tables in most cases.
		Buffer	This table engine is used to configure a memory buffer for a destination table. If the buffer meets the specified conditions, data is flushed to a disk.

Note

For more information about table engines, see Table Engines.

MergeTree

The MergeTree table engine can be used to analyze a large volume of data. The table engine supports data partitioning, sorting of stored data, primary key indexes, sparse indexes, and time to live (TTL) for data. The MergeTree table engine supports all SQL syntax of ApsaraDB for ClickHouse, but some features differ from the features of standard SQL.

In this example, a primary key is used to show the feature difference. In the SQL syntax of ApsaraDB for ClickHouse, the primary key is used to remove duplicates to ensure that data is unique. In the MergeTree table engine, the primary key is used to accelerate queries. Even after the compaction is complete, data rows that have the same primary key still exist.

Note

For more information about the MergeTree table engine, see MergeTree.

The following example shows how to use the MergeTree table engine:

Create a table named test_tbl. The primary key is (id, create_time). The stored data is sorted based on the primary key and is partitioned based on the value of create_time.

CREATE TABLE test_tbl ON CLUSTER default (
  id UInt16,
  create_time Date,
  comment Nullable(String)
) ENGINE = MergeTree()
   PARTITION BY create_time
     ORDER BY  (id, create_time)
     PRIMARY KEY (id, create_time)
     SETTINGS index_granularity=8192;

Write data that has duplicate primary keys.

insert into test_tbl values(1, '2019-12-13', null);
insert into test_tbl values(1, '2019-12-13', null);
insert into test_tbl values(2, '2019-12-14', null);
insert into test_tbl values(3, '2019-12-15', null);
insert into test_tbl values(3, '2019-12-15', null);

Query data.

select * from test_tbl;

The following query result is returned:

┌─id─┬─create_time─┬─comment──┐
│  1 │ 2019-12-13  │   NULL   │
│  1 │ 2019-12-13  │   NULL   │
│  2 │ 2019-12-14  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
└────┴─────────────┴──────────┘

Execute the OPTIMIZE statement to forcibly perform compaction in the background. Compaction is forcibly performed because the table engines in the MergeTree family use a structure that is similar to a log-structured merge-tree (LSM-tree) and the processing logic at the storage layer is not implemented until the compaction phase is started.
```
optimize table test_tbl final;
```

Query data again.

select * from test_tbl;

The following query result is returned. Data that has duplicate primary keys still exists.

┌─id─┬─create_time─┬─comment──┐
│  1 │ 2019-12-13  │   NULL   │
│  1 │ 2019-12-13  │   NULL   │
│  2 │ 2019-12-14  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
└────┴─────────────┴──────────┘

ReplacingMergeTree

ApsaraDB for ClickHouse provides the ReplacingMergeTree table engine to delete duplicates that have the same primary key. This helps resolve the issue that the MergeTree table engine does not support this feature.

Although the ReplacingMergeTree table engine can remove duplicates that have the same primary key, the table engine still has the following limits. Therefore, the ReplacingMergeTree table engine is used to ensure that duplicates are eventually removed for data, but cannot ensure that the primary keys are unique during the query process.

In a distributed scenario, the data that has the same primary key may be distributed to different nodes. Duplicates cannot be removed for data across different shards.
Before the execution of the OPTIMIZE statement is complete, duplicates that have the same primary key may not be removed. For example, duplicates are removed for some data, but duplicates that have the same primary key are not removed for other data.
The OPTIMIZE statement is executed in the background, and the execution time cannot be predicted.
In a scenario in which a large volume of data exists, a long period of time is required to manually execute the OPTIMIZE statement. In this case, requirements for real-time queries cannot be met.

Note

For more information about the ReplacingMergeTree table engine, see ReplacingMergeTree.

The following example shows how to use the ReplacingMergeTree table engine:

Create a table named test_tbl_replacing.

CREATE TABLE test_tbl_replacing (
  id UInt16,
  create_time Date,
  comment Nullable(String)
) ENGINE = ReplacingMergeTree()
   PARTITION BY create_time
     ORDER BY  (id, create_time)
     PRIMARY KEY (id, create_time)
     SETTINGS index_granularity=8192;

Write data that has duplicate primary keys.

insert into test_tbl_replacing values(1, '2019-12-13', null);
insert into test_tbl_replacing values(1, '2019-12-13', null);
insert into test_tbl_replacing values(2, '2019-12-14', null);
insert into test_tbl_replacing values(3, '2019-12-15', null);
insert into test_tbl_replacing values(3, '2019-12-15', null);

Query data.

select * from test_tbl_replacing;

The following query result is returned:

┌─id─┬─create_time─┬─comment──┐
│  1 │ 2019-12-13  │   NULL   │
│  1 │ 2019-12-13  │   NULL   │
│  2 │ 2019-12-14  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
└────┴─────────────┴──────────┘

Execute the OPTIMIZE statement to forcibly perform compaction in the background. Compaction is forcibly performed because the table engines in the MergeTree family use a structure similar to an LSM tree and processing logic at the storage layer is not implemented until the compaction phase is started.
```
optimize table test_tbl_replacing final;
```

Query data again.

select * from test_tbl_replacing;

The following query result is returned. Data that has duplicate primary keys is removed.

┌─id─┬─create_time─┬─comment──┐
│  1 │ 2019-12-13  │   NULL   │
│  2 │ 2019-12-14  │   NULL   │
│  3 │ 2019-12-15  │   NULL   │
└────┴─────────────┴──────────┘

CollapsingMergeTree

The CollapsingMergeTree table engine eliminates the limits on the features of the ReplacingMergeTree table engine. When you use the CollapsingMergeTree table engine, you must specify the Sign column in the CREATE TABLE statement. Rows are divided into two categories based on the value of the Sign column: state row and cancel row. Rows for which the value of the Sign column is 1 are called state rows. State rows are used to add states. Rows for which the value of the Sign column is -1 are called cancel rows. Cancel rows are used to delete states.

Note

The CollapsingMergeTree table engine can delete rows that have the same primary key in real time. If states continuously change and rows are written in parallel by using multiple threads, state rows and cancel rows may be out of order and cannot be collapsed or deleted as expected.

During compaction in the background, rows that have the same primary key and opposite Sign values are collapsed or deleted. If no compaction is performed, state rows and cancel rows exist at the same time. To collapse or delete rows that have the same primary key, perform the following operations at the business layer:

Record the values of original state rows, or query the database to obtain the values of original state rows before you delete the states.
The reason is that you must write cancel rows when you delete states. Cancel rows must contain the data that has the same primary keys as the primary keys of original state rows, except the Sign column.
When you execute aggregate functions, such as count() and sum(col), data redundancy may exist. To obtain valid results, modify SQL statements at the business layer. Change count() to sum(Sign) and sum(col) to sum(col * Sign).
The following reasons are available:
- The time to perform compaction in the background cannot be predicted. When you initiate a query, state rows and cancel rows may not be collapsed or deleted.
- ApsaraDB for ClickHouse cannot ensure that rows with the same primary key fall on the same node. Data cannot be collapsed or deleted across nodes.

Note

For more information about the CollapsingMergeTree table engine, see CollapsingMergeTree.

The following example shows how to use the CollapsingMergeTree table engine:

Create a table named test_tbl_collapsing.

CREATE TABLE test_tbl_collapsing
(
    UserID UInt64,
    PageViews UInt8,
    Duration UInt8,
    Sign Int8
)
ENGINE = CollapsingMergeTree(Sign)
ORDER BY UserID;

Insert a state row for which the value of the Sign column is 1.
```
INSERT INTO test_tbl_collapsing VALUES (4324182021466249494, 5, 146, 1);
```
Note
If you insert a cancel row and then a state row, the cancel row and the state row may be out of order. In this case, even if the compaction is forcibly performed in the background, the data that has the same primary key cannot be collapsed or deleted.
Insert a cancel row for which the value of the Sign column is -1. Values in the cancel row are the same as the values in the inserted state row, except the value of the Sign column. At the same time, insert a new state row that has the same primary key as the cancel row.
```
INSERT INTO test_tbl_collapsing VALUES (4324182021466249494, 5, 146, -1), (4324182021466249494, 6, 185, 1);
```

Query data.

SELECT * FROM test_tbl_collapsing;

The following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign──┐ 
│ 4324182021466249494 │    5      │    146   │   1   │ 
│ 4324182021466249494 │    5      │    146   │  -1   │ 
│ 4324182021466249494 │    6      │    185   │   1   │ 
└─────────────────────┴───────────┴──────────┴───────┘

If you want to execute aggregate functions on specified columns, modify the SQL statement to obtain valid results. In this example, the aggregate function sum(col) is used and the SQL statement is modified to the following statement:

SELECT
    UserID,
    sum(PageViews * Sign) AS PageViews,
    sum(Duration * Sign) AS Duration
FROM test_tbl_collapsing
GROUP BY UserID
HAVING sum(Sign) > 0;

After data is aggregated, the following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration──┐ 
│ 4324182021466249494 │    6      │    185    │ 
└─────────────────────┴───────────┴───────────┘

Execute the OPTIMIZE statement to forcibly perform compaction in the background. Compaction is forcibly performed because the table engines in the MergeTree family use a structure similar to an LSM tree and processing logic at the storage layer is not implemented until the compaction phase is started.
```
optimize table test_tbl_collapsing final;
```

Query data again.

SELECT * FROM test_tbl_collapsing;

The following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign──┐  
│ 4324182021466249494 │    6      │    185   │   1   │ 
└─────────────────────┴───────────┴──────────┴───────┘

VersionedCollapsingMergeTree

The CollapsingMergeTree table engine cannot collapse or delete rows as expected if the rows are inserted in an incorrect order. To resolve this issue, ApsaraDB for ClickHouse provides the VersionedCollapsingMergeTree table engine that allows you to add the Version column to the CREATE TABLE statement. The Version column is used to record mappings between state rows and cancel rows if the rows are inserted in an incorrect order. During compaction in the background, rows that have the same primary key, same Version value, and opposite Sign values are collapsed or deleted.

Similar to the CollapsingMergeTree table engine, when you execute aggregate functions such as count() and sum(col), modify the SQL statement at the business layer. Change count() to sum(Sign) and sum(col) to sum(col * Sign).

Note

For more information about the VersionedCollapsingMergeTree table engine, see VersionedCollapsingMergeTree.

The following example shows how to use the VersionedCollapsingMergeTree table engine:

Create a table named test_tbl_Versioned.

CREATE TABLE test_tbl_Versioned
(
    UserID UInt64,
    PageViews UInt8,
    Duration UInt8,
    Sign Int8,
    Version UInt8
)
ENGINE = VersionedCollapsingMergeTree(Sign, Version)
ORDER BY UserID;

Insert a cancel row for which the value of the Sign column is -1.

INSERT INTO test_tbl_Versioned VALUES (4324182021466249494, 5, 146, -1, 1);

Insert a state row for which the value of the Sign column is 1 and the value of the Version column is 1. The values in other columns are the same as the values in the cancel row that is inserted. At the same time, insert a new state row that has the same primary key as the cancel row.
```
INSERT INTO test_tbl_Versioned VALUES (4324182021466249494, 5, 146, 1, 1),(4324182021466249494, 6, 185, 1, 2);
```

Query data.

SELECT * FROM test_tbl_Versioned;

The following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign───┬Version─┐ 
│ 4324182021466249494 │    5      │    146   │  -1    │    1   │
│ 4324182021466249494 │    5      │    146   │   1    │    1   │
│ 4324182021466249494 │    6      │    185   │   1    │    2   │
└─────────────────────┴───────────┴──────────┴────────┴────────┘

SELECT
    UserID,
    sum(PageViews * Sign) AS PageViews,
    sum(Duration * Sign) AS Duration
FROM test_tbl_Versioned
GROUP BY UserID
HAVING sum(Sign) > 0;

After data is aggregated, the following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration─┐ 
│ 4324182021466249494 │    6      │    185   │
└─────────────────────┴───────────┴──────────┘

Execute the OPTIMIZE statement to forcibly perform compaction in the background. Compaction is forcibly performed because the table engines in the MergeTree family use a structure similar to an LSM tree and processing logic at the storage layer is not implemented until the compaction phase is started.
```
optimize table test_tbl_Versioned final;
```

Query data again.

SELECT * FROM test_tbl_Versioned;

The following query result is returned:

┌────────UserID───────┬─PageViews─┬─Duration─┬─Sign───┬Version─┐ 
│ 4324182021466249494 │    6      │    185   │   1    │    2   │
└─────────────────────┴───────────┴──────────┴────────┴────────┘

SummingMergeTree

The SummingMergeTree table engine pre-aggregates primary key columns and combines all the rows that have the same primary key into one row. This helps reduce storage usage and improves aggregation performance.

Before you use the SummingMergeTree table engine, take note of the following points:

In ApsaraDB for ClickHouse, primary key columns are pre-aggregated only when compaction is performed in the background. The time to perform compaction cannot be predicted. Therefore, some data may be pre-aggregated, whereas some data may not be pre-aggregated. The GROUP BY clause is still required in the SQL statement to perform aggregation.
During pre-aggregation, ApsaraDB for ClickHouse pre-aggregates all columns except the primary key columns. If these columns can be aggregated, such as the columns of the NUMERIC data type, the values are summed. If the columns cannot be aggregated, such as the columns of the STRING data type, a random value is used.
We recommend that you use the SummingMergeTree table engine in conjunction with the MergeTree table engine. The MergeTree table engine stores complete data, and the SummingMergeTree table engine stores pre-aggregated results.

Note

For more information about the SummingMergeTree table engine, see SummingMergeTree.

The following example shows how to use the SummingMergeTree table engine:

Create a table named test_tbl_summing.

CREATE TABLE test_tbl_summing
(
    key UInt32,
    value UInt32
)
ENGINE = SummingMergeTree()
ORDER BY key;

Write data.

INSERT INTO test_tbl_summing Values(1,1),(1,2),(2,1);

Query data.

select * from test_tbl_summing;

The following query result is returned:

┌─key─┬value─┐
│  1  │  1   │
│  1  │  2   │
│  2  │  1   │
└─────┴──────┘

Execute the OPTIMIZE statement to forcibly perform compaction in the background. Compaction is forcibly performed because the table engines in the MergeTree family use a structure similar to an LSM tree and processing logic at the storage layer is not implemented until the compaction phase is started.
```
optimize table test_tbl_summing final;
```
After compaction is forcibly performed in the background, execute the statement that contains the GROUP BY clause to aggregate data. Then, query data again.
```
SELECT key, sum(value) FROM test_tbl_summing GROUP BY key;
```
The following query result is returned. Data that has duplicate primary keys has been aggregated.
```
┌─key─┬value─┐
│  1  │  3   │
│  2  │  1   │
└─────┴──────┘
```

AggregatingMergeTree

The AggregatingMergeTree table engine is a pre-aggregation engine and is used to improve aggregation performance. The SummingMergeTree table engine uses the sum function to aggregate non-primary key columns. Compared with the SummingMergeTree table engine, the AggregatingMergeTree table engine allows you to use various aggregate functions.

The syntax of the AggregatingMergeTree table engine is complex. The AggregatingMergeTree table engine must be used in conjunction with a materialized view or the special data type AggregateFunction in ApsaraDB for ClickHouse.

Note

For more information about the AggregatingMergeTree table engine, see AggregatingMergeTree.

The following example shows how to use the AggregatingMergeTree table engine:

Use the AggregatingMergeTree table engine in conjunction with a materialized view

Create a detail table named visits.

CREATE TABLE visits
(
    UserID UInt64,
    CounterID UInt8,
    StartDate Date,
    Sign Int8
)
ENGINE = CollapsingMergeTree(Sign)
ORDER BY UserID;

Create a materialized view named visits_agg_view for the visits table, and execute the sumState and uniqState functions to pre-aggregate the visits table.

CREATE MATERIALIZED VIEW visits_agg_view
ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate)
AS SELECT
    CounterID,
    StartDate,
    sumState(Sign)    AS Visits,
    uniqState(UserID) AS Users
FROM visits
GROUP BY CounterID, StartDate;

Write data to the visits table.

INSERT INTO visits VALUES(0, 0, '2019-11-11', 1);
INSERT INTO visits VALUES(1, 1, '2019-11-12', 1);

Execute the sumMerge and uniqMerge aggregate functions to aggregate the materialized view. Then, query the aggregated data.
```
SELECT
    StartDate,
    sumMerge(Visits) AS Visits,
    uniqMerge(Users) AS Users
FROM visits_agg_view
GROUP BY StartDate
ORDER BY StartDate
```
Note
The sum and uniq functions can no longer be used. If you use the preceding functions, the following error message is returned when you execute the SQL statement: Illegal type AggregateFunction(sum, Int8) of argument for aggregate function sum...
The following query result is returned:
```
┌──StartDate──┬─Visits─┬─Users──┐
│  2019-11-11 │   1    │   1    │
│  2019-11-12 │   1    │   1    │
└─────────────┴────────┴────────┘
```

Use the AggregatingMergeTree table engine in conjunction with the special data type AggregateFunction
1. Create a detail table named detail_table.
```
CREATE TABLE detail_table
(   CounterID UInt8,
    StartDate Date,
    UserID UInt64
) ENGINE = MergeTree() 
PARTITION BY toYYYYMM(StartDate) 
ORDER BY (CounterID, StartDate);
```
2. Write data to detail_table.
```
INSERT INTO detail_table VALUES(0, '2019-11-11', 1);
INSERT INTO detail_table VALUES(1, '2019-11-12', 1);
```
3. Create an aggregate table named agg_table in which the data type of the UserID column is AggregateFunction.
```
CREATE TABLE agg_table
(   CounterID UInt8,
    StartDate Date,
    UserID AggregateFunction(uniq, UInt64)
) ENGINE = AggregatingMergeTree() 
PARTITION BY toYYYYMM(StartDate) 
ORDER BY (CounterID, StartDate);
```
4. Execute the uniqState aggregate function to insert the data of the detail table into the aggregate table.
```
INSERT INTO agg_table
select CounterID, StartDate, uniqState(UserID)
from detail_table
group by CounterID, StartDate;
```
  Note
  You cannot execute the INSERT INTO agg_table VALUES(1, '2019-11-12', 1); statement to insert data into the aggregate table. Otherwise, the following error message is returned: Cannot convert UInt64 to AggregateFunction(uniq, UInt64)...
5. Execute the uniqMerge aggregate function to aggregate data in the aggregate table. Then, query the aggregated data.
```
SELECT uniqMerge(UserID) AS state 
FROM agg_table 
GROUP BY CounterID, StartDate;
```
  The following query result is returned:
```
┌─state─┐
│   1   │
│   1   │
└───────┘
```