ApsaraDB for ClickHouse supports four ways to create a table: from an explicit schema, by copying an existing table's schema, by mirroring a SELECT result set's structure, and as a distributed table that spans multiple local tables.
Create a local table
A local table stores data on the nodes where it is created. Use ON CLUSTER default to replicate the table definition across all nodes in the cluster.
Syntax
CREATE TABLE [IF NOT EXISTS] [db.]local_table_name ON CLUSTER cluster
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
...
INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = engine_name()
[PARTITION BY expr]
ORDER BY expr
[PRIMARY KEY expr]
[SAMPLE BY expr]
[SETTINGS name=value, ...];Parameters
| Parameter | Description |
|---|---|
db | Database name. Defaults to the current database. |
local_table_name | Name of the local table. |
ON CLUSTER cluster | Creates the table on each node. Fixed value: ON CLUSTER default. |
name1, name2 | Column names. |
type1, type2 | Column data types. For supported types, see Data types. |
ENGINE = engine_name() | Table engine. See Table engines for supported engines. |
ORDER BY expr | Sort key. Required. Accepts a column tuple or any expression. |
[DEFAULT|MATERIALIZED|ALIAS expr] | Default expression type for a column: DEFAULT generates a value automatically when none is provided; MATERIALIZED is a materialized expression; ALIAS is an alias expression. |
GRANULARITY | Index granularity. |
[PARTITION BY expr] | Partition key. Typically a date field or date expression. Also accepts other fields or expressions. |
[PRIMARY KEY expr] | Primary key. Defaults to the same value as the sort key. In most cases, omit this clause. |
[SAMPLE BY expr] | Sampling expression. If specified, include it in the primary key. |
[SETTINGS name=value, ...] | Additional settings that affect performance. See SETTINGS configuration items. |
ORDER BY,GRANULARITY,PARTITION BY,PRIMARY KEY,SAMPLE BY, andSETTINGSare only supported by MergeTree family table engines. See CREATE TABLE.
ReplicatedMergeTree configuration
When using a ReplicatedMergeTree engine, the following two forms are equivalent:
ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}')— the full form; do not modify the path template.ReplicatedMergeTree()— shorthand that expands to the full form automatically.
In Double-replica Edition clusters, all tables must use Replicated table engines from the MergeTree family. Non-Replicated engines do not replicate data across replicas, which can lead to data inconsistency.
Example
CREATE TABLE local_table ON CLUSTER default
(
Year UInt16,
Quarter UInt8,
Month UInt8,
DayofMonth UInt8,
DayOfWeek UInt8,
FlightDate Date,
FlightNum String,
Div5WheelsOff String,
Div5TailNum String
)ENGINE = MergeTree()
PARTITION BY toYYYYMM(FlightDate)
PRIMARY KEY (intHash32(FlightDate))
ORDER BY (intHash32(FlightDate),FlightNum)
SAMPLE BY intHash32(FlightDate)
SETTINGS index_granularity= 8192;Create a distributed table
A distributed table is a logical view over a set of local tables. Writes to a distributed table are automatically routed to the underlying local tables based on the sharding key. Queries aggregate results from all local tables before returning them.
Only ApsaraDB for ClickHouse Community-Compatible Edition supports distributed tables using the Distributed engine. In Enterprise Edition, multiple nodes act as replicas rather than shards, so distributed tables are not needed.Create local tables on all nodes before creating a distributed table.
Syntax
CREATE TABLE [db.]distributed_table_name ON CLUSTER default
AS db.local_table_name ENGINE = Distributed(cluster, db, local_table_name [, sharding_key])Parameters
| Parameter | Description |
|---|---|
db | Database name. Defaults to the current database. |
distributed_table_name | Name of the distributed table. |
ON CLUSTER cluster | Creates the table on each node. Fixed value: ON CLUSTER default. |
local_table_name | Name of the underlying local table. |
sharding_key | Sharding expression that determines which shard receives each row. Accepts an expression such as rand() or an integer column such as user_id. |
Example
CREATE TABLE distributed_table ON CLUSTER default
AS default.local_table
ENGINE = Distributed(default, default, local_table, rand());Copy a table schema
Create a new table with the same column definitions and table engine as an existing table.
Syntax
CREATE TABLE [IF NOT EXISTS] [db.]table_name2 ON CLUSTER cluster AS [db.]table_name1 [ENGINE = engine_name];Parameters
| Parameter | Description |
|---|---|
db | Database name. Defaults to the current database. |
table_name1 | Source table whose schema is copied. |
table_name2 | New table to create. |
ON CLUSTER cluster | Creates the table on each node. Fixed value: ON CLUSTER default. |
[ENGINE = engine_name] | Table engine for the new table. If omitted, the new table uses the same engine as the source table. See Table engines. |
Example
CREATE TABLE t2 ON CLUSTER default as default.local_table;Create a table from a SELECT query
Create a table whose schema matches the result set of a SELECT statement, and populate it with the query results.
Syntax
CREATE TABLE [IF NOT EXISTS] [db.]s_table_name ON CLUSTER cluster ENGINE = engine_name() AS SELECT ...Parameters
| Parameter | Description |
|---|---|
db | Database name. Defaults to the current database. |
s_table_name | Name of the table to create. |
ON CLUSTER cluster | Creates the table on each node. Fixed value: ON CLUSTER default. |
ENGINE = engine_name() | Table engine. See Table engines. |
SELECT ... | Query whose result set defines the schema and initial data. |
Example
CREATE TABLE t3 ON CLUSTER default ENGINE =MergeTree() order by Year as select * from default.local_table;