All Products
Search
Document Center

ApsaraDB for ClickHouse:Create Table

Last Updated:Jun 13, 2023

This topic describes how to create a table in ApsaraDB for ClickHouse.

Create a local table

Syntax
CREATE TABLE [IF NOT EXISTS] [db.]local_table_name ON CLUSTER cluster
(
    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
    ...
    INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
    INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = engine_name()
[PARTITION BY expr]
ORDER BY expr
[PRIMARY KEY expr]
[SAMPLE BY expr]
[SETTINGS name=value, ...];
Parameters
ParameterDescription
dbThe name of the database. The default value is the name of the current database. In this example, default is used as the database name.
local_table_nameThe name of the local table.
ON CLUSTER clusterSpecifies that a local table is created on each node. Set the value to ON CLUSTER default.
name1,name2The name of the column.
type1,type2The type of the column.
Note For information about the data types that are supported by ApsaraDB for ClickHouse, see Data types.
ENGINE = engine_name()The type of the table engine.
When you create a table in a cluster of the Double-replica Edition, you must use the Replicated * engine that supports data replication among the engines in the MergeTree family. If you do not use the Replicated * engine, data cannot be replicated between replicas and inconsistent data query results are returned. When you use this engine to create a table, use one of the following methods to configure the parameters:
  • ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}'). The characters in the parameters cannot be changed.
  • ReplicatedMergeTree(). This is equivalent to ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}').
Note For information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.
ORDER BY exprThe sort key. This parameter is required. The value can be a tuple of a set of columns or an expression.
[DEFAULT|MATERIALIZED|ALIAS expr]The default expression.
  • DEFAULT: default expression. If the field is left empty, the default value is automatically generated and populated.
  • MATERIALIZED: materialized expression.
  • ALIAS: alias expression.
GRANULARITYThe index granularity.
[PARTITION BY expr]The partition key. In most cases, data is partitioned by date. You can specify another field or field expression as the partition key.
[PRIMARY KEY expr]The primary key. By default, the primary key is the same as the sort key. In most cases, you do not need to use the PRIMARY KEY clause to specify the primary key.
[SAMPLE BY expr]The sampling expression. If you want to use the sampling expression, include the sampling expression in the primary key.
[SETTINGS name=value, ...]Additional parameters that affect the performance of creating the table.
Note For information about the parameters that can be configured in SETTINGS, see SETTINGS configuration items.
Note Only table engines in the MergeTree family support the following parameters: ORDER BY, GRANULARITY, PARTITION BY, PRIMARY KEY, SAMPLE BY, and [SETTINGS name=value, ...]. For information about more parameters, see CREATE TABLE.
Example
CREATE TABLE local_table ON CLUSTER default
(
    Year UInt16,
    Quarter UInt8,
    Month UInt8,
    DayofMonth UInt8,
    DayOfWeek UInt8,
    FlightDate Date,
    FlightNum String,
    Div5WheelsOff String,
    Div5TailNum String
)ENGINE = MergeTree()
 PARTITION BY toYYYYMM(FlightDate)
 PRIMARY KEY (intHash32(FlightDate))
 ORDER BY (intHash32(FlightDate),FlightNum)
 SAMPLE BY intHash32(FlightDate)
SETTINGS index_granularity= 8192;

Create a distributed table

A distributed table is a collection of local tables. The distributed table abstracts the local tables into a unified table and supports data writes and queries. When data is written to a distributed table, the data is automatically distributed to each local table in the collection. When a distributed table is queried, local tables in the collection are separately queried and the final results are rolled up and returned. Before you create a distributed table, create a local table.

Syntax
CREATE TABLE [db.]distributed_table_name ON CLUSTER default
 AS db.local_table_name ENGINE = Distributed(cluster, db, local_table_name [, sharding_key])
Parameters
ParameterDescription
dbThe name of the database. The default value is the name of the current database. In this example, default is used as the database name.
distributed_table_nameThe name of the distributed table.
ON CLUSTER clusterSpecifies that a table is created on each node. Set the value to ON CLUSTER default.
local_table_nameThe name of the local table that you created.
sharding_keyThe sharding expression. The sharding expression determines the shard to which you want to write data.

The value of the sharding_key parameter can be an expression, such as the rand() function, or a column, such as the user_id column of the INTEGER type.

Example
CREATE TABLE distributed_table ON CLUSTER default
 AS default.local_table 
ENGINE = Distributed(default, default, local_table, rand());

Create a table by copying the schema of an existing table

You can create a table by copying the schema of an existing table so that the table has the same schema as the source table.

Syntax
CREATE TABLE [IF NOT EXISTS] [db.]table_name2 ON CLUSTER cluster AS [db.]table_name1 [ENGINE = engine_name];
Parameters
ParameterDescription
dbThe name of the database. The default value is the name of the current database. In this example, default is used as the database name.
table_name1The name of the source table from which the schema is copied. In this example, the local table local_table that has been created is used.
table_name2The name of the table that you want to create.
ON CLUSTER clusterSpecifies that a table is created on each node. Set the value to ON CLUSTER default.
[ENGINE = engine_name]The type of the table engine. If you do not specify a table engine when you create a table, the table engine of the source table is used by default.
Note For information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.
Example
create table t2 ON CLUSTER default as default.local_table;

Create a table by specifying a SELECT clause in a CREATE TABLE statement

You can use a specified table engine to create a table that has the same schema as the query result of the SELECT clause. The query result of the SELECT clause is populated to the table.

Syntax
CREATE TABLE [IF NOT EXISTS] [db.]s_table_name ON CLUSTER cluster ENGINE = engine_name() AS SELECT ...
Parameters
ParameterDescription
dbThe name of the database. The default value is the name of the current database. In this example, default is used as the database name.
s_table_nameThe table that you want to create by specifying the SELECT clause in the CREATE TABLE statement.
ON CLUSTER clusterSpecifies that a table is created on each node. Set the value to ON CLUSTER default.
ENGINE = engine_name()The type of the table engine.
Note For information about the table engine types that are supported by ApsaraDB for ClickHouse, see Table engines.
SELECT ...The SELECT clause.
Example
create table t3 ON CLUSTER default ENGINE =MergeTree() order by Year as select * from default.local_table;

References