This topic describes the terms of ApsaraDB for ClickHouse.

region

A region refers to the geographical location of the ClickHouse server that you purchase when you activate ApsaraDB for ClickHouse. When you activate ApsaraDB for ClickHouse, you must select a region. You cannot change the region afterward.

zone

A zone is a physical area that has an independent power grid and network within a region. Zones are interconnected over internal networks. Communication within the same zone has a low network latency.

ApsaraDB for ClickHouse cluster

In terms of physical composition, an ApsaraDB for ClickHouse cluster is a distributed database that consists of multiple ClickHouse servers. Depending on the purchased specifications, a ClickHouse server may contain one or more replicas and one or more shards. In terms of logical composition, an ApsaraDB for ClickHouse cluster can contain multiple database objects.

edition

Two editions of ApsaraDB for ClickHouse clusters are provided: High-availability Edition and Single-replica Edition. An ApsaraDB for ClickHouse cluster of the High-availability Edition is a dual-replica cluster. Each node of the cluster has two replicas. Each node of an ApsaraDB for ClickHouse cluster of the Single-replica Edition has only one replica.

For an ApsaraDB for ClickHouse cluster of the High-availability Edition, if one of the replicas of a node or shard becomes unavailable, the other replica of the node continues to provide services. Therefore, the cluster is highly available. For an ApsaraDB for ClickHouse cluster of the Single-replica Edition, if the only one replica becomes unavailable, the whole cluster becomes unavailable. The cluster cannot resume stable services until the unavailable replica is fully restored.

The number of resources that an ApsaraDB for ClickHouse cluster of the High-availability Edition provides is twice that of an ApsaraDB for ClickHouse cluster of the Single-replica Edition. Therefore, an ApsaraDB for ClickHouse cluster of the High-availability Edition costs twice as much as an ApsaraDB for ClickHouse cluster of the Single-replica Edition.

shard

In scenarios where a large volume of data needs to be processed, the storage and computing resources of a single server may become insufficient, which limits the processing performance. To improve the processing efficiency, ApsaraDB for ClickHouse distributes the large volume of data to multiple servers. Each server stores and processes only a part of the data. In this architecture, each server is called a shard.

replica

To ensure data security and high service availability after an error occurs, ApsaraDB for ClickHouse provides replicas. Data on a server can be copied to one or more servers, which are called replicas.

database

A database is the highest-level object in an ApsaraDB for ClickHouse cluster. A database may contain objects such as tables, columns, views, functions, and data types.

table

A table is a data organization form and consists of rows and columns. In terms of data distribution, ClickHouse tables are classified into local tables and distributed tables. In terms of storage engine, ClickHouse tables are classified into non-replicated tables and replicated tables.

local table

A local table is stored on only one node. Data can be written to the local table only on this node and cannot be distributed to other servers.

distributed table

A distributed table is a collection of local tables. The distributed table abstracts the local tables into a unified table and supports data writes and queries. When data is written to a distributed table, the data is automatically distributed to each local table of the distributed table. When data in a distributed table is queried, each local table of the distributed table is queried. The query results of all the local tables are summarized and then returned.

Note The difference between local tables and distributed tables lies in scalability. Local tables have no horizontal scalability. The data write and query performance of a local table is limited by the storage and computing resources of a single server. On the contrary, distributed tables have high horizontal scalability. The data write and query performance of a distributed table relies on the storage and computing resources of multiple servers.

non-replicated table

A non-replicated table has only one replica. The non-replicated table is stored on only one server. The data of the non-replicated table cannot be copied to other servers.

replicated table

The data of a replicated table is automatically copied to multiple servers to form multiple replicas.

Note The difference between non-replicated tables and replicated tables lies in availability:
  • A non-replicated table cannot ensure high service availability after an error occurs.
  • A replicated table can continue to provide services if at least one of its replicas is still available after an error occurs.