All Products
Search
Document Center

ApsaraDB for ClickHouse:FAQ

Last Updated:Mar 25, 2024

This topic provides answers to some frequently asked questions (FAQ) about ApsaraDB for ClickHouse.

What are the unique features of ApsaraDB for ClickHouse compared with open source ClickHouse?

Multiple bugs of open source ClickHouse are fixed in ApsaraDB for ClickHouse to ensure stability. ApsaraDB for ClickHouse provides the resource queue feature for you to assign priorities to the resource queues that are bound to users.

Which version of an ApsaraDB for ClickHouse cluster is recommended for purchase?

ApsaraDB for ClickHouse provides services based on the long-term support (LTS) kernel versions released by the open source community. In most cases, if a kernel version remains stable for more than three months after release, you can purchase an ApsaraDB for ClickHouse cluster that uses the kernel version. We recommend that you purchase an ApsaraDB for ClickHouse cluster of V21.8 or later. For more information about the feature comparison among different kernel versions, see Features comparison among different kernel versions.

What are the differences between Single-replica Edition and Double-replica Edition?

  • For an ApsaraDB for ClickHouse cluster of Single-replica Edition, no shard has replica shards. This edition cannot ensure high availability. An ApsaraDB for ClickHouse cluster of Single-replica Edition creates multiple copies of your data and stores the data on disks. This edition is cost-effective and ensures the security of your data.

  • In an ApsaraDB for ClickHouse cluster of Double-replica Edition, each shard has a replica shard. This allows disaster recovery. If a shard in the cluster becomes unavailable, its replica shard takes over.

What do I do if the resources are insufficient in a selected zone when I create an ApsaraDB for ClickHouse cluster?

You can select another zone in the same region. ApsaraDB for ClickHouse clusters within the same virtual private cloud (VPC) can use the resources of different zones in the same region. The services in the same region are not affected by network latency.

What factors affect the time required by the scale-out or scale-in of a cluster?

Data is migrated during the scale-out or scale-in of a cluster. A large volume of cluster data requires an extended period of time for scaling.

How is a cluster affected during a scale-out or scale-in?

When you scale out or scale in a cluster, the cluster is read-only. This ensures data consistency after the data is migrated.

What are the suggestions for horizontal scaling?

Horizontal scaling may take a long time to complete. If the cluster performance cannot meet your business requirements, we recommend that you scale up your cluster. For more information about how to scale up a cluster, see Modify the configurations of an ApsaraDB for ClickHouse cluster.

How do ports work?

Protocol

Port

Use scenario

TCP

3306

This port is used when you use the clickhouse-client tool to connect to an ApsaraDB for ClickHouse cluster. For more information, see Use the CLI of ApsaraDB for ClickHouse to connect to a cluster.

HTTP

8123

This port is used when you use Java Database Connectivity (JDBC) to connect to an ApsaraDB for ClickHouse cluster for application development. For more information, see Use JDBC to connect to an ApsaraDB for ClickHouse cluster.

HTTPS

8443

This port is used when you use HTTPS to connect to an ApsaraDB for ClickHouse cluster. For more information, see Connect to an ApsaraDB for ClickHouse cluster over HTTPS.

What are the ports used by the SDK for each programming language to connect to an ApsaraDB for ClickHouse cluster?

Programming language

HTTP

TCP

Java

8123

3306

Python

Go

What are the recommended SDKs for Go or Python?

For more information, see Client Libraries from Third-party Developers.

What do I do if the error message "connect timed out" appears when I use a client tool to connect to an ApsaraDB for ClickHouse cluster?

To resolve this issue, perform the following operations:

  • Check whether the network connection is established. You can run the ping command to check the status of the network connection. You can also run the telnet command to check whether port 3306 and port 8123 of the ApsaraDB for ClickHouse cluster are enabled.

  • Check whether a whitelist is configured for the ApsaraDB for ClickHouse cluster to which you want to connect. For more information about how to configure a whitelist for an ApsaraDB for ClickHouse cluster, see Configure a whitelist.

  • Check whether the actual IP address of the client is used. The IP addresses of machines that are connected over the internal networks of enterprises are frequently changed. This way, the client IP address that you obtain may not be the actual IP address of the client. You can use an IP address checking tool such as WhatsMyIP to obtain the actual IP address of your client. For more information, see WhatsMyIP.

Why am I unable to connect an external table in an ApsaraDB for ClickHouse cluster to a table on platforms such as MySQL, Hadoop Distributed File System (HDFS), and Kafka?

When you create an external table in an ApsaraDB for ClickHouse cluster of V20.3 or V20.8, the system automatically verifies the network connection between the ApsaraDB for ClickHouse cluster and the external platform. If the external table is created, the network connection is established. The external table may fail to be created due to one of the following causes:

  • The table that you want to connect to is deployed in a cluster that is not in the same VPC as the ApsaraDB for ClickHouse cluster. In this case, the network connection fails to be established.

  • A whitelist is configured for the MySQL instance to which you want to connect. The CIDR block of the VPC to which the ApsaraDB for ClickHouse cluster is connected is not included in the whitelist. In this case, add the CIDR block to the whitelist.

For example, you create an external table in an ApsaraDB for ClickHouse cluster. The external table is connected to a table in a Kafka instance. When you query data from the external table, no results are returned. A common cause is that data in the Kafka instance fails to be parsed based on the fields and the format defined in the schema of the external table. The error message returns information about the parse error.

Why does my application fail to connect to an ApsaraDB for ClickHouse cluster?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: The VPC settings or Internet access settings are improper. If your application and the cluster are deployed in the same VPC, you can connect your application to the cluster over an internal network. If your application and the cluster are deployed in different VPCs, apply for a public endpoint for the cluster.

    Solution: Apply for a public endpoint for the cluster if your application and the cluster are deployed in different VPCs. For more information, see Apply for and release a public endpoint.

  • Cause 2: The IP address of your application is not added to the whitelist of the cluster.

    Solution: Add the IP address of your application to the whitelist of the cluster. For more information, see Configure a whitelist.

  • Cause 3: The security group of the Elastic Compute Service (ECS) instance on which the cluster runs denies access traffic.

    Solution: Enable the security group of the ECS instance on which the cluster runs to allow access traffic. For more information, see Work with security groups.

  • Cause 4: Your enterprise uses a network firewall.

    Solution: Modify the firewall rules.

  • Cause 5: The account password in the connection string contains any of the following special characters: !@#$%^&*()_+=. These special characters cannot be recognized during a connection. As a result, the application fails to connect to the ApsaraDB for ClickHouse cluster.

    Solution: Escape special characters in the connection string by using the following escape rules:

    ! : %21
    @ : %40
    # : %23
    $ : %24
    % : %25
    ^ : %5e
    & : %26
    * : %2a
    ( : %28
    ) : %29
    _ : %5f
    + : %2b
    = : %3d

    For example, if the password is ab@#c, the escaped characters of the password are ab%40%23c in the connection string.

  • Cause 6: The ApsaraDB for ClickHouse cluster is attached to a CLB instance by default. The CLB instance is charged by using the pay-as-you-go billing method. If your Alibaba Cloud account has an overdue payment, your ApsaraDB for ClickHouse cluster may be inaccessible.

    Solution: Check whether your Alibaba Cloud account has an overdue payment. If yes, pay off the overdue payment at your earliest opportunity.

How do I resolve a timeout error in an ApsaraDB for ClickHouse cluster?

The kernel of ApsaraDB for ClickHouse provides multiple timeout-related parameters and supports multiple protocols for interactions. For example, you can set parameters related to HTTP and TCP to resolve a timeout error in an ApsaraDB for ClickHouse cluster.

HTTP

HTTP is the most popular interaction protocol that is used by ApsaraDB for ClickHouse clusters in production environments. A number of tools and services adopt HTTP, such as the Java Database Connectivity (JDBC) driver that is provided by ClickHouse, Alibaba Cloud Data Management (DMS), and DataGrip. The common HTTP port is 8123.

  • Handle timeouts by setting the distributed_ddl_task_timeout parameter

    • The distributed_ddl_task_timeout parameter specifies the timeout period of distributed DDL queries that contain the on cluster clause. The default value is 180, in seconds. You can execute the following statement in DMS to globally set this parameter. After you set the parameter, restart your cluster.

      set global on cluster default distributed_ddl_task_timeout = 1800;

      Distributed DDL queries are asynchronously executed based on task queues that are created on ZooKeeper nodes. Even if a distributed DDL query times out, the query may not have failed. After a distributed DDL query times out, the query waits in a queue to be executed. You do not need to resubmit the query.

  • Handle timeouts by setting the max_execution_time parameter

    • The max_execution_time parameter specifies the timeout period for queries. The default timeout period for queries is 7,200 seconds in DMS and 30 seconds in the JDBC driver and DataGrip. If the waiting time of a query exceeds the specified timeout period, the query is automatically canceled. You can set this parameter in query statements. For example, you can execute the following query statement: select * from system.numbers settings max_execution_time = 3600. You can also execute the following statement in DMS to globally set this parameter:

      set global on cluster default max_execution_time = 3600;
  • Handle timeouts by setting the socket_timeout parameter

    • The socket_timeout parameter specifies the timeout period when the client listens to sockets over HTTP and waits for results. The default timeout period is 7,200 seconds in DMS and 30 seconds in the JDBC driver and DataGrip. This parameter is not a built-in parameter in ApsaraDB for ClickHouse, but a parameter of the JDBC driver for HTTP. This parameter specifies a time limit for the client to wait for results. The max_execution_time parameter may become invalid due to the setting of the socket_timeout parameter. In most cases, when you set the max_execution_time parameter, you also need to set the socket_timeout parameter. Set the socket_timeout parameter to a value slightly greater than the value of the max_execution_time parameter. To set the socket_timeout parameter, add the socket_timeout property to the JDBC connection string. The unit of the value is milliseconds. Example: 'jdbc:clickhouse://127.0.0.1:8123/default?socket_timeout=3600000'

  • Handle SLB disconnections

    • If no data packet is transmitted over a Server Load Balancer (SLB) connection in the specified time period, the connection is automatically closed and the client receives the "read timeout" error message. In this case, queries that are sent over the connection cannot be traced. You can execute the following statements in DMS to globally set the send_progress_in_http_headers and http_headers_progress_interval_ms parameters. After you set the parameters, restart your cluster.

      set global on cluster default send_progress_in_http_headers = 1; 
      set global on cluster default http_headers_progress_interval_ms = 60000; 

      If the kernel version of your cluster is 20.8, execute the following statements. If the settings of the parameters fail to take effect after you execute the following statements, you can apply for a minor version update.

      set global on cluster default http_server_enable_tcp_keep_alive = 1;
      set global on cluster default tcp_keep_alive_timeout = 60;

      After the send_progress_in_http_headers parameter is set to 1, the ApsaraDB for ClickHouse server continuously sends messages that contain information about the query execution progress in an HTTP header to the client. This way, data packets are transmitted over the connection all the time to prevent the connection from being closed.

      Even if the client is disconnected from the network, the queries that are sent over HTTP can be executed. You can query system tables to check whether the queries are executed.

      • Query information about the queries that are being executed on all nodes in the ApsaraDB for ClickHouse cluster.

        SELECT * FROM remote(default, system, processes) WHERE query LIKE 'XXX'
      • Query the execution results of the queries that are executed on the current day. The execution results include successful query results and error information about failed queries.

        SELECT * FROM remote(default, system, query_log) WHERE event_date = toDate(now()) AND query LIKE 'XXX'
  • Handle the hanging of a client that is connected to the ApsaraDB for ClickHouse server by using the server IP address

    • When the JDBC client on an ECS instance accesses the ApsaraDB for ClickHouse server in another security group, a silent connection error may occur on the ECS instance. The error occurred because the IP address of the ApsaraDB for ClickHouse server is not added to the whitelist of the security group to which the ECS instance belongs. If the system takes a long time to obtain the query result for a request of the client, the returned result may fail to be sent to the client because no route is available in the route table. In this case, the client is hung.

      To resolve this issue, set the send_progress_in_http_headers parameter to 1. This method is the same as the method of handling SLB disconnections. This method works in most cases. In rare cases, this issue persists after you set the send_progress_in_http_headers parameter to 1. If the issue persists after you set the parameter to 1, add the IP address of the ApsaraDB for ClickHouse server to the whitelist of the security group to which the ECS instance belongs. Then, check whether the issue is resolved.

TCP

TCP is mostly used in scenarios where interactive analysis is performed by using the built-in command-line tool of ApsaraDB for ClickHouse. The frequently used TCP port is 3306. TCP keepalive packets are used for scheduled connection failure detection. If TCP is used, socket timeouts do not occur. You need to set only the timeout-related parameters distributed_ddl_task_timeout and max_execution_time for TCP. You can set the parameters in the same manner in which you set the parameters for HTTP.

Why does an out of memory (OOM) error or a memory-related error occur when I import data in formats such as ORC and Parquet from an Object Storage Service (OSS) object to an external table in an ApsaraDB for ClickHouse cluster?

Common cause: The memory usage is high.

To resolve this issue, perform the following operations:

What do I do if the error message "too many parts" appears when I import data to an ApsaraDB for ClickHouse cluster?

A data part is generated each time you write data to a table in an ApsaraDB for ClickHouse cluster. If you write only one data record or a small amount of data each time, a large number of data parts are generated in your ApsaraDB for ClickHouse cluster. This increases the CPU utilization, and the period of time required to perform merge operations and queries. To prevent a large number of data parts from being generated in an ApsaraDB for ClickHouse cluster, a limit is imposed on the number of data parts that can be generated. If a large number of data parts are generated in an ApsaraDB for ClickHouse cluster, the error message "too many parts" is returned. If this error occurs, increase the batch size to write more data to a table in the ApsaraDB for ClickHouse cluster each time. If the batch size cannot be increased, you can set the merge_tree.parts_to_throw_insert parameter to a greater value in the ApsaraDB for ClickHouse console.

Why is a long period of time required to import data to a table in an ApsaraDB for ClickHouse cluster by using DataX?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: The specified parameter values are not suitable for the scenario. In ApsaraDB for ClickHouse, we recommend that you write data in multiple batches. Make sure that each batch contains a large volume of data and that a small number of concurrent data writing tasks are run at the same time. In most cases, a batch of data can contain tens of thousands of or even hundreds of thousands of rows. This depends on the size of a single row. In most cases, the estimated size of a row is 100 bytes. You must estimate a row size and calculate the number of rows based on the row size.

    Solution: Set the maximum number of concurrent write requests to 10. You can modify the settings of parameters and try again.

  • Cause 2: In an exclusive resource group that is created in DataWorks, ECS instances have low specifications. In an exclusive resource group, the low specifications of CPU cores or memory can impose limits on the request concurrency or outbound bandwidth. If you specify a large batch size but have insufficient memory capacity, garbage collection can occur in DataWorks Java processes.

    Solution: Check the specifications of the ECS instances based on the output logs of DataWorks.

  • Cause 3: A long period of time is required to perform a read operation in a data source.

    Solution: Search for totalWaitReaderTime and totalWaitWriterTime in the output logs of DataWorks. If the value of totalWaitReaderTime is much greater than the value of totalWaitWriterTime, a longer period of time is required to perform the read operation than the period of time required to perform the write operation.

  • Cause 4: A public endpoint is used. A public endpoint provides a limited bandwidth and cannot ensure high-performance data import or export.

    Solution: Replace the public endpoint with a VPC endpoint.

  • Cause 5: A data source contains dirty data. In normal cases, data is written in batches. If dirty data is read from the data source, the batch of data that is being written fails and a row of data is written each time. In addition, a large number of data parts are generated. This slows down the write operation.

    You can use the following methods to check whether dirty data is written:

    • Query data and check the error message that is returned. If the message contains "Cannot parse", dirty data is written.

      You can run the following sample code:

      SELECT written_rows, written_bytes, query_duration_ms, event_time, exception
      FROM system.query_log
      WHERE event_time BETWEEN '2021-11-22 22:00:00' AND '2021-11-22 23:00:00' AND lowerUTF8(query) LIKE '%insert into <table_name>%' and type != 'QueryStart' and exception_code != 0
      ORDER BY event_time DESC LIMIT 30;
    • Check the number of rows after a batch of data is written. If the number of rows is 1, dirty data is written.

      You can run the following sample code:

      SELECT written_rows, written_bytes, query_duration_ms, event_time
      FROM system.query_log
      WHERE event_time BETWEEN '2021-11-22 22:00:00' AND '2021-11-22 23:00:00' AND lowerUTF8(query) LIKE '%insert into <table_name>%' and type != 'QueryStart'
      ORDER BY event_time DESC LIMIT 30;

    Solution: Delete or modify the dirty data in the data source.

Why is the number of rows inconsistent between a table in a Hive instance and an external table in an ApsaraDB for ClickHouse cluster after data is synchronized between the tables?

To identify the cause of the inconsistency in the number of rows between the tables, perform the following steps:

  1. Check whether an error is reported during data synchronization based on a system table named query_log. If an error is reported, data loss may occur.

  2. Check whether the table engine that you use allows you to remove duplicates. If you use ReplacingMergeTree, the external table in the ApsaraDB for ClickHouse cluster may have a smaller number of rows compared with the table in the Hive instance.

  3. Check whether the number of rows in the table that is stored in the Hive instance is correct.

Why is the number of rows inconsistent between a table in a Kafka instance and an external table in an ApsaraDB for ClickHouse cluster after data is synchronized between the tables?

To identify the cause of the inconsistency in the number of rows between the tables, perform the following steps:

  1. Check whether an error is reported during data synchronization based on a system table named query_log. If an error is reported, data loss may occur.

  2. Check whether the table engine that you use allows you to remove duplicates. If you use ReplacingMergeTree, the external table in the ApsaraDB for ClickHouse cluster may have a smaller number of rows compared with the table in the Kafka instance.

  3. Check whether the kafka_skip_broken_messages parameter is set for the external table. If this parameter is set, ApsaraDB for ClickHouse may skip the Kafka messages that fail to be parsed. As a result, the external table in the ApsaraDB for ClickHouse cluster may have a smaller number of rows than the table in the Kafka instance.

How do I use Spark or Flink to import data to an ApsaraDB for ClickHouse cluster?

How do I migrate data from a self-managed ClickHouse database to an ApsaraDB for ClickHouse cluster?

To resolve this issue, perform the following operations:

Why is the "The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires" error reported when I use the MaterializeMySQL engine to synchronize MySQL data?

Common cause: The MaterializeMySQL engine stopped synchronizing data for a long period of time. As a result, MySQL binary logs expired and the system cleared the logs.

Solution: Delete the database that reported the error, and create a database that uses the MaterializeMySQL engine in ApsaraDB for ClickHouse.

Why does data synchronization on a table stop when the MaterializeMySQL engine is used to synchronize MySQL data? Why is the sync_failed_tables field in the system.materialize_mysql system table not null?

Common cause: The MySQL DDL statements that are executed during synchronization are not supported by ApsaraDB for ClickHouse.

Solution: Perform the following steps to execute another synchronization of MySQL data:

  1. Delete the table on which data synchronization stopped.

    DROP TABLE <table_name> ON cluster default;
    Note

    The table_name parameter specifies the name of the table on which data synchronization stopped. If the table on which data synchronization stopped is a distributed table, the local table and the distributed table must be deleted.

  2. Restart data synchronization.

    ALTER database <database_name> ON cluster default MODIFY SETTING skip_unsupported_tables = 1;
    Note

    The <database_name> parameter specifies the name of the database that uses the MaterializeMySQL engine in ApsaraDB for ClickHouse.

What do I do if the error message "Too many partitions for single INSERT block (more than 100)" appears?

Common cause: The number of partitions in a single inserted block specified in a single INSERT operation exceeds the value of max_partitions_per_insert_block. The default value of max_partitions_per_insert_block is 100. ApsaraDB for ClickHouse generates a data part for each write operation. A partition may contain one or more data parts. If a single INSERT operation inserts an excessively large number of partitions, a large number of data parts are generated in ApsaraDB for ClickHouse, which can impose a heavy burden on merge and query operations. To prevent a large number of data parts from being generated, ApsaraDB for ClickHouse imposes certain restrictions.

Solution: Perform the following operations to adjust the number of partitions or the value of max_partitions_per_insert_block:

  • Adjust the table schema and the partitioning mode or make sure that the number of partitions inserted at a time does not exceed the upper limit.

  • To prevent the number of partitions inserted at a time from exceeding the upper limit, execute the following statement to change the value of max_partitions_per_insert_block based on the data volume to increase the upper limit on the number of partitions that can be inserted at a time:

    SET GLOBAL ON cluster DEFAULT max_partitions_per_insert_block = XXX;
    Note

    In ClickHouse, the recommended default value of max_partitions_per_insert_block is 100. Do not set max_partitions_per_insert_block to an excessively large value. Otherwise, the performance may be affected. You can change the value of max_partitions_per_insert_block to the default value after you import data in batches.

What do I do if an error occurs because the memory usage exceeds the threshold value after I execute the INSERT INTO SELECT statement?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: The memory usage is high.

    Solution: Modify the settings of the max_insert_threads parameter to reduce the memory usage.

  • Cause 2: You executed the INSERT INTO SELECT statement to import data from one ApsaraDB for ClickHouse cluster to another cluster.

    Solution: Migrate data by importing a file to the destination cluster. For more information, see Migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster.

How do I query the CPU utilization and memory usage?

You can view the logs of CPU utilization and memory usage in the system.query_log table. This table contains the statistics on the CPU utilization and memory usage for each query. For more information, see system.query_log.

What do I do if the memory usage of my system exceeds the threshold value when a query is performed?

The ApsaraDB for ClickHouse server provides a memory tracker for each query thread. The memory trackers of all threads for each query report the memory usage to a memory tracker named memory tracker for query. Then, memory tracker for query reports the obtained information to the upper-layer memory tracker that is named memory tracker for total. To resolve the issue, perform the following operations based on your scenario:

  • If the "Memory limit (for query)" message is returned, more than 70% of the cluster memory is consumed by queries. We recommend that you scale up the memory specifications.

  • If the "Memory limit (for total)" message is returned, more than 90% of the cluster memory is consumed. We recommend that you reduce the number of concurrent queries. If the issue persists after you perform this operation, a large amount of memory may be consumed by asynchronous tasks in the background. In most cases, the deduplication of the records that contain the same primary key value after data is written consumes a large amount of memory. We recommend that you scale up the memory specifications.

What do I do if the number of concurrent queries exceeds the upper limit?

By default, the maximum number of concurrent queries on the ApsaraDB for ClickHouse server is 100. You can change the value in the ApsaraDB for ClickHouse console. To change the maximum number of concurrent queries, perform the following steps:

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, click the Default Instances tab and click the ID of the cluster that you want to manage.

  3. In the left-side navigation pane of the cluster details page, click Parameter Configuration.

  4. Change the value of the max_concurrent_queries parameter. Enter a valid value and click OK.修改参数

  5. Click Submit Parameters.

  6. Click OK.

What do I do if a different query result is returned each time I execute the same statement to query data from a table when no data is being written to the table?

The number that is returned for the SELECT COUNT(*) statement may be approximately half the total number of data records, or different query results may be returned for the same statement.

To resolve this issue, perform the following operations:

  • Check whether your ApsaraDB for ClickHouse cluster is a multi-node cluster. If your ApsaraDB for ClickHouse cluster has multiple nodes, you must create distributed tables on the nodes. Then, you can write data to the distributed tables and query data from the distributed tables. This way, the same query result is returned each time you execute the statement to query data from the distributed tables. If you do not use distributed tables, the data of a different shard may be queried each time you execute the statement. As a result, different query results are returned for the same statement. For more information about how to create a distributed table, see Create a table.

  • Check whether your ApsaraDB for ClickHouse cluster is a master-replica cluster. If you use a master-replica cluster, create replicated tables on nodes. This ensures that your data is consistent between the replicas of a node. If you do not use replicated tables, the query result may vary based on the replica that is queried. For more information about how to create a replicated table, see Table engines.

Why am I unable to find the table that I created and what do I do if a different number of records are returned each time I execute the same statement?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: An invalid DDL statement is used to create a table in an ApsaraDB for ClickHouse cluster. To create a table in a distributed ApsaraDB for ClickHouse cluster, you must include the ON CLUSTER default clause in the CREATE TABLE statement. If you execute only the CREATE TABLE statement, the table is created only on the server to which your client is connected. In this case, you can query data only from this table. If you connect your client to another server, you cannot find the table.

    Solution:

    1. When you create a table, execute the CREATE TABLE <table_name> ON CLUSTER default statement. Due to the ON CLUSTER default clause, the statement is executed on all nodes in the default cluster. The following sample statement provides an example:

      CREATE TABLE test ON cluster default (a UInt64) Engine = MergeTree() ORDER BY tuple();
    2. Create a distributed table engine on the table named test. You can execute the following statement to create a distributed table named test_dis:

      CREATE TABLE test_dis ON cluster default AS test Engine = Distributed(default, default, test, cityHash64(a));
  • Cause 2: A ReplicatedMergeTree table is incorrectly configured. The ReplicatedMergeTree table engine is developed based on the MergeTree table engine and can be used to synchronize data between replicated tables. Only MergeTree table engines can be created in an ApsaraDB for ClickHouse cluster of Single-replica Edition. Only ReplicatedMergeTree table engines can be created in an ApsaraDB for ClickHouse cluster of Double-replica Edition.

    Solution: When you create a table in an ApsaraDB for ClickHouse cluster of Double-replica Edition, use ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}') or ReplicatedMergeTree() to configure the ReplicatedMergeTree table engine. ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}') is a fixed configuration and does not need to be modified.

What do I do if the timestamp data that is queried from a table is different from the timestamp data that is written to the table?

Execute the SELECT timezone() statement to query the time zone of your cluster. If your cluster does not use the local time zone, set the timezone parameter to the local time zone. For more information about how to change the values of parameters, see What do I do if the number of concurrent queries exceeds the upper limit?

What do I do if a created table is not found when I query data from the table?

Common cause: A DDL statement is executed only on one node.

Solution: Check whether the DDL statement that you executed to create the table includes the on cluster keyword. For more information, see Create Table.

Why does the volume of data in the local table that stores the data of a Kafka external table remain unchanged after I write data from the external table?

To identify the cause, execute the SELECT * FROM statement to query data from the external table. If an error message is reported, you can identify the cause based on the error message. In most cases, the error message indicates that data failed to be parsed. If a query result is returned and no errors are reported, check whether the fields in the local table and the fields in the external table are consistent. If data cannot be written to the local table, the fields in the local table and the fields in the external table are inconsistent. You can execute the following sample statement:

insert into <Local table> as select * from <External table that is connected to a table in a Kafka instance>;

Why are the time and time zone displayed on the ApsaraDB for ClickHouse client different from those displayed on the ApsaraDB for ClickHouse server?

The use_client_time_zone parameter is set for the client, and the parameter value is not set to the local time zone.

Why am I unable to view the data that is written to a table in an ApsaraDB for ClickHouse cluster?

In most cases, this is because a local table and its associated distributed table use different schemas. You can query a system table named system.distribution_queue to check whether an error occurs when you write data to the distributed table.

Why does the OPTIMIZE statement require a long period of time to execute?

The execution of the OPTIMIZE statement consumes a large number of CPU resources and increases the disk throughput. High CPU utilization and high disk throughput degrade query performance and increase the period of time that is required to execute the OPTIMIZE statement. If the OPTIMIZE statement is executed on a node that handles a heavy load, a long period of time is required to execute this statement. In this version, no solutions are available.

Why are the records that contain the same primary key value not deduplicated after the OPTIMIZE statement is executed?

Before you execute the OPTIMIZE statement, make sure that the following conditions are met. This ensures that the logic used to deduplicate the records that contain the same primary key value is valid.

  • When you define a table to store data, the fields specified in the PARTITION BY clause must be included in the fields specified in the order by clause. If two records in different partitions contain the same primary key value, the records are not deduplicated.

  • When you define a distributed table, the fields that you want to hash must be included in the fields specified in the order by clause. If two records on different nodes contain the same primary key value, the records are not deduplicated.

The following table describes common OPTIMIZE statements.

Statement

Description

optimize table test;

Attempts to deduplicate the records that contain the same primary key value in specific data parts of a MergeTree table. The system may return the result without deduplicating the records that contain the same primary key value. The execution of this statement does not ensure that all records that contain the same primary key value in the table are deduplicated. In most cases, this statement is not used.

optimize table test partition tuple();

Deduplicates the records that contain the same primary key value in all data parts of a specified partition. The system may return a result without deduplicating the records that contain the same primary key value. After this statement is executed, the records that contain the same primary key value in all data parts of the specified partition are deduplicated. After the deduplication process is complete, all records in the partition are stored in the same data part. If this statement is executed on a table that contains a default single partition, the records that contain the same primary key value in the single partition are deduplicated. If data is written during the execution of this statement, the written data is not deduplicated. If the partition that you specify contains only one data part, this statement is not executed.

Note

If a table does not contain a partition key, the default partition of the table is partition tuple().

optimize table test final;

Deduplicates the records that contain the same primary key value for each partition of a table. This statement can be executed on a table partition that contains one or more data parts. You can execute this statement to remove expired data records based on a specified time-to-live (TTL) value. Among the three statements, the execution of the OPTIMIZE TABLE test FINAL; statement consumes the largest amount of resources. After the OPTIMIZE TABLE test FINAL; statement is executed, the system may return a result without deduplicating the records that contain the same primary key value.

If you execute one of the preceding statements, you can set the optimize_throw_if_noop parameter. This parameter specifies whether to return an exception error if a statement does not deduplicate the records that contain the same primary key value.

Why does the TTL setting not take effect after the OPTIMIZE statement is executed?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: Eviction based on the specified TTL value occurs when the records that contain the same primary key value are being deduplicated. If the records that contain the same primary key value in data parts are not deduplicated for a long period of time, the expired records cannot be removed.

    Solution:

    • You can execute the OPTIMIZE TABLE statement by using the FINAL or PARTITION clause to trigger a task that deduplicates the records that contain the same primary key value.

    • When you create a table, you can set parameters such as merge_with_ttl_timeout and ttl_only_drop_parts to increase the deduplication frequency of the records that contain the same primary key value in the data parts that contain the expired data.

  • Cause 2: The TTL for a table is specified or changed. As a result, TTL information may be incomplete or incorrect in existing data parts, and expired data may not be deleted.

    Solution:

    • You can execute the ALTER TABLE materialize ttl statement to reconfigure the TTL.

    • You can execute the OPTIMIZE TABLE test PARTITION tuple(); statement to update the TTL.

Why is data not updated or deleted after the OPTIMIZE statement is executed?

ApsaraDB for ClickHouse updates or deletes data in an asynchronous manner. No measures can be taken to intervene in the progress of updating or deleting data. You can view the progress of each operation in the system.mutations table.

How do I create, delete, or modify a column by executing a DDL statement in ApsaraDB for ClickHouse?

For a local table, you can execute a DDL statement to create, delete, or modify a column. For a distributed table, you can use either of the following methods to create, delete, or modify a column in different scenarios:

  • If no data is being written to the distributed table, you can create, delete, or modify a column in the local table and then in the distributed table.

  • If data is being written to the distributed table, you can use one of the following procedures to perform different operations.

    Type

    Operation

    Create a nullable column.

    1. Create or modify a column in the local table.

    2. Create or modify a column in the distributed table.

    Change the data type of a column if data type conversion is supported.

    Delete a nullable column.

    1. Create or modify a column in the distributed table.

    2. Create or modify a column in the local table.

    Create a non-nullable column.

    1. Stop writing data.

    2. Execute the SYSTEM FLUSH DISTRIBUTED statement.

    3. Create or modify a column in the local table.

    4. Create or modify a column in the distributed table.

    5. Resume writing data to the distributed table.

    Delete a non-nullable column.

    Change the name of a column.

Why does the system stop responding during the execution of DDL statements or a long period of time is required to execute DDL statements?

Common cause: If you execute global DDL statements, the statements are executed in serial mode. A complex query may cause a deadlock to occur.

To resolve this issue, perform the following operations:

  • Wait until all DDL statements are executed.

  • Stop executing DDL statements in the ApsaraDB for ClickHouse console.

  • In the ApsaraDB for ClickHouse console, go to the cluster details page. In the left-side navigation pane, click Parameter Configuration. On the page that appears, change the value of a parameter and retain the previous value of the parameter. Then, click Submit Parameters.

    Note

    For more information about how to change the values of parameters, see What do I do if the number of concurrent queries exceeds the upper limit?

What do I do if the "longer than distributed_ddl_task_timeout (=xxx) seconds" error is returned when a DDL statement is executed in a distributed manner?

To resolve this issue, execute the SET GLOBAL ON CLUSTER default distributed_ddl_task_timeout=xxx statement to change the default timeout period. Replace xxx in the statement with the timeout period that you want to specify. The timeout period is measured in seconds. For more information about how to modify the settings of global parameters, see Set parameters in the user.xml file.

What do I do when a syntax error occurs for the SET GLOBAL ON CLUSTER default statement?

For this issue, the following sections provide the common causes and solutions:

  • Cause 1: The ApsaraDB for ClickHouse client parses the syntax of statements. However, the SET GLOBAL ON CLUSTER default statement is valid only on the ApsaraDB for ClickHouse server. If the client version does not match the server version, the client reports a syntax error when you execute the SET GLOBAL ON CLUSTER default statement.

    Solution:

    • Use tools that do not parse the syntax of statements on a client. For example, you can use DataGrip or DBeaver with a JDBC driver.

    • You can also write a JDBC program to execute the SET GLOBAL ON CLUSTER default statement.

  • Cause 2: You specified a string value that is not enclosed in single quotation marks (' ') in the SET GLOBAL ON CLUSTER default statement.

    Solution: Enclose the string value in the SET GLOBAL ON CLUSTER default statement in single quotation marks (' ').

What are the recommended business intelligence (BI) tools?

We recommend that you use Quick BI.

What are the recommended integrated development environments (IDEs) for data queries?

We recommend that you use DataGrip or DBeaver.

How do I check the disk space that is used by each table?

To check the disk space that is used by each table, you can execute the following sample statement:

SELECT table, formatReadableSize(sum(bytes)) as size, min(min_date) as min_date, max(max_date) as max_date FROM system.parts WHERE active GROUP BY table; 

How do I query the size of cold data?

To query the size of cold data, you can execute the following sample statement:

SELECT * FROM system.disks;

How do I check which data is in cold storage?

To check which data is in cold storage, you can execute the following sample statement:

SELECT * FROM system.parts WHERE disk_name = 'cold_disk';

How do I dump data in a partition to cold storage?

To dump data in a partition to cold storage, you can execute the following sample statement:

ALTER TABLE table_name MOVE PARTITION partition_expr TO DISK 'cold_disk';

Why is cluster monitoring data missing for a specific period of time?

This issue occurs due to one of the following causes:

  • When you query data, an OOM error is reported.

  • The configuration of your cluster is modified. This causes your cluster to restart.

  • The specifications of your cluster are scaled up or scaled down. This causes your cluster to restart.

Does an ApsaraDB for ClickHouse cluster of V20.8 or later support seamless updates that do not require data migration?

Yes, you can update an ApsaraDB for ClickHouse cluster of V20.8 or later without the need for data migration.

What are the common system tables?

The following table describes the common system tables.

Name

Description

system.processes

Contains the SQL statements that are being executed.

system.query_log

Contains the SQL statements that were executed.

system.merges

Contains information about the merge operations that are performed in a cluster.

system.mutations

Contains information about mutations in a cluster.

How do I modify the settings of system parameters? After I modify the settings of system parameters, do I need to restart my cluster? What are the impacts on the cluster if the cluster is restarted?

The config.xml file contains information about system settings. You can modify the settings of system parameters in the file. To modify the settings of a system parameter, perform the following steps:

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, click the Default Instances tab and click the ID of the cluster that you want to manage.

  3. In the left-side navigation pane of the cluster details page, click Parameter Configuration.

  4. Change the value of the max_concurrent_queries parameter. Enter a valid value and click OK.修改参数

  5. Click Submit Parameters.

  6. Click OK.

After you click OK, the ApsaraDB for ClickHouse server is automatically restarted. The restart causes a transient connection that lasts for approximately 1 minute.

How do I modify the settings of user parameters?

The users.xml file contains information about user settings. You can modify the settings of user parameters in the file. To modify the settings of a user parameter, execute the following sample statement:

SET global ON cluster default ${key}=${value};

Unless otherwise specified, the new parameter value immediately takes effect after the statement is executed.

How do I change a resource quota?

You can use the SETTINGS statement to change a resource quota. To change a resource quota, execute the following sample statement:

settings max_memory_usage = XXX;