All Products
Search
Document Center

ApsaraDB for ClickHouse:FAQ

更新时间:Dec 04, 2025

This topic lists frequently asked questions (FAQs) and provides solutions for ApsaraDB for ClickHouse.

What features does ApsaraDB for ClickHouse offer compared to the official version?

ApsaraDB for ClickHouse primarily fixes stability bugs found in the community version and provides resource queues that allow you to configure resource usage priorities at the user role level.

When I purchase an ApsaraDB for ClickHouse instance, which version is recommended?

ApsaraDB for ClickHouse is based on stable Long-Term Support (LTS) kernel versions from the open source community. A version is typically offered as a cloud service after it has been stable for approximately three months. We recommend that you purchase version 21.8 or later. For more information about the features of different versions, see Version feature comparison.

What are the features of single-replica and master-replica instances?

  • Single-replica instances do not have replica nodes for each shard and therefore do not provide high availability. They provide high cost-effectiveness by ensuring data security through multi-replica storage on cloud disks.

  • Master-replica instances have a replica service node for each shard. The replica node provides disaster recovery if the primary node fails.

How do I handle the "Insufficient resources in the current region" message when I purchase resources?

You can choose another zone in the same region to make the purchase. VPCs support connectivity between different zones within the same region, and the network latency is negligible.

What factors affect the time required for horizontal scaling?

Horizontal scaling involves data migration. The more data an instance contains, the longer the migration takes.

How does scaling affect an instance?

To ensure data consistency after migration, the instance is in a read-only state during the scaling process.

What are the recommendations for horizontal scaling?

Horizontal scaling takes a long time. If your cluster performance is insufficient, prioritize vertical scaling. For more information about how to perform vertical scaling, see Vertical scaling and horizontal scaling for Community-Compatible Edition clusters.

What does each port do?

Protocol

Port number

Scenarios

TCP

3306

Use to connect to ApsaraDB For ClickHouse with the clickhouse-client tool. For more information, see Connect to ClickHouse through the command line interface.

HTTP

8123

Use to connect to ApsaraDB For ClickHouse using JDBC for application development. For more information, see Connect to ClickHouse through JDBC.

HTTPS

8443

Use to access ApsaraDB For ClickHouse using HTTPS. For more information, see Connect to ClickHouse through the HTTPS protocol.

What are the corresponding ports for connecting to ApsaraDB for ClickHouse using SDKs in different programming languages?

Programming language

HTTP protocol

TCP protocol

Java

8123

3306

Python

Go

What SDKs are recommended for Go and Python?

For more information, see third-party development libraries.

How do I resolve the "connect timed out" error when I connect to a cluster using a client?

You can use one of the following solutions:

  • Check the network connection. You can run the ping command to check network connectivity. You can also run the telnet command to check whether database ports 3306 and 8123 are open.

  • Check whether a whitelist is configured for ClickHouse. For more information, see Configure a whitelist.

  • Check whether the client IP address is correct. The IP addresses of computers in a corporate network often change, so the displayed IP address may be incorrect. You can use a professional IP address lookup service to determine the originating IP address. For more information, see whatsmyip.

Why can't I connect to external tables such as MySQL, HDFS, and Kafka?

Versions 20.3 and 20.8 automatically verify the connection when you create an external table. If the table is created, it indicates that the network is connected. If the table cannot be created, the common causes are as follows:

  • The destination endpoint and the ClickHouse cluster are not in the same VPC, which causes a network disconnection.

  • The MySQL endpoint has whitelist settings configured. You must add the IP address of the ClickHouse cluster to the whitelist of the MySQL endpoint.

In the case of a Kafka external table, the table may be created, but a query on the table returns no results. A common cause is that the data in Kafka cannot be parsed based on the fields and format specified in the table schema. The error message indicates the specific location of the parsing failure.

Why can't my program connect to ClickHouse?

The common causes and solutions are as follows:

  • Cause 1: The VPC or public network environment is not correctly configured. You can use an internal network connection if the program and the cluster are in the same VPC. If they are not in the same VPC, you must enable a public endpoint to connect.

    Solution: For more information about how to apply for a public endpoint, see Apply for and release a public endpoint.

  • Cause 2: A whitelist is not configured for the cluster.

    Solution: For more information about how to configure a whitelist, see Configure a whitelist.

  • Cause 3: The required ports are not open in the ECS security group.

    Solution: For more information about how to configure a security group, see Security group operations.

  • Cause 4: A corporate network firewall is blocking the connection.

    Solution: Modify the firewall rules.

  • Cause 5: The password in the connection string contains special characters, such as !@#$%^&*()_+=. These special characters may not be recognized during the connection and can cause the connection to fail.

    Solution: You must escape the special characters in the connection string. The escape rules are as follows:

    ! : %21
    @ : %40
    # : %23
    $ : %24
    % : %25
    ^ : %5e
    & : %26
    * : %2a
    ( : %28
    ) : %29
    _ : %5f
    + : %2b
    = : %3d

    Example: If the password is ab@#c, you must escape the special characters in the connection string. The password becomes ab%40%23c.

  • Cause 6: By default, ApsaraDB for ClickHouse mounts an SLB instance, which is a pay-as-you-go service. If your account has an overdue payment, your ApsaraDB for ClickHouse instance may become inaccessible.

    Solution: Check whether your Alibaba Cloud account has an overdue payment. If it does, pay the outstanding balance promptly.

How do I handle ClickHouse timeout issues?

The ApsaraDB for ClickHouse kernel has many timeout-related parameters and supports multiple interaction protocols. For example, you can set parameters for the HTTP and TCP protocols to resolve timeout issues.

HTTP protocol

The HTTP protocol is the most common method for interacting with ApsaraDB for ClickHouse in production environments. It is used on the backend by tools such as the official JDBC driver, Alibaba Cloud DMS, and DataGrip. The common port number for the HTTP protocol is 8123.

  • Resolving distributed_ddl_task_timeout issues

    • The default wait time for a distributed DDL query that includes `ON CLUSTER` to run is 180 seconds. You can run the following command in DMS to set the global parameter. A cluster restart is required for the new setting to take effect.

      set global on cluster default distributed_ddl_task_timeout = 1800;

      Distributed DDL uses ZooKeeper to build an asynchronous task queue. A timeout during execution does not mean that the query has failed. It only indicates that the previously sent request is still in the queue. You do not need to resend the task.

  • Resolving max_execution_time timeout issues

    • The default execution timeout for general queries is 7,200 seconds on the DMS platform, and 30 seconds for the JDBC driver and DataGrip. The query is automatically canceled after the timeout limit is reached. You can change this at the query level. For example, you can run select * from system.numbers settings max_execution_time = 3600. You can also run the following command in DMS to set the global parameter:

      set global on cluster default max_execution_time = 3600;
  • Resolving socket_timeout issues

    • The default wait time for the HTTP protocol to receive a response from a socket is 7,200 seconds on the DMS platform, and 30 seconds for the JDBC driver and DataGrip. This parameter is not a ClickHouse system parameter but a JDBC parameter for the HTTP protocol. However, it affects the max_execution_time parameter because it determines the maximum time that the client waits for a response. Therefore, when you adjust the max_execution_time parameter, also adjust the socket_timeout parameter to be slightly higher than max_execution_time. To set this parameter, add the socket_timeout property to the JDBC connection string and specify the value in milliseconds. For example: `jdbc:clickhouse://127.0.0.1:8123/default?socket_timeout=3600000`.

  • Client hangs when directly connecting to the ClickHouse server IP address

    • When ECS instances on Alibaba Cloud connect across security groups, they may encounter silent connection errors. This is because the security group whitelist of the ECS instance where the JDBC client is located has not been opened to the ClickHouse server. When the client's request receives query results after a very long time, the returned messages may not be sent to the client due to an inaccessible route table. At this point, the client hangs.

      The solution to this problem is the same as for SLB connection disconnection issues. Enabling `send_progress_in_http_headers` can solve most problems. In rare cases where enabling `send_progress_in_http_headers` does not solve the problem, you must configure the security group whitelist of the ECS instance where the JDBC client is located. Add the ClickHouse server address to the whitelist.

TCP protocol

The TCP protocol is most commonly used for interactive analysis with the built-in command-line interface of ClickHouse. The common port for Community-Compatible Edition clusters is 3306. Because the TCP protocol uses connection keep-alive messages, it does not experience socket-level timeout issues. You only need to monitor the timeouts for the distributed_ddl_task_timeout and max_execution_time parameters. These parameters are set in the same way as those for the HTTP protocol.

Why do memory errors or out-of-memory (OOM) crashes occur when I import data in formats such as ORC and PARQUET from an OSS external table?

This issue is commonly caused by high memory usage.

You can use one of the following solutions:

How do I handle the "too many parts" error when I import data?

ClickHouse generates a data part for each write operation. If you write a small amount of data at a time, many data parts are created. This places a heavy burden on merge operations and queries. To prevent too many data parts from being created, ClickHouse has internal limitations, which cause the "too many parts" error. If this error occurs, you can increase the batch size for write operations. If you cannot adjust the batch size, you can modify the merge_tree.parts_to_throw_insert parameter in the console and set it to a larger value.

Why is DataX data import slow?

The common causes and solutions are as follows:

  • Cause 1: Improper parameter settings are configured. ClickHouse is suitable for write operations that use large batches and a few concurrent processes. In most cases, the batch size can be as high as tens of thousands or even hundreds of thousands. The batch size depends on your row size. You can estimate the row size at 100 bytes per row, but you need to calculate it based on your actual data characteristics.

    Solution: We recommend that the number of concurrent processes does not exceed 10. You can try adjusting different parameters.

  • Cause 2: The ECS instance of the exclusive resource group for DataWorks has low specifications. For example, if the CPU and memory of the exclusive resource are too small, the number of concurrent processes and the outbound bandwidth are limited. If the batch size is too large and the memory is too small, Java garbage collection (GC) may be triggered in the DataWorks process.

    Solution: Check the ECS instance specifications in the DataWorks output logs.

  • Cause 3: Data is read slowly from the data source.

    Solution: Search for `totalWaitReaderTime` and `totalWaitWriterTime` in the DataWorks output logs. If `totalWaitReaderTime` is significantly greater than `totalWaitWriterTime`, the bottleneck is at the read end, not the write end.

  • Cause 4: A public endpoint is used. The bandwidth of public endpoints is limited and cannot support high-performance data import and export.

    Solution: Replace the public endpoint with a VPC endpoint.

  • Cause 5: Dirty data is present. Without dirty data, data is written in batches. However, if dirty data is encountered, the current batch write operation fails and falls back to row-by-row writing. This generates many data parts and significantly reduces the write speed.

    You can use one of the following methods to check for dirty data:

    • Check the error message. If the returned message contains Cannot parse, dirty data exists.

      The code is as follows:

      SELECT written_rows, written_bytes, query_duration_ms, event_time, exception
      FROM system.query_log
      WHERE event_time BETWEEN '2021-11-22 22:00:00' AND '2021-11-22 23:00:00' AND lowerUTF8(query) LIKE '%insert into <table_name>%' and type != 'QueryStart' and exception_code != 0
      ORDER BY event_time DESC LIMIT 30;
    • Check the number of rows in the batch. If the number of rows is 1, dirty data exists.

      The code is as follows:

      SELECT written_rows, written_bytes, query_duration_ms, event_time
      FROM system.query_log
      WHERE event_time BETWEEN '2021-11-22 22:00:00' AND '2021-11-22 23:00:00' AND lowerUTF8(query) LIKE '%insert into <table_name>%' and type != 'QueryStart'
      ORDER BY event_time DESC LIMIT 30;

    Solution: Delete or modify the dirty data in the data source.

Why does the number of rows in ClickHouse not match the number of rows in Hive after I import data from Hive?

You can use the following methods to troubleshoot the issue:

  1. Check the `query_log` system table to see if any errors occurred during the import. If errors occurred, data loss may have happened.

  2. Determine whether the table engine supports deduplication. For example, if you use the ReplacingMergeTree engine, the row count in ClickHouse may be smaller than in Hive.

  3. Reconfirm the accuracy of the row count in Hive. The source row count may have been determined incorrectly.

Why does the number of rows in ClickHouse not match the number of rows in Kafka after I import data from Kafka?

You can use the following methods to troubleshoot the issue:

  1. Check the `query_log` system table to see if any errors occurred during the import. If errors occurred, data loss may have happened.

  2. Determine whether the table engine supports deduplication. For example, if you use the ReplacingMergeTree engine, the row count in ClickHouse may be smaller than in Kafka.

  3. Check whether the `kafka_skip_broken_messages` parameter is configured for the Kafka external table. If this parameter is present, Kafka messages that fail to be parsed may be skipped. This causes the total number of rows in ClickHouse to be smaller than in Kafka.

How do I import data using Spark or Flink?

How do I import data from an existing ClickHouse instance to an ApsaraDB for ClickHouse instance?

You can use one of the following solutions:

When I synchronize MySQL data using the MaterializeMySQL engine, why do I receive the following error: The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires?

Common cause: The MaterializeMySQL engine stopped synchronizing for too long, which caused the MySQL binary logs to expire and be deleted.

Solution: Delete the database that has the error, and re-create the synchronized database in ApsaraDB for ClickHouse.

When I synchronize MySQL data using the MaterializeMySQL engine, why does table synchronization stop? Why is the sync_failed_tables field in the system.materialize_mysql system table not empty?

Common cause: MySQL DDL statements that are not supported by ApsaraDB for ClickHouse were used during the synchronization.

Solution: Resynchronize the MySQL data. To do so, perform the following steps:

  1. Delete the table that stopped synchronizing.

    DROP TABLE <table_name> ON cluster default;
    Note

    table_name is the name of the table that stopped synchronizing. If the table has a distributed table, you must delete both the local table and the distributed table.

  2. Restart the synchronization process.

    ALTER database <database_name> ON cluster default MODIFY SETTING skip_unsupported_tables = 1;
    Note

    <database_name> is the database in ApsaraDB for ClickHouse that is being synchronized.

How do I handle the "Too many partitions for single INSERT block (more than 100)" error?

Common cause: A single INSERT operation exceeds the `max_partitions_per_insert_block` limit. The default value is 100. ClickHouse generates a data part for each write operation. A partition may contain one or more data parts. If a single INSERT operation inserts data into too many partitions, it creates many data parts in ClickHouse. This places a heavy burden on merge operations and queries. To prevent too many data parts, ClickHouse has internal limitations.

Solution: Adjust the number of partitions or the `max_partitions_per_insert_block` parameter.

  • Adjust the table schema, modify the partitioning method, or prevent the number of different partitions in a single insert from exceeding the limit.

  • To prevent the number of different partitions in a single insert from exceeding the limit, you can modify the `max_partitions_per_insert_block` parameter based on the data volume to increase the limit. The syntax is as follows:

    Standalone instance

    SET GLOBAL max_partitions_per_insert_block = XXX;

    Multi-node deployment

    SET GLOBAL ON cluster DEFAULT max_partitions_per_insert_block = XXX;
    Note

    The ClickHouse community recommends a default value of 100. Do not set the number of partitions too high, as this may affect performance. After a bulk data import, you can change the value back to the default.

How do I handle an out-of-memory error when I run an INSERT INTO SELECT statement?

The common causes and solutions are as follows:

How do I query CPU and memory usage?

You can view the CPU and memory usage logs for queries in the `system.query_log` system table. This table contains statistics on the CPU and memory usage for each query. For more information, see system.query_log.

How do I handle an out-of-memory error during a query?

The ClickHouse server has a memory tracker for each query thread. The trackers for all threads in the same query report to a `memory tracker for query`. Above that is a `memory tracker for total`. You can use one of the following solutions based on your situation:

  • If you encounter a Memory limit (for query) error, the query is using too much memory (70% of the instance's total memory) and has failed. In this case, you can perform a vertical upgrade to increase the instance's memory.

  • If you encounter a Memory limit (for total) error, the total memory usage of the instance has exceeded the limit (90% of the instance's total memory). In this case, you can try to reduce query concurrency. If the issue persists, background asynchronous tasks may be using a large amount of memory. This is often the case for primary key merge tasks after a write. You need to perform a vertical upgrade to increase the instance's memory.

Why do I receive a memory limit error when I run an SQL statement on an Enterprise Edition cluster?

Cause: In an ApsaraDB for ClickHouse Enterprise Edition cluster, the total number of ClickHouse Compute Units (CCUs) is the sum of the CCUs from all nodes. A single node has 32 cores and 128 GB of memory (32 CCUs), with a memory limit of 128 GB. This includes memory for the operating system, which leaves approximately 115 GB of available memory. By default, a single SQL statement runs on one node. Therefore, if a single SQL statement consumes more than 115 GB of memory, a memory limit error may occur.

Note

The number of nodes in a cluster is determined only by the cluster's CCU upper limit. When the CCU upper limit is greater than 64, the formula for calculating the number of nodes in an Enterprise Edition cluster is: CCU upper limit / 32. When the CCU upper limit is less than 64, the number of nodes in an Enterprise Edition cluster is 2.

Solution: Set the following parameters after the SQL statement to run the query in parallel across multiple nodes. This can reduce the load and prevent memory limit errors.

SETTINGS 
    allow_experimental_analyzer = 1,
    allow_experimental_parallel_reading_from_replicas = 1;

How do I resolve an SQL error that is caused by excessive memory usage when I run a GROUP BY operation on a large result set?

You can set the max_bytes_before_external_group_by parameter to limit the memory consumption of GROUP BY operations. Note that the allow_experimental_analyzer parameter affects whether this setting takes effect.

How do I handle a concurrency limit error during a query?

The default maximum query concurrency for the server is 100. You can modify this in the console. To modify the running parameter value, perform the following steps:

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Cluster List page, select List of Community Edition Instances, and click the target cluster ID.

  3. In the navigation pane on the left, click Parameter Configuration.

  4. On the Parameter Configuration page, click the edit button next to Operating parameter value for the max_concurrent_queries parameter.

  5. Enter the target value in the pop-up box and click Confirm.Modify parameters

  6. Click Submit parameters.

  7. Click Submit parameters.

How do I handle inconsistent results for the same query statement when data writing is stopped?

Problem description: When you query data using select count(*), the result is only about half of the total data, or the data volume constantly fluctuates.

You can use one of the following solutions:

  • Check whether it is a multi-node cluster. For multi-node clusters, you must create distributed tables, write data to them, and query them to obtain consistent results. Otherwise, each query accesses data from different shards, which results in inconsistent results. For more information about how to create a distributed table, see Create a distributed table.

  • Check whether it is a master-replica cluster. Master-replica clusters require tables with Replicated series table engines to synchronize data between replicas. Otherwise, each query accesses a different replica, which results in inconsistent results. For more information about how to create a table with a Replicated series table engine, see Table engines.

Why can I sometimes not see a created table and the query results fluctuate?

The common causes and solutions are as follows:

  • Cause 1: Issues with the table creation process. A distributed ClickHouse cluster does not have native distributed DDL semantics. If you use create table to create a table in a self-managed ClickHouse cluster, the query may return a success message, but the table is created only on the server to which you are currently connected. If you reconnect to a different server, you will not see this table.

    Solution:

    1. When you create a table, use the create table <table_name> on cluster default statement. The on cluster default declaration broadcasts this statement to all nodes in the default cluster for execution. The following code provides an example:

      CREATE TABLE test ON cluster default (a UInt64) Engine = MergeTree() ORDER BY tuple();
    2. Create a distributed table engine on the `test` table. The table creation statement is as follows:

      CREATE TABLE test_dis ON cluster default AS test Engine = Distributed(default, default, test, cityHash64(a));
  • Cause 2: Issues with the ReplicatedMergeTree storage table configuration. The ReplicatedMergeTree table engine is an enhanced version of the MergeTree table engine with master-replica synchronization. Single-replica instances can only create MergeTree tables. Master-replica instances can only create ReplicatedMergeTree tables.

    Solution: When you create a table on a master-replica instance, use ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}') or ReplicatedMergeTree() to configure the ReplicatedMergeTree table engine. The ReplicatedMergeTree('/clickhouse/tables/{database}/{table}/{shard}', '{replica}') configuration is fixed and does not require modification.

How do I handle the issue where the queried result of timestamp data is different from the actual data written to a table?

You can run the SELECT timezone() statement to check whether the time zone is set to your local time zone. If not, modify the value of the `timezone` configuration item to your local time zone. For more information about how to modify the value, see Modify the running parameter value of a configuration item.

How do I handle the issue where a table does not exist after it is created?

Common cause: The DDL statement was run on only one node.

Solution: Check whether the DDL statement includes the on cluster keyword. For more information, see CREATE TABLE syntax.

Why does data not increase after a Kafka external table is created?

You can run a select * from query on the Kafka external table. If the query returns an error, you can determine the cause based on the error message. The cause is usually a data parsing failure. If the query returns results, check whether the fields of the destination table (the storage table for the Kafka external table) and the Kafka source table (the Kafka external table) match. If the data write fails, the fields do not match. The following is a sample statement:

insert into <destination_table> as select * from <kafka_external_table>;

Why is the time displayed on the client different from the time in the specified time zone?

The client has set use_client_time_zone to an incorrect time zone.

Why is data not visible after it is written?

Problem description: Why can't I query data after it has been written?

Cause: The common causes are as follows:

  • The table schemas of the distributed table and the local tables are inconsistent.

  • After data is written to a distributed table, the temporary files are not distributed.

  • After data is written to one of the replicas in a master-replica setup, replica synchronization is incomplete.

Cause analysis and solutions:

Inconsistent table schemas between distributed and local tables

You can query the system.distribution_queue system table to check whether errors occurred when data was written to a distributed table.

Temporary files are not distributed after data is written to a distributed table

Cause analysis: In a multi-node ApsaraDB for ClickHouse cluster, if your service connects to the database using a domain name and runs an INSERT statement on a distributed table, the INSERT request is forwarded by a front-end SLB instance and is routed to a random node in the cluster. After a node receives the INSERT request, it writes a portion of the data directly to its local disk. The remaining data is staged on the node as temporary files and then asynchronously distributed to other nodes in the cluster. If this distribution process is not complete, a subsequent query may fail to retrieve the undistributed data.

Solution: If your business requires high accuracy for query results immediately after an INSERT operation, you can add settings insert_distributed_sync = 1 to the INSERT statement. After you configure this parameter, INSERT operations on distributed tables become synchronous. The INSERT statement returns a success message only after all nodes have completed the distribution. For more information about how to configure this parameter, see the following content:

Important
  • After you set this parameter, the execution time of the INSERT statement may increase because it must wait for data distribution to complete. Consider the write performance requirements of your business before you set this parameter.

  • This parameter takes effect at the cluster level and should be considered carefully. We recommend that you test by adding this parameter to individual queries. After you verify that it is correct, decide whether to apply it at the cluster level as needed.

  • To make the setting effective for a single query, you can add it after the query. The following is an example:

    INSERT INTO <table_name> values() settings insert_distributed_sync = 1;
  • To make the setting effective at the cluster level, you can set it in the `user.xml` file. For more information, see Configure user.xml parameters.

Replica synchronization is incomplete after data is written to one of the replicas

Cause analysis: In a master-replica ApsaraDB for ClickHouse cluster, when an INSERT statement is run, only one of the two replicas, selected at random, runs the statement. The other replica asynchronously synchronizes the data. Therefore, after an INSERT statement is run, if a SELECT query is handled by the replica that has not yet completed data synchronization, the query may not find the expected data.

Solution: If your business requires high accuracy for query results immediately after an INSERT operation, you can add settings insert_quorum = 2 to the write statement. After this parameter is configured, data synchronization between replicas becomes synchronous. The INSERT statement returns a success message only after synchronization is complete across all replicas.

Important

Note the following when you set this parameter:

  • After you set this parameter, the execution time of the INSERT statement may increase because it must wait for data synchronization between replicas to complete. Consider the write performance requirements of your business before you set this parameter.

  • After you set this parameter, an INSERT statement can succeed only after data is synchronized between replicas. This means that if a replica is unavailable, all writes configured with `insert_quorum = 2` fail. This conflicts with the reliability guarantee of a master-replica cluster.

  • This parameter takes effect at the cluster level and should be considered carefully. We recommend that you test by adding this parameter to individual queries. After you verify that it is correct, decide whether to apply it at the cluster level as needed.

  • To make the setting effective for a single query, you can add it after the query. The following is an example:

    INSERT INTO <table_name> values() settings insert_quorum = 2;
  • To make the setting effective at the cluster level, you can set it in the `user.xml` file. For more information, see Configure user.xml parameters.

Why is expired data not deleted after a TTL is set in ClickHouse?

Problem description

A Time to Live (TTL) is correctly configured for a table, but the expired data in the table is not automatically deleted. This indicates that the TTL is not in effect.

Troubleshooting

  1. Check whether the TTL settings for the table are reasonable.

    Set the TTL based on your business needs. We recommend that you set the TTL at the day level. Avoid setting the TTL at the second or minute level, such as TTL event_time + INTERVAL 30 SECOND.

  2. Check the materialize_ttl_after_modify parameter.

    This parameter controls whether to apply a new TTL rule to historical data after you run an ALTER MODIFY TTL statement. The default value is 1, which means the rule is applied. A value of 0 means the rule applies only to new data. Historical data is not affected by the TTL limit.

    • View the parameter settings.

      SELECT * FROM system.settings WHERE name like 'materialize_ttl_after_modify';
    • Modify the parameter settings.

      Important

      This command scans all historical data, which can consume significant resources. Use it with caution.

      ALTER TABLE $table_name MATERIALIZE TTL;
  3. Check the partition cleanup policy.

    If the ttl_only_drop_parts parameter is set to 1, a data partition is deleted only when all data within the partition expires.

    • View the ttl_only_drop_parts parameter settings.

      SELECT * FROM system.merge_tree_settings WHERE name LIKE 'ttl_only_drop';
    • View the partition expiration status.

      SELECT partition, name, active, bytes_on_disk, modification_time, min_time, max_time, delete_ttl_info_min, delete_ttl_info_max FROM system.parts c WHERE database = 'your_dbname' AND TABLE = 'your_tablename' LIMIT 100;
      • delete_ttl_info_min: The minimum date and time key value in the part used for TTL DELETE rules.

      • delete_ttl_info_max: The maximum date and time key value in the part used for TTL DELETE rules.

    • If the partitioning rules do not match the TTL rules, some data may not be cleaned up for a long time. The following describes how partitioning rules and TTL rules match:

      • If the partitioning rule matches the TTL rule (for example, data is partitioned by day and the TTL rule also deletes data by day), the system can evaluate the TTL based on the `partition_id` and drop one partition at a time. This policy has the lowest cost. We recommend that you combine partitioning (such as daily partitioning) with the ttl_only_drop_parts=1 setting to efficiently delete expired data and improve performance.

      • If the partitioning rule does not match the TTL rule and ttl_only_drop_parts = 1, matching is performed based on the `ttl_info` of each part. The TTL operation is performed only when the entire part exceeds the `delete_ttl_info_max` time.

      • If the partitioning rule does not match the TTL rule and ttl_only_drop_parts = 0, the system must scan the data in each part to find and delete the data that needs to be deleted. This policy has the highest cost.

  4. Control the merge trigger frequency.

    Expired data is deleted asynchronously during the data merging process, not in real time. You can use the merge_with_ttl_timeout parameter to control the merge frequency, or use the ALTER TABLE ... MATERIALIZE TTL command to force the application of TTL.

    • View the parameter.

      SELECT * FROM system.merge_tree_settings WHERE name = 'merge_with_ttl_timeout';
      Note

      The unit is seconds. The default value for online instances is 7200 seconds (2 hours).

    • Modify the parameter.

      If merge_with_ttl_timeout is set too high, the TTL merge frequency decreases, which causes expired data to remain uncleared for a long time. You can lower this parameter to increase the cleanup frequency. For more information, see Parameter Description.

  5. Check the thread pool parameter settings.

    Data expiration based on TTL occurs during the part merge phase. This process is limited by the max_number_of_merges_with_ttl_in_pool and background_pool_size parameters. The default values for online instances are 2 and 16, respectively.

    • Query the current background thread activity.

      SELECT * FROM system.metrics WHERE metric LIKE 'Background%';

      `BackgroundPoolTask` represents the real-time monitoring value of the `background_pool_size` metric.

    • Modify the parameters.

      If your other parameter settings are normal and the CPU is relatively idle, you can first increase the max_number_of_merges_with_ttl_in_pool parameter, for example, from 2 to 4, or from 4 to 8. If this adjustment has no effect, you can increase the background_pool_size parameter.

      Important

      Adjusting the max_number_of_merges_with_ttl_in_pool parameter requires a cluster restart. Increasing the background_pool_size parameter does not require a cluster restart, but decreasing the background_pool_size parameter does.

  6. Check whether the table schema and partition design are reasonable.

    If a table is not partitioned properly or the partition granularity is too large, the TTL cleanup efficiency is reduced. To efficiently clean up expired data, we recommend that you match the partition granularity with the TTL granularity (for example, both by day). For more information, see Best practices.

  7. Check whether the cluster has sufficient disk space.

    TTL is triggered in the background along with merge operations, which requires a certain amount of disk space. If large parts exist or disk space is insufficient (usage exceeds 90%), TTL may fail to run.

  8. Check other system parameter settings in system.merge_tree_settings.

    • merge_with_recompression_ttl_timeout: The minimum delay before merging again with recompression TTL. For online instances, this is set to 4 hours by default. By default, TTL rules are applied to tables at least once every 4 hours. To apply TTL rules more frequently, you can modify this setting.

    • max_number_of_merges_with_ttl_in_pool: A parameter that controls the maximum number of threads that can be used for TTL tasks. When the number of merge tasks with TTL in progress in the background thread pool exceeds the value of this parameter, new merge tasks with TTL are not allocated.

Why is the OPTIMIZE task slow?

OPTIMIZE tasks consume a lot of CPU and disk throughput. Queries and OPTIMIZE tasks affect each other. When the node load is high, the OPTIMIZE task appears slow. Currently, there is no special optimization method.

Why is data not merged by primary key after an OPTIMIZE task is run?

To ensure that data is correctly merged by primary key, the following two prerequisites must be met:

  • The `PARTITION BY` field defined in the storage table must be included in the ORDER BY clause. Data from different partitions is not merged by primary key.

  • The hash algorithm field defined in the distributed table must be included in the ORDER BY clause. Data from different nodes is not merged by primary key.

Common OPTIMIZE commands and their descriptions are as follows:

Command

Description

optimize table test;

Attempts to select MergeTree data parts for merging. It may return without running any tasks. Even if it runs, it does not guarantee that all records in the table are merged by primary key. This command is generally not used.

optimize table test partition tuple();

Specifies a partition and selects all data parts in the partition for merging. It may return without running any tasks. After the task runs, all data in a partition is merged into a single data part, and primary key merging is completed within that partition. However, data written during the task does not participate in the merge. If there is only one data part in the partition, the task is not run again.

Note

For tables without a partition key, the default partition is `partition tuple()`.

optimize table test final;

Forces a merge for all partitions in the entire table. Even if a partition has only one data part, it is re-merged. This can be used to forcibly remove records that have expired due to TTL. This task has the highest execution cost, but it may also return without running any merge tasks.

For the three commands, you can set the optimize_throw_if_noop parameter to detect whether a task is run using exception reporting.

Why is the TTL of data not in effect after an OPTIMIZE task is run?

The common causes and solutions are as follows:

  • Cause 1: Data TTL expiration is handled during the primary key merge phase. If data parts are not merged by primary key for a long time, expired data cannot be removed.

    Solution:

    • Manually trigger merge tasks using OPTIMIZE FINAL or OPTIMIZE PARTITION.

    • Set parameters such as `merge_with_ttl_timeout` and `ttl_only_drop_parts` when you create a table to increase the merge frequency of data parts that contain expired data.

  • Cause 2: The table's TTL has been modified or added, and existing data parts lack TTL information or have incorrect information. This can also prevent expired data from being removed.

    Solution:

    • Regenerate TTL information using the ALTER TABLE ... MATERIALIZE TTL command.

    • Update TTL information using OPTIMIZE PARTITION.

Why are update and delete operations not in effect after an OPTIMIZE task is run?

Updates and deletions in ApsaraDB for ClickHouse are run asynchronously. Currently, there is no mechanism to intervene in their progress. You can view the progress in the `system.mutations` system table.

How do I run DDL statements to add, delete, or modify columns?

Modifications to local tables can be run directly. To modify a distributed table, the procedure varies based on the situation:

  • If no data is being written, you can modify the local table first, and then modify the distributed table.

  • If data is being written, you must handle different types of operations differently:

    Type

    Procedure

    Add a `Nullable` column

    1. Modify the local table.

    2. Modify the distributed table.

    Modify a column's data type (if the types are convertible)

    Delete a `Nullable` column

    1. Modify the distributed table.

    2. Modify the local table.

    Add a non-`Nullable` column

    1. Stop data writing.

    2. Run `SYSTEM FLUSH DISTRIBUTED` on the distributed table.

    3. Modify the local table.

    4. Modify the distributed table.

    5. Resume data writing.

    Delete a non-`Nullable` column

    Modify a column name

Why are DDL statements executed slowly and frequently get stuck?

Common cause: Global DDL statements are run sequentially. Complex queries can cause deadlocks.

You can use one of the following solutions:

  • Wait for the operation to complete.

  • Try to terminate the query in the console.

How do I handle the "longer than distributed_ddl_task_timeout (=xxx) seconds" error for a distributed DDL statement?

You can use the set global on cluster default distributed_ddl_task_timeout=xxx command to modify the default timeout. `xxx` is the custom timeout in seconds. For more information about how to modify global parameters, see Modify cluster parameters.

How do I handle the "set global on cluster default" syntax error?

The common causes and solutions are as follows:

  • Cause 1: The ClickHouse client parses syntax, and set global on cluster default is syntax added by the server. If the client has not been updated to a version that is aligned with the server, this syntax is blocked by the client.

    Solution:

    • Use tools that do not parse syntax on the client side, such as a JDBC driver, DataGrip, or DBeaver.

    • Write a JDBC program to run the statement.

  • Cause 2: In set global on cluster default key = value;, `value` is a string, but the quotation marks are missing.

    Solution: Add quotation marks on both sides of the string-type value.

What BI tools are recommended?

Quick BI.

What data query IDE tools are recommended?

DataGrip and DBeaver.

Does ApsaraDB for ClickHouse support vector search?

ApsaraDB for ClickHouse supports vector search. For more information, see the following topics:

What do I do if the ON CLUSTER is not allowed for Replicated database error is reported when I create a table?

If your cluster is an Enterprise Edition cluster and your table creation statement includes ON CLUSTER default, you may receive the ON CLUSTER is not allowed for Replicated database error. You can upgrade your instance to the latest version. Some minor versions have this bug. For more information about how to upgrade the version, see Upgrade the minor engine version.

What do I do if the Double-distributed IN/JOIN subqueries is denied (distributed_product_mode = 'deny') error is reported when I use subqueries (JOIN or IN) with distributed tables?

Problem description: If you have a multi-node Community-Compatible Edition cluster, you may receive the Exception: Double-distributed IN/JOIN subqueries is denied (distributed_product_mode = 'deny'). error when you use JOIN or IN subqueries on multiple distributed tables.

Cause analysis: When you use JOIN or IN subqueries on multiple distributed tables, the query is amplified. For example, with three nodes, a distributed table JOIN or IN subquery expands to 3 × 3 local table subqueries. This leads to resource waste and increased latency. Therefore, the system does not allow such queries by default.

Solution principle: Replace IN or JOIN with GLOBAL IN or GLOBAL JOIN. This allows the subquery on the right side of GLOBAL IN or GLOBAL JOIN to be completed on one node and stored in a temporary table. The temporary table is then sent to other nodes for higher-level queries.

Impact of replacing IN or JOIN with GLOBAL IN or GLOBAL JOIN:

  • The temporary table is sent to all remote servers. Avoid using large datasets.

  • When you use the `remote` function to query data from external instances, replacing IN or JOIN with GLOBAL IN or GLOBAL JOIN causes subqueries that should be run on the external instance to be run on the local instance. This can lead to incorrect query results.

    For example, you can run the following statement on instance `a` to query data from the external instance cc-bp1wc089c**** using the `remote` function.

    SELECT *
    FROM remote('cc-bp1wc089c****.clickhouse.ads.aliyuncs.com:3306', `default`, test_tbl_distributed1, '<your_Account>', '<YOUR_PASSWORD>')
    WHERE id GLOBAL IN
        (SELECT id
         FROM test_tbl_distributed1);

    Based on the solution principle, instance `a` runs the subquery to the right of GLOBAL IN, SELECT id FROM test_tbl_distributed1, to generate temporary table A. It then passes the data of the temporary table to instance cc-bp1wc089c**** for the parent query. Ultimately, the statement run by instance cc-bp1wc089c**** is SELECT * FROM default.test_tbl_distributed1 WHERE id IN (temporary_table_A);

    This is the execution principle of GLOBAL IN or GLOBAL JOIN. Continuing with the example, let's understand why replacing IN or JOIN with GLOBAL IN or GLOBAL JOIN can lead to incorrect results when you use the `remote` function to query data from external instances.

    As described, the statement that is ultimately run by instance cc-bp1wc089c**** is SELECT * FROM default.test_tbl_distributed1 WHERE id IN (temporary_table_A);. However, the condition set, temporary table A, is generated on instance `a`. In this example, instance cc-bp1wc089c**** should have run SELECT * FROM default.test_tbl_distributed1 WHERE id IN (SELECT id FROM test_tbl_distributed1 );. The condition set should have been sourced from instance cc-bp1wc089c****. Therefore, using GLOBAL IN or GLOBAL JOIN causes the subquery to retrieve its condition set from the wrong source, which leads to incorrect results.

Solutions:

Solution 1: Modify the SQL in your business code. Manually change IN or JOIN to GLOBAL IN or GLOBAL JOIN.

For example, you can change the following statement:

SELECT * FROM test_tbl_distributed WHERE id IN (SELECT id FROM test_tbl_distributed1);

Add GLOBAL to modify it to:

SELECT * FROM test_tbl_distributed WHERE id GLOBAL IN (SELECT id FROM test_tbl_distributed1);

Solution 2: Modify the `distributed_product_mode` or `prefer_global_in_and_join` system parameter to have the system automatically replace IN or JOIN with GLOBAL IN or GLOBAL JOIN.

distributed_product_mode

You can use the following statement to set `distributed_product_mode` to `global`. This allows the system to automatically replace IN or JOIN queries with GLOBAL IN or GLOBAL JOIN.

SET GLOBAL ON cluster default distributed_product_mode='global';

distributed_product_mode usage instructions

  • Function: An important setting in ClickHouse that controls the behavior of distributed subqueries.

  • Value descriptions:

    • `deny` (default): Prohibits the use of IN and JOIN subqueries and throws a "Double-distributed IN/JOIN subqueries is denied" exception.

    • `local`: Replaces the database and table in the subquery with the local table of the destination server (shard) and retains the regular IN or JOIN.

    • `global`: Replaces IN or JOIN queries with GLOBAL IN or GLOBAL JOIN.

    • `allow`: Allows the use of IN and JOIN subqueries.

  • Applicable scenarios: Only applicable to queries that use multiple distributed tables with JOIN or IN subqueries.

prefer_global_in_and_join

prefer_global_in_and_join

You can use the following statement to set `prefer_global_in_and_join` to 1. This allows the system to automatically replace IN or JOIN queries with GLOBAL IN or GLOBAL JOIN.

SET GLOBAL ON cluster default prefer_global_in_and_join = 1;

prefer_global_in_and_join usage instructions

  • Function: An important setting in ClickHouse that controls the behavior of IN and JOIN operators.

  • Value descriptions:

    • 0 (default): Prohibits the use of IN and JOIN subqueries and throws a "Double-distributed IN/JOIN subqueries is denied" exception.

    • 1: Enables IN and JOIN subqueries and replaces IN or JOIN queries with GLOBAL IN or GLOBAL JOIN.

  • Applicable scenarios: Only applicable to queries that use multiple distributed tables with JOIN or IN subqueries.

How do I view the disk space occupied by each table?

You can use the following code to view the disk space occupied by each table:

SELECT table, formatReadableSize(sum(bytes)) as size, min(min_date) as min_date, max(max_date) as max_date FROM system.parts WHERE active GROUP BY table; 

How do I view the size of cold data?

The following is sample code:

SELECT * FROM system.disks;

How do I query which data is in cold storage?

The following is sample code:

SELECT * FROM system.parts WHERE disk_name = 'cold_disk';

How do I move partitioned data to cold storage?

The following is sample code:

ALTER TABLE table_name MOVE PARTITION partition_expr TO DISK 'cold_disk';

Why does data interruption occur in monitoring?

The common causes are as follows:

  • A query triggered an OOM error.

  • A configuration change triggered a restart.

  • The instance was restarted after an upgrade or downgrade.

Do versions later than 20.8 support smooth upgrades without data migration?

Whether a ClickHouse cluster supports a smooth upgrade depends on when the cluster was created. For clusters purchased after December 1, 2021, you can perform an in-place smooth upgrade to a major engine version without data migration. For clusters purchased before December 1, 2021, you must migrate data to upgrade the major engine version. For more information about how to upgrade the version, see Upgrade a major engine version.

What are the common system tables?

The common system tables and their functions are as follows:

Name

Function

system.processes

Queries the SQL statements that are running.

system.query_log

Queries the SQL statements that have been run.

system.merges

Queries merge information on the cluster.

system.mutations

Queries mutation information on the cluster.

How do I modify system-level parameters? Is a restart required? What are the impacts?

System-level parameters correspond to some configuration items in the `config.xml` file. To modify them, perform the following steps:

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Cluster List page, select List of Community Edition Instances, and click the target cluster ID.

  3. In the navigation pane on the left, click Parameter Configuration.

  4. On the Parameter Configuration page, click the edit button next to Operating parameter value for the max_concurrent_queries parameter.

  5. Enter the target value in the pop-up box and click Confirm.Modify parameters

  6. Click Submit parameters.

  7. Click Submit parameters.

After you click OK, the `clickhouse-server` process automatically restarts. This causes a transient disconnection of about 1 minute.

How do I modify user-level parameters?

User-level parameters correspond to some configuration items in the `users.xml` file. You can run the following sample statement:

SET global ON cluster default ${key}=${value};

Parameters without special instructions take effect immediately after they are run.

How do I modify a quota?

You can add it to the settings when you run a statement. The following is sample code:

settings max_memory_usage = XXX;

Why do the CPU utilization, memory utilization, and memory usage vary significantly among nodes?

If your cluster is a master-replica cluster or a single-replica multi-node cluster, the CPU and memory utilization of the write node will be higher than other nodes when you perform many write operations. After data is synchronized to other nodes, the CPU and memory utilization will become balanced.

How do I view detailed system log information?

  • Problem description:

    How to view detailed system log information to troubleshoot errors or identify potential issues.

  • Solution:

    1. Check the `text_log.level` parameter of the cluster and perform the following operations:

      1. If `text_log.level` is empty, `text_log` is not enabled. You can set `text_log.level` to enable `text_log`.

      2. If `text_log.level` is not empty, check whether the `text_log` level meets your current requirements. If not, you can modify this parameter to set the `text_log` level.

      For more information about how to view and modify the `text_log.level` parameter, see Configure config.xml parameters.

    2. Log on to the destination database. For more information, see Connect to a database.

    3. Run the following statement for analysis:

      SELECT * FROM system.text_log;

How do I resolve network connectivity issues between the target cluster and the data source?

If the target cluster and the data source use the same VPC and are in the same region, check whether they have added each other's IP addresses to their whitelists. If not, add the IP addresses to the whitelists.

  • For more information about how to add an IP address to a whitelist in ClickHouse, see Configure a whitelist.

  • For more information about how to add an IP address to a whitelist for other data sources, see the documentation for the respective product.

If the target cluster and the data source do not meet the preceding conditions, you can choose an appropriate network solution to resolve the network issue. Then, add each other's IP addresses to their whitelists.

Scenario

Solutions

Cloud-to-on-premises connectivity

Cloud-to-on-premises network connectivity

Cross-region and cross-account VPC-to-VPC connection

Cross-account VPC-to-VPC connection

Connectivity between different VPCs in the same region

Use Cloud Enterprise Network Basic Edition to connect VPCs in the same region

Cross-region and cross-account VPC-to-VPC connection

Use Cloud Enterprise Network Basic Edition to connect VPCs across regions and accounts

Internet connectivity

Use the SNAT feature of an Internet NAT gateway to access the Internet

Can I migrate an ApsaraDB for ClickHouse Community-Compatible Edition cluster to an Enterprise Edition cluster?

Yes, you can migrate an ApsaraDB for ClickHouse Community-Compatible Edition cluster to an Enterprise Edition cluster.

There are two main methods for migrating data between Enterprise Edition and Community-Compatible Edition clusters: using the `remote` function or using file export and import. For more information, see Migrate data from a self-managed ClickHouse instance to an ApsaraDB for ClickHouse Community-Compatible Edition instance.

How do I handle inconsistent database and table schemas across different shards during data migration?

Problem description

Data migration requires consistent database and table schemas across all shards. Otherwise, some schemas may fail to migrate.

Solution

  • The schemas of MergeTree tables (not inner tables of materialized views) are inconsistent across different shards.

    Check whether your business logic has caused the schema differences between shards:

    • If your business expects all shard table schemas to be identical, you can re-create them.

    • If your business requires different table schemas across shards, Submit a ticket to contact technical support.

  • The schemas of inner tables of materialized views are inconsistent across different shards.

    • Solution 1: Rename the inner table and explicitly map the materialized view and distributed table to the target MergeTree table. Use the original materialized view up_down_votes_per_day_mv as an example. Perform the following steps:

      1. List the tables whose count is not equal to the number of nodes. `NODE_NUM = Number of shards × Number of replicas`.

        SELECT database,table,any(create_table_query) AS sql,count() AS cnt
        FROM cluster(default, system.tables)
        WHERE database NOT IN ('system', 'information_schema', 'INFORMATION_SCHEMA')
        GROUP BY  database, table
        HAVING cnt != <NODE_NUM>;
      2. View the materialized views with an abnormal inner table count.

        SELECT substring(hostName(),38,8) AS host,*
        FROM cluster(default, system.tables)
        WHERE uuid IN (<UUID1>, <UUID2>, ...);
      3. Disable the default cluster synchronization behavior. This is required for ApsaraDB for ClickHouse but not for self-managed ClickHouse. Then, rename the inner table so that the table names are consistent on each node. To reduce operational risks, you can obtain the IP address of each node, connect to port 3005, and run the operations on each node one by one.

        SELECT count() FROM mv_test.up_down_votes_per_day_mv;
        SET enforce_on_cluster_default_for_ddl=0; 
        RENAME TABLE `mv_test`.`.inner_id.9b40675b-3d72-4631-a26d-25459250****` TO `mv_test`.`up_down_votes_per_day`;
      4. Delete the materialized view. You must run this on each node.

        SELECT count() FROM mv_test.up_down_votes_per_day_mv;
        SET enforce_on_cluster_default_for_ddl=0; 
        DROP TABLE mv_test.up_down_votes_per_day_mv;
      5. Create a new materialized view that explicitly points to the renamed inner table. You must run this on each node.

        SELECT count() FROM mv_test.up_down_votes_per_day_mv;
        SET enforce_on_cluster_default_for_ddl=0; 
        CREATE MATERIALIZED VIEW mv_test.up_down_votes_per_day_mv TO `mv_test`.`up_down_votes_per_day`
        (
            `Day` Date,
            `UpVotes` UInt32,
            `DownVotes` UInt32
        ) AS
        SELECT toStartOfDay(CreationDate) AS Day,
               countIf(VoteTypeId = 2) AS UpVotes,
               countIf(VoteTypeId = 3) AS DownVotes
        FROM mv_test.votes
        GROUP BY Day;

        Note: You must define the columns of the target table in the materialized view according to the original format. Do not use SELECT to infer the schema, as this can cause exceptions. For example, if the tcp_cn column uses sumIf in the SELECT statement, it should be defined as sum in the target table.

        Correct usage

        CREATE MATERIALIZED VIEW net_obs.public_flow_2tuple_1m_local TO net_obs.public_flow_2tuple_1m_local_inner
        (
         ... 
        tcp_cnt AggregateFunction(sum, Float64),
        ) AS
        SELECT
        ...
        sumIfState(pkt_cnt, protocol = '6') AS tcp_cnt,
        FROM net_obs.public_flow_5tuple_1m_local
        ...

        Incorrect usage

        CREATE MATERIALIZED VIEW net_obs.public_flow_2tuple_1m_local TO net_obs.public_flow_2tuple_1m_local_inner AS
        SELECT
        ...
        sumIfState(pkt_cnt, protocol = '6') AS tcp_cnt,
        FROM net_obs.public_flow_5tuple_1m_local
        ...
    • Solution 2: Rename the inner tables, rebuild the materialized views globally, and migrate the inner table data.

    • Solution 3: Dual-write to the materialized views and wait for 7 days.

The same SQL statement runs without errors on an existing instance but fails on an Enterprise Edition instance of version 24.5 or later. How do I fix this?

New Enterprise Edition instances of version 24.5 or later use the new analyzer as the default query engine. The new analyzer offers better query performance but may be incompatible with some older SQL syntax, which can cause parsing errors. If you encounter this error, you can run the following statement to revert to the old analyzer. For more information about the new analyzer, see New analyzer enabled by default.

SET allow_experimental_analyzer = 0;

How do I pause an ApsaraDB for ClickHouse cluster?

The pause feature is not supported for ClickHouse Community Edition clusters, but it is available for Enterprise Edition clusters. To pause an Enterprise Edition cluster, go to the Enterprise Edition cluster list page. In the upper-left corner, select the destination region, find the target cluster, and click image>Pause in the Actions column of the target cluster.

How do I convert MergeTree tables to ReplicatedMergeTree tables in a cluster?

Problem description

Because users are unfamiliar with the features and principles of ClickHouse, they often mistakenly create tables with the MergeTree engine when they use multi-replica clusters. This prevents data from being synchronized between the replica nodes of each shard, which in turn causes inconsistent results when you query distributed tables. You then need to convert the original MergeTree tables into ReplicatedMergeTree tables.

Solution

In ClickHouse, there is no DDL statement to directly modify the storage engine of a table. Therefore, to convert a MergeTree table to a ReplicatedMergeTree table, you must create a ReplicatedMergeTree table and import the data from the MergeTree table into it.

For example, your multi-replica cluster has a MergeTree table named `table_src` and a corresponding distributed table named `table_src_d`. To convert it to a ReplicatedMergeTree table, perform the following steps:

  1. Create a destination table named `table_dst` of the ReplicatedMergeTree type and its corresponding distributed table `table_dst_d`. For more information about how to create tables, see CREATE TABLE.

  2. Import data from the MergeTree table `table_src` into `table_dst_d`. There are two solutions:

Note
  • In both solutions, when you query source data, the queries are directed at the local MergeTree tables.

  • When you insert data into the destination table, if the data volume is not large, you can insert it directly into the distributed table `table_dst_d` for data balancing.

  • If the original MergeTree table `table_src` has balanced data across nodes and a large data volume, you can insert the data directly into the ReplicatedMergeTree local table `table_dst`.

  • If the data volume is large, the execution takes a long time. When you use the `remote` function, note the timeout setting for the function.

Import data using the remote function

  1. Connect to a ClickHouse cluster using DMS.

  2. Obtain the IP address of each node.

    SELECT 
        cluster,
        shard_num,
        replica_num,
        is_local,
        host_address
    FROM system.clusters
    WHERE cluster = 'default';
    
  3. Use the remote function to import data.

    Pass the IP address of each node obtained in the previous step into the `remote` function and run it.

    INSERT INTO  table_dst_d SELECT * FROM remote('node1', db.table_src) ;

    For example, if you find two node IP addresses, `10.10.0.165` and `10.10.0.167`, you can run the following INSERT statements:

    INSERT INTO table_dst_d SELECT * FROM remote('10.10.0.167', default.table_src) ;
    INSERT INTO table_dst_d SELECT * FROM remote('10.10.0.165', default.table_src) ;

    After you run the statements with all node IP addresses, the conversion of the MergeTree table to a ReplicatedMergeTree table in the cluster is complete.

Import data using local tables

If you have an ECS instance in your VPC with the ClickHouse client installed, you can also log on to each node individually and run the following operations.

  1. Connect to a ClickHouse cluster using the command-line interface.

  2. Obtain the IP address of each node.

    SELECT 
        cluster,
        shard_num,
        replica_num,
        is_local,
        host_address
    FROM system.clusters
    WHERE cluster = 'default';
  3. Import the data.

    Use the node IP addresses to log on to each node individually and run the following statement:

    INSERT INTO table_dst_d SELECT * FROM db.table_src ;

    After you log on and run the statement on all nodes, the conversion of the MergeTree table to a ReplicatedMergeTree table in the cluster is complete.

How do I execute multiple SQL statements in the same session?

By setting a unique session_id identifier, the ClickHouse server maintains the same context for requests with the same Session ID. This allows multiple SQL statements to be run in the same session. Use the ClickHouse Java Client (V2) to connect to ClickHouse as an example. The key implementation steps are as follows:

  1. Add the dependency to the `pom.xml` file of the Maven project.

    <dependency>
        <groupId>com.clickhouse</groupId>
        <artifactId>client-v2</artifactId>
        <version>0.8.2</version>
    </dependency>
  2. Add a custom Session ID in `CommandSetting`.

    package org.example;
    
    import com.clickhouse.client.api.Client;
    import com.clickhouse.client.api.command.CommandSettings;
    public class Main {
    
        public static void main(String[] args) {
            Client client = new Client.Builder()
                    // Add the instance endpoint.
                    .addEndpoint("endpoint")
                    // Add the username.
                    .setUsername("username")
                    // Add the password.
                    .setPassword("password")
                    .build();
    
            try {
                client.ping(10);
                CommandSettings commandSettings = new CommandSettings();
                // Set the session_id.
                commandSettings.serverSetting("session_id","examplesessionid");
                // Set the max_block_size parameter to 65409 within the session.
                client.execute("SET max_block_size=65409 ",commandSettings);
                // Run the query.
                client.execute("SELECT 1 ",commandSettings);
                client.execute("SELECT 2 ",commandSettings);
    
            } catch (Exception e) {
                throw new RuntimeException(e);
            } finally {
                client.close();
            }
        }
    }

In the example above, both SELECT statements are executed within the same session, and they both have the max_block_size parameter set to 65409. For more information about how to use ClickHouse Java Client, see Java Client | ClickHouse Docs.

Why does the FINAL keyword for deduplication fail due to JOIN in ClickHouse?

Problem description

When you use the FINAL keyword to deduplicate query results, if the SQL statement contains a JOIN, the deduplication fails and duplicate data still exists in the results. The following is a sample SQL statement:

SELECT * FROM t1 FINAL JOIN t2 FINAL WHERE xxx;

Cause

This is a known bug that has not been fixed by ClickHouse. The deduplication fails because of conflicts between the execution logic of FINAL and JOIN. For more information, see ClickHouse's issues.

Solution

  • Solution 1 (Recommended): Enable the experimental optimizer. You can add a configuration at the end of the query to enable query-level FINAL. No table-level declaration is needed. The following is an example:

    For example, the original SQL statement is:

    SELECT * FROM t1 FINAL JOIN t2 FINAL WHERE xxx;

    You need to remove the FINAL keyword after the table names and add settings allow_experimental_analyzer = 1,FINAL = 1 at the end of the SQL. The adjusted statement is as follows:

    SELECT * FROM t1 JOIN t2 WHERE xxx  SETTINGS allow_experimental_analyzer = 1, FINAL = 1; 
    Important

    The allow_experimental_analyzer parameter is supported only in version 23.8 and later. If your version is earlier than 23.8, you must upgrade your version before you adjust the SQL statement. For more information about how to upgrade, see Upgrade a major engine version.

  • Solution 2 (Use with caution):

    1. Force merge and deduplication: Periodically run OPTIMIZE TABLE local_table_name FINAL to merge data in advance. Use with caution for large tables due to high I/O overhead.

    2. Adjust the query SQL: Remove the FINAL keyword and rely on querying the merged data for deduplication.

    Important

    Operate with caution. This operation consumes significant I/O resources and affects performance when the table contains a large amount of data.

Why do DELETE or UPDATE operations remain incomplete in an ApsaraDB for ClickHouse Community-Compatible Edition cluster?

Problem description

In an ApsaraDB for ClickHouse Community-Compatible Edition cluster, DELETE or UPDATE operations remain in an incomplete state for a long time.

Cause analysis

Unlike synchronous operations in MySQL, DELETE and UPDATE operations in an ApsaraDB for ClickHouse Community-Compatible Edition cluster are run asynchronously based on the Mutation mechanism and do not take effect in real time. The core flow of a Mutation is as follows:

  1. Submit task: A user runs ALTER TABLE ... UPDATE/DELETE to generate an asynchronous task.

  2. Mark data: The background process creates a mutation_*.txt file to record the data range to be modified. The change does not take effect immediately.

  3. Background rewrite: ClickHouse gradually rewrites the affected data part and applies the changes during merges.

  4. Clean up old data: Old data blocks are marked for deletion after the merge is complete.

If too many Mutation operations are issued in a short period, Mutation tasks may be blocked. This can cause DELETE and UPDATE operations to remain incomplete. Before you issue a Mutation, you can run the following SQL statement to check whether there are many Mutations running. If not, you can proceed to avoid Mutation stacking.

SELECT * FROM clusterAllReplicas('default', system.mutations) WHERE is_done = 0;

Solution

  1. Check whether there are too many Mutations running in the cluster.

    You can run the following SQL statement to view the current Mutation status in the cluster:

    SELECT * FROM clusterAllReplicas('default', system.mutations) WHERE is_done = 0;
  2. If there are many running Mutations, you can use a privileged account to cancel some or all of them.

    • Cancel all Mutation tasks on a single table.

      KILL MUTATION WHERE database = 'default' AND table = '<table_name>'
    • Cancel a specific Mutation task.

      KILL MUTATION WHERE database = 'default' AND table = '<table_name>' AND mutation_id = '<mutation_id>'

      You can obtain the `mutation_id` using the following SQL statement:

      SELECT mutation_id, * FROM clusterAllReplicas('default', system.mutations) WHERE is_done = 0;

Why are consecutive query results inconsistent in ApsaraDB for ClickHouse Community-Compatible Edition clusters?

Problem description

In an ApsaraDB for ClickHouse Community-Compatible Edition cluster, the query results may be inconsistent when you run the same SQL statement multiple times.

Cause analysis

There are two main reasons why the same SQL query returns inconsistent results in a Community-Compatible Edition cluster. The specific reasons and analysis are as follows:

  • The target table of multiple queries in a multi-shard cluster is a local table.

    When you create tables in a multi-shard cluster of the Community-Compatible Edition, you must create both local tables and a distributed table. In this type of cluster, the data writing flow is roughly as follows:

    When data is written, it is first written to the distributed table. The distributed table then distributes the data to local tables on different shards for storage.

    When you query data, different table types result in different data sources:

    • Querying a distributed table: The distributed table aggregates and returns data from the local tables on all shards.

    • Querying a local table: Each query returns data from a local table on a random shard. In this case, each query result may be inconsistent with the previous one.

  • A master-replica cluster was created without using a Replicated* series engine.

    When you create tables in a master-replica cluster of the Community-Compatible Edition, you need to use a `Replicated*` series engine, such as the ReplicatedMergeTree engine. `Replicated*` series engines enable data synchronization between replicas.

    If a master-replica cluster does not use a `Replicated*` series engine when tables are created, data is not synchronized between replicas. This can also cause inconsistent query results.

Solution

  1. Determine the cluster type.

    You can determine whether your cluster is a multi-shard cluster, a master-replica cluster, or a multi-shard master-replica cluster based on the cluster information. Perform the following steps:

    1. Log on to the ApsaraDB for ClickHouse console.

    2. In the upper-left corner of the page, select List of Community Edition Instances.

    3. In the cluster list, click the ID of the target cluster to go to the cluster information page.

      Check the Series under Cluster Attributes and the Number of node groups under Configuration Information. Determine the cluster type. The logic is as follows:

      • Number of node groups is greater than 1: Multi-shard cluster.

      • Series is High-availability Edition: Master-replica cluster.

      • Both of the preceding conditions are met: Multi-shard master-replica cluster.

  2. Choose a solution based on the cluster type.

    Multi-shard cluster

    Check the table type that is being queried. If it is a local table, query the distributed table instead.

    If you have not created a distributed table, you can create one. For more information, see Create a table.

    Master-replica cluster

    Check the table creation statement of the target table to see if its engine is a Replicated* series engine. If not, you must re-create the table. For more information, see Create a table.

    Multi-shard master-replica cluster

    Check the table type that is being queried.

    If it is a local table, query the distributed table instead. If you have not created a distributed table, you can create one.

    If you are querying a distributed table, check whether the engine of the local table that corresponds to the distributed table is a Replicated* series engine. If not, you must re-create the local table and use a Replicated* series engine. For more information, see Create a table.

Why does the ReplacingMergeTree engine fail to deduplicate data after a forced data merge using the OPTIMIZE command in an ApsaraDB for ClickHouse Community-Compatible Edition cluster?

Problem description

The ReplacingMergeTree engine table in ClickHouse performs deduplication on data with the same primary key during the data merging process. After you use the following command to force a data merge, you can still find duplicate data with the same primary key:

optimize TABLE <table_name> FINAL ON cluster default;

Cause analysis

The deduplication of the ReplacingMergeTree engine works only on a single node. If data with the same primary key is distributed to different nodes because the sharding expression sharding_key is not explicitly specified (it defaults to rand() for random allocation), it cannot be guaranteed that data is not duplicated when you query the entire cluster. This is because the ReplacingMergeTree engine cannot deduplicate across nodes.

Solution

Re-create the local table and the distributed table. When you create the distributed table, set the sharding expression sharding_key to the primary key of the local table. For more information about the table creation syntax, see CREATE TABLE.

Important

Both the distributed table and the local table need to be re-created. If you only rebuild the distributed table, it takes effect only for newly inserted data. Historical data cannot be deduplicated.