Upgrade a ClickHouse cluster to two replicas or a multi-zone deployment - ApsaraDB for ClickHouse

To improve the high availability (HA) of your ApsaraDB for ClickHouse Community-Compatible Edition cluster, you can upgrade it to a multi-zone deployment. This process upgrades a single-replica cluster to a two-replica cluster or a single-zone cluster to a multi-zone cluster.

Prerequisites

The cluster is a Community-Compatible Edition cluster.
The cluster is in the Running status.
The cluster has no unpaid renewal orders.
Note
Log on to the ApsaraDB for ClickHouse console. In the upper-right corner of the page, choose Expenses > Expenses and Costs. In the navigation pane on the left, click Orders. You can then pay for or cancel the order.

Precautions

After you upgrade a cluster to a multi-zone deployment, historical data from MergeTree engine tables is migrated to the new cluster and automatically redistributed.
The following items are supported for migration:
- Databases, data dictionaries, and materialized views.
- Table schema: All table schemas except for tables that use the Kafka or RabbitMQ engine.
- Data: Incremental migration of data from MergeTree family tables.
The following items are not supported for migration:
- Tables that use the Kafka or RabbitMQ engine and their data.
  Important
  When you change the configuration, data is migrated to a new instance, and traffic is eventually switched to the new instance. To ensure that Kafka and RabbitMQ data is not split, first delete the Kafka and RabbitMQ engine tables from the source cluster. After the change is complete, recreate them.
- Data from tables that are not of the MergeTree type, such as external tables and Log tables.
Important
During the upgrade, you must manually handle the unsupported content by following the procedure in this topic.
Do not perform Data Definition Language (DDL) operations during the upgrade to a multi-zone deployment. If you do, data validation may fail at the end of the upgrade, which will cause the upgrade to fail.
After the upgrade to a multi-zone deployment, the internal node IP addresses change. If your data write and access operations depend on node IP addresses, you must obtain the VPC CIDR block of the cluster again. For more information, see Obtain the VPC CIDR block of a cluster.
After you change the cluster configuration, frequent merge operations occur for a period of time. These operations increase I/O usage and can lead to increased latency for business requests. You should plan for the potential impact of this increased latency. For information about how to calculate the duration of merge operations, see Calculate the merge duration after migration.
During the upgrade to a multi-zone deployment, the CPU and memory usage of the cluster increases. The estimated resource usage per node is less than 5 cores and 20 GB of memory.

Costs

After you change the cluster configuration, the cost changes. The actual cost is displayed on the console. For more information, see Billing for configuration changes.

Procedure

Step 1: Handle tables with the Kafka and RabbitMQ engines

Migration is not supported for tables that use the Kafka or RabbitMQ engine. You must handle these tables manually.

Log on to the cluster and run the following statement to query for the tables that you need to handle. For more information, see Connect to a ClickHouse cluster using DMS.
```
SELECT * FROM  `system`.`tables` WHERE engine IN ('RabbitMQ', 'Kafka');
```
View and back up the `CREATE TABLE` statement for each target table.
```
SHOW CREATE TABLE <aim_table_name>;
```
Delete the tables that use the Kafka and RabbitMQ engines.
Important
When you delete a Kafka table, you must also delete the materialized views that reference it. Otherwise, the scale-out or scale-in operation fails.

Step 2: Back up business data from non-MergeTree tables

Log on to the cluster and run the following statement to identify the non-MergeTree tables whose data requires migration.

SELECT
    `database` AS database_name,
    `name` AS table_name,
    `engine`
FROM `system`.`tables`
WHERE (`engine` NOT LIKE '%MergeTree%') AND (`engine` != 'Distributed') AND (`engine` != 'MaterializedView') AND (`engine` NOT IN ('Kafka', 'RabbitMQ')) AND (`database` NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema')) AND (`database` NOT IN (
    SELECT `name`
    FROM `system`.`databases`
    WHERE `engine` IN ('MySQL', 'MaterializedMySQL', 'MaterializeMySQL', 'Lazy', 'PostgreSQL', 'MaterializedPostgreSQL', 'SQLite')
))

Back up the data.
You must back up the data from the non-MergeTree tables that you identified. For more information, see Back up data to OSS.

Step 3: Perform the upgrade in the console

Log on to the ApsaraDB for ClickHouse console.
In the upper-left corner of the page, select the region where the cluster is located.
On the Clusters page, select Clusters of Community-compatible Edition.
In the Actions column of the target Cluster ID, click Change Configuration.
In the Change Configuration dialog box, select Upgrade To Multi-zone Deployment and click OK.
In the detection window that appears, check the detection status.
- Detection successful: Click Next.
- Detection failed: Make changes as prompted on the page, and then click Retry Detection. After the detection is successful, click Next.
  The detection may fail during the upgrade for several reasons.
  Missing unique distributed table: A local table does not have a corresponding distributed table. You need to create one.
  Corresponding distributed table is not unique: A local table has more than one distributed table. Delete the extra distributed tables and keep only one.
  Kafka/RabbitMQ engine tables are not supported: Kafka or RabbitMQ engine tables exist. Delete them.
  A primary-replica instance has non-replicated *MergeTree tables: Data is inconsistent between replicas. This will cause an exception during data migration for the scale-out or scale-in operation.
  The columns of the distributed table and the local table are inconsistent: You must ensure that the columns of the distributed table and the local table are consistent. Otherwise, an exception occurs during data migration for the scale-out or scale-in operation.
  The table is missing on some nodes: You need to create tables with the same name on different shards. For the inner table of a materialized view, rename the inner table and then rebuild the materialized view to point to the renamed inner table. For more information, see The inner table of a materialized view is inconsistent across shards.
On the Upgrade/Downgrade page, configure the deployment plan, zone, VPC vSwitch, and write suspension time as needed.
Note
The upgrade to a multi-zone deployment involves data migration. To ensure a successful migration, the write suspension time is subject to the following requirements:
- Set the write suspension time to at least 30 minutes.
- The upgrade must be completed within 5 days after the configuration change is created. Therefore, the date for End Write Suspension Time must be less than or equal to Current Date + 5.
- To reduce the impact of the migration on your business, set the write suspension time to off-peak hours.
Click Buy Now and complete the payment as prompted.
On the The order is complete page, click Console.
In the Status column of the Clusters of Community-compatible Edition list, you can view the status of the target cluster. When the cluster status changes from Scaling to Running, the upgrade is complete.

Note

The upgrade to a multi-zone deployment is expected to take more than 30 minutes. The duration depends on the amount of data. The actual task status is indicated by the cluster status displayed in the console.

Step 4: Recreate tables with the Kafka and RabbitMQ engines

Log on to the cluster and execute the `CREATE TABLE` statements that you backed up in Step 1: Handle tables with the Kafka and RabbitMQ engines. For more information, see Connect to a ClickHouse cluster using DMS.

Step 5: Migrate business data from non-MergeTree tables

Log on to the cluster and use OSS to migrate the data backed up in Step 2: Back up business data from non-MergeTree tables. For more information, see Import data from OSS.