All Products
Search
Document Center

ApsaraMQ for Kafka:Best practice for single-zone disaster recovery

Last Updated:Apr 09, 2025

This topic describes how to quickly switch traffic to a secondary instance to restore the service when a zone-level failure occurs on an ApsaraMQ for Kafka instance deployed in a single zone.

Background information

When a zone-level failure occurs on an ApsaraMQ for Kafka instance deployed in a single zone, the service may be unavailable and data may be lost. To prevent the preceding risks, you can use the connector ecosystem integration feature of ApsaraMQ for Kafka to back up messages to a secondary instance in another region. This way, when a failure occurs, you can switch the traffic to the secondary instance and quickly restore the service by resetting offsets.

image

Usage notes

  • To prevent instance unavailability caused by region-level failures, select different regions for the primary and secondary instances.

  • After switching traffic to the secondary instance, you need to reset offsets to quickly restore the service. We recommend that you implement message idempotence to reduce business impacts caused by duplicated message consumption.

  • We recommend that you resolve the custom domain name of the client to the domain name of the ApsaraMQ for Kafka instance using a CNAME record to quickly switch traffic during failures.

Procedure

Step 1: Create a connector

For more information, see Create ApsaraMQ for Kafka sink connectors.

(Optional) Step 2: Add a CNAME record

For more information, see CNAME record.

Step 3: Modify the endpoint on the client

  • CNAME record mode

    • You need to change the custom domain name of the client to the domain name with the CNAME record added.

    • After a failure occurs, you need to change only the mapped domain name by CNAME to the domain name of the secondary instance. This enables you to quickly switch traffic without restarting the business application.

  • Regular mode

    After a failure occurs, you need to change the endpoint of the client to the endpoint of the secondary instance, then restart the service to restore the business. We recommend that you use the CNAME record mode to reduce impacts caused by failures.

Important

If you want to access ApsaraMQ for Kafka instances across regions, you can use Cloud Enterprise Network (CEN) to connect virtual private clouds (VPCs) in these regions. For more information, see Connect VPCs in different regions.