Hologres supports multi-zone disaster recovery across three availability zones (AZs) in the same region. This feature extends instance availability from a single AZ to three AZs in the same region, providing cross-AZ fault isolation to keep your business online if an AZ fails. You can use this capability to handle scenarios such as carrier network failures or compute infrastructure failures in a single AZ, which improves the disaster recovery capability of your application.
Important notes
You can upgrade only instances running Hologres V3.0.19 or later to zone-redundant storage instances.
Available service regions include the following: China (Shenzhen), China (Shanghai), China (Beijing), China (Hong Kong), Singapore, Japan (Tokyo), China (Shanghai) Finance Cloud, and China (Hangzhou) Finance Cloud.
Multi-zone disaster recovery based on 3 AZs
Function overview
Hologres multi-zone disaster recovery based on three AZs extends instance availability from a single AZ to three AZs in the same region. It provides cross-AZ fault isolation to keep your business running if an AZ fails. This feature is useful for scenarios such as carrier network failures or compute infrastructure failures in a single AZ, which improves the disaster recovery capability of your application.
This capability includes zone-redundant storage and multi-AZ compute high availability. The details are as follows:
Zone-redundant storage: This feature stores instance data across multiple AZs in the same region. The underlying storage across AZs is used automatically, and you do not need to know which specific AZs are used. If a storage data center in one AZ becomes unavailable, the system automatically accesses replica data in another AZ. No manual switchover is required. This provides data center-level disaster recovery within the same region.
Multi-AZ compute high availability: If your instance uses zone-redundant storage, you can manually switch compute nodes to a healthy AZ if a compute data center fails. This helps avoid downtime caused by compute infrastructure failures and improves high availability at the compute layer, provided that sufficient compute resources are available in the target AZ.
Technical principles
Locally redundant storage: If your instance uses locally redundant storage, the instance and its data are located in a single AZ within a region. If the data center in that AZ becomes unavailable, the data becomes inaccessible. In this configuration, neither the storage nor the compute layer supports cross-AZ high availability.
Zone-redundant storage: If your instance uses 3-AZ zone-redundant storage (multi-zone disaster recovery), its data is stored redundantly across multiple AZs in the same region. If the data center in one AZ fails, zone-redundant storage ensures that data remains accessible. This provides data center-level disaster recovery for both the storage and compute layers within the same region.
Compared to locally redundant storage, zone-redundant storage offers higher availability and better disaster recovery capabilities. As a result, storage fees are higher. For more information, see Billing overview. All other fees remain unchanged.
Multi-zone disaster recovery based on three AZs includes storage disaster recovery and compute disaster recovery. The technical principles are as follows:
Storage disaster recovery principle
In zone-redundant storage mode, instance data is stored across multiple AZs in the same region. The AZ where the instance is running is the primary zone. Other AZs are pre-deployed physical zones where replicas are stored.
An AZ refers to the physical zone where underlying servers reside. The system automatically selects other AZs based on your instance’s primary zone. You do not need to know which specific AZs are used.
When the instance's zone is operating normally:
Write operations: Data is written to all AZs simultaneously. The system returns a success message only after the write operation is complete in all AZs. If the write operation fails in any AZ, the entire operation fails. The storage system guarantees the atomicity of every write operation.
Query operations: By default, queries read data from the primary AZ.
When the zone where the instance is located fails:
Write operations: Write operations skip the failed AZ and proceed in the other healthy AZs. The system always maintains multiple replicas. Even in extreme scenarios, at least one AZ remains available.
Query operations: Queries are automatically routed to the nearest replica in another AZ. This ensures service continuity and high availability.
After the instance's zone recovers from a fault:
Write operations: Write operations resume in the original AZ. The system asynchronously copies the new data that was written during the outage from the replica AZs back to the recovered AZ.
Query operations: The storage engine automatically routes queries. It attempts to read from the primary AZ first. If the primary AZ does not have the latest data, the query is automatically routed to a replica AZ. This process ensures data correctness without requiring you to track replication progress. The system automatically handles data freshness and correctness through its routing mechanism.
In zone-redundant storage mode, the system provides highly available storage with automatic disaster recovery switching. No manual intervention is required to maintain business availability.

Compute disaster recovery principle
Compute disaster recovery is available only for instances that use 3-AZ zone-redundant storage. Unlike storage disaster recovery, which automatically switches AZs and routes traffic, compute resources are stateless. If a compute data center fails, you must manually select Switch Computing Zone in the console to move compute nodes to a healthy AZ. This action maintains compute availability.
If the target AZ has insufficient compute resources, the switchover may fail. Resource availability is not 100% guaranteed. In this case, promptly submit a ticket or join the Hologres user group to contact Hologres support.
Purchase and use a multi-zone disaster recovery instance
When you purchase a new instance, set Storage Redundancy Type to ZRS.
Billing: You are charged based on the rate for zone-redundant storage. Compared to standard instances, only the storage fees increase. For more information, see Billing overview.
Existing instances default to locally redundant storage (single-AZ storage). Only instances running Hologres V3.0.19 or later can be upgraded to 3-AZ storage. For details, see Convert a standard instance to a 3-AZ multi-zone disaster recovery instance.

After you purchase the instance, go to the Instance Details page. Under Storage Resources, confirm that Storage Redundancy Type shows Zone-redundant Storage (ZRS). If an availability zone fails, follow the Disaster recovery guidance. You can use this instance in the same way as a locally redundant storage instance.
Disaster recovery guidance
Storage disaster recovery guidance
If the AZ that hosts your instance's storage fails, you will receive an SMS or email notification from Hologres. Hologres performs an automatic recovery process as follows:
Hologres automatically switches the storage to a healthy AZ. No action is required from your application. The service resumes automatically.
After the switchover, data continues to be written to the healthy AZs. Queries are automatically routed to the nearest AZ that holds your data. No code changes are required. If any jobs failed during the outage, you must rerun them.
Monitor your application to confirm that it has fully recovered.

Compute disaster recovery guidance
If your instance uses zone-redundant storage (multi-zone disaster recovery), Hologres supports manual compute AZ switching to achieve multi-AZ compute high availability and quickly restore the service.
If the AZ that hosts your instance's compute resources fails, you will receive an SMS or email notification from Hologres. You must perform the following steps manually:
Go to the Hologres Management Console. On the Instances page, click the instance ID to go to the Instance Details page.
In the left navigation pane, click Backup and Disaster Recovery, then select the Zone-disaster Recovery tab.
In the Compute disaster recovery section, click Switch Computing Zone.

If sufficient compute resources are available in the target AZ, select the desired Computing Zone for Disaster Recovery in the Switch Computing Zone dialog box, and then click OK to migrate the compute nodes.
After migration, your instance’s endpoint and other basic configurations remain unchanged. When the instance status is Running, rerun any failed jobs and monitor your application until it is fully recovered.
Manual compute AZ switching is supported only for instances that use zone-redundant storage (multi-zone disaster recovery). If your instance uses locally redundant storage, see Convert a standard instance to a 3-AZ multi-zone disaster recovery instance.
If the target AZ has insufficient compute resources, the switchover will fail. In this case, promptly submit a ticket or join the Hologres user group to contact Hologres support.
After manual compute AZ migration, your instance’s endpoint and network configuration remain unchanged.
No additional compute fees are incurred after a manual compute AZ migration.
Convert a standard instance to a 3-AZ multi-zone disaster recovery instance
If your instance uses locally redundant storage, its data resides in a single AZ within a region. If the storage data center in that AZ fails, the related data becomes inaccessible.
To enable multi-zone disaster recovery, you can submit a ticket or join the Hologres user group. An O&M engineer will perform the conversion in the background. Take note of the following:
Only Hologres instances that are V3.0.19 or later support zone-redundant storage. If your instance is an earlier version, you must upgrade it. You can perform the upgrade by following the instructions in Instance upgrade or by joining the Hologres user group. For more information, see How do I get more online support?.
Impact of conversion:
During the conversion, write operations are paused but read operations can continue. If your jobs support automatic failover, you do not need to manually stop them.
The conversion time depends on the number of tables in your instance. For most instances, the conversion is completed within 10 minutes. Your Hologres support engineer will provide a more precise time estimate.
After the conversion, you are charged based on the rate for zone-redundant storage. This results in higher storage costs. We recommend that you monitor your bills closely.