All Products
Search
Document Center

E-MapReduce:Disaster recovery management

Last Updated:Mar 26, 2026

Disaster recovery management lets you deploy EMR Serverless StarRocks instances across multiple availability zones (AZs) within a region, so that a single zone failure does not take your service offline. When a zone becomes unavailable, you can switch frontend (FE) and compute nodes to a healthy secondary zone to restore operations. Underlying data uses zone-redundant storage, so it remains intact across zone failures.

This feature is available only for instances that use a storage and compute separation architecture.

Zone types

Choose your deployment model based on how much zone failure your workload can tolerate.

Zone type Disaster recovery capability Data storage Fault recovery Deployment cost
Single zone No cross-data center disaster recovery. Redundant storage across multiple devices within the same zone. BE/CN node failures are auto-recovered. Zone-level failures cause task interruptions. No extra cost.
Two-zone Withstands a single zone failure. Redundant storage across multiple zones within the same region. If the primary zone fails, switch FE and compute nodes to the secondary zone. Two sets of FE nodes + one set of compute nodes + extra storage cost.
Three-zone Withstands two concurrent zone failures. Redundant storage across multiple zones within the same region. If the primary zone fails, select a functioning secondary zone for a primary/secondary failover. FE and compute nodes switch to that zone. Three sets of FE nodes + one set of compute nodes + extra storage cost.

Limitations

  • The primary zone cannot be changed after an instance is created.

  • The primary instance and disaster recovery nodes must be in different zones within the same region. Cross-region deployment is not supported.

  • After a zone switchover, FE and compute nodes are migrated to the secondary zone. The instance's network configuration and domain name remain unchanged. Make sure the vSwitch in the secondary zone has enough available IP addresses.

  • After you enable multi-zone disaster recovery, you cannot disable it.

Billing

Multi-zone disaster recovery incurs the following additional costs:

  • Observer computing resources: Additional FE nodes are deployed in secondary zones as Observer nodes, generating additional CU costs. These appear as OBSERVER Computing Resources in your bill. For details, see Computing fees.

  • Zone-redundant storage: The underlying storage spans multiple zones, which incurs additional storage costs. For details, see Data storage (multi-zone) fees.

Enable multi-zone disaster recovery

Enable when creating an instance

  1. Log on to the E-MapReduce console. In the left navigation pane, choose EMR Serverless > StarRocks.

  2. In the top menu bar, select the target region.

  3. On the Instance List page, click Create Instance.

  4. On the E-MapReduce Serverless StarRocks page, set Multi-zone Disaster Recovery to Two zones or Three zones, and specify the backup vSwitch for each zone.

For more information about instance creation parameters, see Create an instance.

Enable for an existing instance

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS. In the left navigation pane, choose EMR Serverless > StarRocks.

  2. Click the name of the target instance, then click the Disaster Recovery tab.

  3. Click Disaster Recovery Settings.

  4. In the panel that appears, set Multi-zone Disaster Recovery to Two zones or Three zones, and specify the backup vSwitch for each zone.

  5. Review and accept the service agreement, then click OK.

Switch zones

Important

A zone switchover causes the instance to be unavailable for approximately 1–2 hours. Perform a switchover only when the instance is completely unavailable and cannot be recovered through other means. If the destination zone has insufficient server resources, the switchover may fail — submit a ticket in advance to request capacity in the destination zone.

Before performing a zone switchover, understand the following impacts:

  • Downtime: The instance is unavailable for approximately 1–2 hours during the switchover.

  • Post-switchover state: After the switchover, the instance continues operating from the secondary zone. The network configuration and domain name remain unchanged, so no client-side updates are required to reconnect.

Steps:

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS. In the left navigation pane, choose EMR Serverless > StarRocks.

  2. Click the name of the target instance, then click the Disaster Recovery tab.

  3. In the row of the target secondary zone, click Switch Zone.

  4. In the dialog box, enter the current instance name to confirm, then click OK.

Usage notes

  • If the primary zone fails before all data is synced to the disaster recovery nodes, that unsynced data may be lost after a switchover.

  • When FE resources are adjusted — such as by scaling out nodes, upgrading configurations, or expanding disks — Observer resources in the secondary zone are automatically synced with the FE.