The three data centers across two regions architecture deploys a PolarDB-X instance across two primary data centers in one region and one secondary data center in a second region. This topology provides cross-region high availability with a recovery point objective (RPO) of zero, meeting Level 4 through Level 6 disaster recovery requirements in the financial industry.
This topic describes how the architecture works, the mechanisms that underpin it, and the operations you need to manage it.
Supported version
This architecture requires Apsara Stack DBStack V1.2.1 or later.
Disaster recovery levels
| Level | Recovery time objective (RTO) | RPO | Deployment requirement |
|---|---|---|---|
| Level 4 | ≤ 30 minutes | 0 | Zone-disaster recovery or geo-disaster recovery |
| Level 5 | ≤ 15 minutes | 0 | Geo-disaster recovery, at least one replica in the remote region |
| Level 6 | ≤ 1 minute | 0 | Geo-disaster recovery, at least two replicas in the remote region |
PolarDB-X uses five replicas based on the Paxos majority consensus protocol to achieve RPO = 0 across regions. When the primary data centers fail completely, the remote secondary instance restores service within the RTO constraints above.
How it works
A PolarDB-X instance in this topology runs five replicas: four replicas distributed across two primary data centers in one region, and one replica in a secondary data center in another region. Majority synchronization requires responses from at least three replicas.
Within the primary data centers, the four replicas communicate over low-latency local networks—majority synchronization typically completes in approximately 1 millisecond. Network latency between the primary and secondary data centers is approximately 30 milliseconds, which is typical for cross-region connections in industries such as financial services.
Four mechanisms make this architecture work reliably:
Weighted election mechanism: Keeps leader elections local to avoid unnecessary cross-region latency.
Dynamic replica adjustment: Restores low-latency majority synchronization after a data center failure.
Forced start of a single replica: Lets the secondary data center serve requests when both primary data centers fail.
Remote secondary instance: Provides geo-disaster recovery to meet RTO requirements.
Failure scenarios and responses
| Failure scenario | Scope | Response |
|---|---|---|
| Leader replica failure | Primary data centers | Leader re-election is triggered. A follower in the same data center is prioritized to minimize traffic rerouting. |
| Follower replica failure | Primary data centers | No action required. |
| Follower replica failure | Secondary data center | No action required. |
| Primary data center failure | One of the two primary data centers | Five replicas are dynamically downgraded to three replicas. Cross-region synchronization may occur. |
| Secondary data center failure | — | Four replicas remain in the primary data centers. Paxos protocol performance is unaffected. |
| Both primary data centers fail | Regional failure | One replica remains in the secondary data center. Run force_single_mode to start that replica in single-replica mode. Switch business traffic to the remote secondary instance. |
| Secondary data center fails | Regional failure | No action required. |
Key mechanisms
Weighted election mechanism
PolarDB-X applies a weighted election mechanism so that leader re-elections prefer replicas in the same data center, avoiding unnecessary cross-region latency.
Replica election weights:
| Data center | Replica role | Election weight |
|---|---|---|
| Primary Data Center 1 | Leader | 9 |
| Primary Data Center 1 | Follower | 7 |
| Primary Data Center 2 | Follower | 5 |
| Primary Data Center 2 | Follower | 3 |
| Secondary Data Center | Follower | 1 |
The mechanism has two parts:
Optimistic weighted election: Each node waits a calculated delay before initiating a leader election. The delay is inversely proportional to the node's weight, so higher-weight nodes initiate elections first.
Mandatory weighted election: When a new leader discovers it does not have the highest weight among all nodes, it enters an abdication phase instead of immediately accepting writes. During this phase, the node sends heartbeat signals (for example, every 1 to 2 seconds). If a higher-weight node responds before the abdication phase ends, leadership transfers to that node.
For example, if the leader in Primary Data Center 1 fails, the follower in the same data center (weight 7) is prioritized over all others, keeping traffic local.
Dynamic adjustment of replica quantities
When a primary data center fails and only three replicas remain, majority synchronization must include the secondary data center replica, adding approximately 30 milliseconds of cross-region latency. Adjust replica counts using the following commands to manage this tradeoff:
| Transition | Command | Notes |
|---|---|---|
| Five replicas to three replicas | downgrade_follower | Converts two followers to learners |
| Three replicas to five replicas | upgrade_learner | Converts two learners back to followers; make sure replication logs are current before upgrading |
| One replica to three replicas | add_follower | Adds new replicas as learners; they are automatically promoted to followers once their logs are current |
Forced start of a single replica
When both primary data centers fail, the single remaining replica in the secondary data center cannot satisfy the majority consensus requirement on its own. Run force_single_mode to force the system into single-replica mode, sidelining all follower replicas so the remaining replica can serve requests.
Once the primary data centers recover, PolarDB-X rebuilds the distributed system incrementally:
Add replicas from one to three: run
add_follower.Add replicas from three to five: run
upgrade_learner.
Remote secondary instance
In distributed database systems, replication progress can differ across Paxos groups during a distributed transaction. Without coordination, data from a partially replicated transaction could appear in the secondary instance, causing transaction inconsistency.
PolarDB-X addresses this using Change Data Capture (CDC) log nodes deployed in the remote region. These nodes sort and reorganize distributed transactions to guarantee atomic replication—no transaction is partially committed when data moves from the primary to the secondary instance. This guarantee holds during both routine disaster recovery drills and real failovers.
Two design points govern how replication and failover work:
Primary instance replication: The primary instance uses the cross-region Paxos protocol, requiring responses from at least three replicas. Because four replicas are in the primary data centers, majority synchronization normally completes locally. The remote replica in the secondary data center responds asynchronously, so cross-region latency does not affect primary instance write performance.
Secondary instance replication: The secondary instance is in the remote region. CDC log nodes replicate data in near real time across regions. Replication latency may occur, but atomic transaction replication ensures the secondary instance never reflects a partially committed transaction.
Common operations and maintenance (O&M)
Quick reference: scenario to command
| Scenario | Action |
|---|---|
| One primary data center fails | Run downgrade_follower to reduce five replicas to three |
| Primary data centers recover | Run upgrade_learner to restore five replicas |
| Both primary data centers fail | Run force_single_mode to start single-replica mode; switch traffic to secondary instance |
| Secondary data center recovers after regional failure | Run add_follower to add replicas, then upgrade_learner |
Create an instance with this topology
When creating a PolarDB-X instance, set the Topology parameter to Three Data Centers Across Two Regions.
View instance topology
On the Basic Information page of the instance, find the Topology Information section to see the zone details of all resources.
Perform a failover
Before you begin
Schedule the failover during low-traffic periods to reduce the impact on write performance.
On the Basic Information page, verify the current topology and confirm which data center you want to designate as the primary zone.
Steps
Log on to the PolarDB for Xscale console.
In the top navigation bar, select the region where the target instance is located.
On the Instances page, click the PolarDB-X 2.0 tab.
Find the target instance and click its ID.
On the Basic Information page, click Specify Primary Zone in the upper-right corner of the Topology Information section.
In the Specify Primary Zone dialog box, set the Data Center, Primary Zone, and Switch Mode parameters.
Click OK.
After the failover
Verify that the Topology Information section reflects the new primary zone before resuming normal write traffic.