Database primary-standby replication is one of the standard methods of open source databases to solve the problem of single point availability. For example, in ApsaraDB for RDS PostgreSQL dual-node architecture, you can create a standby node using PostgreSQL internal stream replication and switch to it when the primary node exhibits abnormal behavior.
The active-standby redo replication of the community version of PostgreSQL supports the following synchronization modes:
#synchronous_commit = on # synchronization level; # off, local, remote_write, remote_apply, or on #synchronous_standby_names = '' # standby servers that provide sync rep # method to choose sync standbys, number of sync standbys, # and comma-separated list of application_name # from standby(s); '*' = all [FIRST] num_sync ( standby_name [, ...] ) ANY num_sync ( standby_name [, ...] ) standby_name [, ...]
To ensure zero loss and availability, a customer needs to select at least three nodes, or 2.5 nodes, with one node only receiving redo logs. In case of a particular node failure, there will be no data loss and availability issue with this setup.
However, the community version doesn’t support the semi-synchronization mode. Synchronization mode is useful when standby is normal, and asynchronization mode is beneficial when standby is abnormal or in jitter. This mode does not fully guarantee zero data loss, but the probability is high. Therefore, some customers prefer the two-node semi-synchronization mode after considering the costs and availability.
ApsaraDB RDS PG supports semi-synchronization by enhancing its kernel. The configuration method is relatively simple, with two parameters required as follows:
The maximum time in milliseconds to wait for WAL synchronous replication. In the case of timeout, synchronous replication change to asynchronous until replication is caught up.
If the replication delays or standby exception occurs, 0 indicates waiting forever, equivalent to full synchronization in the community version. A value greater than 0 indicates the maximum wait time of commit or rollback. If the wait time exceeds this number, the mode degrades to asynchronous.
If a standby exception occurs in semi-synchronous mode and the wait time exceeds the set value, the mode degrades to asynchronous. When standby recovers and catches up, the mode changes to synchronous.
Adjust the protection level of the transaction with the following parameters:
It sets the current transaction's synchronization level.
The configuration settings are as follows:
1) Testing and edge services with requirements on high performance and low data reliability.
2) Services with requirements on high performance and certain data reliability (a small amount of data loss is acceptable).
3) Services that require the highest reliability to ensure zero loss even at the expense of availability.
Solution to Ensure Availability and Reliability (rpo,rto) of PostgreSQL Two-node Stream Replication
PostgreSQL Multi-replica Wait Behavior, Transaction Snapshots, and Data Visibility
ApsaraDB - August 1, 2022
Alibaba Clouder - August 1, 2017
Alibaba Clouder - July 22, 2020
digoal - September 30, 2021
afzaalvirgoboy - February 25, 2020
Alibaba Clouder - January 8, 2021
Fully managed and less trouble database servicesLearn More
An online MPP warehousing service based on the Greenplum Database open source programLearn More
SDDP automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities.Learn More
A comprehensive DDoS protection for enterprise to intelligently defend sophisticated DDoS attacks, reduce business loss risks, and mitigate potential security threats.Learn More
More Posts by digoal