Community Blog Solution to Ensure Availability and Reliability (rpo,rto) of PostgreSQL Two-node Stream Replication

Solution to Ensure Availability and Reliability (rpo,rto) of PostgreSQL Two-node Stream Replication

This article explains how to ensure availability and reliability of PostgreSQL with a semi synchronous feature also known as two node stream replication process.

By digoal


In the two-node HA architecture, how can we ensure that RPO is 0 in reliability dimension across data centers and control RTO in terms of availability at the same time?

Semi-synchronous feature is a good choice.

1) When only one node is suspended, the RPO is 0. The example is shown as below:

Primary -> standby (failed)
Primary (failed) -> standby 

2) One node fails and another node recovers and starts the synchronization mode. If the (current) primary node fails, RPO is greater than 0. Although the standby database is active at this time, the synchronization mode has not been started. The example is given below:

Primary (failed) -> standby (in OPEN state, but has failed before and not started the synchronization mode)
RPO > 0, which is similar to the condition of failure of both nodes at the same time

3) What is the way to ensure the RTO time controllably?

Under the synchronous mode, before committing, a transaction needs to wait for WALs of sync STANDBY to copy feedback, ensuring transaction WALs persist multiple replicas and return feedback to client. In this process, first, make the master database to be persistent. Then, synchronize the sync STANDBY. Next, wait for the feedback from the WAL synchronization timestamp in sync STANDBY. When the STANDBY fails, waiting is open-ended. So, two-node synchronous replication cannot take into account the availability (RTO).

Therefore, what is the solution?

The status of the (pg_stat_activity) wait event can be monitored. If the waiting time of a synchronous transaction exceeds a certain threshold (RTO threshold), the mode is degraded to the asynchronous mode.

Database restart is not required for degradation.

3.1) Modify the configuration.

3.2) Reload. This setting takes immediate effect for both existing and new connections.

3.3) Cancel the wait signal for the current waiting process.

4) After degrading, how can the mode be upgraded to the synchronization mode?

Similarly, the status of (pg_stat_replication) can be monitored. When sync standby is in streaming status, the mode can be switched to the synchronization mode.

Database restart is not required for the upgrade.

4.1) Modify configuration.

4.2) Reload. The setting takes effect immediately both for existing connections and new connections.

Technical Points Involved

1) Transaction submission parameters.

on, remote_apply, remote_write, local 

2) Synchronous configuration parameters.

[FIRST] num_sync ( standby_name [, ...] )    
ANY num_sync ( standby_name [, ...] )    
standby_name [, ...]    
ANY 3 (s1, s2, s3, s4)    
FIRST 3 (s1, s2, s3, s4)    
*  表示所有节点 

3) Active sessions: View the wait event status when the transaction is submitted.


Wait events

To know more, visit the following link


4) Stream status, pg_stat_replication.

    Current WAL sender state. Possible values are:    
startup: This WAL sender is starting up.    
catchup: This WAL sender's connected standby is catching up with the primary.    
streaming: This WAL sender is streaming changes after its connected standby server has caught up with the primary.    
backup: This WAL sender is sending a backup.    
stopping: This WAL sender is stopping.  



1) Primary

synchronous_commit = remote_write    
wal_level = replica    
max_wal_senders = 8    
synchronous_standby_names = '*'  

2) Standby

restore_command = 'cp /data01/digoal/wal/%f %p'    
primary_conninfo = 'host=localhost port=8001 user=postgres'     '

Synchronous Degradation and Upgrade - Practices

Disable standby to simulate a standby database exception to see how a semi-synchronous feature is achieved.

Simulate the STANDBY recovery to simulate the upgrade to the synchronous mode.

1) Monitor pg_stat_activity. If it is found that the transaction submission wait exceeds a certain threshold (RTO threshold), degrade it.

select max(now()-query_start) from pg_stat_activity where wait_event='SyncRep';     

2) View the waiting time of the preceding result (RTO threshold).

When it is greater than a threshold value, it starts degradation.

Note that NULL is used to ensure that no transactions are in the SyncRep wait state.

3) Degrade step 1 to modify the synchronous_commit parameter. Change to WAL local persistence with asynchronous stream replication.

alter system set synchronous_commit=local;    

4) Degrade step 2 to RELOAD the effective parameters.

select pg_reload_conf();

5) Degrade step 3 to clear the current waiting queue. After the SyncRep waiting process receives the CANCEL signal, it clears the queue and notifies the client that the local WAL of the current transaction has been persisted. The transaction ends normally.

select pg_cancel_backend(pid) from pg_stat_activity where wait_event='SyncRep';    

6) The client that receives the clearing signal returns normal results. The client can see that the transaction is submitted normally.

postgres=# end;    
WARNING:  01000: canceling wait for synchronous replication due to user request    
DETAIL:  The transaction has already been committed locally, but might not have been replicated to the standby.    
LOCATION:  SyncRepWaitForLSN, syncrep.c:264    


The redo information of the transaction has been persisted in the local WAL, and the commit status is normal.

For subsequent requests of the current session, the asynchronous stream replication mode, that is, WAL local persistence mode (synchronous_commit=local), will be implemented.

Upgrade process is as follows:

7) Upgrade step 1 to monitor standby state. If sync_state='sync' state of standby is streaming state, it means that WALs of the standby and primary are fully synchronized.

select * from pg_stat_replication where sync_state='sync' and state='streaming';    

The result returned indicates that the standby has received the WAL of the primary. Then, the synchronization can be implemented.

8) Upgrade step 2 to change the transaction commit mode to synchronous mode with synchronous_commit=remote_write. When submitting a transaction, wait until the sync standby receives the WAL and writes it later.

alter system set synchronous_commit=remote_write; 

9) Upgrade step 3 to RELOAD the effective parameters. Reset synchronous_commit=remote_write for all sessions, including existing connections and new connections.

select pg_reload_conf();    


1) Without modifying the kernel of PostgreSQL, the two-node semi-synchronous mode is implemented through external auxiliary monitoring and manipulation, for example, the 5-second monitoring interval. If the two-node or single-node structure runs normally, RPO is 0 and controllable. For example, the maximum wait_event='SyncRep' wait time is greater than 10 seconds.

2) Kernel modification suggestions.

  • Degradation: Add HOOK to the wait queue, and wait_event='SyncRep' degrades to the asynchronous mode after timeout.
  • Upgrade: Add HOOK to the wal_sender code, and change to the synchronization mode after the standby is recovered.


1) https://www.postgresql.org/docs/11/monitoring-stats.html#MONITORING-STATS-VIEWS

0 0 0
Share on


211 posts | 14 followers

You may also like