×
Community Blog PostgreSQL Multi-replica Wait Behavior, Transaction Snapshots, and Data Visibility

PostgreSQL Multi-replica Wait Behavior, Transaction Snapshots, and Data Visibility

In this article, you’ll learn about the multi-replica wait behavior of PostgreSQL by performing multiple synchronous tests.

By digoal

Background

PostgreSQL has built-in multi-replica function, relating to several parameters.

配置多副本样式  
synchronous_standby_names = ''  # standby servers that provide sync rep  
                                # method to choose sync standbys, number of sync standbys,  
                                # and comma-separated list of application_name  
                                # from standby(s); '*' = all  
  
  
配置事务的等待模式  
#synchronous_commit = on                # synchronization level;  
                                        # off, local, remote_write, remote_apply, or on  
  
配置standby名字  
#primary_conninfo = ''                  # connection string to sending server  
                                        # (change requires restart)  application_name 区分standby  

Many similar articles have introduced the meaning of these parameters, however, this article will not be including those parameters. To learn about the parameters, visit the following link:

HA Protection Configuration of ApsaraDB RDS for PostgreSQL – Maximum Protection, Availability, and Performance

Next, analyze whether other sessions can apply the transaction while waiting, once the transaction is committed. Besides, check whether other sessions apply modifications after canceling the wait and whether changes are visible to other sessions when the session is killed by other sessions.

Test 1 by Synchronous Mode

Disable the standby database and simulate waiting.

Session 1:

db1=# set synchronous_commit =on;  
SET  
  
db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
(2 rows)  
  
db1=# begin;  
BEGIN  
db1=# insert into t values (3);  
INSERT 0 1  
db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
  3 |      |                          |        4  
(3 rows)  
  
db1=# commit;  
由于从库关掉了, 所以commit处于等待状态.  

Session 2:

Query that Session 1 is in the commit waiting state and waits for the SyncRep event.

db1=# select * from pg_stat_activity where pid=87173;  
-[ RECORD 1 ]----+------------------------------  
datid            | 17765  
datname          | db1  
pid              | 87173  
usesysid         | 10  
username          | postgres  
application_name | psql  
client_addr      |   
client_hostname  |   
client_port      | -1  
backend_start    | 2020-02-13 14:57:26.579201+08  
xact_start       | 2020-02-13 15:00:31.732551+08  
query_start      | 2020-02-13 15:00:36.534047+08  
state_change     | 2020-02-13 15:00:36.534048+08  
wait_event_type  | IPC  
wait_event       | SyncRep  
state            | active  
backend_xid      | 246780732  
backend_xmin     |   
query            | commit;  
backend_type     | client backend  

Query the transaction snapshot of the current database. 246780732, the transaction number of Session 1, has not yet been submitted.

db1=# select * from txid_current_snapshot();  
 txid_current_snapshot   
-----------------------  
 246780732:246780732:  
(1 row)  

txid_snapshot's textual representation is xmin:xmax:xip_list. For example, 10:20:10,14,15 means xmin=10, xmax=20, xip_list=10, 14, 15.

Name Description
xmin Earliest transaction ID (txid) that is still active. All earlier transactions will either be committed and visible, or rolled back and dead.
xmax First as-yet-unassigned txid. All txids greater than or equal to this are not yet started as of the time of the snapshot, and thus invisible.
xip_list Active txids at the time of the snapshot. The list includes only those active txids between xmin and xmax; there might be active txids higher than xmax. A txid that is xmin <= txid < xmax and not in this list was already completed at the time of the snapshot, and thus either visible or dead according to its commit status. The list does not include txids of subtransactions.

246780732 corresponds to the transaction in the commit waiting state, which is an uncommitted transaction. Therefore, the change is invisible to other sessions.

In Session 2, the content of the database modified by the transaction when Session 1 is still waiting for commit cannot be queried.

db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
(2 rows) 

Session 1:

Actively cancel the waiting and an alert that the local transaction has been committed and is received. The remote access is a complete WAL log for receiving this transaction.

db1=# commit;  
^CCancel request sent  
WARNING:  canceling wait for synchronous replication due to user request  
DETAIL:  The transaction has already been committed locally, but might not have been replicated to the standby.  
COMMIT  

Session 2:

After Session 1 has been canceled, it is committed locally, and the transaction number shows that the transaction is in the committed state.

db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
  3 |      |                          |        4  
(3 rows)  

Test 2 by Synchronous Mode

Similarly, disable the standby database and simulate waiting.

Session 1:

Start synchronous transactions, write, and wait.

db1=# begin;  
BEGIN  
db1=# set synchronous_commit =on;  
SET  
db1=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          87528  
(1 row)  
  
db1=# insert into t values (6);  
INSERT 0 1  
db1=# commit;  
无限等待中. 

Session 2:

kill Session 1.

db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
  3 |      |                          |        4  
  4 |      |                          |        5  
  5 |      |                          |        6  
(5 rows)  
  
db1=# select pg_terminate_backend(87528);  
 pg_terminate_backend   
----------------------  
 t  
(1 row)  

Session 1:

Receive a warning, indicating that Session 1 has been killed.

WARNING:  canceling the wait for synchronous replication and terminating connection due to administrator command  
DETAIL:  The transaction has already committed locally, but might not have been replicated to the standby.  
server closed the connection unexpectedly  
        This probably means the server terminated abnormally  
        before or while processing the request.  
The connection to the server was lost. Attempting reset: Succeeded. 

Session 2:

For transactions that enter the SyncRep waiting state and have changes in Session 1, only a few variables that have not been released prevent other sessions from querying the changes. Therefore, the variables are released when Session 1 is killed, and changes are visible to other sessions.

db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
  3 |      |                          |        4  
  4 |      |                          |        5  
  5 |      |                          |        6  
  6 |      |                          |        7  
(6 rows)  

Test 3 by Synchronous Mode

Similarly, disable the standby database and simulate waiting.

Session 1:

Start synchronous transactions, write, and wait.

db1=# begin;  
BEGIN  
db1=# set synchronous_commit =on;  
SET  
db1=# insert into t values (7);  
INSERT 0 1  
db1=# commit;  

Session 2:

Cancel the query of Session 1.

db1=# select pg_cancel_backend(87539);  
 pg_cancel_backend   
-------------------  
 t  
(1 row)

Session 1:

Receive a warning that the query has been canceled. The result is the same as that of self-initiated cancellation. The transaction is locally committed.

WARNING:  canceling wait for synchronous replication due to user request  
DETAIL:  The transaction has already been committed locally, but might not have been replicated to the standby.  
COMMIT  

Session 2:

The transaction for Session 1 is committed locally and is visible.

db1=# select * from t;  
 id | info |         crt_time         | mod_time   
----+------+--------------------------+----------  
  1 | abc  | 2020-02-13 14:04:43.5781 |        1  
  2 |      |                          |        3  
  3 |      |                          |        4  
  4 |      |                          |        5  
  5 |      |                          |        6  
  6 |      |                          |        7  
  7 |      |                          |        8  
(7 rows)  

Summary

1) If parameters are submitted with multiple replicas, their feedback should be awaited after committing.

2) During the waiting, the transaction number is in the uncommitted state in the transaction snapshot, so other sessions cannot view the database changes made by the transaction.

3) During the waiting, users can actively cancel the current waiting, which can also be canceled or terminated by other sessions. After canceling, users will receive a warning, indicating that the transaction has been committed locally.

4) The transaction number is then removed from the snapshot's uncommitted transactions, and changes made by this committed transaction to the database are visible to itself and other sessions.

0 0 0
Share on

digoal

211 posts | 14 followers

You may also like

Comments