×
Community Blog How Does PostgreSQL Maintain Heartbeat and Support Automatic Upgrade and Synchronization

How Does PostgreSQL Maintain Heartbeat and Support Automatic Upgrade and Synchronization

UDF customization can help prevent the blocking of heartbeat and support automatic upgrade and degradation in synchronous or asynchronous mode.

By digoal

Background

The heartbeat can be prevented from blocking by customizing UDFs. Currently, UDFs allow automatic upgrade and degradation in synchronous or asynchronous mode based on configurations, realizing semi-synchronous function.

UDF input:

1) Priority mode (synchronous and asynchronous)

2) Synchronous wait timeout

Assume that the synchronous mode and synchronous configuration are selected. If the standby database has an exception and the transaction commitment waits for longer than a specified time, the system automatically degrades to asynchronous mode.

When the asynchronous mode and synchronous configuration are preferred, it is automatically degraded to asynchronous mode.

Assume that the synchronous mode and asynchronous configuration are used. If the standby database is restored to the streaming mode, it is automatically upgraded to the synchronous mode.

Technical points of use:

1) Alter system

2) Reload configuration

3) Cancel backend

4) dblink asynchronous calls

Heartbeat UDF Logic

Check the current instance status  
  
  Read-only  
  
    Exit 
  
  Read/write  
  
    Determine the current transaction mode   
  
      Asynchronous  
  
        Send heartbeat  
  
        What is the priority mode  
  
          Asynchronous  
  
            Exit  
  
          Synchronization  
  
            Determine whether the upgrade is required  
  
              Upgrade  
  
              Exit  
  
  
      Synchronization  
  
        Consume asynchronous messages  
  
        Send remote heartbeat  
  
        Query timeout  
  
          Degradation  
  
        Otherwise  
  
          Consume asynchronous messages  
  
        What is the priority mode  
  
        Asynchronous  
  
          Degradation  
  
          Exit  
  
        Synchronous  
  
          Exit  

Design

1) Configure the current postgresql.conf file.

synchronous_commit='remote_write';  
synchronous_standby_names='*';

The above statement shows synchronous mode.

2) Heartbeat table design.

create table t_keepalive(id int primary key, ts timestamp, pos pg_lsn);  

3) Heartbeat write method.

insert into t_keepalive values (1,now(),pg_current_wal_lsn()) on conflict (id) do update set ts=excluded.ts,pos=excluded.pos returning id,ts,pos;

4) Create a connection function and do not report errors.

create or replace function conn(        
  name,   -- dblink名字        
  text    -- 连接串,URL        
) returns void as $$          
declare          
begin          
  perform dblink_connect($1, $2);         
  return;          
exception when others then          
  return;          
end;          
$$ language plpgsql strict;     

5) Create the heartbeat UDF based on the above logic.

create or replace function keepalive (  
  prio_commit_mode text,    
  tmout interval  
) returns t_keepalive as $$  
declare  
  res1 int;  
  res2 timestamp;  
  res3 pg_lsn;  
  commit_mode text;  
  conn text := format('hostaddr=%s port=%s user=%s dbname=%s application_name=', '127.0.0.1', current_setting('port'), current_user, current_database());  
  conn_altersys text := format('hostaddr=%s port=%s user=%s dbname=%s', '127.0.0.1', current_setting('port'), current_user, current_database());  
  app_prefix_stat text := 'keepalive_dblink';  
begin  
  if prio_commit_mode not in ('sync','async') then  
    raise notice 'prio_commit_mode must be [sync|async]';  
    return null;  
  end if;  
  
  show synchronous_commit into commit_mode;  
  
  create extension IF NOT EXISTS dblink;  
  
  -- 判断当前实例状态  
  if pg_is_in_recovery()   
  
  -- 只读  
  then  
    raise notice 'Current instance in recovery mode.';  
    return null;  
      
  -- 读写  
  else  
  
    -- 判断当前事务模式   
    if commit_mode in ('local','off')  
  
    -- 异步  
    then  
  
      -- 发心跳  
      insert into t_keepalive values (1,now(),pg_current_wal_lsn()) on conflict (id) do update set ts=excluded.ts,pos=excluded.pos returning id,ts,pos into res1,res2,res3;  
  
      -- 优先模式是什么  
      if prio_commit_mode='async'   
  
      -- 异步  
      then  
  
        -- 退出  
        return row(res1,res2,res3)::t_keepalive;  
  
      -- 同步  
      else  
  
        -- 判断是否需要升级  
        perform 1 from pg_stat_replication where state='streaming' limit 1;  
        if found  
  
        -- 升级  
        then  
          perform dblink_exec(conn_altersys, 'alter system set synchronous_commit=remote_write', true);   
          perform pg_reload_conf();   
  
          -- 退出  
          return row(res1,res2,res3)::t_keepalive;  
        end if;  
  
        return row(res1,res2,res3)::t_keepalive;  
      end if;  
  
  
    -- 同步  
    else  
  
      -- 消耗异步消息  
      perform conn(app_prefix_stat,  conn||app_prefix_stat);     
      perform t from dblink_get_result(app_prefix_stat, false) as t(id int, ts timestamp, pos pg_lsn);  
  
      -- 发远程心跳  
      perform dblink_send_query(app_prefix_stat, $_$ insert into t_keepalive values (1,now(),pg_current_wal_lsn()) on conflict (id) do update set ts=excluded.ts,pos=excluded.pos returning id,ts,pos $_$);    
  
      -- 查询是否超时  
      <<ablock>>  
      loop  
        perform pg_sleep(0.2);  
  
        perform 1 from pg_stat_activity where application_name=app_prefix_stat and state='idle' limit 1;  
        -- 未超时  
        if found then  
          select id,ts,pos into res1,res2,res3 from dblink_get_result(app_prefix_stat, false) as t(id int, ts timestamp, pos pg_lsn);  
          raise notice 'no timeout';  
          exit ablock;  
        end if;  
            
        perform 1 from pg_stat_activity where wait_event='SyncRep' and application_name=app_prefix_stat and clock_timestamp()-query_start > tmout limit 1;  
        -- 降级  
        if found then  
          perform dblink_exec(conn_altersys, 'alter system set synchronous_commit=local', true);   
          perform pg_reload_conf();  
          perform pg_cancel_backend(pid) from pg_stat_activity where wait_event='SyncRep';  
          select id,ts,pos into res1,res2,res3 from dblink_get_result(app_prefix_stat, false) as t(id int, ts timestamp, pos pg_lsn);  
          raise notice 'timeout';  
          exit ablock;  
        end if;  
            
        perform pg_sleep(0.2);  
      end loop;  
  
      -- 优先模式是什么  
      if prio_commit_mode='async'   
  
      -- 异步  
      then  
        show synchronous_commit into commit_mode;  
        -- 降级  
        if commit_mode in ('on','remote_write','remote_apply')   
        then  
          perform dblink_exec(conn_altersys, 'alter system set synchronous_commit=local', true);   
          perform pg_reload_conf();  
          perform pg_cancel_backend(pid) from pg_stat_activity where wait_event='SyncRep';  
        end if;  
              
        -- 退出  
        return row(res1,res2,res3)::t_keepalive;  
  
      -- 同步  
      else  
  
        -- 退出  
        return row(res1,res2,res3)::t_keepalive;  
      end if;  
          
    end if;  
  
  end if;  
end;  
$$ language plpgsql strict;

Testing

1) The current mode is synchronous mode.

postgres=# show synchronous_commit ;  
 synchronous_commit   
--------------------  
 remote_write  
(1 row)  

2) The standby database is manually closed. The heartbeat automatically changes the database to the asynchronous mode, and notifies all waiting sessions.

postgres=# select * from keepalive ('sync','5 second');  
NOTICE:  extension "dblink" already exists, skipping  
NOTICE:  timeout  
 id |             ts             |     pos       
----+----------------------------+-------------  
  1 | 2019-01-30 00:48:39.800829 | 23/9501D5F8  
(1 row)  
  
postgres=# show synchronous_commit ;  
 synchronous_commit   
--------------------  
 local  
(1 row)  

3) When the standby database is restored, the heartbeat automatically upgrades the database to the synchronous mode.

postgres=# select * from keepalive ('sync','5 second');  
NOTICE:  extension "dblink" already exists, skipping  
 id |             ts             |     pos       
----+----------------------------+-------------  
  1 | 2019-01-30 00:48:47.329119 | 23/9501D6E8  
(1 row)  
  
postgres=# select * from keepalive ('sync','5 second');  
NOTICE:  extension "dblink" already exists, skipping  
NOTICE:  no timeout  
 id |             ts             |     pos       
----+----------------------------+-------------  
  1 | 2019-01-30 00:49:11.991855 | 23/9501E0C8  
(1 row)  
  
postgres=# show synchronous_commit ;  
 synchronous_commit   
--------------------  
 remote_write  
(1 row)  

Summary

The heartbeat can be prevented from blocking by customizing UDFs. Currently, UDFs allow automatic upgrade and degradation in synchronous or asynchronous mode based on configurations, realizing semi-synchronous function.

UDF input:

1) Priority mode (synchronous and asynchronous)

2) Synchronous wait timeout

Assume that the synchronous mode and synchronous configuration are selected. If the standby database has an exception and the transaction commitment waits for longer than a specified time, the system automatically degrades to asynchronous mode.

When the asynchronous mode and synchronous configuration are preferred, it is automatically degraded to asynchronous mode.

Assume that the synchronous mode and asynchronous configuration are used. If the standby database is restored to the streaming mode, it is automatically upgraded to the synchronous mode.

Technical points of use:

1) Alter system

2) Reload configuration

3) Cancel backend

4) dblink asynchronous calls

By using heartbeat to achieve a semi-synchronous feature, the whole process of switching between synchronous and asynchronous modes is greatly simplified. If the kernel level can be implemented and several parameters are configured, it will be more perfect.

0 0 0
Share on

digoal

277 posts | 24 followers

You may also like

Comments