本文介紹ECS執行個體上的PostgreSQL如何結合pgpool實現讀寫分離,您也可以通過RDS PostgreSQL執行個體及唯讀執行個體簡化操作步驟。
背景資訊
不使用pgpool實現資料庫的高可用時,pgpool自身是無狀態的,效能損耗很小,同時還支援橫向擴充,因此搭配自身具有高可用架構的RDS PostgreSQL執行個體,可以方便快捷地實現讀寫分離。
部署環境
如果您已經購買PostgreSQL 10高效能本地碟及唯讀執行個體(詳情請參見快速建立RDS PostgreSQL執行個體和建立PostgreSQL唯讀執行個體),僅需安裝pgpool,然後請跳轉到配置pgpool查看後續步驟。
PostgreSQL雲端硬碟版的唯讀執行個體敬請期待。
測試環境:
ECS執行個體規格為16核CPU、64GB記憶體、1.8TB SSD雲端硬碟。
ECS執行個體系統為CentOS 7.7 x64。
操作步驟如下。
修改設定檔sysctl.conf,命令如下:
sudo vi /etc/sysctl.conf # add by digoal.zhou fs.aio-max-nr = 1048576 fs.file-max = 76724600 # 可選:kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p # /data01/corefiles 提前建好,許可權777,如果是軟連結,對應的目錄修改為777。 kernel.sem = 4096 2147483647 2147483646 512000 # 訊號量, ipcs -l 或 -u 查看,每16個進程一組,每組訊號量需要17個訊號量。 kernel.shmall = 107374182 # 所有共用記憶體段相加大小限制(建議記憶體的80%),單位為頁。 kernel.shmmax = 274877906944 # 最大單個共用記憶體段大小(建議為記憶體一半), 大於9.2的版本已大幅降低共用記憶體的使用,單位為位元組。 kernel.shmmni = 819200 # 一共能產生多少共用記憶體段,每個PG資料庫叢集至少2個共用記憶體段。 net.core.netdev_max_backlog = 10000 net.core.rmem_default = 262144 # The default setting of the socket receive buffer in bytes. net.core.rmem_max = 4194304 # The maximum receive socket buffer size in bytes net.core.wmem_default = 262144 # The default setting (in bytes) of the socket send buffer. net.core.wmem_max = 4194304 # The maximum send socket buffer size in bytes. net.core.somaxconn = 4096 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.tcp_keepalive_intvl = 20 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_time = 60 net.ipv4.tcp_mem = 8388608 12582912 16777216 net.ipv4.tcp_fin_timeout = 5 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syncookies = 1 # 開啟SYN Cookies。當出現SYN等待隊列溢出時,啟用cookie來處理,可防範少量的SYN攻擊。 net.ipv4.tcp_timestamps = 1 # 減少time_wait。 net.ipv4.tcp_tw_recycle = 0 # 如果=1則開啟TCP串連中TIME-WAIT通訊端的快速回收,但是NAT環境可能導致串連失敗,建議服務端關閉它。 net.ipv4.tcp_tw_reuse = 1 # 開啟重用。允許將TIME-WAIT通訊端重新用於新的TCP串連。 net.ipv4.tcp_max_tw_buckets = 262144 net.ipv4.tcp_rmem = 8192 87380 16777216 net.ipv4.tcp_wmem = 8192 65536 16777216 net.nf_conntrack_max = 1200000 net.netfilter.nf_conntrack_max = 1200000 vm.dirty_background_bytes = 409600000 # 系統髒頁到達這個值,系統後台刷髒頁調度進程 pdflush(或其他) 自動將(dirty_expire_centisecs/100)秒前的髒頁刷到磁碟。 # 預設為10%,大記憶體機器建議調整為直接指定多少位元組。 vm.dirty_expire_centisecs = 3000 # 大於這個值的髒頁,將被刷到磁碟。3000表示30秒。 vm.dirty_ratio = 95 # 如果系統進程刷髒頁太慢,使得系統髒頁超過記憶體 95 % 時,則使用者進程如果有寫磁碟的操作(如fsync、fdatasync等調用),則需要主動把系統髒頁刷出。 # 有效防止使用者進程刷髒頁,在單機多執行個體,並且使用CGROUP限制單一實例IOPS的情況下非常有效。 vm.dirty_writeback_centisecs = 100 # pdflush(或其他)後台刷髒頁進程的喚醒間隔, 100表示1秒。 vm.swappiness = 0 # 不使用交換分區。 vm.mmap_min_addr = 65536 vm.overcommit_memory = 0 # 在分配記憶體時,允許少量over malloc, 如果設定為 1, 則認為總是有足夠的記憶體,記憶體較少的測試環境可以使用 1。 vm.overcommit_ratio = 90 # 當overcommit_memory = 2 時,用於參與計算允許指派的記憶體大小。 vm.swappiness = 0 # 關閉交換分區。 vm.zone_reclaim_mode = 0 # 禁用 numa, 或者在vmlinux中禁止。 net.ipv4.ip_local_port_range = 40000 65535 # 本地自動分配的TCP, UDP連接埠號碼範圍。 fs.nr_open=20480000 # 單個進程允許開啟的檔案控制代碼上限。 # 以下參數請注意。 #vm.extra_free_kbytes = 4096000 # 小記憶體機器不要設定這樣大, 會無法開機。 #vm.min_free_kbytes = 6291456 # vm.min_free_kbytes 建議每32G記憶體配置1G vm.min_free_kbytes。 # 如果是小記憶體機器,以上兩個值不建議設定。 # vm.nr_hugepages = 66536 # 建議shared buffer設定超過64GB時使用大頁,頁大小 /proc/meminfo Hugepagesize。 #vm.lowmem_reserve_ratio = 1 1 1 # 對於記憶體大於64G時,建議設定,否則建議預設值 256 256 32。修改設定檔limits.conf,命令如下:
sudo vi /etc/security/limits.conf * soft nofile 1024000 * hard nofile 1024000 * soft nproc unlimited * hard nproc unlimited * soft core unlimited * hard core unlimited * soft memlock unlimited * hard memlock unlimited # 注釋其他 # 同時注釋/etc/security/limits.d/20-nproc.conf。關閉透明大頁、配置大頁、並自啟動PostgreSQL。命令如下:
sudo chmod +x /etc/rc.d/rc.local sudo vi /etc/rc.local # 關閉透明大頁。 if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled fi # 兩個執行個體, 每個執行個體16G shared buffer。 #sysctl -w vm.nr_hugepages=17000 # 自啟動兩個執行個體。 su - postgres -c "pg_ctl start -D /data01/pg12_3389/pg_data" su - postgres -c "pg_ctl start -D /data01/pg12_8002/pg_data"建立檔案系統。命令如下:
警告本步驟操作僅針對新磁碟,請確認已掛載新磁碟(例如新掛載磁碟為vdb而不是vda)後再進行操作,否則可能會因為掛載錯誤磁碟導致磁碟中資料清空。
parted -a optimal -s /dev/vdb mklabel gpt mkpart primary 1MiB 100%FREE mkfs.ext4 /dev/vdb1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -b 4096 -T largefile -L vdb1 vi /etc/fstab LABEL=vdb1 /data01 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0 mkdir /data01 mount -a啟動irq balance。命令如下:
sudo systemctl status irqbalance sudo systemctl enable irqbalance sudo systemctl start irqbalance sudo systemctl status irqbalance安裝PostgreSQL 10、pgpool工具軟體。命令如下:
sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm sudo yum search all postgresql sudo yum search all pgpool sudo yum install -y postgresql12* sudo yum install -y pgpool-II-12-extensions初始化資料庫資料目錄。命令如下:
mkdir /data01/pg12_3389 sudo chown postgres:postgres /data01/pg12_3389配置postgres使用者環境變數。命令如下:
su - postgres vi .bash_profile # 追加 export PS1="$USER@`/bin/hostname -s`-> " export PGPORT=3389 export PGDATA=/data01/pg12_$PGPORT/pg_data export LANG=en_US.utf8 export PGHOME=/usr/pgsql-12 export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH export DATE=`date +"%Y%m%d%H%M"` export PATH=$PGHOME/bin:$PATH:. export MANPATH=$PGHOME/share/man:$MANPATH export PGHOST=$PGDATA export PGUSER=postgres export PGDATABASE=db1 alias rm='rm -i' alias ll='ls -lh' unalias vi初始化主庫。命令如下:
initdb -D $PGDATA -U postgres -E UTF8 --lc-collate=C --lc-ctype=en_US.utf8修改設定檔postgresql.conf。命令如下:
listen_addresses = '0.0.0.0' port = 3389 max_connections = 1500 superuser_reserved_connections = 13 unix_socket_directories = '., /var/run/postgresql, /tmp' tcp_keepalives_idle = 60 tcp_keepalives_interval = 10 tcp_keepalives_count = 10 shared_buffers = 16GB huge_pages = on work_mem = 8MB maintenance_work_mem = 1GB dynamic_shared_memory_type = posix vacuum_cost_delay = 0 bgwriter_delay = 10ms bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 10.0 bgwriter_flush_after = 512kB effective_io_concurrency = 0 max_worker_processes = 128 max_parallel_maintenance_workers = 3 max_parallel_workers_per_gather = 4 parallel_leader_participation = off max_parallel_workers = 8 backend_flush_after = 256 wal_level = replica synchronous_commit = off full_page_writes = on wal_compression = on wal_buffers = 16MB wal_writer_delay = 10ms wal_writer_flush_after = 1MB checkpoint_timeout = 15min max_wal_size = 64GB min_wal_size = 8GB checkpoint_completion_target = 0.2 checkpoint_flush_after = 256kB random_page_cost = 1.1 effective_cache_size = 48GB log_destination = 'csvlog' logging_collector = on log_directory = 'log' log_filename = 'postgresql-%a.log' log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 0 log_min_duration_statement = 1s log_checkpoints = on log_connections = on log_disconnections = on log_line_prefix = '%m [%p] ' log_statement = 'ddl' log_timezone = 'Asia/Shanghai' autovacuum = on log_autovacuum_min_duration = 0 autovacuum_vacuum_scale_factor = 0.1 autovacuum_analyze_scale_factor = 0.05 autovacuum_freeze_max_age = 800000000 autovacuum_multixact_freeze_max_age = 900000000 autovacuum_vacuum_cost_delay = 0 vacuum_freeze_table_age = 750000000 vacuum_multixact_freeze_table_age = 750000000 datestyle = 'iso, mdy' timezone = 'Asia/Shanghai' lc_messages = 'en_US.utf8' lc_monetary = 'en_US.utf8' lc_numeric = 'en_US.utf8' lc_time = 'en_US.utf8' default_text_search_config = 'pg_catalog.english'修改設定檔pg_hba.conf。命令如下:
說明因為pgpool-II和資料庫伺服器處於同一ECS執行個體,所以設定為127.0.0.1時要求輸入密碼才能登入。
# "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 md5 # IPv6 local connections: host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. local replication all trust host replication all 127.0.0.1/32 trust host replication all ::1/128 trust host db123 digoal 0.0.0.0/0 md5建立流複製使用者。樣本如下:
create role rep123 login replication encrypted password 'xxxxxxx'; CREATE ROLE建立業務使用者。樣本如下:
create role digoal login encrypted password 'xxxxxxx'; CREATE ROLE create database db123 owner digoal; CREATE DATABASE建立pgpool資料庫健康心跳使用者,檢查唯讀節點回放延遲(wal replay),只要能登入postgres資料庫或指定的庫即可,配合pgpool參數使用。樣本如下:
create role nobody login encrypted password 'xxxxxxx';
建立從庫
為簡化測試步驟,在同一ECS執行個體建立備庫。
使用pg_basebackup線上建立從庫。命令如下:
pg_basebackup -D /data01/pg12_8002/pg_data -F p --checkpoint=fast -P -h 127.0.0.1 -p 3389 -U rep123修改從庫設定檔postgresql.conf。命令如下:
cd /data01/pg12_8002/pg_data vi postgresql.conf # 相比主配置,修改如下: port = 8002 primary_conninfo = 'hostaddr=127.0.0.1 port=3389 user=rep123' # 不用設定密碼, 因為主設定了trust訪問。 hot_standby = on wal_receiver_status_interval = 1s wal_receiver_timeout = 10s recovery_target_timeline = 'latest'配置從庫standby.signal標記。命令如下:
cd /data01/pg12_8002/pg_data touch standby.signal查看主從同步是否正常。命令如下:
db1=# select * from pg_stat_replication ; -[ RECORD 1 ]----+------------------------------ pid | 21065 usesysid | 10 usename | postgres application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 47064 backend_start | 2020-02-29 00:26:28.485427+08 backend_xmin | state | streaming sent_lsn | 0/52000060 write_lsn | 0/52000060 flush_lsn | 0/52000060 replay_lsn | 0/52000060 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async reply_time | 2020-02-29 01:32:40.635183+08
配置pgpool
查詢pgpool安裝位置。命令如下:
rpm -qa|grep pgpool pgpool-II-12-extensions-4.1.1-1.rhel7.x86_64 pgpool-II-12-4.1.1-1.rhel7.x86_64 rpm -ql pgpool-II-12-4.1.1修改設定檔pgpool.conf。命令如下:
cd /etc/pgpool-II-12/ cp pgpool.conf.sample-stream pgpool.conf vi pgpool.conf # ---------------------------- # pgPool-II configuration file # ---------------------------- # # This file consists of lines of the form: # # name = value # # Whitespace may be used. Comments are introduced with "#" anywhere on a line. # The complete list of parameter names and allowed values can be found in the # pgPool-II documentation. # # This file is read on server startup and when the server receives a SIGHUP # signal. If you edit the file on a running system, you have to SIGHUP the # server for the changes to take effect, or use "pgpool reload". Some # parameters, which are marked below, require a server shutdown and restart to # take effect. # #------------------------------------------------------------------------------ # CONNECTIONS #------------------------------------------------------------------------------ # - pgpool Connection Settings - listen_addresses = '0.0.0.0' # Host name or IP address to listen on: # '*' for all, '' for no TCP/IP connections # (change requires restart) port = 8001 # Port number # (change requires restart) socket_dir = '/tmp' # Unix domain socket path # The Debian package defaults to # /var/run/postgresql # (change requires restart) reserved_connections = 0 # Number of reserved connections. # Pgpool-II does not accept connections if over # num_init_chidlren - reserved_connections. # - pgpool Communication Manager Connection Settings - pcp_listen_addresses = '' # Host name or IP address for pcp process to listen on: # '*' for all, '' for no TCP/IP connections # (change requires restart) pcp_port = 9898 # Port number for pcp # (change requires restart) pcp_socket_dir = '/tmp' # Unix domain socket path for pcp # The Debian package defaults to # /var/run/postgresql # (change requires restart) listen_backlog_multiplier = 2 # Set the backlog parameter of listen(2) to # num_init_children * listen_backlog_multiplier. # (change requires restart) serialize_accept = off # whether to serialize accept() call to avoid thundering herd problem # (change requires restart) # - Backend Connection Settings - backend_hostname0 = '127.0.0.1' # Host name or IP address to connect to for backend 0 backend_port0 = 3389 # Port number for backend 0 backend_weight0 = 1 # Weight for backend 0 (only in load balancing mode) backend_data_directory0 = '/data01/pg12_3389/pg_data' # Data directory for backend 0 backend_flag0 = 'ALWAYS_MASTER' # Controls various backend behavior # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER # or ALWAYS_MASTER backend_application_name0 = 'server0' # walsender's application_name, used for "show pool_nodes" command backend_hostname1 = '127.0.0.1' backend_port1 = 8002 backend_weight1 = 1 backend_data_directory1 = '/data01/pg12_8002/pg_data' backend_flag1 = 'DISALLOW_TO_FAILOVER' backend_application_name1 = 'server1' # - Authentication - enable_pool_hba = on # Use pool_hba.conf for client authentication pool_passwd = 'pool_passwd' # File name of pool_passwd for md5 authentication. # "" disables pool_passwd. # (change requires restart) authentication_timeout = 60 # Delay in seconds to complete client authentication # 0 means no timeout. allow_clear_text_frontend_auth = off # Allow Pgpool-II to use clear text password authentication # with clients, when pool_passwd does not # contain the user password # - SSL Connections - ssl = off # Enable SSL support # (change requires restart) #ssl_key = './server.key' # Path to the SSL private key file # (change requires restart) #ssl_cert = './server.cert' # Path to the SSL public certificate file # (change requires restart) #ssl_ca_cert = '' # Path to a single PEM format file # containing CA root certificate(s) # (change requires restart) #ssl_ca_cert_dir = '' # Directory containing CA root certificate(s) # (change requires restart) ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # Allowed SSL ciphers # (change requires restart) ssl_prefer_server_ciphers = off # Use server's SSL cipher preferences, # rather than the client's # (change requires restart) ssl_ecdh_curve = 'prime256v1' # Name of the curve to use in ECDH key exchange ssl_dh_params_file = '' # Name of the file containing Diffie-Hellman parameters used # for so-called ephemeral DH family of SSL cipher. #------------------------------------------------------------------------------ # POOLS #------------------------------------------------------------------------------ # - Concurrent session and pool size - num_init_children = 128 # Number of concurrent sessions allowed # (change requires restart) max_pool = 4 # Number of connection pool caches per connection # (change requires restart) # - Life time - child_life_time = 300 # Pool exits after being idle for this many seconds child_max_connections = 0 # Pool exits after receiving that many connections # 0 means no exit connection_life_time = 0 # Connection to backend closes after being idle for this many seconds # 0 means no close client_idle_limit = 0 # Client is disconnected after being idle for that many seconds # (even inside an explicit transactions!) # 0 means no disconnection #------------------------------------------------------------------------------ # LOGS #------------------------------------------------------------------------------ # - Where to log - log_destination = 'syslog' # Where to log # Valid values are combinations of stderr, # and syslog. Default to stderr. # - What to log - log_line_prefix = '%t: pid %p: ' # printf-style string to output at beginning of each log line. log_connections = on # Log connections log_hostname = off # Hostname will be shown in ps status # and in logs if connections are logged log_statement = off # Log all statements log_per_node_statement = off # Log all statements # with node and backend informations log_client_messages = off # Log any client messages log_standby_delay = 'if_over_threshold' # Log standby delay # Valid values are combinations of always, # if_over_threshold, none # - Syslog specific - syslog_facility = 'LOCAL0' # Syslog local facility. Default to LOCAL0 syslog_ident = 'pgpool' # Syslog program identification string # Default to 'pgpool' # - Debug - #log_error_verbosity = default # terse, default, or verbose messages #client_min_messages = notice # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = warning # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #------------------------------------------------------------------------------ # FILE LOCATIONS #------------------------------------------------------------------------------ pid_file_name = '/var/run/pgpool-II-12/pgpool.pid' # PID file name # Can be specified as relative to the" # location of pgpool.conf file or # as an absolute path # (change requires restart) logdir = '/tmp' # Directory of pgPool status file # (change requires restart) #------------------------------------------------------------------------------ # CONNECTION POOLING #------------------------------------------------------------------------------ connection_cache = on # Activate connection pools # (change requires restart) # Semicolon separated list of queries # to be issued at the end of a session # The default is for 8.3 and later reset_query_list = 'ABORT; DISCARD ALL' # The following one is for 8.2 and before #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' #------------------------------------------------------------------------------ # REPLICATION MODE #------------------------------------------------------------------------------ replication_mode = off # Activate replication mode # (change requires restart) replicate_select = off # Replicate SELECT statements # when in replication mode # replicate_select is higher priority than # load_balance_mode. insert_lock = off # Automatically locks a dummy row or a table # with INSERT statements to keep SERIAL data # consistency # Without SERIAL, no lock will be issued lobj_lock_table = '' # When rewriting lo_create command in # replication mode, specify table name to # lock # - Degenerate handling - replication_stop_on_mismatch = off # On disagreement with the packet kind # sent from backend, degenerate the node # which is most likely "minority" # If off, just force to exit this session failover_if_affected_tuples_mismatch = off # On disagreement with the number of affected # tuples in UPDATE/DELETE queries, then # degenerate the node which is most likely # "minority". # If off, just abort the transaction to # keep the consistency #------------------------------------------------------------------------------ # LOAD BALANCING MODE #------------------------------------------------------------------------------ load_balance_mode = on # Activate load balancing mode # (change requires restart) ignore_leading_white_space = on # Ignore leading white spaces of each query white_function_list = '' # Comma separated list of function names # that don't write to database # Regexp are accepted black_function_list = 'currval,lastval,nextval,setval' # Comma separated list of function names # that write to database # Regexp are accepted black_query_pattern_list = '' # Semicolon separated list of query patterns # that should be sent to primary node # Regexp are accepted # valid for streaming replication mode only. database_redirect_preference_list = '' # comma separated list of pairs of database and node id. # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2' # valid for streaming replication mode only. app_name_redirect_preference_list = '' # comma separated list of pairs of app name and node id. # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby' # valid for streaming replication mode only. allow_sql_comments = off # if on, ignore SQL comments when judging if load balance or # query cache is possible. # If off, SQL comments effectively prevent the judgment # (pre 3.4 behavior). disable_load_balance_on_write = 'transaction' # Load balance behavior when write query is issued # in an explicit transaction. # Note that any query not in an explicit transaction # is not affected by the parameter. # 'transaction' (the default): if a write query is issued, # subsequent read queries will not be load balanced # until the transaction ends. # 'trans_transaction': if a write query is issued, # subsequent read queries in an explicit transaction # will not be load balanced until the session ends. # 'always': if a write query is issued, read queries will # not be load balanced until the session ends. statement_level_load_balance = off # Enables statement level load balancing #------------------------------------------------------------------------------ # MASTER/SLAVE MODE #------------------------------------------------------------------------------ master_slave_mode = on # Activate master/slave mode # (change requires restart) master_slave_sub_mode = 'stream' # Master/slave sub mode # Valid values are combinations stream, slony # or logical. Default is stream. # (change requires restart) # - Streaming - sr_check_period = 3 # Streaming replication check period # Disabled (0) by default sr_check_user = 'nobody' # Streaming replication check user # This is necessary even if you disable streaming # replication delay check by sr_check_period = 0 sr_check_password = '' # Password for streaming replication check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password sr_check_database = 'postgres' # Database name for streaming replication check delay_threshold = 512000 # Threshold before not dispatching query to standby node # Unit is in bytes # Disabled (0) by default # - Special commands - follow_master_command = '' # Executes this command after master failover # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new master node id # %H = new master node hostname # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character #------------------------------------------------------------------------------ # HEALTH CHECK GLOBAL PARAMETERS #------------------------------------------------------------------------------ health_check_period = 5 # Health check period # Disabled (0) by default health_check_timeout = 10 # Health check timeout # 0 means no timeout health_check_user = 'nobody' # Health check user health_check_password = '' # Password for health check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password health_check_database = '' # Database name for health check. If '', tries 'postgres' first, health_check_max_retries = 60 # Maximum number of times to retry a failed health check before giving up. health_check_retry_delay = 1 # Amount of time to wait (in seconds) between retries. connect_timeout = 10000 # Timeout value in milliseconds before giving up to connect to backend. # Default is 10000 ms (10 second). Flaky network user may want to increase # the value. 0 means no timeout. # Note that this value is not only used for health check, # but also for ordinary connection to backend. #------------------------------------------------------------------------------ # HEALTH CHECK PER NODE PARAMETERS (OPTIONAL) #------------------------------------------------------------------------------ #health_check_period0 = 0 #health_check_timeout0 = 20 #health_check_user0 = 'nobody' #health_check_password0 = '' #health_check_database0 = '' #health_check_max_retries0 = 0 #health_check_retry_delay0 = 1 #connect_timeout0 = 10000 #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_command = '' # Executes this command at failover # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new master node id # %H = new master node hostname # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character failback_command = '' # Executes this command at failback. # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new master node id # %H = new master node hostname # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character failover_on_backend_error = off # Initiates failover when reading/writing to the # backend communication socket fails # If set to off, pgpool will report an # error and disconnect the session. detach_false_primary = off # Detach false primary if on. Only # valid in streaming replication # mode and with PostgreSQL 9.6 or # after. search_primary_node_timeout = 300 # Timeout in seconds to search for the # primary node when a failover occurs. # 0 means no timeout, keep searching # for a primary node forever. #------------------------------------------------------------------------------ # ONLINE RECOVERY #------------------------------------------------------------------------------ recovery_user = 'nobody' # Online recovery user recovery_password = '' # Online recovery password # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password recovery_1st_stage_command = '' # Executes a command in first stage recovery_2nd_stage_command = '' # Executes a command in second stage recovery_timeout = 90 # Timeout in seconds to wait for the # recovering node's postmaster to start up # 0 means no wait client_idle_limit_in_recovery = 0 # Client is disconnected after being idle # for that many seconds in the second stage # of online recovery # 0 means no disconnection # -1 means immediate disconnection auto_failback = off # Detached backend node reattach automatically # if replication_state is 'streaming'. auto_failback_interval = 60 # Min interval of executing auto_failback in # seconds. #------------------------------------------------------------------------------ # WATCHDOG #------------------------------------------------------------------------------ # - Enabling - use_watchdog = off # Activates watchdog # (change requires restart) # -Connection to up stream servers - trusted_servers = '' # trusted server list which are used # to confirm network connection # (hostA,hostB,hostC,...) # (change requires restart) ping_path = '/bin' # ping command path # (change requires restart) # - Watchdog communication Settings - wd_hostname = '' # Host name or IP address of this watchdog # (change requires restart) wd_port = 9000 # port number for watchdog service # (change requires restart) wd_priority = 1 # priority of this watchdog in leader election # (change requires restart) wd_authkey = '' # Authentication key for watchdog communication # (change requires restart) wd_ipc_socket_dir = '/tmp' # Unix domain socket path for watchdog IPC socket # The Debian package defaults to # /var/run/postgresql # (change requires restart) # - Virtual IP control Setting - delegate_IP = '' # delegate IP address # If this is empty, virtual IP never bring up. # (change requires restart) if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd exists # If if_up/down_cmd starts with "/", if_cmd_path will be ignored. # (change requires restart) if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0' # startup delegate IP command # (change requires restart) if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0' # shutdown delegate IP command # (change requires restart) arping_path = '/usr/sbin' # arping command path # If arping_cmd starts with "/", if_cmd_path will be ignored. # (change requires restart) arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0' # arping command # (change requires restart) # - Behaviour on escalation Setting - clear_memqcache_on_escalation = on # Clear all the query cache on shared memory # when standby pgpool escalate to active pgpool # (= virtual IP holder). # This should be off if client connects to pgpool # not using virtual IP. # (change requires restart) wd_escalation_command = '' # Executes this command at escalation on new active pgpool. # (change requires restart) wd_de_escalation_command = '' # Executes this command when master pgpool resigns from being master. # (change requires restart) # - Watchdog consensus settings for failover - failover_when_quorum_exists = on # Only perform backend node failover # when the watchdog cluster holds the quorum # (change requires restart) failover_require_consensus = on # Perform failover when majority of Pgpool-II nodes # agrees on the backend node status change # (change requires restart) allow_multiple_failover_requests_from_node = off # A Pgpool-II node can cast multiple votes # for building the consensus on failover # (change requires restart) enable_consensus_with_half_votes = off # apply majority rule for consensus and quorum computation # at 50% of votes in a cluster with even number of nodes. # when enabled the existence of quorum and consensus # on failover is resolved after receiving half of the # total votes in the cluster, otherwise both these # decisions require at least one more vote than # half of the total votes. # (change requires restart) # - Lifecheck Setting - # -- common -- wd_monitoring_interfaces_list = '' # Comma separated list of interfaces names to monitor. # if any interface from the list is active the watchdog will # consider the network is fine # 'any' to enable monitoring on all interfaces except loopback # '' to disable monitoring # (change requires restart) wd_lifecheck_method = 'heartbeat' # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external') # (change requires restart) wd_interval = 10 # lifecheck interval (sec) > 0 # (change requires restart) # -- heartbeat mode -- wd_heartbeat_port = 9694 # Port number for receiving heartbeat signal # (change requires restart) wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) heartbeat_destination0 = 'host0_ip1' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) #heartbeat_destination1 = 'host0_ip2' #heartbeat_destination_port1 = 9694 #heartbeat_device1 = '' # -- query mode -- wd_life_point = 3 # lifecheck retry times # (change requires restart) wd_lifecheck_query = 'SELECT 1' # lifecheck query to pgpool from watchdog # (change requires restart) wd_lifecheck_dbname = 'template1' # Database name connected for lifecheck # (change requires restart) wd_lifecheck_user = 'nobody' # watchdog user monitoring pgpools in lifecheck # (change requires restart) wd_lifecheck_password = '' # Password for watchdog user in lifecheck # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password # (change requires restart) # - Other pgpool Connection Settings - #other_pgpool_hostname0 = 'host0' # Host name or IP address to connect to for other pgpool 0 # (change requires restart) #other_pgpool_port0 = 5432 # Port number for other pgpool 0 # (change requires restart) #other_wd_port0 = 9000 # Port number for other watchdog 0 # (change requires restart) #other_pgpool_hostname1 = 'host1' #other_pgpool_port1 = 5432 #other_wd_port1 = 9000 #------------------------------------------------------------------------------ # OTHERS #------------------------------------------------------------------------------ relcache_expire = 0 # Life time of relation cache in seconds. # 0 means no cache expiration(the default). # The relation cache is used for cache the # query result against PostgreSQL system # catalog to obtain various information # including table structures or if it's a # temporary table or not. The cache is # maintained in a pgpool child local memory # and being kept as long as it survives. # If someone modify the table by using # ALTER TABLE or some such, the relcache is # not consistent anymore. # For this purpose, cache_expiration # controls the life time of the cache. relcache_size = 8192 # Number of relation cache # entry. If you see frequently: # "pool_search_relcache: cache replacement happend" # in the pgpool log, you might want to increate this number. check_temp_table = catalog # Temporary table check method. catalog, trace or none. # Default is catalog. check_unlogged_table = on # If on, enable unlogged table check in SELECT statements. # This initiates queries against system catalog of primary/master # thus increases load of master. # If you are absolutely sure that your system never uses unlogged tables # and you want to save access to primary/master, you could turn this off. # Default is on. enable_shared_relcache = on # If on, relation cache stored in memory cache, # the cache is shared among child process. # Default is on. # (change requires restart) relcache_query_target = master # Target node to send relcache queries. Default is master (primary) node. # If load_balance_node is specified, queries will be sent to load balance node. #------------------------------------------------------------------------------ # IN MEMORY QUERY MEMORY CACHE #------------------------------------------------------------------------------ memory_cache_enabled = off # If on, use the memory cache functionality, off by default # (change requires restart) memqcache_method = 'shmem' # Cache storage method. either 'shmem'(shared memory) or # 'memcached'. 'shmem' by default # (change requires restart) memqcache_memcached_host = 'localhost' # Memcached host name or IP address. Mandatory if # memqcache_method = 'memcached'. # Defaults to localhost. # (change requires restart) memqcache_memcached_port = 11211 # Memcached port number. Mandatory if memqcache_method = 'memcached'. # Defaults to 11211. # (change requires restart) memqcache_total_size = 67108864 # Total memory size in bytes for storing memory cache. # Mandatory if memqcache_method = 'shmem'. # Defaults to 64MB. # (change requires restart) memqcache_max_num_cache = 1000000 # Total number of cache entries. Mandatory # if memqcache_method = 'shmem'. # Each cache entry consumes 48 bytes on shared memory. # Defaults to 1,000,000(45.8MB). # (change requires restart) memqcache_expire = 0 # Memory cache entry life time specified in seconds. # 0 means infinite life time. 0 by default. # (change requires restart) memqcache_auto_cache_invalidation = on # If on, invalidation of query cache is triggered by corresponding # DDL/DML/DCL(and memqcache_expire). If off, it is only triggered # by memqcache_expire. on by default. # (change requires restart) memqcache_maxcache = 409600 # Maximum SELECT result size in bytes. # Must be smaller than memqcache_cache_block_size. Defaults to 400KB. # (change requires restart) memqcache_cache_block_size = 1048576 # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'. # Defaults to 1MB. # (change requires restart) memqcache_oiddir = '/var/log/pgpool/oiddir' # Temporary work directory to record table oids # (change requires restart) white_memqcache_table_list = '' # Comma separated list of table names to memcache # that don't write to database # Regexp are accepted black_memqcache_table_list = '' # Comma separated list of table names not to memcache # that don't write to database # Regexp are accepted涉及修改的重要配置如下:
listen_addresses = '0.0.0.0' port = 8001 socket_dir = '/tmp' reserved_connections = 0 pcp_listen_addresses = '' pcp_port = 9898 pcp_socket_dir = '/tmp' # - Backend Connection Settings - backend_hostname0 = '127.0.0.1' # Host name or IP address to connect to for backend 0 backend_port0 = 3389 # Port number for backend 0 backend_weight0 = 1 # Weight for backend 0 (only in load balancing mode) backend_data_directory0 = '/data01/pg12_3389/pg_data' # Data directory for backend 0 backend_flag0 = 'ALWAYS_MASTER' # Controls various backend behavior # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER # or ALWAYS_MASTER backend_application_name0 = 'server0' # walsender's application_name, used for "show pool_nodes" command backend_hostname1 = '127.0.0.1' backend_port1 = 8002 backend_weight1 = 1 backend_data_directory1 = '/data01/pg12_8002/pg_data' backend_flag1 = 'DISALLOW_TO_FAILOVER' backend_application_name1 = 'server1' # - Authentication - enable_pool_hba = on # Use pool_hba.conf for client authentication pool_passwd = 'pool_passwd' # File name of pool_passwd for md5 authentication. # "" disables pool_passwd. # (change requires restart) allow_clear_text_frontend_auth = off # Allow Pgpool-II to use clear text password authentication # with clients, when pool_passwd does not # contain the user password # - Concurrent session and pool size - num_init_children = 128 # Number of concurrent sessions allowed # (change requires restart) max_pool = 4 # Number of connection pool caches per connection # (change requires restart) # - Life time - child_life_time = 300 # Pool exits after being idle for this many seconds child_max_connections = 0 # Pool exits after receiving that many connections # 0 means no exit connection_life_time = 0 # Connection to backend closes after being idle for this many seconds # 0 means no close client_idle_limit = 0 # Client is disconnected after being idle for that many seconds # (even inside an explicit transactions!) # 0 means no disconnection #------------------------------------------------------------------------------ # LOGS #------------------------------------------------------------------------------ # - Where to log - log_destination = 'syslog' # Where to log # Valid values are combinations of stderr, # and syslog. Default to stderr. log_connections = on # Log connections log_standby_delay = 'if_over_threshold' # Log standby delay # Valid values are combinations of always, # if_over_threshold, none #------------------------------------------------------------------------------ # FILE LOCATIONS #------------------------------------------------------------------------------ pid_file_name = '/var/run/pgpool-II-12/pgpool.pid' # PID file name # Can be specified as relative to the" # location of pgpool.conf file or # as an absolute path # (change requires restart) logdir = '/tmp' # Directory of pgPool status file # (change requires restart) #------------------------------------------------------------------------------ # CONNECTION POOLING #------------------------------------------------------------------------------ connection_cache = on # Activate connection pools # (change requires restart) # Semicolon separated list of queries # to be issued at the end of a session # The default is for 8.3 and later reset_query_list = 'ABORT; DISCARD ALL' #------------------------------------------------------------------------------ # LOAD BALANCING MODE #------------------------------------------------------------------------------ load_balance_mode = on # Activate load balancing mode # (change requires restart) ignore_leading_white_space = on # Ignore leading white spaces of each query white_function_list = '' # Comma separated list of function names # that don't write to database # Regexp are accepted black_function_list = 'currval,lastval,nextval,setval' # Comma separated list of function names # that write to database # Regexp are accepted black_query_pattern_list = '' # Semicolon separated list of query patterns # that should be sent to primary node # Regexp are accepted # valid for streaming replication mode only. database_redirect_preference_list = '' # comma separated list of pairs of database and node id. # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2' # valid for streaming replication mode only. app_name_redirect_preference_list = '' # comma separated list of pairs of app name and node id. # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby' # valid for streaming replication mode only. allow_sql_comments = off # if on, ignore SQL comments when judging if load balance or # query cache is possible. # If off, SQL comments effectively prevent the judgment # (pre 3.4 behavior). disable_load_balance_on_write = 'transaction' # Load balance behavior when write query is issued # in an explicit transaction. # Note that any query not in an explicit transaction # is not affected by the parameter. # 'transaction' (the default): if a write query is issued, # subsequent read queries will not be load balanced # until the transaction ends. # 'trans_transaction': if a write query is issued, # subsequent read queries in an explicit transaction # will not be load balanced until the session ends. # 'always': if a write query is issued, read queries will # not be load balanced until the session ends. statement_level_load_balance = off # Enables statement level load balancing #------------------------------------------------------------------------------ # MASTER/SLAVE MODE #------------------------------------------------------------------------------ master_slave_mode = on # Activate master/slave mode # (change requires restart) master_slave_sub_mode = 'stream' # Master/slave sub mode # Valid values are combinations stream, slony # or logical. Default is stream. # (change requires restart) # - Streaming - sr_check_period = 3 # Streaming replication check period # Disabled (0) by default sr_check_user = 'nobody' # Streaming replication check user # This is necessary even if you disable streaming # replication delay check by sr_check_period = 0 sr_check_password = '' # Password for streaming replication check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password sr_check_database = 'postgres' # Database name for streaming replication check delay_threshold = 512000 # Threshold before not dispatching query to standby node # Unit is in bytes # Disabled (0) by default #------------------------------------------------------------------------------ # HEALTH CHECK GLOBAL PARAMETERS #------------------------------------------------------------------------------ health_check_period = 5 # Health check period # Disabled (0) by default health_check_timeout = 10 # Health check timeout # 0 means no timeout health_check_user = 'nobody' # Health check user health_check_password = '' # Password for health check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password health_check_database = '' # Database name for health check. If '', tries 'postgres' first, health_check_max_retries = 60 # Maximum number of times to retry a failed health check before giving up. health_check_retry_delay = 1 # Amount of time to wait (in seconds) between retries. connect_timeout = 10000 # Timeout value in milliseconds before giving up to connect to backend. # Default is 10000 ms (10 second). Flaky network user may want to increase # the value. 0 means no timeout. # Note that this value is not only used for health check, # but also for ordinary connection to backend. #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_on_backend_error = off # Initiates failover when reading/writing to the # backend communication socket fails # If set to off, pgpool will report an # error and disconnect the session. relcache_expire = 0 # 建議結構變更後, 設定為1,然後reload然後再改回來. 當然也可以直接設定為一個時間 # Life time of relation cache in seconds. # 0 means no cache expiration(the default). # The relation cache is used for cache the # query result against PostgreSQL system # catalog to obtain various information # including table structures or if it's a # temporary table or not. The cache is # maintained in a pgpool child local memory # and being kept as long as it survives. # If someone modify the table by using # ALTER TABLE or some such, the relcache is # not consistent anymore. # For this purpose, cache_expiration # controls the life time of the cache. relcache_size = 8192 # Number of relation cache # entry. If you see frequently: # "pool_search_relcache: cache replacement happend" # in the pgpool log, you might want to increate this number.配置pool_passwd密碼檔案。命令如下:
說明通過pgpool串連資料庫時需要使用密碼檔案,可以理解為pgpool支援了PostgreSQL的認證協議。
cd /etc/pgpool-II-12 #用法 #pg_md5 --md5auth --username=username password #產生digoal, nobody密碼, 自動寫入pool_passwd。 pg_md5 --md5auth --username=digoal "xxxxxxx" pg_md5 --md5auth --username=nobody "xxxxxxx"自動產生pool_passwd檔案。命令如下:
cd /etc/pgpool-II-12 cat pool_passwd digoal:md54dd55116da69d3d03bf2e3a1470564f9 nobody:md54240e76623e2511d607f431043a5d1c1配置pgpool_hba檔案。命令如下:
cd /etc/pgpool-II-12 cp pool_hba.conf.sample pool_hba.conf vi pool_hba.conf host all all 0.0.0.0/0 md5配置pcp管理密碼檔案。命令如下:
說明這裡是用來管理pgpool的密碼和使用者,不是資料庫的使用者和密碼。
cd /etc/pgpool-II-12 pg_md5 abc # 例如密碼是abc。 900150983cd24fb0d6963f7d28e17f72 cp pcp.conf.sample pcp.conf vi pcp.conf USERID:MD5PASSWD manage:900150983cd24fb0d6963f7d28e17f72 #表示使用manage使用者來管理pcp。啟動pgpool。命令如下:
cd /etc/pgpool-II-12 pgpool -f ./pgpool.conf -a ./pool_hba.conf -F ./pcp.conf說明查看pgpool日誌的命令如下:
less /var/log/messages通過pgpool串連資料庫。命令如下:
psql -h 127.0.0.1 -p 8001 -U digoal postgres
常見問題
如何測試讀寫分離是否成功?
串連並查詢pg_is_in_recovery(),然後斷開重連再查詢pg_is_in_recovery(),如果交替返回false和true,說明是交替將請求發送給了主庫和從庫,即讀寫分離成功。
使用pgpool會增加延遲嗎?
會小幅增加延遲,本文測試環境下約增加0.12毫秒延遲。
pgpool的延遲檢測和健康檢測機制是什嗎?
pgpool不會將SQL請求發送給回放延遲(wal replay)大於設定值的唯讀節點,當唯讀節點延遲小於設定值後,才會再次發送。
說明您可以串連主庫查詢當前資料庫wal寫入位置的Lsn 1,然後串連唯讀節點查詢當前wal replay位置的Lsn 2,對比Lsn 1和Lsn 2相差的位元組。
pgpool可以檢測後端的健康狀態,如果發現不健康,SQL請求不會路由到這個節點。
如何停止、重新載入pgpool配置?
您可以使用
pgpool --help查看協助命令,例如:cd /etc/pgpool-II-12 pgpool -f ./pgpool.conf -m fast stop如果有多個唯讀執行個體, 應該如何配置?
修改pgpool.conf檔案,補充多個唯讀執行個體的配置,樣本如下:
backend_hostname1 = 'xx.xx.xxx.xx' backend_port1 = 8002 backend_weight1 = 1 backend_data_directory1 = '/data01/pg12_8002/pg_data' backend_flag1 = 'DISALLOW_TO_FAILOVER' backend_application_name1 = 'server1' backend_hostname2 = 'xx.xx.xx.xx' backend_port1 = 8002 backend_weight1 = 1 backend_data_directory1 = '/data01/pg12_8002/pg_data' backend_flag1 = 'DISALLOW_TO_FAILOVER' backend_application_name1 = 'server1'如何通過pcp查詢後端狀態?
樣本命令如下:
# pcp_node_info -U manage -h /tmp -p 9898 -n 1 -v Password: 輸入密碼 Hostname : 127.0.0.1 Port : 8002 Status : 2 Weight : 0.500000 Status Name : up Role : standby Replication Delay : 0 Replication State : Replication Sync State : Last Status Change : 2020-02-29 00:20:29監聽的連接埠有哪些?
監聽的連接埠如下:
主庫:3389
備庫:8002
pgpool:8001
pcp:9898