This topic describes the flashback technology of Zettabyte File System (ZFS) that
is supported by ApsaraDB RDS for PostgreSQL. After you specify your ECS instance as
a backup of your RDS instance, ZFS allows your ECS instance to synchronize data from
your RDS instance and invoke the cron tool to quickly create snapshots at fixed intervals.
If your RDS instance becomes faulty, you can use the snapshots on the ECS instance
to restore your RDS instance in seconds.
Background information
ZFS is a dynamic file management system that offers features and benefits not found
in other file systems available today. ZFS uses the concept of storage pools to manage
physical storage devices. This allows ZFS to aggregate physical storage devices into
a storage pool. The storage pool describes the characteristics of physical storage
devices. These characteristics include device layout and data redundancy. The storage
pool also acts as an arbitrary data store from which file systems can be created.
File systems are no longer constrained to individual physical devices. In addition,
file systems provided with individual physical devices can be shared in the storage
pool.
ZFS uses the copy on write technique to manage data. When you write new data, the
block that contains the original data is retained. This allows ZFS to store all of
the data that is required to create snapshots. You can use the data to create snapshots
in seconds. In addition, ZFS allows modifications to a file to be shared among file
systems and their snapshots. This allows you to optimize the storage usage of ZFS
snapshots.
These features of ZFS allow you to find the latest snapshot. You can use the latest
snapshot to restore data in the event of faults or disasters.
Before you begin
- Create a suitable ECS instance. For more information, see ECS instance creation overview. We recommend that your ECS instance run the 64-bit CentOS 7 operating system and
have the same specifications, zones, Virtual Private Cloud (VPC), and vSwitch as your
RDS instance. This ensures the fastest communication between your ECS and RDS instances
over an internal network. If you require cross-region disaster recovery, create a
Cloud Enterprise Network instance. For more information, see Tutorial overview.
- Configure a whitelist to grant your ECS instance access to your RDS instance. For
more information, see Configure an IP address whitelist for an ApsaraDB RDS for PostgreSQL instance.
Create a snapshot
- Connect to your ECS instance. For more information, see Overview.
- Install ZFS.
wget http://download.zfsonlinux.org/epel/zfs-release.el7_7.noarch.rpm
rpm -ivh zfs-release.el7_7.noarch.rpm
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y "kernel-devel-uname-r == $(uname -r)" zfs
- Modify the rc.local file.
vi /etc/rc.local
/sbin/modprobe zfs
chmod +x /etc/rc.local
- Test ZFS. After you check that ZFS is properly running, restart your ECS instance.
# modprobe zfs
# zpool list
no pools available
# reboot
- Install PostgreSQL 12 on your ECS instance.
yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum install postgresql12*
- Configure the following parameters on the disks of your ECS instance:
parted -a optimal -s /dev/vdb1 mklabel gpt mkpart primary 1MiB 100%FREE // Align partitions. The /dev/vdb1 path is where a new disk is created.
zpool create zp1 -f -o ashift=13 vdc1 // Configure the ZFS storage pool.
zfs set canmount=off zp1 // Set the canmount parameter.
- Create a directory that is used to store data files.
zfs create -o mountpoint=/zpdata01 -o recordsize=8K -o atime=off -o primarycache=metadata -o logbias=throughput -o secondarycache=none zp1/zpdata01
- Create a directory that is used to store archived write-ahead logging (WAL) files.
zfs create -o mountpoint=/zpdata02 -o recordsize=8K -o atime=off -o primarycache=metadata -o logbias=throughput -o secondarycache=none -o compression=on zp1/zpdata02
zfs set compression=on zp1/zpdata02
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
zp1 1.29M 1.42T 192K /zp1
zp1/zpdata01 192K 1.42T 192K /zpdata01
zp1/zpdata02 192K 1.42T 192K /zpdata02
- Create a directory that is used to store the synchronized data from your RDS instance.
mkdir /zpdata01/pg12_1921_data
mkdir /zpdata02/pg12_1921_wal
chown -R postgres:postgres /zpdata01/pg12_1921_data
chown -R postgres:postgres /zpdata02/pg12_1921_wal
- Synchronize data from your RDS instance to your ECS instance.
su - postgres
export PGPASSWORD=<The password of the privileged account for your RDS instance>
nohup pg_basebackup -D /zpdata01/pg12_1921_data -F p -R -c fast -X stream -h <The endpoint that is used to connect to your RDS instance> -p <The port that is used to connect to your RDS instance> -U <The username of the privileged account for your RDS instance> >./bak.log 2>&1 &

- Comment out the following parameters in the /zpdata01/pg12_1921_data/postgresql.conf file:
#Fri Mar 13 09:55:03 CST 2020
#ssl_key_file='server.key'
#huge_pages=try
#auto_explain.sample_rate=1
#zhparser.multi_zall=off
#shared_preload_libraries='pg_stat_statements,auth_delay,auto_explain,zhparser,timescaledb,pg_pathman'
#promote_trigger_file='/data/postgresql.trigger'
#ssl=off
#rds_max_log_files=20
#pg_pathman.enable_auto_partition=on
#shared_buffers=32768MB
#zhparser.punctuation_ignore=off
#pg_pathman.override_copy=on
#port=1922
#pg_stat_statements.max=5000
#auth_delay.milliseconds=3s
#auto_explain.log_nested_statements=off
#track_io_timing=on
#zhparser.multi_zmain=off
#auto_explain.log_analyze=off
#archive_mode=on
#ssl_cert_file='server.crt'
#zhparser.multi_short=off
#zhparser.dict_in_memory=off
#auto_explain.log_format=text
#auto_explain.log_min_duration=-1
#rds.rds_max_non_super_conns=12800
#pg_pathman.enable=on
#archive_command='/bin/date'
#auto_explain.log_verbose=off
#log_line_prefix='\1\n\t%p\t%r\t%u\t%d\t%t\t%e\t%T\t%S\t%U\t%E\t\t'
#pg_pathman.enable_runtimemergeappend=on
#zhparser.extra_dicts='dict_extra.xdb'
#auto_explain.log_buffers=off
#pg_stat_statements.track=top
#jit_provider='llvmjit'
#pg_pathman.enable_partitionrouter=off
#pg_stat_statements.track_utility=off
#pg_stat_statements.save=off
#zhparser.dicts_type='EXTRA'
#auto_explain.log_timing=on
#pg_pathman.enable_runtimeappend=on
#zhparser.seg_with_duality=off
#rds.rds_max_super_conns=100
#pg_pathman.enable_partitionfilter=on
#log_destination='stderr,csvlog'
#zhparser.multi_duality=off
#pg_pathman.insert_into_fdw='postgres'
#pg_pathman.enable_bounds_cache=on
#rds.rds_max_non_super_wal_snd=32
#auto_explain.log_triggers=off
#rds_sync_replication_timeout=0
- Modify the postgresql.auto.conf file.
vi /zpdata01/pg12_1921_data/postgresql.auto.conf
primary_conninfo = 'user=<The username of the privileged account for your RDS instance> password=''<The password of the privileged account for your RDS instance> '' host=''<The endpoint that is used to connect to your RDS instance>'' port=<The port that is used to connect to your RDS instance> application_name=hello_rds_pg12'
port=1922
shared_buffers=32GB
log_destination='csvlog'
archive_mode=always
archive_command='test ! -f /zpdata02/pg12_1921_wal/%f && cp %p /zpdata02/pg12_1921_wal/%f'
- Modify the permissions on the created directories.
chmod 700 /zpdata02/pg12_1921_wal
chmod 700 /zpdata01/pg12_1921_data
- Start the PostgreSQL 12 database on your ECS instance.
su - postgres
/usr/pgsql-12/bin/pg_ctl start -D /zpdata01/pg12_1921_data
Note If the following error is reported because the configuration of your ECS instance
is low, we recommend that you upgrade your ECS instance:
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages.
- Configure the PostgreSQL 12 database on your ECS instance to automatically start.
vi /etc/rc.local
su - postgres -c "/usr/pgsql-12/bin/pg_ctl start -D /zpdata01/pg12_1921_data"
- Configure the PostgreSQL 12 database on your ECS instance to automatically create
snapshots of the directory that is used to store data files. You do not need to create
snapshots of the directory that is used to store archived WAL files.
- Create a script to configure the permission that is used to create snapshots.
vi /etc/snap.sh
STIME=`date +%F%T`
/usr/sbin/zfs snapshot zp1/zpdata01@$STIME
chmod 500 /etc/snap.sh
- Check whether snapshots can be properly created.
/etc/snap.sh
# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
zp1/zpdata01@2020-03-2117:06:47 144K - 770M -
- Configure the crond daemon process to automatically start.
# systemctl start crond
# systemctl enable crond
# systemctl status crond
● crond.service - Command Scheduler
Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2020-03-21 16:16:08 CST; 53min ago
Main PID: 2526 (crond)
CGroup: /system.slice/crond.service
└─2526 /usr/sbin/crond -n
- Configure the crontab file.
# crontab -e
1 1 * * * /etc/snap.sh
# crontab -l -u root
1 1 * * * /etc/snap.sh
Note You can configure automatic deletion of snapshots or archives based on your disk usage
and business requirements. To manually delete snapshots or archives, follow these
steps:
Manually delete snapshots:
# zfs list -t snapshot
zNAME USED AVAIL REFER MOUNTPOINT
zp1/zpdata01@2020-03-2117:06:47 144K - 770M -
zp1/zpdata01@2020-03-2117:17:01 0B - 786M -
# zfs destroy zp1/zpdata01@2020-03-2117:06:47
# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
zp1/zpdata01@2020-03-2117:17:01 0B - 786M -
Manually delete archives:
find /zpdata02/pg12_1921_wal/ -type f -mtime +7 -exec rm -f {} \;
- Check the validity of backup sets. For more information, visit GitHub.
- Obtain the latency of data synchronization between your RDS and ECS instances.
postgres=> select pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_flush_lsn(),sent_lsn)) as sent_delay,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_flush_lsn(),replay_lsn)) as replay_dealy,*
from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
sent_delay | 0 bytes
replay_dealy | 0 bytes
pid | 84098
usesysid | 886185
usename | rep
application_name | hello_rds_pg12
client_addr | 192.168.0.173
client_hostname |
client_port | 60402
backend_start | 2020-03-21 16:59:01.890775+08
backend_xmin |
state | streaming
sent_lsn | 11D/97002068
write_lsn | 11D/97002068
flush_lsn | 11D/97002068
replay_lsn | 11D/97002068
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2020-03-21 17:01:17.198139+08
After you complete the preceding steps, snapshots will be created at fixed intervals
on your ECS instance. You can use the created snapshots to restore your RDS instance
in seconds for real-time disaster recovery.
Restore your RDS instance in seconds
- Clone the ZFS file system based on a suitable snapshot.
# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
zp1/zpdata01@2020-03-2117:17:01 312K - 786M -
# zfs clone -o mountpoint=/test_recovery zp1/zpdata01@2020-03-2117:17:01 zp1/zpdata_test
[root@iZbp135pwcjjoxqgfpw9k1Z pg12_1921_wal]# cd /test_recovery
[root@iZbp135pwcjjoxqgfpw9k1Z test_recovery]# ll
total 17
drwx------ 20 postgres postgres 35 Mar 21 16:59 pg12_1921_data
- Configure the parameters that are used to restore the data of your RDS instance.
Note If your ECS instance cannot provide a sufficient memory capacity, you can configure
a small shared buffer.
vi /test_recovery/pg12_1921_data/postgresql.auto.conf
port=1923
shared_buffers=32GB
log_destination='csvlog'
recovery_end_command = 'cp /zpdata02/pg12_1921_wal/%f %p'
recovery_target_time = '2020-03-21 17:28:37.670338+08'
recovery_target_timeline = 'latest'
recovery_target_action = 'pause'
- Delete the socket and pid files from the cloned ZFS file system.
rm -f /test_recovery/pg12_1921_data/.s.*
rm /test_recovery/pg12_1921_data/postmaster.pid
- Restore the data from the cloned ZFS file system.
su - postgres
/usr/pgsql-12/bin/pg_ctl start -D /test_recovery/pg12_1921_data
- View the restored data.
psql -h /test_recovery/pg12_1921_data -p 1923 -U <The username of the privileged account for your RDS instance> postgres
psql (12.1)
Type "help" for help.
postgres=> \dt

Note After you check that the restored data is correct, you can use the pgdump tool to
copy the restored data to your RDS instance.
Delete restored data from your ECS instance
After the restoration is complete, run the following commands to delete the restored
data from the test_recovery directory on your ECS instance:
su - postgres
/usr/pgsql-12/bin/pg_ctl stop -m fast -D /test_recovery/pg12_1921_data
sudo zfs destroy zp1/zpdata_test