This topic describes how to migrate data from a source self-managed PostgreSQL database to a destination database on your ApsaraDB RDS for PostgreSQL instance. The source self-managed database is hosted on an Elastic Compute Service (ECS) instance.
Prerequisites
An ECS instance that runs a Linux operating system is created. In addition, the ECS instance can connect to your RDS instance.Procedure
- Install the PostgreSQL Community edition on the ECS instance. Note The pg_dump extension must run the same PostgreSQL version as the source self-managed database. The pg_restore extension must run the same PostgreSQL version as your RDS instance. In this topic, PostgreSQL 12 is used as an example. You can select a different PostgreSQL version based on your business requirements.
-- Install the RPM Package Manager (RPM). sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm -- Install the PostgreSQL Community edition. sudo yum install -y postgresql12-*
- Configure the required environment variables to ensure that a suitable PostgreSQL version is used for the data export and import.
su - postgres vi .bash_profile --Append the required content. export PS1="$USER@`/bin/hostname -s`-> " export LANG=en_US.utf8 export PGHOME=/usr/pgsql-12 export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH export DATE=`date +"%Y%m%d%H%M"` export PATH=$PGHOME/bin:$PATH:. export MANPATH=$PGHOME/share/man:$MANPATH alias rm='rm -i' alias ll='ls -lh' unalias vi
- Export all of the user data from the source self-managed database. Note The migration of the user data must be performed before the migration of the other data. Otherwise, the migration may fail due to issues related to object permissions or owners.
pg_dumpall -g -h 127.0.0.1 -p 5432 -U postgres -- -- PostgreSQL database cluster dump -- SET default_transaction_read_only = off; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; -- -- Roles -- CREATE ROLE postgres; ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS PASSWORD 'md5d5df0dxxxxxxxc88a541fec598f'; -- -- PostgreSQL database cluster dump complete --
- Modify the settings of the roles-related commands from the preceding step. For example, you must replace SUPERUSER in the commands from the preceding step with rds_SUPERUSER. Then, run the modified commands to import the user data from the source self-managed database into your RDS instance.
CREATE ROLE postgres; ALTER ROLE postgres WITH rds_SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS PASSWORD 'md5d5df0dxxxxxxxc88a541fec598f';
- Connect to your RDS instance and create a destination database. Note The destination database must use the same encoding format as the source self-managed database. The UTF-8 encoding format is used in this example.
create database db1 with template template0 encoding 'UTF8' lc_ctype 'en_US.utf8' lc_collate 'C';
- Optional. If the preceding method is unsuitable for your workloads, use one of the following three methods to export and import data:
- Method 1: Online migration
If the source self-managed database can connect to your RDS instance, you can use this method.
- Configure a password file. The content of the password file are in the following format:
host:port:dbname:username:password
.Note The values of the dbname and username parameters must be in lower case. This does not apply if the database name and username that you specified each are in upper case and are enclosed in a pair of double quotation marks ("). The metadata of a PostgreSQL database is stored in lower case by default.vi ~/.pgpass pgm-xxx.pg.rds.aliyuncs.com:1921:db1:Username:Password 127.0.0.1:5432:postgres:Username:Password chmod 400 ~/.pgpass
- Migrate data by using a pipeline.
nohup pg_dump -F p -h 127.0.0.1 -p 5432 -U postgres -d postgres --no-tablespaces | time psql -h pgm-bpxxxxx.pg.rds.aliyuncs.com -p 1921 -U postgres --single-transaction db1 > ./pg.dump.log 2>&1 &
Note If errors occur, you can view the error logs in the pg.dump.log file. After you fix the errors, you can immediately resume the data import. This applies if you have configured the--single-transaction
option.
- Configure a password file. The content of the password file are in the following format:
- Method 2: Offline migration
If the source self-managed database cannot connect to your RDS instance, you can use this method. In this case, you can use the pg_dump extension to export data from the source self-managed database as files. Then, after you copy the files to a host that can connect to your RDS instance, you can use the pg_restore extension to import the files into your RDS instance.
Note The pg_dump extension that is used to export data must run the same PostgreSQL version as the source self-managed database. The pg_restore extension that is used to import data must run the same PostgreSQL version as your RDS instance.- Export data from the source self-managed database as files.
nohup pg_dump -F c -h 127.0.0.1 -p 5432 -U postgres -d postgres --no-tablespaces -f ./pg.dump > ./pg.dump.log 2>&1 & ll pg.dump -rw-rw-r-- 1 digoal digoal 4.2M Aug 31 10:17 pg.dump
Note Wait until the export is complete. Then, check that no errors are found in the pg.dump.log file. - Import the files into your RDS instance.
pg_restore -h pgm-bpxxxxx.pg.rds.aliyuncs.com -p 1921 -U postgres -d db1 --no-tablespaces --single-transaction pg1.dump >./pg1.restore.log
Note If you have specified an MD5-encrypted password for the specified user, errors may occur. We recommend that you reset the password in the ApsaraDB RDS console. For more information, see Reset the password of an account on an ApsaraDB RDS for PostgreSQL instance.
Wait until the import is complete. If errors occur, you can view the error logs in the pg.dump.log file. After you fix the errors, you can immediately resume the data import. This applies if you have configured the
--single-transaction
option. - Export data from the source self-managed database as files.
- Method 3: Parallelism-based accelerated offline migration
This method is similar to Method 2. However, this method allows you to specify the option that is used to enable parallelism when you run the pg_restore extension to import data.
nohup pg_restore -U postgres -d db1 --no-tablespaces -j 4 /tmp/pg.dump >./pg.restore.log 2>&1 &
Note The-j
parameter that is used to enable parallelism cannot be used at the same time as the--single-transaction
parameter.
- Method 1: Online migration