AnalyticDB for PostgreSQL is optimized by Alibaba Cloud based on Greenplum. AnalyticDB for PostgreSQL supports vector computing and transaction processing in the multi-coordinator architecture. It uses the same API operations as Greenplum Community Edition, which makes migration straightforward. This topic walks you through the complete migration process — from choosing a method to validating results.
Before you migrate
Review the following constraints before starting:
One-way migration only. Data migration does not support continuous synchronization. If writes continue during migration, backfill the new data after migration completes. Pause write operations during migration to avoid performance degradation and data gaps.
Post-migration statistics. After migration, AnalyticDB for PostgreSQL runs
ANALYZEto collect statistics for query planning. CPU load on the coordinator node increases temporarily based on data volume.Node count. Create an AnalyticDB for PostgreSQL instance with at least as many compute nodes as the source Greenplum cluster. When using gpcopy, matching the node count gives the best migration throughput.
Extensions. Check which extensions the self-managed Greenplum cluster uses against those supported by AnalyticDB for PostgreSQL. For the full list, see Manage extensions.
Syntax compatibility. AnalyticDB for PostgreSQL is partially incompatible with Greenplum 4X syntax. Review and update affected SQL before migration.
Partition and inherit tables. Migrate these based on child tables rather than the parent table.
Migration overview
Migration follows these four stages:
Evaluate migration risks and select a migration method.
Purchase a test AnalyticDB for PostgreSQL instance and verify your migration approach.
Purchase a production AnalyticDB for PostgreSQL instance and migrate your production data.
Connect your application to the production instance and verify the migration.
Choose a migration method
| Method | When to use |
|---|---|
| gpcopy | The self-managed Greenplum cluster can connect directly to the AnalyticDB for PostgreSQL instance |
| gptransfer | The AnalyticDB for PostgreSQL instance can connect directly to the self-managed Greenplum cluster |
| OSS foreign table | The self-managed Greenplum cluster and the AnalyticDB for PostgreSQL instance cannot connect to each other |
gpcopy does not support instances in elastic storage mode (V7.0) or Serverless mode. Use gptransfer or the OSS foreign table method for those instance types.
You can also use Alibaba Cloud DataWorks (Data Integration) for migration. For details, see Use Data Integration to migrate and batch synchronize data.
Migrate data using gpcopy
Prerequisites
Before starting, confirm the following:
The self-managed Greenplum cluster can reach the AnalyticDB for PostgreSQL instance over the network.
A superuser account is available on the Greenplum cluster.
The AnalyticDB for PostgreSQL instance is not in elastic storage mode (V7.0) or Serverless mode. gpcopy does not support those instance types.
Step 1: Contact Alibaba Cloud technical support
Submit a ticketSubmit a ticket and provide:
The ID of your AnalyticDB for PostgreSQL instance
The IP address and port number of the self-managed Greenplum cluster
A superuser account on the Greenplum cluster
The IP address of the server where the Greenplum cluster resides
Alibaba Cloud technical support will download and install gpcopy on your AnalyticDB for PostgreSQL instance.Submit a ticket
Step 2: Update the IP address whitelist
Add 0.0.0.0/0 to an IP address whitelist of the AnalyticDB for PostgreSQL instance to allow external access.
0.0.0.0/0 allows all IP addresses to access the instance. Remove this entry and restore a specific IP allowlist immediately after migration.
Step 3: Install gpcopy on the Greenplum server
Download gpcopy on the server where the self-managed Greenplum cluster resides.
This topic uses gpcopy 2.3.2. Use gpcopy 2.3.x for migration. gpcopy 2.4.x requires SSL encryption — see Configure SSL encryption if you use that version.
In the directory where the gpcopy package is located, run the following script to install gpcopy:
#!/bin/bash tar xzvf gpcopy-2.3.2.tar.gz cd gpcopy-2.3.2/ cp gpcopy $GPHOME/bin cp gpcopy_helper $GPHOME/bin chmod 755 $GPHOME/bin/gpcopy chmod 755 $GPHOME/bin/gpcopy_helper host_sql=`psql -A -t -c "select address from gp_segment_configuration where role = 'p' and content >= 0 group by address"` host=(${host_sql}) host_len=${#host[@]} host_str="" for ((i=0; i<$host_len; ++i)); do host_str=$host_str" -h "${host[$i]} done gpscp $host_str gpcopy_helper =:$GPHOME/bin gpssh $host_str -e "chmod 755 $GPHOME/bin/gpcopy_helper"
Step 4: Run the migration
Run the following command to migrate data:
gpcopy --full \
--source-host <SOURCEHOST> \
--source-port <SOURCEPORT> \
--source-user <SOURCEUSER> \
--dest-host <DESTHOST> \
--dest-port <DESTPORT> \
--dest-user <DESTUSER> \
--jobs 4 \
--drop \
--validate countReplace the placeholders with actual values:
| Placeholder | Description | Notes |
|---|---|---|
<SOURCEHOST> | IP address of the Greenplum server | Use localhost if running the command on the Greenplum server |
<SOURCEPORT> | Port number of the Greenplum cluster | Default: 5432. Run gpstate -s to check the actual port |
<SOURCEUSER> | Greenplum superuser account | — |
<DESTHOST> | IP address of the AnalyticDB for PostgreSQL coordinator node | Contact Alibaba Cloud technical support |
<DESTPORT> | Port number of the coordinator node | Contact Alibaba Cloud technical support |
<DESTUSER> | AnalyticDB for PostgreSQL superuser account | Contact Alibaba Cloud technical support |
Key options:
| Option | Description |
|---|---|
--full | Migrates all user databases. Replace with -d <database name> to migrate a single database |
--jobs | Number of tables migrated in parallel. Default: 4. Configure this based on your business requirements. We recommend calculating the value using the formula: 2 × Number of connections + 1 |
--validate count | Validates migration by comparing row counts between source and destination tables |
Step 5: Restore access controls
After migration is complete:
Update the IP address whitelist from
0.0.0.0/0to the IP addresses required for your business.Contact Alibaba Cloud technical support to restore the superuser account configuration, security groups, and pg_hba password-free logon settings.
Migrate data using gptransfer
Prerequisites
Before starting, confirm the following:
The AnalyticDB for PostgreSQL instance can reach the self-managed Greenplum cluster over the network.
A privileged account exists on the AnalyticDB for PostgreSQL instance. If not, create one — see Create a database account.
Steps
Create a privileged account for the AnalyticDB for PostgreSQL instance if one does not already exist.
Add
0.0.0.0/0to an IP address whitelist of the AnalyticDB for PostgreSQL instance to allow external access.Warning0.0.0.0/0allows all IP addresses to access the instance. Remove this entry immediately after migration.Update the AnalyticDB for PostgreSQL instance to the latest minor version. See View the minor engine version and Update the minor engine version.
Submit a ticketSubmit a ticket and provide: the instance ID, the IP address and port number of the self-managed Greenplum cluster, an account of the Greenplum cluster that has permissions to migrate data, and the server IP address.
Wait for the migration to complete.
Update the IP address whitelist from
0.0.0.0/0to the IP addresses required for your business. Verify your data, then connect your application to the instance.
Migrate data using an OSS foreign table
Use this method when the Greenplum cluster and AnalyticDB for PostgreSQL instance cannot connect directly. The process has three stages: schema migration, table data migration, and data verification.
Schema migration
On the master node of the self-managed Greenplum cluster, export the DDL schema using
pg_dumpall:export PGHOST=<IP address of the self-managed Greenplum cluster> export PGPORT=<Port number of the self-managed Greenplum cluster> export PGUSER=<Superuser account of the self-managed Greenplum cluster> export PGPASSWORD=<Password of the superuser account> # For Greenplum 4X pg_dumpall -s -q --gp-syntax > full_schema.sql # For Greenplum 5X or 6X pg_dumpall -s --gp-syntax > full_schema.sqlCheck for custom libraries. Connect to the Greenplum cluster using psql and run:
SELECT * FROM pg_catalog.pg_library;If custom libraries exist, create the corresponding schema manually on the AnalyticDB for PostgreSQL instance before proceeding.
Apply the DDL schema to the AnalyticDB for PostgreSQL instance:
export PGHOST=<IP address of the AnalyticDB for PostgreSQL instance> export PGPORT=<Port number of the AnalyticDB for PostgreSQL instance> export PGUSER=<Privileged account of the AnalyticDB for PostgreSQL instance> export PGPASSWORD=<Password of the privileged account> psql postgres -f full_schema.sql > psql.log 2>&1 &Check
psql.logfor errors. Most errors relate to SQL syntax differences, especially for Greenplum 4X. For compatibility details, see Compatibility comparisons between AnalyticDB for PostgreSQL V4.3 and V6.0 and Check for incompatibility between AnalyticDB for PostgreSQL V4.3 and V6.0.
Table data migration
Step 1: Export data from the Greenplum cluster
Choose one of the following export methods:
Option A: COPY TO
Run the COPY TO statement on the psql client:
-- Works with Greenplum 4X, 5X, and 6X
COPY public.t1 TO '/data/gpload/public_t1.csv' FORMAT CSV ENCODING 'UTF8';For Greenplum 5X or 6X, you can use ON SEGMENT to export in parallel:
COPY public.t1 TO '<SEG_DATA_DIR>/public_t1_<SEGID>.csv' FORMAT CSV ENCODING 'UTF8' ON SEGMENT;Do not modify the<SEGID>placeholder — the system replaces it automatically during export. You can keep<SEG_DATA_DIR>as-is or replace it with an absolute path.
Option B: gpfdist external table
Start the gpfdist service:
mkdir -p /data/gpload gpfdist -d /data/gpload -p 8000 &Connect to the Greenplum cluster using psql and export the data:
-- Create a writable external table for public.t1 CREATE WRITABLE EXTERNAL TABLE ext_w_t1 (LIKE public.t1) LOCATION ('gpfdist://<IP address of the self-managed Greenplum cluster>:8000/public_t1.csv') FORMAT 'CSV' ENCODING 'UTF8'; -- Write the table data to the external table INSERT INTO ext_w_t1 SELECT * FROM public.t1;
Step 2: Upload data to Object Storage Service (OSS)
Upload the exported files to an OSS bucket. The bucket must be in the same region as the AnalyticDB for PostgreSQL instance. For upload instructions, see Upload objects.
Step 3: Import data into AnalyticDB for PostgreSQL
Use an OSS foreign table to load the data from OSS into your AnalyticDB for PostgreSQL instance. See the Import OSS data to an AnalyticDB for PostgreSQL table section for details.
Verify migration results
Run both checks to confirm the migration is complete:
Data consistency check. Compare row counts and aggregate values of numeric fields between the source and destination. For large tables, sample-based comparison is acceptable.
Business consistency check. Connect your application to the AnalyticDB for PostgreSQL instance and verify that business queries execute correctly and return expected results.