All Products
Search
Document Center

AnalyticDB:Migrate from a self-managed Greenplum cluster to AnalyticDB for PostgreSQL

Last Updated:Mar 28, 2026

AnalyticDB for PostgreSQL is optimized by Alibaba Cloud based on Greenplum. AnalyticDB for PostgreSQL supports vector computing and transaction processing in the multi-coordinator architecture. It uses the same API operations as Greenplum Community Edition, which makes migration straightforward. This topic walks you through the complete migration process — from choosing a method to validating results.

Before you migrate

Review the following constraints before starting:

  • One-way migration only. Data migration does not support continuous synchronization. If writes continue during migration, backfill the new data after migration completes. Pause write operations during migration to avoid performance degradation and data gaps.

  • Post-migration statistics. After migration, AnalyticDB for PostgreSQL runs ANALYZE to collect statistics for query planning. CPU load on the coordinator node increases temporarily based on data volume.

  • Node count. Create an AnalyticDB for PostgreSQL instance with at least as many compute nodes as the source Greenplum cluster. When using gpcopy, matching the node count gives the best migration throughput.

  • Extensions. Check which extensions the self-managed Greenplum cluster uses against those supported by AnalyticDB for PostgreSQL. For the full list, see Manage extensions.

  • Syntax compatibility. AnalyticDB for PostgreSQL is partially incompatible with Greenplum 4X syntax. Review and update affected SQL before migration.

  • Partition and inherit tables. Migrate these based on child tables rather than the parent table.

Migration overview

Migration follows these four stages:

  1. Evaluate migration risks and select a migration method.

  2. Purchase a test AnalyticDB for PostgreSQL instance and verify your migration approach.

  3. Purchase a production AnalyticDB for PostgreSQL instance and migrate your production data.

  4. Connect your application to the production instance and verify the migration.

Choose a migration method

MethodWhen to use
gpcopyThe self-managed Greenplum cluster can connect directly to the AnalyticDB for PostgreSQL instance
gptransferThe AnalyticDB for PostgreSQL instance can connect directly to the self-managed Greenplum cluster
OSS foreign tableThe self-managed Greenplum cluster and the AnalyticDB for PostgreSQL instance cannot connect to each other
gpcopy does not support instances in elastic storage mode (V7.0) or Serverless mode. Use gptransfer or the OSS foreign table method for those instance types.

You can also use Alibaba Cloud DataWorks (Data Integration) for migration. For details, see Use Data Integration to migrate and batch synchronize data.

Migrate data using gpcopy

Prerequisites

Before starting, confirm the following:

  • The self-managed Greenplum cluster can reach the AnalyticDB for PostgreSQL instance over the network.

  • A superuser account is available on the Greenplum cluster.

  • The AnalyticDB for PostgreSQL instance is not in elastic storage mode (V7.0) or Serverless mode. gpcopy does not support those instance types.

Step 1: Contact Alibaba Cloud technical support

Submit a ticketSubmit a ticket and provide:

  • The ID of your AnalyticDB for PostgreSQL instance

  • The IP address and port number of the self-managed Greenplum cluster

  • A superuser account on the Greenplum cluster

  • The IP address of the server where the Greenplum cluster resides

Alibaba Cloud technical support will download and install gpcopy on your AnalyticDB for PostgreSQL instance.Submit a ticket

Step 2: Update the IP address whitelist

Add 0.0.0.0/0 to an IP address whitelist of the AnalyticDB for PostgreSQL instance to allow external access.

Warning

0.0.0.0/0 allows all IP addresses to access the instance. Remove this entry and restore a specific IP allowlist immediately after migration.

Step 3: Install gpcopy on the Greenplum server

  1. Download gpcopy on the server where the self-managed Greenplum cluster resides.

    This topic uses gpcopy 2.3.2. Use gpcopy 2.3.x for migration. gpcopy 2.4.x requires SSL encryption — see Configure SSL encryption if you use that version.
  2. In the directory where the gpcopy package is located, run the following script to install gpcopy:

    #!/bin/bash
    
    tar xzvf gpcopy-2.3.2.tar.gz
    
    cd gpcopy-2.3.2/
    cp gpcopy $GPHOME/bin
    cp gpcopy_helper $GPHOME/bin
    chmod 755 $GPHOME/bin/gpcopy
    chmod 755 $GPHOME/bin/gpcopy_helper
    
    host_sql=`psql -A -t -c "select address from gp_segment_configuration where role = 'p' and content >= 0 group by address"`
    host=(${host_sql})
    host_len=${#host[@]}
    
    host_str=""
    for ((i=0; i<$host_len; ++i)); do
        host_str=$host_str" -h "${host[$i]}
    done
    
    gpscp $host_str gpcopy_helper =:$GPHOME/bin
    gpssh $host_str -e "chmod 755 $GPHOME/bin/gpcopy_helper"

Step 4: Run the migration

Run the following command to migrate data:

gpcopy --full \
  --source-host <SOURCEHOST> \
  --source-port <SOURCEPORT> \
  --source-user <SOURCEUSER> \
  --dest-host <DESTHOST> \
  --dest-port <DESTPORT> \
  --dest-user <DESTUSER> \
  --jobs 4 \
  --drop \
  --validate count

Replace the placeholders with actual values:

PlaceholderDescriptionNotes
<SOURCEHOST>IP address of the Greenplum serverUse localhost if running the command on the Greenplum server
<SOURCEPORT>Port number of the Greenplum clusterDefault: 5432. Run gpstate -s to check the actual port
<SOURCEUSER>Greenplum superuser account
<DESTHOST>IP address of the AnalyticDB for PostgreSQL coordinator nodeContact Alibaba Cloud technical support
<DESTPORT>Port number of the coordinator nodeContact Alibaba Cloud technical support
<DESTUSER>AnalyticDB for PostgreSQL superuser accountContact Alibaba Cloud technical support

Key options:

OptionDescription
--fullMigrates all user databases. Replace with -d <database name> to migrate a single database
--jobsNumber of tables migrated in parallel. Default: 4. Configure this based on your business requirements. We recommend calculating the value using the formula: 2 × Number of connections + 1
--validate countValidates migration by comparing row counts between source and destination tables

Step 5: Restore access controls

After migration is complete:

  1. Update the IP address whitelist from 0.0.0.0/0 to the IP addresses required for your business.

  2. Contact Alibaba Cloud technical support to restore the superuser account configuration, security groups, and pg_hba password-free logon settings.

Migrate data using gptransfer

Prerequisites

Before starting, confirm the following:

  • The AnalyticDB for PostgreSQL instance can reach the self-managed Greenplum cluster over the network.

  • A privileged account exists on the AnalyticDB for PostgreSQL instance. If not, create one — see Create a database account.

Steps

  1. Create a privileged account for the AnalyticDB for PostgreSQL instance if one does not already exist.

  2. Add 0.0.0.0/0 to an IP address whitelist of the AnalyticDB for PostgreSQL instance to allow external access.

    Warning

    0.0.0.0/0 allows all IP addresses to access the instance. Remove this entry immediately after migration.

  3. Update the AnalyticDB for PostgreSQL instance to the latest minor version. See View the minor engine version and Update the minor engine version.

  4. Submit a ticketSubmit a ticket and provide: the instance ID, the IP address and port number of the self-managed Greenplum cluster, an account of the Greenplum cluster that has permissions to migrate data, and the server IP address.

  5. Wait for the migration to complete.

  6. Update the IP address whitelist from 0.0.0.0/0 to the IP addresses required for your business. Verify your data, then connect your application to the instance.

Migrate data using an OSS foreign table

Use this method when the Greenplum cluster and AnalyticDB for PostgreSQL instance cannot connect directly. The process has three stages: schema migration, table data migration, and data verification.

Schema migration

  1. On the master node of the self-managed Greenplum cluster, export the DDL schema using pg_dumpall:

    export PGHOST=<IP address of the self-managed Greenplum cluster>
    export PGPORT=<Port number of the self-managed Greenplum cluster>
    export PGUSER=<Superuser account of the self-managed Greenplum cluster>
    export PGPASSWORD=<Password of the superuser account>
    
    # For Greenplum 4X
    pg_dumpall -s -q --gp-syntax > full_schema.sql
    
    # For Greenplum 5X or 6X
    pg_dumpall -s --gp-syntax > full_schema.sql
  2. Check for custom libraries. Connect to the Greenplum cluster using psql and run:

    SELECT * FROM pg_catalog.pg_library;

    If custom libraries exist, create the corresponding schema manually on the AnalyticDB for PostgreSQL instance before proceeding.

  3. Apply the DDL schema to the AnalyticDB for PostgreSQL instance:

    export PGHOST=<IP address of the AnalyticDB for PostgreSQL instance>
    export PGPORT=<Port number of the AnalyticDB for PostgreSQL instance>
    export PGUSER=<Privileged account of the AnalyticDB for PostgreSQL instance>
    export PGPASSWORD=<Password of the privileged account>
    
    psql postgres -f full_schema.sql > psql.log 2>&1 &

    Check psql.log for errors. Most errors relate to SQL syntax differences, especially for Greenplum 4X. For compatibility details, see Compatibility comparisons between AnalyticDB for PostgreSQL V4.3 and V6.0 and Check for incompatibility between AnalyticDB for PostgreSQL V4.3 and V6.0.

Table data migration

Step 1: Export data from the Greenplum cluster

Choose one of the following export methods:

Option A: COPY TO

Run the COPY TO statement on the psql client:

-- Works with Greenplum 4X, 5X, and 6X
COPY public.t1 TO '/data/gpload/public_t1.csv' FORMAT CSV ENCODING 'UTF8';

For Greenplum 5X or 6X, you can use ON SEGMENT to export in parallel:

COPY public.t1 TO '<SEG_DATA_DIR>/public_t1_<SEGID>.csv' FORMAT CSV ENCODING 'UTF8' ON SEGMENT;
Do not modify the <SEGID> placeholder — the system replaces it automatically during export. You can keep <SEG_DATA_DIR> as-is or replace it with an absolute path.

Option B: gpfdist external table

  1. Start the gpfdist service:

    mkdir -p /data/gpload
    gpfdist -d /data/gpload -p 8000 &
  2. Connect to the Greenplum cluster using psql and export the data:

    -- Create a writable external table for public.t1
    CREATE WRITABLE EXTERNAL TABLE ext_w_t1 (LIKE public.t1)
    LOCATION ('gpfdist://<IP address of the self-managed Greenplum cluster>:8000/public_t1.csv')
    FORMAT 'CSV' ENCODING 'UTF8';
    
    -- Write the table data to the external table
    INSERT INTO ext_w_t1 SELECT * FROM public.t1;

Step 2: Upload data to Object Storage Service (OSS)

Upload the exported files to an OSS bucket. The bucket must be in the same region as the AnalyticDB for PostgreSQL instance. For upload instructions, see Upload objects.

Step 3: Import data into AnalyticDB for PostgreSQL

Use an OSS foreign table to load the data from OSS into your AnalyticDB for PostgreSQL instance. See the Import OSS data to an AnalyticDB for PostgreSQL table section for details.

Verify migration results

Run both checks to confirm the migration is complete:

  • Data consistency check. Compare row counts and aggregate values of numeric fields between the source and destination. For large tables, sample-based comparison is acceptable.

  • Business consistency check. Connect your application to the AnalyticDB for PostgreSQL instance and verify that business queries execute correctly and return expected results.

What's next