All Products
Search
Document Center

Data Transmission Service:Synchronize data from an RDS for PostgreSQL instance to a SelectDB instance

Last Updated:Mar 28, 2026

Data Transmission Service (DTS) synchronizes data from a PostgreSQL database — including self-managed PostgreSQL databases and RDS for PostgreSQL instances — to a SelectDB instance for large-scale analytics. This topic uses an RDS for PostgreSQL instance as the source.

How it works

A DTS synchronization task runs three phases in sequence:

  1. Schema synchronization — DTS replicates the table structure from the source to the destination.

  2. Full data synchronization — DTS copies all existing data from the source to the destination as the baseline.

  3. Incremental data synchronization — DTS captures ongoing changes from the PostgreSQL write-ahead log (WAL) using logical replication and applies them to the destination.

Incremental data synchronization is charged. Schema synchronization and full data synchronization are free. For details, see Billing overview.

Prerequisites

Before you begin, make sure you have:

  • A destination SelectDB instance with storage space larger than the data size used by the source RDS for PostgreSQL instance. To create one, see Create an instance.

  • The wal_level parameter of the source RDS for PostgreSQL instance set to logical. To set this, see Set instance parameters.

Required permissions

DatabaseRequired permissionsHow to grant
Source RDS for PostgreSQL instanceA privileged account that is the Owner of the database to be synchronizedCreate an account and Create a database
Destination SelectDB instanceCluster access permissions (Usage_priv) and database read/write permissions (Select_priv, Load_priv, Alter_priv, Create_priv, Drop_priv)Cluster permission management and Basic permission management

Supported SQL operations

Operation typeSQL statements
DMLINSERT, UPDATE, DELETE
DDLADD COLUMN, DROP COLUMN
DDL operations that modify multiple columns at once, and consecutive DDL operations on the same table, are not supported.

Limitations

Review the following limitations before configuring the synchronization task.

Source database requirements

  • Primary keys and UNIQUE constraints — All tables to be synchronized must have a primary key or a UNIQUE constraint to avoid duplicate data in the destination.

    • If the destination table was not created by DTS (that is, you did not select Schema synchronization for Synchronization types), make sure the destination table has the same primary key or non-empty UNIQUE constraint as the source table. Otherwise, duplicate data may appear in the destination.

    • For tables that have neither a primary key nor a UNIQUE constraint: select Schema synchronization for Synchronization types, then set Engine to duplicate for those tables on the Configurations for databases, tables, and columns step. If you skip this, the task may fail or data may be lost.

      Note

      During initial schema synchronization, DTS adds columns to the destination table. For details, see Additional columns.

  • Database names — Database names to be synchronized cannot contain hyphens (-). For example, dts-testdata is invalid.

  • Large numbers of tables — If you synchronize at the table level, edit objects (such as mapping column names), and the total number of tables in a single task exceeds 5,000, split the tables into multiple tasks or configure the task to synchronize the entire database. Exceeding this limit may cause a request error when you submit the task.

  • WAL log retention — DTS reads incremental changes from the WAL. If the logs are purged before DTS processes them, the task fails and may result in data inconsistency or data loss that is not covered by the DTS service level agreement (SLA). You can change the log retention period to more than 24 hours after initial full data synchronization completes.

    Synchronization typeMinimum WAL retention period
    Incremental data synchronization onlyMore than 24 hours
    Full data synchronization + incremental data synchronizationAt least 7 days
  • Failover behavior — If the source is a self-managed PostgreSQL database and a failover occurs, the synchronization task fails. For RDS for PostgreSQL instances, enable Logical Replication Slot Failover to prevent replication interruptions during failovers. For details, see Logical Replication Slot Failover.

  • Large incremental records — If a single incremental change exceeds 256 MB after encoding, the synchronization instance fails and cannot be recovered. Reconfigure the synchronization instance.

  • DDL operations during synchronization — Do not run Data Definition Language (DDL) operations that change database or table schemas during schema synchronization or full data synchronization. Such operations cause the task to fail.

    During full data synchronization, DTS queries the source database and acquires metadata locks. This may block DDL operations on the source.
  • Long-running transactions — If the source database has long-running transactions during incremental synchronization, WAL logs generated before those transactions are committed cannot be cleared. This causes WAL logs to accumulate and may exhaust the disk space of the source database.

  • Major version upgrades — Do not perform a major engine version upgrade on the source database while a synchronization task is running. The task fails and cannot be recovered. Reconfigure the synchronization instance.

Destination and configuration requirements

  • Supported table engines — DTS can only synchronize data to tables that use the Unique or Duplicate engine in SelectDB.

  • Tables using the Unique engine — All unique keys in the destination table must also exist in the source table and be included as synchronization objects. Otherwise, data inconsistency may occur.

  • Tables using the Duplicate engine — Duplicate data may appear in the destination in these scenarios: Use the additional columns (_is_deleted, _version, and _record_id) to identify and remove duplicates. For column details, see Additional columns.

    • The synchronization instance has been retried.

    • The synchronization instance has been restarted.

    • Two or more DML operations are performed on the same data record after the task starts.

    When the destination table uses the Duplicate engine, DTS converts UPDATE and DELETE statements to INSERT statements.
  • `bucket_count` parameter — In the Selected objects box, you can only set the bucket_count parameter. The value must be a positive integer. The default is auto.

  • Naming requirements — SelectDB only supports database and table names that start with a letter. If a name does not start with a letter, use the mapping feature to rename it. Similarly, if a synchronization object name (database, table, or column) contains Chinese characters, use the mapping feature to rename it to an English name. Otherwise, the task may fail.

  • Single database per task — One synchronization instance can synchronize only one database. To synchronize multiple databases, configure a separate instance for each database.

  • Unsupported object types — DTS does not synchronize TimescaleDB extension tables, tables with cross-schema inheritance, or tables that contain expression-based unique indexes.

  • REPLICA IDENTITY FULL requirement — Before writing data to the following tables, run the command below to make sure data is consistent. Do not perform table locking operations while running this command, or the tables may become locked. If you skip the related precheck items, DTS runs this command automatically during instance initialization. Run this command when:

    • The instance runs for the first time.

    • The synchronization granularity is schema, and a new table is created in the synchronized schema, or a table is rebuilt using RENAME.

    • You use the Modify objects feature.

    ALTER TABLE schema.table REPLICA IDENTITY FULL;

    Replace schema and table with the actual schema and table names. Run this command during off-peak hours.

  • Full data synchronization load — During full data synchronization, DTS uses read and write resources on both the source and destination databases, which increases database load. Evaluate the performance of both databases before starting, and schedule synchronization during off-peak hours when CPU load is below 30%.

  • Backend node changes — Do not add backend (BE) nodes to the SelectDB database during synchronization. Doing so causes the task to fail. Restart the synchronization instance to resume.

  • New clusters — Do not create a new cluster in the destination SelectDB instance during synchronization. Doing so causes the task to fail. Restart the synchronization instance to resume.

  • Multi-table merge — In a multi-table merge scenario where data from multiple source tables is synchronized to a single destination table, all source tables must have the same schema. Otherwise, data inconsistency or task failure may occur.

  • Metadata validation — DTS validates data content but does not validate metadata such as sequences. Validate metadata manually.

  • Partitioned tables — Include both the parent table and all its child tables as synchronization objects. The parent table of a PostgreSQL partitioned table does not store data directly — all data resides in the child tables. Omitting child tables results in data inconsistency.

  • Temporary tables — DTS creates the following temporary tables in the source database to capture DDL statements, table structures, and heartbeat information. Do not delete these tables during synchronization, or the task becomes abnormal. They are deleted automatically when the DTS instance is released: public.dts_pg_class, public.dts_pg_attribute, public.dts_pg_type, public.dts_pg_enum, public.dts_postgres_heartbeat, public.dts_ddl_command, public.dts_args_session, and public.aliyun_dts_instance.

  • Replication slots — DTS creates a replication slot with the prefix dts_sync_ in the source database. DTS uses this slot to fetch incremental logs from the source database within 15 minutes. When the task fails or the instance is released, DTS attempts to clear this slot automatically.

    Important

    If you change the database account password or remove DTS IP addresses from the source database whitelist during synchronization, the replication slot cannot be cleared automatically. Clear it manually to prevent disk space accumulation. If a failover occurs on the source database, log on to the secondary database to clear the replication slot manually.

    Amazon slot查询信息

  • Incremental synchronization latency — DTS uses a batch synchronization policy to reduce load on the destination. By default, DTS writes data at most once every 5 seconds per synchronization object, resulting in typical latency within 10 seconds. To reduce latency, modify the selectdb.reservoir.timeout.milliseconds parameter of the DTS instance in the console. The valid range is 1,000–10,000 milliseconds.

    Reducing the batching interval increases write frequency. This may increase the load and write response time of the destination instance, which in turn increases synchronization latency. Adjust based on the actual load of the destination.
  • Task recovery — If the task fails, DTS technical support attempts recovery within 8 hours. During recovery, DTS may restart the task or adjust task parameters (not database parameters). Parameters that may be modified include those described in Modify instance parameters.

Special cases

ScenarioRequirement
Source is an RDS for PostgreSQL instanceDo not change the endpoint or zone of the instance during synchronization. Doing so causes the task to fail.
Source is a self-managed PostgreSQL databaseMake sure the values of max_wal_senders and max_replication_slots are each greater than the sum of existing replication slots and the number of DTS instances using this database as the source.
Source is Google Cloud SQL for PostgreSQLThe database account must have the cloudsqlsuperuser permission and must be authorized to manage all objects selected for synchronization (or granted the Owner permission via GRANT <owner_of_the_object> TO <source_database_account>). An account with cloudsqlsuperuser permission cannot manage objects owned by another cloudsqlsuperuser account.

Create a synchronization task

Step 1: Go to the data synchronization task list

Use one of the following methods to navigate to the synchronization task list in the destination region.

From the DTS console:

  1. Log on to the Data Transmission Service (DTS) console.

  2. In the left navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance resides.

From the DMS console:

The actual steps may vary based on the mode and layout of the DMS console. For details, see Simple mode console and Customize the layout and style of the DMS UI.
  1. Log on to Data Management (DMS).

  2. On the top menu bar, choose Data + AI > Data Transmission (DTS) > Data Synchronization.

  3. To the right of Data Synchronization Tasks, select the region where the synchronization instance resides.

Step 2: Configure the source and destination databases

  1. Click Create Task.

  2. Configure the source and destination databases. Task name DTS generates a task name automatically. Specify a descriptive name for easy identification. The name does not need to be unique.

    Source database

    ParameterDescription
    Select existing connectionSelect a saved database instance from the list if you have added it to the system. Otherwise, configure the connection manually. In the DMS console, this parameter is named Select a DMS database instance.
    Database typeSelect PostgreSQL.
    Access methodSelect Alibaba Cloud Instance.
    Instance regionSelect the region where the source RDS for PostgreSQL instance resides.
    Replicate data across Alibaba Cloud accountsSelect No if the source instance belongs to the current Alibaba Cloud account.
    Instance IDSelect the ID of the source RDS for PostgreSQL instance.
    Database nameEnter the name of the database containing the objects to be synchronized.
    Database accountEnter the database account. For required permissions, see Required permissions.
    Database passwordEnter the password for the database account.

    Destination database

    ParameterDescription
    Select existing connectionSelect a saved database instance from the list if you have added it to the system. Otherwise, configure the connection manually. In the DMS console, this parameter is named Select a DMS database instance.
    Database typeSelect SelectDB.
    Access methodSelect Alibaba Cloud Instance.
    Instance regionSelect the region where the destination SelectDB instance resides.
    Replicate data across Alibaba Cloud accountsSelect No if the destination instance belongs to the current Alibaba Cloud account.
    Instance IDSelect the ID of the destination SelectDB instance.
    Database accountEnter the database account. For required permissions, see Required permissions.
    Database passwordEnter the password for the database account.
  3. Click Test Connectivity and Proceed.

    Make sure the IP address blocks of DTS servers are added to the security settings of the source and destination databases. This can be done automatically or manually. For details, see Add the IP address blocks of DTS servers to a whitelist. If the source or destination is a self-managed database (where Access method is not Alibaba Cloud Instance), also click Test Connectivity in the CIDR blocks of DTS servers dialog box.

Step 3: Configure synchronization objects

On the Configure objects page, set the following parameters.

ParameterDescription
Synchronization typesIncremental Data Synchronization is selected by default. Also select Schema Synchronization and Full Data Synchronization. After the precheck passes, DTS initializes all selected objects from the source and uses them as the baseline for incremental synchronization.
Important

When synchronizing from PostgreSQL to SelectDB, data types are converted. If you do not select Schema Synchronization, create Unique or Duplicate model tables with the correct schemas in the destination SelectDB instance in advance. For details, see Data type mapping and Data models.

Processing mode of conflicting tablesPrecheck and report errors: DTS checks whether a table with the same name exists in the destination. If one exists, the precheck fails and the task does not start. To resolve this without deleting the table, use the mapping feature to rename it. See Map schemas, tables, and columns. Ignore errors and proceed: Skips the check.
Warning

This may cause data inconsistency. If schemas match, records with the same primary key or unique key in the destination are overwritten. If schemas differ, initialization may fail or only partial data may be synchronized.

Capitalization of object names in destination instanceConfigure the case sensitivity policy for database, table, and column names synchronized to the destination. The default is the DTS default policy. For details, see Case sensitivity policy for destination object names.
Source objectsClick an object in the Source objects box, then click 向右 to move it to the Selected objects box. Select objects at the schema or table level.
Selected objectsRight-click an object to rename it, set a WHERE clause to filter rows, or select which SQL operations to include for incremental synchronization. To remove an object, click it and then click image. If you selected Schema Synchronization and added objects at the table level, right-click a table to set the bucket_count parameter under Parameter settings.
Note

Using the object name mapping feature may cause synchronization failures for other objects that depend on the renamed object.

Click Next: Advanced Settings.

Step 4: Configure advanced settings (optional)

ParameterDescription
Dedicated cluster for task schedulingBy default, DTS schedules tasks on a shared cluster. For more stable performance, purchase a dedicated cluster. See What is a DTS dedicated cluster?.
Retry time for failed connectionsAfter the task starts, if the connection to the source or destination fails, DTS retries for up to 720 minutes by default. Set a custom duration between 10 and 1,440 minutes. Set this to 30 minutes or more. If the reconnection succeeds within the retry period, the task resumes automatically.
Note

If multiple DTS instances share the same source or destination, the shortest retry time among those instances applies to all. DTS charges for task runtime during the retry period.

Retry time for other issuesIf non-connectivity issues occur (such as DDL or DML execution errors), DTS retries for up to 10 minutes by default. Set a custom duration between 1 and 1,440 minutes. Set this to 10 minutes or more.
Important

This value must be less than the retry time for failed connections.

Enable throttling for full data synchronizationLimit the read rate to the source and write rate to the destination during full synchronization to reduce database load. Configure Queries per second (QPS) to the source database, RPS of full data migration, and Data migration speed for full migration (MB/s). This setting is available only when Full Data Synchronization is selected. You can also adjust the full synchronization rate after the instance is running.
Enable throttling for incremental data synchronizationLimit the write rate to the destination during incremental synchronization. Configure RPS of incremental data synchronization and Data synchronization speed for incremental synchronization (MB/s).
Environment tagSelect an environment tag to identify the instance. This setting is optional.
Configure ETLEnable the extract, transform, and load (ETL) feature to transform data during synchronization. Select Yes and enter processing statements in the code editor. For details, see Configure ETL in a data migration or data synchronization task.
Monitoring and alertingConfigure alerts for task failures or latency threshold breaches. Select Yes, then specify the alert threshold and notification recipients. For details, see Configure monitoring and alerting during task configuration.

Step 5: Configure database and table fields (optional)

If you selected Schema Synchronization, click Next: Configure Database and Table Fields to set the Primary Key Column, Distribution Key, and Engine for each destination table.

Set Definition status to All to view and modify all tables.
Select one or more columns for Primary Key Column. The Distribution Key must include at least one column from the primary key.
For tables without a primary key or UNIQUE constraint, set Engine to duplicate. Otherwise, the task may fail or data may be lost.

Step 6: Run the precheck and purchase the instance

  1. Click Next: Save Task Settings and Precheck.

    To preview the OpenAPI parameters for this task, hover over the button and click Preview OpenAPI parameters before clicking it.
  2. Wait for the precheck to complete. The task can start only after all precheck items pass.

    • If a precheck item fails, click View Details, fix the issue, and run the precheck again.

    • If a precheck item generates a warning that can be ignored, click Confirm Alert Details, Ignore, OK, and Precheck Again in sequence. Ignoring warnings may cause data inconsistency.

  3. When Success Rate reaches 100%, click Next: Purchase Instance.

  4. On the Purchase page, select the billing method and link specification.

    ParameterDescription
    Billing methodSubscription: Pay upfront for a set duration. Suitable for long-term use and more cost-effective than pay-as-you-go. Longer subscriptions offer higher discounts. Pay-as-you-go: Charged hourly. Suitable for short-term use. Release the instance after use to stop charges.
    Resource group configurationThe resource group to which the instance belongs. The default is the default resource group. See What is Resource Management?.
    Link specificationThe synchronization performance level. Higher specifications support higher synchronization rates. See Data synchronization link specifications.
    Subscription duration(Subscription only) Choose monthly (1–9 months) or yearly (1, 2, 3, or 5 years).
  5. Read the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the confirmation dialog box, click OK.

You can monitor task progress on the Data Synchronization page.

Data type mapping

CategoryPostgreSQL data typeSelectDB data type
NumericSMALLINTSMALLINT
INTEGERINT
BIGINTBIGINT
DECIMALDECIMAL
NUMERICDECIMAL
REALDOUBLE
DOUBLEDOUBLE
SMALLSERIALSMALLINT
SERIALINT
BIGSERIALBIGINT
MonetaryMONEYSTRING
CharacterCHAR(n), VARCHAR(n)VARCHAR
Important

To prevent data loss, CHAR(n) and VARCHAR(n) are converted to VARCHAR(4\*n). If no length is specified, the SelectDB default VARCHAR(65533) is used. If the data length exceeds 65,533 characters, the data is converted to STRING.

TEXTSTRING
BinaryBYTEASTRING
Date and timeTIMESTAMP [(P)] [WITHOUT TIME ZONE]DATETIMEV2
TIMESTAMP [(P)] WITH TIME ZONEDATETIMEV2
DATEDATEV2
TIME [(P)] [WITHOUT TIME ZONE]VARCHAR(50)
TIME [(P)] WITH TIME ZONEVARCHAR(50)
INTERVAL [FIELDS] [(P)]STRING
BooleanBOOLEANBOOLEAN
GeometricPOINT, LINE, LSEG, BOX, PATH, POLYGON, CIRCLESTRING
Network addressCIDR, INET, MACADDR, MACADDR8STRING
Text searchTSVECTORSTRING
XMLXMLSTRING
JSONJSONJSON

Additional columns

DTS automatically adds the following columns to destination tables that use the Duplicate engine. If the table was not created by DTS, add these columns manually.

ColumnData typeDefault valueDescription
_is_deletedInt0Indicates whether the row is deleted. INSERT: 0. UPDATE: 0. DELETE: 1.
_versionBigint0For full data synchronization: 0. For incremental data synchronization: the timestamp (in seconds) from the source binary log.
_record_idBigint0For full data synchronization: 0. For incremental data synchronization: the unique record ID of the incremental log entry. This value is unique and increments monotonically.

What's next