All Products
Search
Document Center

Data Transmission Service:Synchronize data from PolarDB for PostgreSQL to SelectDB

Last Updated:Mar 28, 2026

Use Data Transmission Service (DTS) to synchronize data from a PolarDB for PostgreSQL cluster to an ApsaraDB for SelectDB instance for large-scale data analytics. DTS supports schema synchronization, full data synchronization, and ongoing incremental data synchronization.

Prerequisites

Before you begin, make sure that you have:

  • A destination ApsaraDB for SelectDB instance with more disk space than the amount used by the source PolarDB for PostgreSQL cluster. For more information, see Create an instance.

  • The wal_level parameter of the source PolarDB for PostgreSQL cluster set to logical. For more information, see Set cluster parameters.

Billing

Synchronization typeFee
Schema synchronization and full data synchronizationFree of charge
Incremental data synchronizationCharged. For more information, see Billing overview.

Supported SQL operations

Operation typeSQL statement
DMLINSERT, UPDATE, DELETE
DDLADD COLUMN, DROP COLUMN

Required permissions

DatabaseRequired permissionsHow to grant
Source PolarDB for PostgreSQL clusterA privileged account that owns the database to be synchronizedCreate a database account and Database Management
Destination ApsaraDB for SelectDB instanceCluster access permissions (Usage_priv) and database access permissions (Select_priv, Load_priv, Alter_priv, Create_priv, and Drop_priv)Cluster Permission Management and Basic Permission Management

Limitations

Source database

  • All tables to be synchronized must have a primary key or a non-null unique index. If table fields are not unique, duplicate data may appear in the destination database.

  • For tables without a primary key or a non-null unique index: select Schema Synchronization for Synchronization Types and duplicate for Engine in the Configurations for Databases, Tables, and Columns step.

  • Long-running transactions in the source database can cause write-ahead logging (WAL) logs to accumulate before those transactions commit, which may exhaust disk space on the source database.

  • The PolarDB for PostgreSQL cluster must support and have Logical Replication Slot Failover enabled. If the cluster uses PostgreSQL 14, which does not support Logical Replication Slot Failover, a high-availability (HA) switchover may cause the synchronization instance to fail permanently and require full reconfiguration.

  • If a single incremental change exceeds 256 MB, the synchronization instance may fail permanently. Reconfigure the synchronization instance to recover.

  • Do not run DDL operations that alter schemas during initial schema synchronization or initial full data synchronization. DTS queries the source database during initial full data synchronization, which creates metadata locks that can block DDL operations.

Destination database

SelectDB data model requirements

DTS can synchronize data only to tables that use the Unique Key model or the Duplicate key model in ApsaraDB for SelectDB. Choose the model based on your requirements:

RequirementRecommended modelNotes
Consistent, deduplicated recordsUnique Key modelAll unique keys of the destination table must exist in the source table and the synchronization object. Missing keys cause data inconsistency.
Append-only or high-throughput ingestionDuplicate key modelDuplicate data may appear if a retry occurs or two or more DML operations run on the same row after the synchronization instance starts. Use _is_deleted, _version, and _record_id columns to deduplicate. DTS converts UPDATE and DELETE statements to INSERT statements for Duplicate key model tables.

Object and naming constraints

  • Only the bucket_count parameter can be specified in the Selected Objects section. The value must be a positive integer. Default value: auto.

  • ApsaraDB for SelectDB only supports database and table names that start with a letter. Use the object name mapping feature to rename objects that do not start with a letter.

  • Object names (databases, tables, or columns) that contain Chinese characters must be renamed using the object name mapping feature.

  • A single synchronization instance can synchronize only one database. To synchronize multiple databases, configure a separate synchronization instance for each database.

Synchronization scope and DDL limits

  • Synchronization of TimescaleDB extension tables and tables with cross-schema inheritance is not supported.

  • You cannot modify DDL operations on multiple columns at a time, or modify DDL operations on the same table consecutively.

  • In a multi-table merge scenario (synchronizing multiple source tables to a single destination table), the schemas of all source tables must be identical. Otherwise, data inconsistency or task failure may occur.

Partitioned tables

When synchronizing partitioned tables, include both the parent table and all its child partitions as synchronization objects. The parent table of a PostgreSQL partitioned table does not store data directly—all data is in child partitions. Omitting partitions causes data inconsistency.

REPLICA IDENTITY requirement

Run the following command on tables before writing data in these three scenarios: when the instance runs for the first time; when you select objects at the schema level and a new table is created or rebuilt using RENAME; and when you use the feature to modify synchronization objects.

ALTER TABLE schema.table REPLICA IDENTITY FULL;

Replace schema and table with the actual schema name and table name. Run this command during off-peak hours. Do not perform table lock operations while this command is running. If you skip the related precheck, DTS automatically runs this command during instance initialization.

Operations that cause task failure

The following operations cause the synchronization task to fail. Restart the synchronization instance to resume:

  • Adding backend nodes to the destination ApsaraDB for SelectDB database during synchronization

  • Creating clusters in the destination ApsaraDB for SelectDB instance during synchronization

Temporary tables

DTS creates the following temporary tables in the source database to support incremental synchronization. Do not delete them during synchronization—they are automatically removed after the DTS instance is released:

public.dts_pg_class, public.dts_pg_attribute, public.dts_pg_type, public.dts_pg_enum, public.dts_postgres_heartbeat, public.dts_ddl_command, public.dts_args_session, and public.aliyun_dts_instance.

Replication slot

DTS creates a replication slot with the prefix dts_sync_ in the source database. This slot retains incremental logs for the last 15 minutes. When the synchronization task fails or the instance is released, DTS attempts to clean up the slot automatically.

If you change the password of the source database account or remove the DTS IP address whitelist during synchronization, the replication slot cannot be cleaned up automatically. Manually clean up the replication slot to prevent disk space from being exhausted. If a primary/secondary switchover occurs, log on to the secondary database to perform the cleanup.

Replication slot query information

Incremental synchronization latency

DTS uses a batch synchronization policy to reduce load on the destination. By default, DTS writes data at most once every 5 seconds per synchronization object, which results in a normal synchronization latency of up to 10 seconds. To reduce this latency, modify the selectdb.reservoir.timeout.milliseconds parameter of the DTS instance in the console. The valid range is 1,000 to 10,000 milliseconds.

A lower batching time increases write frequency, which may increase the load and write response time of the destination and, in turn, increase the DTS synchronization latency. Adjust the batching time based on the load of the destination.

Other notes

  • DTS does not check the validity of metadata such as sequences. Manually verify metadata validity before and after synchronization.

  • Perform full data synchronization when the CPU load of both the source and destination databases is below 30%.

  • If an instance fails, the DTS support team will attempt to recover it within 8 hours. During recovery, only DTS instance parameters may be modified—database parameters are not changed. For the parameters that may be modified, see Modify instance parameters.

Create a synchronization task

Step 1: Go to the Data Synchronization page

Use one of the following consoles.

DTS console

  1. Log on to the .DTS console

  2. In the left-side navigation pane, click Data Synchronization.

  3. In the upper-left corner of the page, select the region where the synchronization task resides.

DMS console

Note

The actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.

  1. Log on to the .DMS console

  2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.

  3. From the drop-down list to the right of Data Synchronization Tasks, select the region where the synchronization instance resides.

Step 2: Configure source and destination databases

  1. Click Create Task to go to the task configuration page.

  2. Configure the source and destination databases using the following parameters.

    CategoryParameterDescription
    NoneTask NameThe task name. DTS automatically generates a name. Specify a descriptive name to make the task easy to identify. A unique name is not required.
    Source DatabaseSelect Existing ConnectionIf you registered the database instance with DTS, select it from the drop-down list. DTS automatically fills in the database parameters. Otherwise, configure the following parameters manually.
    Database TypeSelect PolarDB for PostgreSQL.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionSelect the region where the source PolarDB for PostgreSQL cluster resides.
    Replicate Data Across Alibaba Cloud AccountsSelect No if the source and destination databases belong to the same Alibaba Cloud account.
    Instance IDSelect the ID of the source PolarDB for PostgreSQL cluster.
    Database NameEnter the name of the database that contains the objects to be synchronized.
    Database AccountEnter the database account. For permission requirements, see Required permissions.
    Database PasswordEnter the password for the database account.
    Destination DatabaseSelect Existing ConnectionIf you registered the database instance with DTS, select it from the drop-down list. DTS automatically fills in the database parameters. Otherwise, configure the following parameters manually.
    Database TypeSelect SelectDB.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionSelect the region where the destination SelectDB instance resides.
    Replicate Data Across Alibaba Cloud AccountsSelect No if the source and destination databases belong to the same Alibaba Cloud account.
    Instance IDSelect the ID of the destination SelectDB instance.
    Database AccountEnter the database account. For permission requirements, see Required permissions.
    Database PasswordEnter the password for the database account.
  3. Click Test Connectivity and Proceed.

    Make sure the CIDR blocks of DTS servers are added to the security settings of both the source and destination databases. For more information, see Add DTS server IP addresses to a whitelist.

Step 3: Configure synchronization objects

  1. In the Configure Objects step, configure the synchronization parameters.

    ParameterDescription
    Synchronization TypesBy default, Incremental Data Synchronization is selected. Also select Schema Synchronization and Full Data Synchronization. After the precheck completes, DTS synchronizes historical data as the basis for ongoing incremental synchronization.
    Important

    When synchronizing from PolarDB for PostgreSQL to SelectDB, data types are converted. If you do not select Schema Synchronization, create the destination tables manually using the corresponding Unique or Duplicate model. For more information, see Data type mapping, Additional column information, and Data model.

    Processing Mode of Conflicting TablesPrecheck and Report Errors: DTS checks for tables with the same name in the destination database. If a duplicate table exists, the precheck fails and the task does not start.
    Note

    If you cannot delete or rename the duplicate table, use the object name mapping feature. For more information, see Map schema, table, and column names. Ignore Errors and Proceed: DTS skips the check for duplicate table names.

    Warning

    This option may cause data inconsistency. If table schemas are the same and a record has a matching primary key or unique key, the source record overwrites the destination record. If schemas differ, initialization may fail or only some columns may be synchronized.

    Capitalization of Object Names in Destination InstanceThe capitalization policy for database names, table names, and column names in the destination instance. Default: DTS default policy. For more information, see Specify the capitalization of object names in the destination instance.
    Source ObjectsSelect one or more objects and click the 向右 icon to move them to Selected Objects. Objects can be selected at the schema, table, or column level.
    Selected Objects- To rename a destination object or specify a receiving object, right-click it in the Selected Objects section. For more information, see Map object names. - To remove a selected object, click it and then click the image icon. - To set the bucket_count parameter for a table (available only when Schema Synchronization is selected and objects are selected at the table level): right-click the table, set Enable Parameter Settings to Yesalert notification settings, enter the value, and click OK.
    Note

    Renaming an object using object name mapping may cause dependent objects to fail synchronization. To filter data with WHERE conditions, right-click a table and specify the conditions. For more information, see Specify filter conditions. To select SQL operations for incremental synchronization, right-click an object and select the operations.

  2. Click Next: Advanced Settings and configure the following parameters.

    ParameterDescription
    Dedicated Cluster for Task SchedulingBy default, DTS schedules the task to the shared cluster. To improve stability, purchase a dedicated cluster. For more information, see What is a DTS dedicated cluster.
    Retry Time for Failed ConnectionsThe time range for retrying failed connections. Valid values: 10–1440 minutes. Default: 720 minutes. Set to more than 30 minutes. If DTS reconnects within this window, the task resumes. Otherwise, the task fails.
    Note

    If multiple tasks share the same source or destination database and have different retry windows, the shortest window takes precedence. DTS charges for the instance during retries.

    Retry Time for Other IssuesThe time range for retrying failed DDL or DML operations. Valid values: 1–1440 minutes. Default: 10 minutes. Set to more than 10 minutes. This value must be smaller than Retry Time for Failed Connections.
    Enable Throttling for Full Data SynchronizationThrottle full data synchronization to reduce load on the source and destination. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected.
    Enable Throttling for Incremental Data SynchronizationThrottle incremental data synchronization by configuring RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
    Environment TagAn optional tag to identify the instance.
    Configure ETLSpecify whether to enable the extract, transform, and load (ETL) feature. Select Yes to enter data processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task. Select No to skip ETL configuration.
    Monitoring and AlertingSpecify whether to configure alerting. Select Yes to configure the alert threshold and notification settings. For more information, see the Configure monitoring and alerting when you create a DTS task section.
  3. (Optional) Click Next: Configure Database and Table Fields to set the Primary Key Column, Distribution Key, and Engine for destination tables.

    - This step is available only when Schema Synchronization is selected for Synchronization Types. Set Definition Status to All to make modifications. - You can select multiple columns as a composite primary key. At least one column in Primary Key Column must also be selected as the Distribution Key. - For tables without a primary key or UNIQUE constraint, set Engine to duplicate. Otherwise, the synchronization may fail or data may be lost.

Step 4: Run the precheck

  1. Click Next: Save Task Settings and Precheck. DTS runs a precheck before the synchronization task can start. The task starts only after the precheck passes.

    To view the API parameters for this configuration, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
  2. If the precheck fails, click View Details next to each failed item, troubleshoot the issue, and rerun the precheck.

  3. If an alert is triggered:

    • If the alert cannot be ignored, click View Details, fix the issue, and rerun the precheck.

    • If the alert can be ignored, click Confirm Alert Details, click Ignore in the View Details dialog box, click OK, and then click Precheck Again.

    Ignoring alerts may cause data inconsistency and expose your business to risk.

Step 5: Purchase the instance

  1. Wait until the Success Rate reaches 100%, then click Next: Purchase Instance.

  2. On the buy page, configure the following parameters.

    ParameterDescription
    Billing MethodSubscription: Pay upfront for a fixed duration. More cost-effective for long-term use. Pay-as-you-go: Billed hourly. Suitable for short-term use. Release the instance when no longer needed to reduce costs.
    Resource Group SettingsThe resource group for the synchronization instance. Default: default resource group. For more information, see What is Resource Management?
    Instance ClassThe synchronization speed tier. Select based on your business requirements. For more information, see Instance classes of data synchronization instances.
    Subscription DurationAvailable only for the subscription billing method. Options: 1–9 months, 1 year, 2 years, 3 years, or 5 years.
  3. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.

  4. Click Buy and Start. In the dialog box that appears, click OK.

The task appears in the task list. Monitor its progress from there.

Data type mapping

The following table shows how PolarDB for PostgreSQL data types map to SelectDB data types.

CategoryPolarDB for PostgreSQL data typeSelectDB data type
NumericSMALLINTSMALLINT
INTEGERINT
BIGINTBIGINT
DECIMALDECIMAL
NUMERICDECIMAL
REALDOUBLE
DOUBLEDOUBLE
SMALLSERIALSMALLINT
SERIALINT
BIGSERIALBIGINT
MonetaryMONEYSTRING
CharacterCHAR(n), VARCHAR(n)VARCHAR.
Important

To prevent data loss, CHAR(n) and VARCHAR(n) are converted to VARCHAR(4\*n). If no length is specified, the SelectDB default VARCHAR(65533) is used. If the length exceeds 65533, the data is converted to STRING.

TEXTSTRING
BinaryBYTEASTRING
Date and timeTIMESTAMP [(P)] [WITHOUT TIME ZONE]DATETIMEV2
TIMESTAMP [(P)] WITH TIME ZONEDATETIMEV2
DATEDATEV2
TIME [(P)] [WITHOUT TIME ZONE]VARCHAR(50)
TIME [(P)] WITH TIME ZONEVARCHAR(50)
INTERVAL [FIELDS] [(P)]STRING
BooleanBOOLEANBOOLEAN
GeometricPOINT, LINE, LSEG, BOX, PATH, POLYGON, CIRCLESTRING
Network addressCIDR, INET, MACADDR, MACADDR8STRING
Text searchTSVECTORSTRING
XMLXMLSTRING
JSONJSONJSON

Additional column information

For tables using the Duplicate key model, DTS automatically adds or requires the following columns in the destination table.

Column nameData typeDefault valueDescription
_is_deletedInt0Deletion flag. Insert: 0. Update: 0. Delete: 1.
_versionBigint0Full data synchronization: 0. Incremental data synchronization: the corresponding timestamp in seconds from the source database's binary log.
_record_idBigint0Full data synchronization: 0. Incremental data synchronization: the record ID from the incremental log, which uniquely identifies the log entry. The value is unique and increments.