Synchronize data from PolarDB-X 2.0 to AnalyticDB for PostgreSQL - Data Transmission Service

Data Transmission Service (DTS) lets you continuously synchronize data from a PolarDB-X 2.0 instance to an AnalyticDB for PostgreSQL instance, so you can run real-time analytics and queries against your transactional data without impacting the source database.

Prerequisites

Before you begin, make sure you have:

An AnalyticDB for PostgreSQL instance with enough available storage to hold all data from the source PolarDB-X 2.0 instance. For setup instructions, see Create an instance.
A database created in the destination instance to receive the synchronized data. For setup instructions, see the "CREATE DATABASE" section of SQL syntax.
The correct permissions on both the source and destination databases (see Permissions required for database accounts).

Limitations

Source database

Limitation	Details
Table requirements	Tables must have a PRIMARY KEY or UNIQUE constraint, with all fields unique. Tables without these constraints can result in duplicate rows in the destination.
Table count (when editing object names)	If you rename tables or columns in the destination, a single task supports up to 5,000 tables. Exceeding this limit causes a request error. Split the work across multiple tasks, or synchronize at the database level instead.
Binary logging	Binary logging must be enabled in the PolarDB-X 2.0 console (see Parameter settings), and `binlog_row_image` must be set to `full`. Failing either requirement causes the precheck to fail and prevents the task from starting.
Binary log retention	Incremental sync only: retain binary logs for at least 24 hours. Full + incremental sync: retain binary logs for at least 7 days; after full sync completes, you can set the retention period to more than 24 hours. If DTS cannot read the binary logs, the task fails and data inconsistency or loss may occur.
MySQL compatibility	The PolarDB-X instance must be compatible with MySQL V5.7.

Other limitations

Only tables can be selected as synchronization objects. Append-optimized (AO) tables are not supported.
If source and destination column mappings are incomplete, or the schemas differ, data in unmapped columns is lost.
Run synchronization tasks during off-peak hours. Initial full data synchronization uses both read and write resources on source and destination databases, which can increase server load.
Concurrent INSERT operations during initial full data synchronization cause table fragmentation in the destination, so the destination tablespace will be larger than the source after full sync completes.
Write data to the destination database only through DTS during synchronization. Writing through other tools causes data inconsistency, and DMS online DDL operations can cause data loss in the destination database.

Special cases

DTS periodically updates the dts_health_check.ha_health_check table in the source database to advance the binary log position.

Billing

Synchronization type	Cost
Schema synchronization and full data synchronization	Free
Incremental data synchronization	Charged. See Billing overview.

Supported SQL operations

DTS synchronizes the following SQL operations:

DML: INSERT, UPDATE, DELETE
DDL: ADD COLUMN

CREATE TABLE is not supported. To synchronize a newly created table, add it to the selected objects manually. See Add an object to a data synchronization task.

Permissions required for database accounts

Grant the required permissions to the database accounts before you create a synchronization task.

Database	Required permissions
Source PolarDB-X 2.0	SELECT on the objects to synchronize, REPLICATION SLAVE, REPLICATION CLIENT
Destination AnalyticDB for PostgreSQL	Read and write on the destination database. Use the initial account or an account with the RDS_SUPERUSER permission.

For more information, see Manage database accounts and Permissions required to synchronize data.

For the destination AnalyticDB for PostgreSQL instance, see Create a database account and Manage account permissions.

Create a synchronization task

Step 1: Go to the Data Synchronization Tasks page

Log on to the DMS console.
In the top navigation bar, click DTS.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.

The actual navigation may vary depending on your DMS mode and layout. See Simple mode and Customize the layout and style of the DMS console. Alternatively, go directly to the Data Synchronization Tasks page in the new DTS console.

Step 2: Select a region

From the drop-down list to the right of Data Synchronization Tasks, select the region where you want to create the task.Data Synchronization Tasks page of the new DTS console

In the new DTS console, select the region in the upper-left corner of the page.

Step 3: Configure source and destination databases

Click Create Task. In the Create Task wizard, configure the following parameters:

Source database

Parameter	Value
Task Name	A descriptive name for the task. DTS assigns a name automatically, but a meaningful name helps you identify the task later. Task names do not need to be unique.
Select an existing DMS database instance	(Optional) Select an existing registered instance to auto-populate the database parameters. If you skip this, configure the parameters manually.
Database Type	Select PolarDB-X 2.0.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region of the PolarDB-X 2.0 instance.
Replicate Data Across Alibaba Cloud Accounts	Select No for same-account synchronization.
Instance ID	The ID of the PolarDB-X 2.0 instance.
Database Account	The account for the PolarDB-X 2.0 instance. See Permissions required for database accounts.
Database Password	The password of the database account.

Destination database

Parameter	Value
Select an existing DMS database instance	(Optional) Select an existing registered instance to auto-populate the database parameters. If you skip this, configure the parameters manually.
Database Type	Select AnalyticDB for PostgreSQL.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region of the AnalyticDB for PostgreSQL instance.
Instance ID	The ID of the AnalyticDB for PostgreSQL instance.
Database Name	The name of the database in the destination instance that receives the synchronized data.
Database Account	The account for the AnalyticDB for PostgreSQL instance. See Permissions required for database accounts.
Database Password	The password of the database account.

Step 4: Test connectivity

Click Test Connectivity and Proceed.

DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances, and to the security group rules of Elastic Compute Service (ECS) instances hosting self-managed databases. For self-managed databases in on-premises data centers or third-party clouds, manually add the DTS server CIDR blocks to the database whitelist. See Add the CIDR blocks of DTS servers to the security settings of on-premises databases.

Warning

Adding DTS server CIDR blocks to whitelists or security groups introduces potential security risks. Before proceeding, take protective measures such as strengthening credentials, limiting exposed ports, authenticating API calls, regularly auditing whitelist rules, and blocking unauthorized CIDR blocks. Consider connecting through Express Connect, VPN Gateway, or Smart Access Gateway (SAG) for enhanced security.

Step 5: Select objects and synchronization types

Configure the following parameters:

Parameter	Description
Synchronization Types	Select all three: Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. DTS pre-selects Incremental Data Synchronization by default. Full data synchronization migrates historical data first, establishing the baseline for ongoing incremental sync.
Processing Mode of Conflicting Tables	Precheck and Report Errors (recommended): DTS checks for table name conflicts before starting. If conflicts exist, the precheck fails. Use object name mapping to rename conflicting tables in the destination. Ignore Errors and Proceed: skips conflict checks. During full sync, DTS skips conflicting rows and keeps existing destination data. During incremental sync, conflicting rows overwrite destination data. If schemas differ, initialization may fail or columns may be missing. Use with caution.
Capitalization of object names in destination instance	Controls whether database, table, and column names in the destination follow the source or destination casing convention. Default is DTS default policy. See Specify the capitalization of object names in the destination instance.
Source Objects	Select one or more tables from Source Objects and click the arrow icon to move them to Selected Objects. Only tables are supported — views, triggers, and stored procedures are not synchronized.
Selected Objects	Right-click an object to rename it, set WHERE filter conditions, or choose which SQL operations to synchronize. Click Batch Edit to rename multiple objects at once. Note that renaming an object may cause dependent objects to fail synchronization.

Step 6: Configure advanced settings

Click Next: Advanced Settings and configure the following:

Parameter	Description
Select the dedicated cluster used to schedule the task	Leave at the default (shared cluster) unless you have purchased a dedicated cluster. See What is a DTS dedicated cluster.
Set Alerts	Select Yes to receive notifications when the task fails or synchronization latency exceeds a threshold. Configure the alert threshold and contacts. See Configure monitoring and alerting when you create a DTS task.
Retry Time for Failed Connections	The duration DTS retries a failed connection after the task starts. Set to a value greater than 30 minutes (valid range: 10–1,440 minutes; default: 720). If multiple tasks share the same source or destination, the shortest retry time among those tasks applies.
Wait time before a retry for other issues	The duration DTS retries after a DDL or DML operation fails. Set to a value greater than 10 minutes (valid range: 1–1,440 minutes; default: 10). This value must be less than the Retry Time for Failed Connections value.
Enable Throttling for Full Data Migration	Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and BPS of Full Data Migration to reduce load on the destination during full sync. Available only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization	Configure RPS of Incremental Data Migration and BPS of Incremental Data Migration to reduce load on the destination during incremental sync.
Environment Tag	(Optional) Tag the DTS instance for identification.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasks	Select Yes to prevent DTS from writing heartbeat operations to the source database. This avoids affecting physical backup and cloning but may show a latency offset in the DTS console. Select No to write heartbeat operations to the source database. Specific features such as physical backup and cloning may be affected.
Configure ETL	Select Yes to configure the extract, transform, and load (ETL) feature and enter data processing statements in the code editor. See Configure ETL in a data migration or synchronization task.

Step 7: (Optional) Configure database and table fields

Click Next: Configure Database and Table Fields.

On this page, specify the type, primary key column, and distribution key for each table you are synchronizing to AnalyticDB for PostgreSQL. This step is available only if you selected Schema Synchronization in Step 5.

Set Definition Status to All to view and edit all tables.

In Primary Key Column, you can specify multiple columns to form a composite primary key. At least one primary key column must also be set as the distribution key. See Manage tables and Define table distribution.

Step 8: Run the precheck

Click Next: Save Task Settings and Precheck.

DTS runs a precheck before starting the task. The task can only proceed after all checks pass.

If a check fails, click View Details next to the failed item, resolve the issue, and run the precheck again.
If an alert is triggered for an item that can be safely ignored, click Confirm Alert Details, then Ignore, confirm with OK, and click Precheck Again. Ignoring alerts may cause data inconsistency.

To preview the API parameters used to configure this instance, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.

Step 9: Purchase an instance

Wait until the precheck success rate reaches 100%, then click Next: Purchase Instance.

On the Purchase Instance page, configure the following:

Parameter	Description
Billing Method	Subscription: pay upfront for one to nine months, or one, two, three, or five years. More cost-effective for long-running tasks. Pay-as-you-go: charged hourly. Suitable for short-term tasks; release the instance when done to stop charges.
Resource Group	The resource group for the instance. Default: default resource group. See What is Resource Management?.
Instance Class	The synchronization specification determines throughput and performance. Select based on your data volume and latency requirements. See Specifications of data synchronization instances.

Step 10: Start the task

Read and agree to the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start.

The synchronization task starts. Track its progress in the task list.