Data Transmission Service (DTS) synchronizes data from a PolarDB-X 1.0 instance to an AnalyticDB for PostgreSQL instance, so you can centralize analytical workloads without building a custom data pipeline.
Prerequisites
Before you begin, make sure that:
The storage type of the PolarDB-X 1.0 instance is ApsaraDB RDS for MySQL. PolarDB for MySQL is not supported.
An AnalyticDB for PostgreSQL instance exists, and its storage capacity exceeds the occupied storage of the source PolarDB-X 1.0 instance. For instructions, see Create an instance.
A destination database is created in the AnalyticDB for PostgreSQL instance. For the SQL syntax, see the "CREATE DATABASE" section of SQL syntax.
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
SQL operations supported
| Operation type | Statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
DDL operations are not synchronized.
Required account permissions
| Database | Required permission | References |
|---|---|---|
| Source PolarDB-X 1.0 instance | Read permissions on the objects to be synchronized | Manage accounts |
| Destination AnalyticDB for PostgreSQL instance | Read and write permissions on the destination database. The initial account or an account with the RDS_SUPERUSER permission also works. | Create and manage a database account and Manage users and permissions |
Limitations
Source database requirements
Tables must have a PRIMARY KEY or UNIQUE constraint, and all fields must be unique. Without this, the destination database may contain duplicate records.
Tables with only a UNIQUE constraint do not support schema synchronization. Use tables with a PRIMARY KEY constraint where possible.
Tables with secondary indexes cannot be synchronized.
If you select tables as objects and plan to rename tables or columns in the destination, a single task can synchronize up to 5,000 tables. Exceeding this limit causes a request error. Split the work into multiple tasks, or synchronize at the database level instead.
The
binlog_row_imageparameter of the attached ApsaraDB RDS for MySQL instance must be set tofull. If it is not, the precheck fails and the task cannot start.Binary log retention: If the retention period is too short, DTS may fail to read binary logs, which can cause task failure or data inconsistency. DTS service level agreements (SLAs) do not cover failures caused by insufficient binary log retention.
Incremental data synchronization only: Retain binary logs for at least 24 hours.
Full data synchronization + incremental data synchronization: Retain binary logs for at least 7 days. After full data synchronization completes, you can reduce the retention period to more than 24 hours.
Restrictions during synchronization
If you change the network type of the PolarDB-X 1.0 instance during synchronization, update the network connection settings of the DTS task to match.
Do not scale the capacity of the source instance, change the distribution of physical databases and tables, change shard keys, or perform DDL operations on the source instance. These actions can cause the task to fail or produce inconsistent data.
Do not synchronize frequently accessed tables during peak hours.
Other limitations
The destination table cannot be an append-optimized (AO) table.
The following data types are not supported: GEOMETRY, CURVE, SURFACE, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION.
If column mapping is used, or if the source and destination schemas differ, data in source columns that do not exist in the destination is lost.
Read-only instances at the PolarDB-X 1.0 compute layer are not supported.
PolarDB-X 1.0 supports only horizontal splitting (by database and table). Vertical splitting is not supported.
When DTS synchronizes a PolarDB-X 1.0 instance, it distributes data across the attached ApsaraDB RDS for MySQL instances and runs one subtask per instance. The subtask status appears in Task Topology.
Performance impact
During initial full data synchronization, DTS reads from the source and writes to the destination concurrently. This increases the load on both databases. Schedule full data synchronization during off-peak hours.
After initial full data synchronization completes, concurrent INSERT operations cause table fragmentation in the destination. The used tablespace of the destination database will be larger than that of the source.
To prevent data inconsistency, avoid writing to the destination database from other processes while a DTS task is running.
Set up data synchronization
Step 1: Open the Data Synchronization page
Use one of the following consoles:
DTS console
Log on to the DTS console.DTS console
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the synchronization instance resides.
DMS console
The steps may vary depending on your DMS console mode and layout. For details, see Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.DMS console
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list to the right of Data Synchronization Tasks, select the region where the synchronization instance resides.
Step 2: Create a task
Click Create Task to open the task wizard.
Task settings
| Parameter | Description |
|---|---|
| Task Name | DTS generates a name automatically. Specify a descriptive name to make the task easy to identify. |
Source database
| Parameter | Description |
|---|---|
| Select an existing DMS database instance | Optional. If you select one, DTS populates the parameters below automatically. |
| Database Type | Select PolarDB-X 1.0. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region where the source PolarDB-X 1.0 instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No for same-account synchronization. |
| Instance ID | The ID of the source PolarDB-X 1.0 instance. |
| Database Account | An account with the required read permissions. |
| Database Password | The password for the database account. |
Destination database
| Parameter | Description |
|---|---|
| Select an existing DMS database instance | Optional. If you select one, DTS populates the parameters below automatically. |
| Database Type | Select AnalyticDB for PostgreSQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region where the destination AnalyticDB for PostgreSQL instance resides. |
| Instance ID | The ID of the destination AnalyticDB for PostgreSQL instance. |
| Database Name | The destination database that will receive the synchronized data. |
| Database Account | An account with read and write permissions on the destination database. |
| Database Password | The password for the database account. |
Step 3: Test connectivity
Click Test Connectivity and Proceed.
DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances and to the security group rules of Elastic Compute Service (ECS) instances hosting self-managed databases. For databases in data centers or with third-party cloud providers, add the CIDR blocks manually. See Add the CIDR blocks of DTS servers.
Adding DTS CIDR blocks to your whitelist or security group rules introduces security exposure. Before proceeding, take preventive measures: use strong credentials, limit exposed ports, authenticate API calls, review whitelist entries regularly, and remove unauthorized CIDR blocks. Alternatively, connect DTS to your database through Express Connect, VPN Gateway, or Smart Access Gateway.
Step 4: Configure objects and synchronization options
Synchronization types
Select which phases to run:
| Option | Description |
|---|---|
| Schema Synchronization | Copies table schemas from source to destination. Required if you want DTS to create tables automatically. |
| Full Data Synchronization | Copies all existing data from the source. Required as the baseline for incremental synchronization. |
| Incremental Data Synchronization | Continuously applies DML changes (INSERT, UPDATE, DELETE) from the source after full synchronization completes. Charged. |
By default, Incremental Data Synchronization is pre-selected. You must also select Schema Synchronization and Full Data Synchronization for a complete, ongoing synchronization pipeline.
Processing mode for conflicting tables
| Mode | Behavior |
|---|---|
| Precheck and Report Errors | The precheck fails if the destination contains tables with the same names as the source. Resolve the conflict before starting. To rename tables at the destination, use Map object names. |
| Ignore Errors and Proceed | Skips the identical-name precheck. During full synchronization, existing records in the destination are kept if they share a primary key or unique key with source records. During incremental synchronization, matching records are overwritten. If schemas differ, initialization may fail or only some columns may be synchronized. Use with caution. |
Other object-level settings
| Parameter | Description |
|---|---|
| Capitalization of Object Names in Destination Instance | Controls the capitalization of database, table, and column names in the destination. Default is DTS default policy. See Specify the capitalization of object names. |
| Source Objects | Select objects and move them to Selected Objects. Only tables can be selected. |
| Selected Objects | Right-click a table to rename it or set filter conditions. Click Batch Edit to rename multiple objects at once. If you rename an object, dependent objects may fail to synchronize. See Map object names and Specify filter conditions. |
Step 5: Configure advanced settings
Click Next: Advanced Settings.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules the task to the shared cluster. For higher stability, purchase and specify a dedicated cluster. See What is a DTS dedicated cluster. |
| Set Alerts | Sends notifications when the task fails or synchronization latency exceeds a threshold. Select Yes to configure the alert threshold and notification settings. See Configure monitoring and alerting. |
| Retry Time for Failed Connections | How long DTS retries after a connection failure. Valid range: 10–1440 minutes. Default: 720 minutes. Set to greater than 30 minutes. If you configure multiple tasks with the same source or destination, the shortest retry time applies across all tasks. DTS charges for the instance during retry attempts. |
| Retry Time for Other Issues | How long DTS retries after a DDL or DML failure. Valid range: 1–1440 minutes. Default: 10 minutes. Set to greater than 10 minutes. This value must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits the load on the source and destination during full data synchronization. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) to control throughput. Appears only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limits the RPS and throughput (MB/s) for incremental synchronization. |
| Environment Tag | Tags the DTS instance by environment type, such as production or staging. |
| Configure ETL | Enables the extract, transform, and load (ETL) feature. Select Yes to enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. |
Step 6: (Optional) Configure table fields for AnalyticDB for PostgreSQL
If you selected Schema Synchronization, click Next: Configure Database and Table Fields to specify how tables are created in the destination.
Set Definition Status to All to view and modify all tables.
| Field | Description |
|---|---|
| Type | The table storage type in AnalyticDB for PostgreSQL. |
| Primary Key Column | Specify one or more columns as the primary key. Multiple columns form a composite primary key. |
| Distribution Key | Specify one or more primary key columns as the distribution key. At least one primary key column must be designated as the distribution key. See Manage tables and Define table distribution. |
Step 7: Run the precheck
Click Next: Save Task Settings and Precheck.
To view the API parameters for this task configuration before saving, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before starting the task. Precheck results fall into two categories:
Failed: The check item blocks the task from starting. Fix the issue and click Precheck Again.
Warning: The task can proceed, but the issue may affect your data or business. Review each warning item and decide whether to resolve it or acknowledge and continue: Ignoring warnings may result in data inconsistency.
To resolve: fix the issue, then click Precheck Again.
To acknowledge: click Confirm Alert Details, then click Ignore in the dialog box, click OK, and then click Precheck Again.
Step 8: Purchase and start the instance
When Success Rate reaches 100%, click Next: Purchase Instance.
On the buy page, configure the instance:
| Section | Parameter | Description |
|---|---|---|
| New Instance Class | Billing Method | Subscription: Pay upfront for a fixed term. More cost-effective for long-term use. Pay-as-you-go: Billed hourly. Release the instance when it is no longer needed to stop charges. |
| Resource Group Settings | The resource group for the synchronization instance. Default: default resource group. | |
| Instance Class | Determines the synchronization throughput. See Instance classes of data synchronization instances. | |
| Subscription Duration | Available when Subscription is selected. Options: 1–9 months, or 1, 2, 3, or 5 years. |
Read and accept Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the confirmation dialog box, click OK.
The task appears in the task list. Monitor its progress from there.