Use Data Transmission Service (DTS) to migrate data from a PolarDB-X 2.0 instance to an ApsaraDB for SelectDB instance.
This feature is in the grayscale testing phase and is available only to some users.
Prerequisites
Before you begin, make sure you have:
-
A PolarDB-X 2.0 instance (source) with at least 100 Mbit/s outbound bandwidth
-
An ApsaraDB for SelectDB instance (destination)
-
The
binlog_row_imageparameter set tofullon the source instance (required for incremental data migration). For details, see Parameter settings -
Database accounts with the required permissions on both instances (see Required permissions)
-
DTS server CIDR blocks added to the security settings of both instances. For details, see Add the CIDR blocks of DTS servers
Unsupported objects
The following object types cannot be migrated:
-
Tables with uppercase letters in their names
-
INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and foreign keys (FK)
-
TABLEGROUP objects
-
Databases or tables that contain the Locality attribute
-
Read-only instances of the Enterprise Edition of PolarDB-X 2.0
Tables to migrate must have PRIMARY KEY or UNIQUE constraints, with all fields unique. Without these constraints, the destination database may contain duplicate records.
If you select tables as migration objects and need to rename them or their columns in the destination, a single task supports up to 1,000 tables. For more than 1,000 tables, configure multiple tasks or migrate the entire database at once.
Restrictions during migration
Do not perform the following operations while the migration task is running:
-
On the source instance: Execute DDL statements that change database or table schemas during schema migration or full data migration. Doing so causes the task to fail.
-
On the destination instance:
-
Create clusters in the ApsaraDB for SelectDB instance. If this causes a task failure, restart the migration instance to resume.
-
Add backend nodes to the ApsaraDB for SelectDB database. If this causes a task failure, restart the migration instance to resume.
-
Allow other data sources to write to the destination while DTS is running. This may cause data inconsistency or task failure.
-
-
DDL restrictions: Do not execute DDL statements that modify multiple columns at once or modify the same table in succession.
SelectDB-specific requirements
-
Database and table names: Must start with a letter. Use the object name mapping feature to rename any objects that do not meet this requirement.
-
Chinese character names: Objects with Chinese character names must be renamed using object name mapping (for example, from Chinese to English). Otherwise, the task may fail.
-
VARCHAR length: In PolarDB-X 2.0,
VARCHAR(M)specifies character length. In SelectDB,VARCHAR(N)specifies byte length. If you skip schema migration, set SelectDB VARCHAR field lengths to four times the corresponding PolarDB-X 2.0 field lengths. -
Destination table engine: Data can only be migrated to Unique engine tables or Duplicate engine tables. See Unique engine tables and Duplicate engine tables below.
Unique engine tables
All unique keys in the destination Unique engine table must exist in the source table and be included in the objects to migrate. Otherwise, data inconsistency may occur.
Duplicate engine tables
When the destination table is a Duplicate engine table, DTS converts UPDATE and DELETE statements to INSERT statements. Duplicate records may appear in the following scenarios:
-
The migration instance is retried or restarted.
-
The same data is modified by DML operations twice or more after the migration instance starts.
To deduplicate records, use the additional columns that DTS automatically creates: _is_deleted, _version, and _record_id. See Additional columns.
Other notes
-
Full data migration performance: Full data migration uses read and write resources on both instances, which increases database load. Run migration during off-peak hours, when CPU load on both instances is below 30%.
-
Storage fragmentation: Because full data migration uses concurrent INSERT operations, destination tables become fragmented. After full data migration, destination tables occupy more storage space than the source tables.
-
Heartbeat writes: DTS periodically runs
CREATE DATABASE IF NOT EXISTS \`test\`on the source instance to write heartbeat data and advance the binary log position. If you set Whether to delete SQL operations on heartbeat tables of forward and reverse tasks to Yes, or if the source account lacks permission to create databases, and no DML operations are performed on the source for an extended period, the latency information may be inaccurate. To refresh latency data, perform a DML operation on the source instance. -
Task recovery: If a DTS task fails, DTS support attempts to restore it within 8 hours. During restoration, the task may be restarted and certain task parameters may be adjusted. Database parameters are not modified.
Billing
| Migration type | Task configuration fee | Data transfer fee |
|---|---|---|
| Schema migration and full data migration | Free | Free when Access Method is Alibaba Cloud Instance. Charges apply when Access Method is Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | — |
SQL operations supported in incremental migration
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | ADD COLUMN; DROP COLUMN; DROP TABLE; TRUNCATE TABLE; RENAME TABLE |
RENAME TABLE operations may cause data inconsistency. For example, if you select a table as the migration object and rename it during migration, that table's data is not migrated. To avoid this, select the database containing the table as the migration object, and make sure both the pre-rename and post-rename database names are included in the objects to migrate.
Required permissions
| Database | Schema migration | Full migration | Incremental migration |
|---|---|---|---|
| Source PolarDB-X 2.0 | SELECT | SELECT | REPLICATION SLAVE, REPLICATION CLIENT, and SELECT on the objects to migrate |
| Destination ApsaraDB for SelectDB | Usage_priv (cluster access), Select_priv, Load_priv, Alter_priv, Create_priv, Drop_priv | ||
For instructions on creating accounts and granting permissions:
PolarDB-X 2.0: Manage database accounts and Account permissions during data synchronization
ApsaraDB for SelectDB: Cluster permission management and Basic permission management
Create a migration task
Step 1: Go to the Data Migration page
Use one of the following methods:
DTS console
Log on to the DTS console.
In the left-side navigation pane, click Data Migration.
In the upper-left corner, select the region where the migration instance resides.
DMS console
The actual steps may vary depending on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
From the drop-down list to the right of Data Migration Tasks, select the region where the migration instance resides.
Step 2: Configure source and destination databases
Click Create Task.
Configure the parameters described in the following table.
Category Parameter Description N/A Task Name A name for the DTS task. DTS generates a default name. Specify an informative name that makes the task easy to identify. The name does not need to be unique. Source Database Select Existing Connection If the source instance is registered with DTS, select it from the drop-down list. DTS auto-fills the remaining parameters. Otherwise, configure the parameters manually. In the DMS console, select the instance from Select a DMS database instance. Database Type Select PolarDB-X 2.0. Access Method Select Alibaba Cloud Instance. Instance Region Select the region where the source PolarDB-X 2.0 instance resides. Replicate Data Across Alibaba Cloud Accounts Select No if the source and destination instances belong to the same Alibaba Cloud account. Instance ID Select the ID of the source PolarDB-X 2.0 instance. Database Account Enter the database account. See Required permissions for the minimum permissions needed. Database Password Enter the password for the database account. Destination Database Select Existing Connection If the destination instance is registered with DTS, select it from the drop-down list. DTS auto-fills the remaining parameters. Otherwise, configure the parameters manually. In the DMS console, select the instance from Select a DMS database instance. Database Type Select SelectDB. Access Method Select Alibaba Cloud Instance. Instance Region Select the region where the destination SelectDB instance resides. Replicate Data Across Alibaba Cloud Accounts Select No if the source and destination instances belong to the same Alibaba Cloud account. Instance ID Select the ID of the destination SelectDB instance. Database Account Enter the database account. See Required permissions for the minimum permissions needed. Database Password Enter the password for the database account. Click Test Connectivity and Proceed.
Step 3: Configure migration objects
On the Configure Objects page, set the following parameters.
Parameter Description Migration Types Select the migration types based on your requirements: <br>- Schema Migration + Full Data Migration: Migrates existing data only. Do not write new data to the source during migration. <br>- Schema Migration + Full Data Migration + Incremental Data Migration: Migrates existing data and keeps the destination in sync with ongoing changes. Use this combination to minimize service disruption. <br><br>If you skip Schema Migration, create Unique or Duplicate engine tables with the correct structure in SelectDB before starting the task. For details, see Data type mapping, Additional columns, and Data model. Processing Mode of Conflicting Tables - Precheck and Report Errors: The precheck fails if the destination contains tables with the same names as the source. To resolve conflicts, use object name mapping to rename the migrated tables. See Map object names. <br>- Ignore Errors and Proceed: Skips the precheck for identical table names. WarningThis may cause data inconsistency. During full migration, DTS skips conflicting records and keeps the existing destination data. During incremental migration, DTS overwrites conflicting records. If the source and destination have different schemas, only partial columns are migrated or the task may fail.
Capitalization of Object Names in Destination Instance The capitalization policy for database names, table names, and column names in the destination. Defaults to DTS default policy. See Specify the capitalization of object names in the destination instance. Source Objects Select the databases or tables to migrate, then click
to move them to Selected Objects.Selected Objects - To rename a table in the destination, right-click it and select an object name mapping option. See Map object names. <br>- To filter rows, right-click a table and specify WHERE conditions. See Specify filter conditions. <br>- To select specific SQL operations for a database or table, right-click the object and choose the operations to include. <br>- If Schema Migration is selected, you can only select tables (not databases) as migration objects, and must set the bucket_countparameter: right-click a table, enable Enable Parameter Settings, set Value to a positive integer (default: auto), and click OK. <br><br>NoteRenaming an object with object name mapping may cause dependent objects to fail migration.
Click Next: Advanced Settings and configure the following parameters.
Parameter Description Dedicated Cluster for Task Scheduling By default, DTS schedules the task to a shared cluster. To improve task stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. Retry Time for Failed Connections How long DTS retries failed connections after the task starts. Valid range: 10–1,440 minutes. Default: 720 minutes. Set to more than 30 minutes. If DTS reconnects within this period, the task resumes. Otherwise, the task fails. NoteWhen multiple tasks share the same source or destination, the most recently set retry time takes precedence. DTS charges for the instance during retry.
Retry Time for Other Issues How long DTS retries failed DDL or DML operations. Valid range: 1–1,440 minutes. Default: 10 minutes. Set to more than 10 minutes. Must be less than Retry Time for Failed Connections. Enable Throttling for Full Data Migration Limits read and write throughput during full migration to reduce load on both instances. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. Enable Throttling for Incremental Data Migration Limits throughput during incremental migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. Whether to delete SQL operations on heartbeat tables of forward and reverse tasks Controls whether DTS writes heartbeat SQL operations to the source instance. <br>- Yes: DTS does not write heartbeat operations. Latency information may be inaccurate. <br>- No: DTS writes heartbeat operations. Physical backup and cloning of the source may be affected. Environment Tag An optional tag for identifying the instance. Configure ETL Whether to enable the extract, transform, and load (ETL) feature. See What is ETL? and Configure ETL in a data migration or data synchronization task. Monitoring and Alerting Whether to receive alerts when the task fails or migration latency exceeds a threshold. If Yes, configure the alert threshold and notification settings. See Configure monitoring and alerting when you create a DTS task. (Optional) Click Next: Configure Database and Table Fields. For each table to migrate, specify Primary Key Column, Distribution Key, and Engine.
- This step is available only when Schema Migration is selected. Set Definition Status to All to view all tables. - Primary Key Column supports multiple columns. One or more primary key columns can also be selected as the Distribution Key. - If a table has no primary key or UNIQUE constraint, select duplicate for Engine. Otherwise, the task may fail or data loss may occur.
Step 4: Run the precheck
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this configuration before proceeding, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
The task must pass the precheck before it can start.
If a precheck item fails, click View Details next to the failed item, resolve the issue, and click Precheck Again.
If an alert is triggered: if it cannot be ignored, resolve it and recheck. If it can be safely ignored, click Confirm Alert Details, click Ignore in the dialog box, confirm, and click Precheck Again. Ignoring alerts may cause data inconsistency.
Step 5: Purchase and start the instance
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the instance.
Section Parameter Description New Instance Class Resource Group The resource group for the migration instance. Defaults to default resource group. See What is Resource Management? Instance Class The instance class determines migration speed. Select a class based on your requirements. See Instance classes of data migration instances. Select the Data Transmission Service (Pay-as-you-go) Service Terms check box.
Click Buy and Start, then click OK in the confirmation dialog box.
The task appears on the Data Migration page. Track its progress there.
If the task does not include incremental data migration, it stops automatically when complete. The Status shows Completed.
If the task includes incremental data migration, it runs continuously. The Status shows Running. The incremental data migration task never stops or completes automatically.
Data type mapping
| Category | PolarDB-X 2.0 type | SelectDB type |
|---|---|---|
| Numeric | TINYINT | TINYINT |
| TINYINT UNSIGNED | SMALLINT | |
| SMALLINT | SMALLINT | |
| SMALLINT UNSIGNED | INT | |
| MEDIUMINT | INT | |
| MEDIUMINT UNSIGNED | BIGINT | |
| INT | INT | |
| INT UNSIGNED | BIGINT | |
| BIGINT | BIGINT | |
| BIGINT UNSIGNED | LARGEINT | |
| BIT(M) | INT | |
| Decimal | Decimal ( Note zerofill is not supported) | |
| Numeric | Decimal | |
| Float | Float | |
| Double | DOUBLE | |
| BOOL / BOOLEAN | BOOLEAN | |
| Date and time | DATE | DATEV2 |
| DATETIME[(fsp)] | DATETIMEV2 | |
| Timestamp[(fsp)] | DATETIMEV2 | |
| Time[(fsp)] | VARCHAR | |
| YEAR[(4)] | INT | |
| String | CHAR / VARCHAR | VARCHAR |
| BINARY / VARBINARY | STRING | |
| TINYTEXT / TEXT / MEDIUMTEXT / LONGTEXT | STRING | |
| TINYBLOB / BLOB / MEDIUMBOLB / LONGBLOB | STRING | |
| ENUM | STRING | |
| SET | STRING | |
| JSON | STRING |
To avoid data loss, CHAR and VARCHAR(n) are converted to VARCHAR(4*n) in SelectDB.
If no length is specified, the SelectDB default of VARCHAR(65533) is used.
If the length exceeds 65533, the data is converted to STRING.
Additional columns
DTS automatically creates the following columns in Duplicate engine tables (or you can add them manually):
| Column | Type | Default value | Description |
|---|---|---|---|
_is_deleted | Int | 0 | Indicates whether the record is deleted. Set to 0 for INSERT and UPDATE operations, 1 for DELETE operations. |
_version | Bigint | 0 | Set to 0 during full data migration. During incremental migration, contains the timestamp (in seconds) from the source binary log file. |
_record_id | Bigint | 0 | Set to 0 during full data migration. During incremental migration, contains the unique ID of the incremental log entry. The ID auto-increments for each new log entry. |