Data Transmission Service (DTS) supports migrating data from a PolarDB-X 2.0 instance to an Elasticsearch cluster, covering schema migration, full data migration, and incremental data migration.
Prerequisites
Before you begin, ensure that you have:
A source PolarDB-X 2.0 instance
A destination Elasticsearch cluster. See Create an Alibaba Cloud Elasticsearch instance
Verified version compatibility between the source and destination. See Overview of migration scenarios
Destination Elasticsearch storage space that is larger than the source PolarDB-X 2.0 instance storage space
Full data migration uses concurrent INSERT operations, which cause index fragmentation in the destination. As a result, the storage space used by the tables in the destination database is larger than that in the source instance after full migration is complete.
Migration types
DTS supports three migration types for this scenario:
| Migration type | Description |
|---|---|
| Schema migration | Migrates schema definitions of the migration objects from the source database to the destination cluster. |
| Full data migration | Migrates all historical data of the specified migration objects. No instance configuration fee is charged. |
| Incremental data migration | After full data migration completes, continuously migrates incremental data updates. Enables migration without interrupting your applications. Instance configuration fee is charged. |
For no-downtime migration, select all three types together.
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration + full data migration | Free | Charged when the Access Method of the destination is set to Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | — |
SQL operations supported by incremental migration
| Operation type | SQL statements | Notes |
|---|---|---|
| DML | INSERT, UPDATE, DELETE | UPDATE operations that remove fields are not supported. |
Account permissions
| Database | Schema migration | Full migration | Incremental migration |
|---|---|---|---|
| Source PolarDB-X 2.0 | SELECT | SELECT | REPLICATION SLAVE, REPLICATION CLIENT, and SELECT on objects to be migrated. See Account permission issues during data synchronization. |
| Destination Elasticsearch | Read and write permissions (typically the elastic account) | — | — |
Data type mappings
PolarDB-X 2.0 and Elasticsearch support different data types, so direct mapping is not always possible. DTS maps data types based on the types that the destination Elasticsearch instance supports. See Data type mappings for initial schema synchronization.
DTS does not set themappingparameter indynamicduring schema migration. The behavior depends on your Elasticsearch instance settings. If source data is in JSON format, make sure values for the same key have the same data type across all rows in a table. Otherwise, DTS may report synchronization errors. See dynamic.
The following table shows how Elasticsearch concepts map to relational database concepts:
| Elasticsearch | Relational database |
|---|---|
| Index | Database |
| Type | Table |
| Document | Row |
| Field | Column |
| Mapping | Database schema |
Limitations
Review the following limitations before starting a migration task. Encountering these constraints mid-task may require you to restart the migration.
Source database limitations
Bandwidth: The server hosting the source database must have sufficient outbound bandwidth. Insufficient bandwidth affects migration speed.
Unsupported source type: Read-only instances of PolarDB-X 2.0 Enterprise Edition cannot be used as the source.
Primary key or UNIQUE constraint required: Tables must have a primary key or UNIQUE constraint with unique fields. Without this, duplicate data may appear in the destination.
1,000-table limit for column mapping: When migrating at the table level with column name mapping, a single task supports a maximum of 1,000 tables. If you exceed this limit, split the tables into multiple tasks, or configure a task to migrate the entire database instead.
Uppercase letters in table names: For PolarDB-X 2.0 tables whose names contain uppercase letters, only schema migration is supported.
Unsupported migration objects:
Table groups and databases or tables with the Locality property
Tables whose names are reserved words (for example,
select)Database partitions in DRDS mode
Binary log requirements for incremental migration
Binary logging must be enabled on the source database.
The
binlog_row_imageparameter must be set tofull. If it is not, the precheck fails and the migration task cannot start.Binary log retention period: If the retention period is shorter than required, the DTS task may fail, and in extreme cases data inconsistency or data loss may occur. Issues caused by an insufficient retention period are not covered by the Service-Level Agreement (SLA).
For incremental-only tasks: retain binary logs for at least 24 hours.
For tasks that include both full and incremental migration: retain binary logs for at least 7 days. After full migration completes, you can reduce the retention period to 24 hours or more.
DDL restrictions during migration
During schema migration and full migration, do not perform DDL operations that change the schema of databases or tables. Otherwise, the migration task fails.
During full migration, DTS queries the source database and creates a metadata lock, which may block DDL operations on the source.
If you switch the network type of the PolarDB-X 2.0 instance during migration, update the network connection information for the migration link after the switch completes.
For full-migration-only tasks, do not write new data to the source instance during migration. To maintain real-time data consistency, select schema migration, full data migration, and incremental data migration together.
Elasticsearch destination limitations
Join field type not supported: Migrating data to an index that has a parent-child relationship or a Join field type mapping is not supported. Attempting this may cause the task to become abnormal or cause query failures in the destination.
Development and test specifications not supported: Elasticsearch development and test instance specifications cannot be used as the destination.
Adding a column mid-migration: To add a column to a table being migrated, first modify the mapping in the destination Elasticsearch index, then perform the DDL operation in the source database, and then pause and restart the migration task.
Other considerations
Migrate during off-peak hours. Full data migration consumes read and write resources on both source and destination databases, which may increase load.
DTS attempts to resume failed migration tasks that are less than seven days old. Before switching your business to the destination instance, end or release the task, or revoke the write permissions of the DTS account using the
revokecommand. This prevents the source data from overwriting destination data if the task resumes automatically.If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust DTS task parameters (not database parameters). For parameters that may be adjusted, see Modify instance parameters.
DTS periodically updates the
dts_health_check.ha_health_checktable in the source database to advance the binary log offset.During schema migration, DTS migrates foreign keys from the source database to the destination. During full and incremental migration, DTS temporarily disables constraint checks and foreign key cascade operations at the session level. If cascade update or delete operations occur in the source while the task is running, data inconsistency may occur.
Create a migration task
Step 1: Open the migration task list
Navigate to the migration task list using one of the following methods.
From the DTS console
Log on to the Data Transmission Service (DTS) console.
In the left navigation pane, click Data Migration.
In the upper-left corner, select the region where the migration instance is located.
From the DMS console
Log on to the Data Management (DMS) console.
In the top menu bar, choose Data + AI > Data Transmission (DTS) > Data Migration.
To the right of Data Migration Tasks, select the region where the migration instance is located.
Step 2: Configure source and destination databases
Click Create Task.
Configure the source and destination databases using the following settings.
WarningAfter selecting source and destination instances, carefully read the limits displayed at the top of the page. Skipping this may cause task failure or data inconsistency.
Source database settings:
Parameter Description Task Name DTS auto-generates a name. Specify a descriptive name for easy identification. The name does not need to be unique. Select Existing Connection Select a previously added database instance from the drop-down list (auto-fills the settings below), or configure the connection manually. In the DMS console, this parameter is named Select a DMS database instance. Database Type Select PolarDB-X 2.0. Connection Type Select Cloud Instance. Instance Region Select the region where the source PolarDB-X 2.0 instance resides. Replicate Data Across Alibaba Cloud Accounts Select No for same-account migration. Instance ID Select the ID of the source PolarDB-X 2.0 instance. Database Account Enter the database account. See Account permissions. Database Password Enter the password for the database account. Destination database settings:
Parameter Description Select Existing Connection Select a previously added database instance, or configure the connection manually. In the DMS console, this parameter is named Select a DMS database instance. Database Type Select Elasticsearch. Connection Type Select Cloud Instance. Instance Region Select the region where the destination Elasticsearch instance resides. Type Select Cluster or Serverless based on your needs. Instance ID Select the ID of the destination Elasticsearch instance. Database Account Enter the Elasticsearch account (default: elastic). See Account permissions.Database Password Enter the password for the account. Encryption Select HTTP or HTTPS. Click Test Connectivity and Proceed.
Make sure the IP address ranges of the DTS service are added to the security settings of the source and destination databases. See Add DTS server IP addresses to a whitelist. If the source or destination is a self-managed database (Access Method is not Alibaba Cloud Instance), also click Test Connectivity in the CIDR Blocks of DTS Servers dialog.
Step 3: Configure migration objects
On the Configure Objects page, set the following parameters:
| Parameter | Description |
|---|---|
| Migration Types | Select the migration types based on your downtime requirements: <br>- Full migration only: select Schema Migration and Full Data Migration. <br>- No-downtime migration: select Schema Migration, Full Data Migration, and Incremental Data Migration. <br><br>If you skip Schema Migration, make sure the destination already has the target database and tables, or use object name mapping in Selected Objects. If you skip Incremental Data Migration, do not write new data to the source during migration. |
| Index Name | Sets how the index name is created in the destination: <br>- Table Name: the index name matches the table name (example: order). <br>- Database_Table: the index name is database_table (example: dtstest_order). |
| Processing Mode of Conflicting Tables | - Precheck and Report Errors: checks for same-name indexes in the destination before starting. The task does not start if a conflict is found. To resolve conflicts, see Object name mapping. <br>- Ignore Errors and Proceed: skips the check. During full migration, existing destination records with the same primary key are kept. During incremental migration, source records overwrite destination records. Inconsistent schemas may result in partial or failed migration. |
| Capitalization of Object Names in Destination Instance | Configures the case policy for database, table, and column names in the destination. Default: DTS Default Policy. See Case conversion policy for destination object names. |
| Source Objects | Select objects from the Source Objects section. Click the |
| Selected Objects | - To rename a single object, right-click it in Selected Objects. See Individual table column mapping. <br>- To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time. <br>- To filter rows, right-click a table and set a WHERE clause condition. See Filter task data using SQL conditions. <br>- To select SQL operations to migrate at the database or table level, right-click the object to be migrated in Selected Objects and select the SQL operations to migrate in the dialog box that appears. <br><br>Index and type names support only underscores (_) as special characters. If you use object name mapping, dependent objects may fail to migrate. |
Step 4: Configure advanced settings
Click Next: Advanced Settings and configure the following parameters:
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | DTS schedules tasks on a shared cluster by default. For more stable performance, purchase a dedicated cluster. |
| Retry Time for Failed Connections | Default: 720 minutes. Range: 10–1,440 minutes. Set to more than 30 minutes. If DTS reconnects within this period, the task resumes automatically. Note: you are charged during the retry period. |
| Retry Time for Other Issues | Default: 10 minutes. Range: 1–1,440 minutes. Set to more than 10 minutes. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits QPS to the source, RPS, and migration speed (MB/s) during full migration. Only available when Full Data Migration is selected. You can also adjust this after the task starts. |
| Enable Throttling for Incremental Data Migration | Limits RPS and migration speed (MB/s) during incremental migration. Only available when Incremental Data Migration is selected. You can also adjust this after the task starts. |
| Environment Tag | (Optional) Tags the instance for environment identification. |
| Shard Configuration | Sets the number of primary and replica shards for indexes, based on the maximum shard configuration of the destination Elasticsearch instance. |
| String Index | Controls how strings are indexed in the destination: <br>- analyzed: analyzes the string before indexing. Select a specific analyzer. See Analyzers. <br>- not analyzed: indexes the original value without analysis. <br>- no: does not index the string. |
| Time Zone | Specifies the time zone for time-type data (such as DATETIME and TIMESTAMP) migrated to Elasticsearch. If the destination does not require time zone information, set the document type for time-type fields in the destination before migration. |
| DOCID | The document ID in Elasticsearch. Default: the primary key of the table. If the table has no primary key, DOCID is the ID column automatically generated by Elasticsearch. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Controls whether DTS writes heartbeat SQL to the source database: <br>- Yes: does not write heartbeat SQL. The DTS instance may display latency. <br>- No: writes heartbeat SQL. This may interfere with physical backups and cloning operations on the source. |
| Configure ETL | Enables extract, transform, and load (ETL) processing. See What is ETL?. Select Yes to enter data processing statements. See Configure ETL. |
| Monitoring and Alerting | Select Yes to configure an alert thresholdalert notifications and alert notifications. DTS sends notifications when a migration fails or latency exceeds the threshold. |
Step 5: Configure Elasticsearch routing and document IDs
Click Next: Configure Database and Table Fields and set the following Elasticsearch-specific parameters.
These settings control how documents are stored and identified in the destination cluster. Choose based on your primary Elasticsearch use case:
Search and analytics use case: Elasticsearch routes documents by document ID. Use No for
_routingand keep the default Primary key column of the table for_id.Key-value lookup use case: Use a business key as the document ID to support deterministic lookups and upserts. Select Business primary key for
_id.
| Parameter | Description |
|---|---|
| Set _routing | Controls which shard stores a document. <br>- Yes: uses custom columns for routing. <br>- No: uses _id for routing (required for Elasticsearch 7._x_). See _routing. |
| Value of _id | Sets the document ID. <br>- Primary key column of the table: composite primary keys are merged into a single column. <br>- Business primary key: set the corresponding Business primary key column. |
Step 6: Save the task and run the precheck
Click Next: Save Task Settings and Precheck.
To view OpenAPI parameters for this configuration, hover over the button and click Preview OpenAPI parameters before clicking through.
DTS runs a precheck before starting the migration task. The task starts only after the precheck passes.
If a check item fails: click View Details, fix the issue based on the prompt, and run the precheck again.
If a warning is reported:
For warnings that cannot be ignored: click View Details, fix the issue, and run the precheck again.
For warnings that can be ignored: click Confirm Alert Details > Ignore > OK > Precheck Again. Ignoring a warning may cause data inconsistency.
Step 7: Purchase the instance
When Success Rate reaches 100%, click Next: Purchase Instance.
On the Purchase page, configure the instance:
Parameter Description Resource Group Settings Select the resource group for the instance. Default: default resource group. See What is Resource Management? Instance Class Select a specification based on your migration volume and speed requirements. See Data migration link specifications. Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start. In the dialog that appears, click OK.
View migration progress on the Data Migration Tasks list page.
Tasks without incremental migration stop automatically after full migration completes. Status changes to Completed.
Tasks with incremental migration continue running. Status remains Running.
Verify migrated indexes and data
After the migration task enters the Running state, use Kibana to connect to the Elasticsearch instance and verify that the created indexes and migrated data meet your expectations. For login instructions, see Log on to the Kibana console.
If the results do not meet your expectations, delete the index and its data, then reconfigure the migration task.