All Products
Search
Document Center

Data Transmission Service:Synchronize PolarDB for MySQL to Elasticsearch

Last Updated:Mar 28, 2026

Use Data Transmission Service (DTS) to continuously synchronize data from a PolarDB for MySQL cluster to an Elasticsearch instance. DTS handles schema synchronization, full data load, and ongoing incremental replication in a single task.

Quick start

Before you configure the task, complete these three steps:

  1. Prepare the source PolarDB for MySQL cluster: Enable binary logging, set loose_polar_log_bin to ON, and retain binary logs for at least 3 days (7 days recommended).

  2. Prepare the destination Elasticsearch instance: Create an instance with storage space larger than the source cluster. Development and test specifications are not supported.

  3. Prepare database accounts: Grant read permissions on the source cluster, and have the elastic login credentials ready for the destination instance.

Prerequisites

Before you begin, ensure that you have:

  • A destination Elasticsearch instance in the same region as your synchronization task, with storage space larger than the source PolarDB for MySQL cluster. See Create an Alibaba Cloud Elasticsearch instance.

  • Binary logging enabled on the source PolarDB for MySQL cluster, with the loose_polar_log_bin parameter set to ON. See Enable binary logging and Set cluster and node parameters.

  • Binary logs retained for at least 3 days (7 days recommended). Retaining logs for less than the required period can cause task failures, data inconsistency, or data loss—issues not covered by the DTS Service-Level Agreement (SLA).

  • Database accounts with the required permissions. See Database account permissions.

Enabling binary logging on a PolarDB for MySQL cluster consumes storage space and incurs fees.
For supported source and destination database versions, see Synchronization overview. Different Elasticsearch instance specifications support different storage capacities.

Billing

Synchronization typePricing
Schema synchronization and full data synchronizationFree of charge
Incremental data synchronizationCharged. See Billing overview.

Limitations

Review the following limitations before configuring the task.

Source database

  • Tables must have a primary key or a UNIQUE constraint with unique field values. Without this, duplicate data may appear in the destination.

  • For table-level synchronization with column mapping, a single task supports a maximum of 1,000 tables. If you exceed this limit, split the tables across multiple tasks, or configure the task to synchronize the entire database instead.

  • Do not run DDL operations that change database or table schemas during schema synchronization or full data synchronization—the task will fail.

    During full data synchronization, DTS queries the source database, creating metadata locks that may block DDL operations on the source.
  • Binary logging must be enabled with loose_polar_log_bin set to ON. If not, the precheck fails and the DTS instance cannot start.

Other limitations

  • Synchronization from a read-only node of the source PolarDB for MySQL cluster is not supported.

  • OSS foreign tables from the source cluster cannot be synchronized.

  • DTS does not support synchronizing INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, or foreign key (FK) objects.

  • Synchronizing to Elasticsearch indexes that contain parent-child relationships or Join field types may cause task errors or query failures in the destination.

  • Primary/secondary failover of the database instance is not supported during initial full data synchronization. If a failover occurs, reconfigure the synchronization task promptly.

  • To add columns to source tables, first update the corresponding mapping in the Elasticsearch instance, then run the DDL on the source, and finally pause and restart the synchronization task.

  • PolarDB for MySQL and Elasticsearch support different data types, so one-to-one type mapping is not possible. During schema initialization, DTS maps types based on what the destination supports. See Data type mappings for schema initialization.

  • Run the synchronization during off-peak hours. Full data synchronization consumes read and write resources on both databases, which may increase load.

  • Full data synchronization runs concurrent INSERT operations, which causes fragmentation in destination tables. As a result, the tablespace of the destination instance is larger than that of the source after full data synchronization completes.

  • For table-level synchronization, do not use tools such as pt-online-schema-change for online DDL operations on synchronized objects in the source—the task will fail.

  • For table-level synchronization, if no data other than DTS writes goes to the destination, use Data Management (DMS) for online DDL operations. See Change schemas without locking tables.

  • If data other than DTS writes goes to the destination during synchronization, data inconsistency between source and destination may occur.

  • If a task fails, DTS support staff will attempt to restore it within 8 hours. During restoration, they may restart the task or adjust DTS task parameters (not database parameters). Parameters that may be adjusted are listed in Modify instance parameters.

DTS periodically executes the CREATE DATABASE IF NOT EXISTS \test\`` command on the source database to advance the binary log offset.

Supported SQL operations

Operation typeSQL operations
DMLINSERT, UPDATE, DELETE
The UPDATE statement cannot be used to remove fields.

Database account permissions

DatabaseRequired permissionsHow to create and grant
Source PolarDB for MySQL clusterRead permissions on the objects to synchronizeSee Create an account and Modify account permissions
Destination Elasticsearch instanceLogin name (default: elastic) and password set when the instance was created

Data type mappings

Because PolarDB for MySQL and Elasticsearch support different data types, DTS maps types based on what the destination Elasticsearch instance supports during schema initialization. See Data type mappings for initial schema synchronization.

DTS does not set the mapping parameter for dynamic during schema migration. The behavior depends on your Elasticsearch instance settings. If source data is in JSON format, ensure values for the same key have the same data type across all rows in a table—otherwise, DTS may report synchronization errors. See dynamic.

The following table shows how Elasticsearch concepts map to relational database concepts:

ElasticsearchRelational database
IndexDatabase
TypeTable
DocumentRow
FieldColumn
MappingDatabase schema

Create a synchronization task

Step 1: Open the data synchronization page

Open the data synchronization task list in one of the following ways:

DTS console

  1. Log on to the DTS console.

  2. In the navigation pane on the left, click Data Synchronization.

  3. In the upper-left corner of the page, select the region where the synchronization instance is located.

DMS console

Note

The actual steps may vary depending on the mode and layout of the DMS console. For more information, see Simple mode console and Customize the layout and style of the DMS console.

  1. Log on to the DMS console.

  2. In the top menu bar, choose Data + AI > DTS (DTS) > Data Synchronization.

  3. To the right of Data Synchronization Tasks, select the region of the synchronization instance.

Step 2: Configure source and destination databases

  1. Click Create Task.

  2. Configure the source and destination databases using the following settings.

General

ParameterDescription
Task NameDTS automatically generates a name. Specify a descriptive name for easy identification. The name does not need to be unique.

Source database

ParameterDescription
Select Existing ConnectionSelect a registered database instance from the drop-down list to auto-fill the connection details. If no registered instance exists or you prefer not to use one, configure the connection details manually.
Note

In the DMS console, this field is labeled Select a DMS database instance.

Database TypeSelect PolarDB for MySQL.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the source PolarDB for MySQL cluster resides.
Replicate Data Across Alibaba Cloud AccountsSelect No if the source cluster belongs to the current Alibaba Cloud account.
PolarDB Cluster IDSelect the ID of the source PolarDB for MySQL cluster.
Database AccountEnter the database account for the source cluster. See Database account permissions.
Database PasswordEnter the password for the database account.
EncryptionSelect as needed. See Configure SSL encryption.

Destination database

ParameterDescription
Select Existing ConnectionSelect a registered database instance from the drop-down list to auto-fill the connection details. If no registered instance exists or you prefer not to use one, configure the connection details manually.
Note

In the DMS console, this field is labeled Select a DMS database instance.

Database TypeSelect Elasticsearch.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the destination Elasticsearch instance resides.
TypeSelect Cluster or Serverless as needed.
Instance IDSelect the ID of the destination Elasticsearch instance.
Database AccountEnter the default login name elastic.
Database PasswordEnter the password for the elastic account.
EncryptionSelect HTTP or HTTPS as needed.
  1. Click Test Connectivity and Proceed at the bottom of the page.

Add the CIDR blocks of DTS servers to the security settings of both the source and destination databases to allow access. See Add the IP address whitelist of DTS servers. If the source or destination is a self-managed database (where Access Method is not Alibaba Cloud Instance), also click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Step 3: Configure task objects

On the Configure Objects page, specify the objects to synchronize.

ParameterDescription
Synchronization TypesDTS always selects Incremental Data Synchronization. Also select Schema Synchronization and Full Data Synchronization (selected by default). After the precheck, DTS initializes the destination cluster with the full data of the selected source objects as the baseline for incremental synchronization.
Index NameTable Name: the index name in the destination matches the table name. Database Name_Table Name: the index name is the database name, an underscore (_), and the table name concatenated. This setting applies to all tables.
Processing Mode of Conflicting TablesPrecheck and Report Errors: checks for tables with the same name in the destination. If found, the precheck reports an error and the task does not start. If you cannot delete or rename the conflicting table, map it to a different name. See Database Table Column Name Mapping. Ignore Errors and Proceed: skips the same-name check.
Warning

This option may cause data inconsistency. During full data synchronization, DTS skips source records that conflict with destination records. During incremental synchronization, DTS overwrites destination records. If table schemas are inconsistent, initialization may fail or result in partial synchronization.

Capitalization of object names in destination instanceSet the case policy for database, table, and column names in the destination. The default is DTS default policy. See Case policy for destination object names.
Source ObjectsIn the Source Objects box, click the objects to synchronize, then click the arrow to move them to the Selected Objects box. Select objects at the database or table level.
Selected ObjectsTo modify the index name, type name, field name, or filter condition for a table, right-click the table in the Selected Objects area. See Map database and table column names and Set filter conditions. Only underscores (_) are allowed as special characters in index and type names.

Step 4: Configure advanced settings

Click Next: Advanced Settings and configure the following:

ParameterDescription
Dedicated cluster for task schedulingBy default, DTS uses a shared cluster. For greater task stability, purchase a dedicated cluster. See What is a DTS dedicated cluster?.
Retry time for failed connectionsIf the connection to the source or destination fails after the task starts, DTS immediately retries. The default retry duration is 720 minutes. Set a value from 10 to 1,440 minutes—30 minutes or more is recommended. If the connection is restored within this period, the task resumes.
Note

If multiple DTS instances share a source or destination, DTS applies the shortest configured retry duration across all instances. DTS charges for task runtime during connection retries.

Retry time for other issuesIf a non-connection issue occurs (for example, a DDL or DML execution error), DTS immediately retries. The default is 10 minutes. Set a value from 1 to 1,440 minutes—10 minutes or more is recommended.
Important

This value must be less than the retry time for failed connections.

Enable throttling for full data synchronizationLimit the synchronization rate to reduce pressure on the destination by setting Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected. See also Adjust the rate of full data synchronization.
Enable throttling for incremental data synchronizationLimit the incremental synchronization rate by setting RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
Environment TagSelect an environment label to identify the instance, if needed.
Shard ConfigurationSet the number of primary and replica shards for the index, based on the maximum shard configuration in the destination Elasticsearch instance.
String IndexHow strings are indexed in the destination: analyzed (analyze first, then index—also select an analyzer; see Analyzers), not analyzed (index the raw value directly), or no (do not index).
Time ZoneWhen synchronizing DATETIME or TIMESTAMP data types to Elasticsearch, select the time zone to use.
Note

If time zone information is not needed, pre-configure the document type for this data in the destination instance.

DOCIDNo configuration required. DOCID defaults to the table's primary key. If no primary key exists, Elasticsearch auto-generates the ID.
Whether to delete SQL operations on heartbeat tablesYesalert notifications: DTS does not write heartbeat SQL to the source. The DTS instance may display latency. No: DTS writes heartbeat SQL to the source, which may interfere with operations like physical backups and cloning.
Configure ETLChoose whether to enable extract, transform, and load (ETL). Yes: enables ETL; enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: disables ETL. See What is ETL?.
Monitoring and AlertingNo: no alerts configured. Yes: configure the alert threshold and notification contacts. See Configure monitoring and alerting during task configuration.

Step 5: Configure database and table fields

Click Next: Configure Database and Table Fields to set the routing strategy and document ID for tables in the destination Elasticsearch instance.

Set Definition Status to All to view and edit all tables.
ParameterDescription
Set _routingYes: define a custom column for routing documents to specific shards. See _routing. No: route using _id. If the destination Elasticsearch instance is version 7.*x*, select No.
_routing ColumnSelect the column to use for routing. Required only when Set _routing is Yes.
Value of _idSelect the column to use as the document ID.

Step 6: Save the task and run the precheck

  • To view the API parameters for this configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.

  • Click Next: Save Task Settings and Precheck to save the task and start the precheck.

DTS runs a precheck before starting the task. The task starts only if the precheck passes.
If the precheck fails, click View Details next to the failed item, fix the issue as prompted, and rerun the precheck.
For non-ignorable warnings, click View Details, fix the issue, and rerun the precheck.
For ignorable warnings, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring precheck warnings may cause data inconsistency. Proceed with caution.

Step 7: Purchase the instance

  1. When the Success Rate reaches 100%, click Next: Purchase Instance.

  2. On the Purchase page, configure the instance.

CategoryParameterDescription
New instance classBilling methodSubscription: pay upfront for a set duration. Cost-effective for long-term, continuous tasks. Pay-as-you-go: billed hourly for actual usage. Ideal for short-term or test tasks—release the instance at any time to stop charges.
Resource group settingsThe resource group for the instance. Defaults to default resource group. See What is Resource Management?.
Instance classDifferent specifications affect the synchronization rate. Select based on your requirements. See Data synchronization link specifications.
Subscription durationIn subscription mode, select the duration and quantity. Monthly options: 1–9 months. Yearly options: 1, 2, 3, or 5 years. Appears only when Billing method is Subscription.
  1. Select the checkbox for Data Transmission Service (Pay-as-you-go) Service Terms.

  2. Click Buy and Start, then click OK.

Monitor the task progress on the data synchronization page.