When you need to perform full-text search or complex analysis on business data in PolarDB for MySQL, operating directly on the database can impact the stability of your core business. PolarDB provides the AutoETL feature to automatically and continuously synchronize data from a read-write node to a PolarSearch node within the cluster. You can create data synchronization links using search views or ETL stored procedures. This approach eliminates the need to deploy and maintain separate ETL tools and isolates search and analytics workloads from online transactional processing workloads.
When you create a link using AutoETL, the AutoETL engine is granted permission by default to access PolarDB data for synchronization.
Feature overview
AutoETL is a built-in data synchronization capability of PolarDB for MySQL that automatically synchronizes data between different types of nodes within a cluster. The current version supports only data synchronization from PolarDB for MySQL to a PolarSearch node in the same cluster for high-performance search and analysis.
AutoETL provides two methods for creating a data synchronization link:
-
Search view: Uses the
CREATE SEARCH VIEWsyntax to define data synchronization logic in standard SQL. This method is ideal for most single-table synchronization and multi-table aggregation scenarios, as the system automatically handles the underlying connection details. -
ETL stored procedure (
dbms_etl.sync_by_sql): Uses a stored procedure with Flink SQL-compatible syntax to define complex logic for data cleansing, transformation, and aggregation.
Prerequisites
Before you use the AutoETL feature, ensure that your environment meets the following requirements:
-
Cluster versions:
-
Search view:
-
MySQL 8.0.1, revision version 8.0.1.1.54 or later.
-
MySQL 8.0.2, revision version 8.0.2.2.34 or later.
-
-
ETL stored procedure (sync_by_sql):
-
MySQL 8.0.1, revision version 8.0.1.1.52 or later.
-
MySQL 8.0.2, revision version 8.0.2.2.33 or later.
-
-
-
Binlog: The cluster must have binlog enabled.
-
Synchronization direction: Only synchronization from PolarDB for MySQL to a PolarSearch node within the same cluster is supported.
-
DDL limitations: When you perform DDL operations on a source table with an associated search view or ETL stored procedure, you must follow specific rules to avoid interrupting synchronization. Some incompatible changes require you to re-create the search view. For more information, see DDL Change Rules and Best Practices.
-
Data types: AutoETL does not support synchronizing the
BITdata type or spatial data types such asGEOMETRY,POINT,LINESTRING,POLYGON,MULTIPOINT,MULTILINESTRING,MULTIPOLYGON, andGEOMETRYCOLLECTION. -
Search view query limitations: Search views can only be used to define synchronization semantics and do not support data queries. To query data, connect directly to the PolarSearch node.
Search views
A search view is a declarative data synchronization mechanism provided by AutoETL. You can use standard SQL syntax to create a search view, which automatically establishes a continuous data synchronization link from the source table to a PolarSearch node.
Create a search view
Syntax
CREATE SEARCH VIEW view_name [(column_list, PRIMARY KEY (pk_column_list))] AS select_statement;
Parameters
|
Parameter |
Required |
Description |
|
|
Yes |
The name of the search view. This is also the name of the target index in the PolarSearch node. |
|
|
No |
Manually defines the columns of the search view. Use commas ( Note
For single-table synchronization, you do not need to specify |
|
|
No |
Specifies the primary key columns for the search view. The structure of a search view corresponds to the index mapping of a PolarSearch node, and If this parameter is not specified, the first column in |
|
|
Yes |
A |
Usage notes and limitations
-
The source table must have a primary key or a unique key.
-
You must have
ALTERpermissions on all source tables referenced in the search view andSELECTpermissions on the relevant columns or the entire tables. -
After you create a search view, the system does not synchronize new columns added to the source table by default. To synchronize new columns, see Modify a search view.
-
If you want to use a custom configuration for the target index, you can first manually create the index and define its configuration in the PolarSearch node, and then create the search view. If the target index does not exist when you create the search view, the system creates it automatically.
-
To configure advanced synchronization parameters, such as JSON field conversions or routing fields for multi-table aggregations or complex queries, see AutoETL Parameter Configuration and Use Cases.
Prepare the data
To prepare test data for the following examples, run these SQL statements in PolarDB for MySQL.
CREATE DATABASE IF NOT EXISTS db1;
CREATE DATABASE IF NOT EXISTS db2;
USE db1;
CREATE TABLE IF NOT EXISTS t1 (
id INT PRIMARY KEY,
c1 VARCHAR(100),
c2 VARCHAR(100)
);
INSERT INTO t1(id, c1, c2) VALUES
(1, 'apple', 'red'),
(2, 'banana', 'yellow'),
(3, 'grape', 'purple');
USE db2;
CREATE TABLE IF NOT EXISTS t2 (id INT PRIMARY KEY, c2 INT);
INSERT INTO t2(id, c2) VALUES (1, 111), (2, 222), (4, 444);
Examples
-
Full table synchronization: Synchronize all data from
db1.t1to PolarSearch. The view nameview_testserves as the name of the target index in PolarSearch.CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1; -
Specific column synchronization: Synchronize only the
c1andc2columns, and manually define the column types and primary key.CREATE SEARCH VIEW view_test1 AS SELECT c1, c2 FROM db1.t1; -
Conditional filtering: Synchronize only the data that meets the
WHEREcondition.CREATE SEARCH VIEW view_test2 AS SELECT id, c1, c2 FROM db1.t1 WHERE c1 > 10; -
Multi-table JOIN: Join
db1.t1anddb2.t2on theidfield and synchronize the result to PolarSearch.CREATE SEARCH VIEW view_test3(id, c1, c2) AS SELECT t1.id, t1.c1, t2.c2 FROM db1.t1 AS t1 LEFT JOIN db2.t2 AS t2 ON t1.id = t2.id; -
Multi-table UNION: Merge and synchronize multiple tables with the same structure. The
SELECTstatements must have the same number and types of columns.CREATE SEARCH VIEW view_test4(id, c2) AS SELECT id, c2 FROM db1.t1 UNION ALL SELECT id, c2 FROM db2.t2; -
Group and aggregate: Group and aggregate data before synchronization. You must manually define the columns and primary key when using
GROUP BY.CREATE SEARCH VIEW view_test5 (id, max_c) AS SELECT t1.id, MAX(t1.c1) AS max_c FROM db1.t1 GROUP BY t1.id;
Verify the data
Check the sync status of the search view:
SHOW SEARCH VIEW STATUS;
When the status is active, the search view is synchronizing data as expected. Connect to the PolarSearch node and use the Elasticsearch-compatible REST API to verify the data:
# Replace : with the credentials for the PolarSearch node and with the connection endpoint and port of the PolarSearch node.
curl -u : -X GET "http:///view_test/_search"
Manage search views
You can use the following commands to view your search views.
-
View the status of all search views:
SHOW SEARCH VIEW STATUS;The following result is returned. A status of
activeindicates that the search view is synchronizing data as expected.+------------+--------+----------+---------+---------------------+---------------------+ | View Name | Type | Status | Message | Created_at | Updated_at | +------------+--------+----------+---------+---------------------+---------------------+ | view_test | search | active | | 2026-03-18 18:44:12 | 2026-03-18 18:51:37 | +------------+--------+----------+---------+---------------------+---------------------+ -
View the creation statement for a specific search view:
SHOW CREATE SEARCH VIEW view_test;The following result is returned:
+-----------+------------------------------------------------------+ | View Name | Create Search View | +-----------+------------------------------------------------------+ | view_test | CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1 | +-----------+------------------------------------------------------+
Delete a search view
Deleting a search view is a high-risk operation. Proceed with caution. This operation stops data synchronization for the search view and cleans up related resources, but it does not delete the index data in PolarSearch.
DROP SEARCH VIEW view_name;
The system's behavior when you delete a search view varies based on its status:
-
For a search view in the
activestate, the status first changes todropping. After the system finishes cleaning up the synchronization resources, the status changes todropped. -
If you delete a search view that is in the
droppedstate, the system permanently removes all information about that search view. -
You cannot delete a search view that is in any other state.
Modify a search view
When you need to modify the synchronization logic of a search view, such as adding a synchronized field or changing a query condition, re-create it by using a "new index + new search view" approach. This method does not affect your business queries.
-
Create a new search view that synchronizes data to a new PolarSearch index.
-
Check the status of the new search view by running
SHOW SEARCH VIEW STATUS. When the synchronization latency drops to between 0 and 1 second, switch your business query logic from the old index to the new index. -
Delete the old search view.
For more information about the impact of source table DDL changes on search views and best practices for modifications, see DDL Change Rules and Best Practices.
ETL stored procedure (sync_by_sql)
For scenarios that require complex transformations, aggregations, or calculations, you can use the CALL dbms_etl.sync_by_sql stored procedure to define data synchronization logic with Flink SQL-compatible syntax.
Create a synchronization link
Syntax
CALL dbms_etl.sync_by_sql("search", "");
Example
The system automatically configures the connection information for the source and target tables, such as host address, port, and credentials. You do not need to specify it in the WITH clause.
CALL dbms_etl.sync_by_sql("search", "
-- Step 1: Define the PolarDB source table
CREATE TEMPORARY TABLE `db1`.`t1` (
`id` BIGINT,
`c1` STRING,
PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
'connector' = 'mysql',
'database-name' = 'db1',
'table-name' = 't1'
);
-- Step 2: Define the PolarSearch target table
CREATE TEMPORARY TABLE `dest` (
`id` BIGINT,
`max_c` STRING,
PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
'connector' = 'opensearch',
'index' = 'dest'
);
-- Step 3: Define the computation and insertion logic
INSERT INTO `dest`
SELECT
`t1`.`id`,
MAX(`t1`.`c1`)
FROM `db1`.`t1` AS `t1`
GROUP BY `t1`.`id`;
");
Verify the data
Connect to the PolarSearch node and use the Elasticsearch-compatible REST API to query and confirm that the data has been synchronized.
# Replace with the connection endpoint of the PolarSearch node.
curl -u : -X GET "http:///dest/_search"
Manage synchronization links
You can use the following commands to view your data synchronization links.
-
To view all links:
CALL dbms_etl.show_sync_link(); -
To view a specific link by its ID, replace
<sync_id>with the ID returned when you created the link.CALL dbms_etl.show_sync_link_by_id('<sync_id>')\GThe following describes the returned parameters:
*************************** 1. row *************************** SYNC_ID: crb5rmv8rttsg NAME: crb5rmv8rttsg SYSTEM: search SYNC_DEFINITION: db1.t1 -> dest SOURCE_TABLES: db1.t1 SINK_TABLES: dest STATUS: active -- Link status. `active` indicates the link is running normally. MESSAGE: -- If an error occurs, the error message is displayed here. CREATED_AT: 2024-05-20 11:55:06 UPDATED_AT: 2024-05-20 17:28:04 OPTIONS: ...
Delete a synchronization link
This operation stops data synchronization and cleans up related resources.
Deleting a data synchronization link is a high-risk operation. Proceed with caution. This operation stops data synchronization for the link and cleans up related resources, but it does not delete the index data in PolarSearch.
CALL dbms_etl.drop_sync_link('<sync_id>');
The system's behavior when you run drop_sync_link to delete a link varies based on its status:
-
For a link in the
activestate, the status first changes todropping. After the system finishes cleaning up the link resources, the status changes todropped. -
If you delete a link that is in the
droppedstate, the system permanently removes all information about that link. -
You cannot delete a link that is in any other state.
Modify a synchronization link
When you need to modify a data synchronization link, re-create it by using the same "new index + new link" method as you would for a search view.
-
Create a new data synchronization link that synchronizes data to a new PolarSearch index.
-
Check the status of the new synchronization link by running
CALL dbms_etl.show_sync_link_by_id('<sync_id>'). When the synchronization latency drops to between 0 and 1 second, switch your business queries from the old index to the new index. -
Delete the old data synchronization link.