Sync MySQL Data to PolarSearch via AutoETL - PolarDB

Running full-text search or complex analysis on PolarDB for MySQL can affect core service stability. AutoETL is a built-in capability of PolarDB for MySQL that continuously synchronizes data from the read-write node to a PolarSearch node within the same cluster—no external ETL tools required. Use a search view or an ETL stored procedure to create a synchronization link and offload search and analysis workloads from online transaction processing (OLTP) workloads.

Note

AutoETL is currently in phased release. Submit a ticket to request access before using this feature.

Choose a synchronization method

AutoETL provides two methods for creating a data synchronization link:

Method	Syntax	Best for
Search view	`CREATE SEARCH VIEW` (standard SQL)	Most single-table sync and multi-table aggregation scenarios. The system handles all underlying connection details automatically.
ETL stored procedure (`dbms_etl.sync_by_sql`)	Flink SQL-compatible syntax	Complex data cleansing, transformation, and aggregation. Gives you full control over source and target connection configuration.

For most use cases, start with a search view—it requires less configuration and no Flink SQL knowledge.

Prerequisites

Before you begin, make sure that:

Cluster version meets the minimum revision for your chosen method:
Method MySQL 8.0.1 (minimum revision) MySQL 8.0.2 (minimum revision)
Search view 8.0.1.1.54 8.0.2.2.34
ETL stored procedure 8.0.1.1.52 8.0.2.2.33
Binlog is enabled on your cluster. See Enable binary logging.
The source table has a primary key or unique key.
Your account has the ALTER permission on all source tables and the SELECT permission on the relevant columns.

Method	MySQL 8.0.1 (minimum revision)	MySQL 8.0.2 (minimum revision)
Search view	8.0.1.1.54	8.0.2.2.34
ETL stored procedure	8.0.1.1.52	8.0.2.2.33

Limitations

Only synchronization from PolarDB for MySQL to a PolarSearch node within the same cluster is supported.
The following data types are not supported: BIT, and spatial types (GEOMETRY, POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION).
Search views define synchronization semantics only—they cannot be queried directly. To query synchronized data, connect to the PolarSearch node.
New columns added to a source table after a search view is created are not automatically synchronized. To synchronize new columns, see Modify a search view.
DDL changes on a source table require specific handling to avoid interrupting synchronization. For details, see DDL Change Rules and Practices.

Search views

A search view uses standard SQL to declare synchronization logic. After creation, the system automatically establishes a continuous data synchronization link from the source table to the PolarSearch node.

Create a search view

Syntax

CREATE SEARCH VIEW view_name [(column_list, PRIMARY KEY (pk_column_list))] AS select_statement;

Parameters

Parameter	Required	Default	Description
`view_name`	Yes	—	Name of the search view. Also used as the target index name on the PolarSearch node.
`column_list`	No	Source table column names (for single-table sync)	Column definitions for the search view. Separate multiple columns with a comma (`,`).
`pk_column_list`	No	First column in `column_list`; for single-table sync, the source table primary key	Specifies which column maps to the document ID on the PolarSearch index.
`select_statement`	Yes	—	A `SELECT` statement that defines the data source and synchronization logic. Supports single-table queries and `JOIN`, `UNION ALL`, and `GROUP BY` operations.

Usage notes

To use a custom target index, create and configure it on the PolarSearch node before creating the search view. If the target index does not exist at creation time, the system creates it automatically.
To configure advanced synchronization parameters—such as JSON field conversions or routing fields for multi-table aggregation—see AutoETL Parameter Configuration and Practice Cases.

Data preparation

The following examples use test tables. Run these statements in PolarDB for MySQL to set up the data:

CREATE DATABASE IF NOT EXISTS db1;
CREATE DATABASE IF NOT EXISTS db2;

USE db1;
CREATE TABLE IF NOT EXISTS t1 (
    id INT PRIMARY KEY,
    c1 VARCHAR(100),
    c2 VARCHAR(100)
);
INSERT INTO t1(id, c1, c2) VALUES
(1, 'apple', 'red'),
(2, 'banana', 'yellow'),
(3, 'grape', 'purple');

USE db2;
CREATE TABLE IF NOT EXISTS t2 (id INT PRIMARY KEY, c2 INT);
INSERT INTO t2(id, c2) VALUES (1, 111), (2, 222), (4, 444);

Examples

Full table sync — synchronize all columns from db1.t1. The view name view_test becomes the target index name in PolarSearch.
```
CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1;
```

Column selection — synchronize only c1 and c2.

CREATE SEARCH VIEW view_test1 AS SELECT c1, c2 FROM db1.t1;

Conditional filtering — synchronize only rows where c1 > 10.

CREATE SEARCH VIEW view_test2 AS SELECT id, c1, c2 FROM db1.t1 WHERE c1 > 10;

Multi-table JOIN — join db1.t1 and db2.t2 on id and synchronize the result.

CREATE SEARCH VIEW view_test3(id, c1, c2) AS SELECT t1.id, t1.c1, t2.c2 FROM db1.t1 AS t1 LEFT JOIN db2.t2 AS t2 ON t1.id = t2.id;

Multi-table UNION — merge two tables with compatible structures. Both SELECT statements must return the same number and types of columns.
```
CREATE SEARCH VIEW view_test4(id, c2) AS SELECT id, c2 FROM db1.t1 UNION ALL SELECT id, c2 FROM db2.t2;
```
Grouped aggregation — group and aggregate before synchronization. When using GROUP BY, define the columns and primary key manually.
```
CREATE SEARCH VIEW view_test5 (id, max_c) AS SELECT t1.id, MAX(t1.c1) AS max_c FROM db1.t1 GROUP BY t1.id;
```

Verify data synchronization

Check the synchronization status:

SHOW SEARCH VIEW STATUS;

A status of active means the view is synchronizing normally:

+------------+--------+----------+---------+---------------------+---------------------+
| View Name  | Type   | Status   | Message | Created_at          | Updated_at          |
+------------+--------+----------+---------+---------------------+---------------------+
| view_test  | search | active   |         | 2026-03-18 18:44:12 | 2026-03-18 18:51:37 |
+------------+--------+----------+---------+---------------------+---------------------+

Once the status is active, query the synchronized data through the PolarSearch node using the Elasticsearch-compatible REST API:

# Replace <user>:<password> with the PolarSearch node credentials
# and <polarsearch_endpoint> with the endpoint and port of the PolarSearch node.
curl -u <user>:<password> -X GET "http://<polarsearch_endpoint>/view_test/_search"

The response includes the synchronized documents:

{
  "hits": {
    "total": { "value": 3, "relation": "eq" },
    "hits": [
      { "_id": "1", "_source": { "id": 1, "c1": "apple", "c2": "red" } },
      { "_id": "2", "_source": { "id": 2, "c1": "banana", "c2": "yellow" } },
      { "_id": "3", "_source": { "id": 3, "c1": "grape", "c2": "purple" } }
    ]
  }
}

Manage search views

View all search views and their status:
```
SHOW SEARCH VIEW STATUS;
```

View the definition of a specific search view:

SHOW CREATE SEARCH VIEW view_test;

Output:

+-----------+------------------------------------------------------+
| View Name | Create Search View                                   |
+-----------+------------------------------------------------------+
| view_test | CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1 |
+-----------+------------------------------------------------------+

Modify a search view

To change synchronization logic—such as adding a new field or updating a query condition—use a "new index + new search view" approach to avoid affecting running queries.

Create a new search view that synchronizes data to a new PolarSearch index.
Run SHOW SEARCH VIEW STATUS and wait until the synchronization latency for the new view drops to 0–1 seconds.
Switch your business queries from the old index to the new one.
Drop the old search view.

For details on how DDL changes to a source table affect a search view, see DDL Change Rules and Practices.

Drop a search view

Important

Dropping a search view stops data synchronization and cleans up related resources, but does not delete the index data in PolarSearch. This operation cannot be undone—confirm the consequences before proceeding.

DROP SEARCH VIEW view_name;

State transitions when dropping:

`active`: Changes to dropping first, then to dropped after resources are cleaned up.
`dropped`: All information is permanently deleted.
Other states: Dropping is not supported.

ETL stored procedure (sync_by_sql)

For complex transformations, aggregations, or computations not covered by search views, use dbms_etl.sync_by_sql to define synchronization logic with Flink SQL-compatible syntax.

Create a synchronization link

Important

Do not hard-code passwords in SQL statements. The example below shows the syntax structure only. In production, use a secrets management solution instead of plaintext passwords.

Syntax

CALL dbms_etl.sync_by_sql("search", "<sync_sql>");

The system automatically substitutes the following placeholders in <sync_sql>. You only need to write the SQL using these fixed placeholders—no manual connection configuration required.

Placeholder	Substituted value
`{mysql_host}`	Internal hostname of the PolarDB read-write node
`{mysql_port}`	Port of the PolarDB read-write node
`{mysql_user}`	PolarDB username
`{mysql_password}`	PolarDB password
`{search_host}`	Hostname of the PolarSearch node
`{search_port}`	Port of the PolarSearch node
`{search_user}`	PolarSearch username
`{search_password}`	PolarSearch password

Example

The following example defines a source table, a target table, and a GROUP BY aggregation:

CALL dbms_etl.sync_by_sql("search", "
-- Step 1: Define the PolarDB source table
CREATE TEMPORARY TABLE `db1`.`sbtest1` (
  `id`   BIGINT,
  `c1`   STRING,
  `c2`   STRING,
  PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
  'connector'     = 'mysql',
  'hostname'      = '{mysql_host}',
  'port'          = '{mysql_port}',
  'username'      = '{mysql_user}',
  'password'      = '{mysql_password}',  -- Use a secrets manager in production.
  'database-name' = 'db1',
  'table-name'    = 't1',
  'server-id'     = '10000-11000'        -- Must be unique across all synchronization links.
);

-- Step 2: Define the PolarSearch target table
CREATE TEMPORARY TABLE `dest` (
  `id`    BIGINT,
  `max_c` STRING,
  PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
  'connector' = 'opensearch',
  'hosts'     = '{search_host}:{search_port}',
  'index'     = 'dest',
  'username'  = '{search_user}',
  'password'  = '{search_password}'     -- Use a secrets manager in production.
);

-- Step 3: Define the transformation and write it to the target
INSERT INTO `dest`
SELECT
    `t1`.`id`,
    MAX(`t1`.`c1`)
FROM `db1`.`t1` AS `t1`
GROUP BY `t1`.`id`;
");

If you create multiple Flink SQL synchronization links, each must use a unique server-id range in the WITH clause to avoid conflicts.

Manage synchronization links

View all links:
```
CALL dbms_etl.show_sync_link();
```

View a specific link by ID (replace <sync_id> with the ID returned when you created the link):

CALL dbms_etl.show_sync_link_by_id('<sync_id>')\G

Output fields:

*************************** 1. row ***************************
        SYNC_ID: crb5rmv8rttsg
           NAME: crb5rmv8rttsg
         SYSTEM: search
SYNC_DEFINITION: db1.t1 -> dest
  SOURCE_TABLES: db1.t1
    SINK_TABLES: dest
         STATUS: active   -- active = synchronizing normally
        MESSAGE:           -- error message, if any
     CREATED_AT: 2024-05-20 11:55:06
     UPDATED_AT: 2024-05-20 17:28:04
        OPTIONS: ...

Modify a synchronization link

Use the same "new index + new link" approach as for search views.

Create a new synchronization link to a new PolarSearch index.
Run CALL dbms_etl.show_sync_link_by_id('<sync_id>') and wait until the synchronization latency drops to 0–1 seconds.
Switch your business queries from the old index to the new one.
Drop the old synchronization link.

Drop a synchronization link

Important

Dropping a synchronization link stops data synchronization and cleans up related resources, but does not delete the index data in PolarSearch. This operation cannot be undone—confirm the consequences before proceeding.

CALL dbms_etl.drop_sync_link('<sync_id>');

State transitions when dropping:

`active`: Changes to dropping first, then to dropped after resources are cleaned up.
`dropped`: All link information is permanently deleted.
Other states: Dropping is not supported.

FAQ

What is the difference between a search view and an ETL stored procedure?

Search views use standard SQL (CREATE SEARCH VIEW) and have the system manage all connection details automatically—no Flink SQL required. They cover most single-table sync and multi-table aggregation scenarios.

The ETL stored procedure (CALL dbms_etl.sync_by_sql) uses Flink SQL-compatible syntax and requires you to manually define source and target connection configurations. Choose this method when you need complex data cleansing, transformation, or aggregation that exceeds what standard SQL can express.

PolarDB:Data synchronization from PolarDB MySQL to PolarSearch

Choose a synchronization method

Prerequisites

Limitations

Search views

Create a search view

Verify data synchronization

Manage search views

Modify a search view

Drop a search view

ETL stored procedure (sync_by_sql)

Create a synchronization link

Manage synchronization links

Modify a synchronization link

Drop a synchronization link

FAQ

What's next