All Products
Search
Document Center

PolarDB:Search views and ETL stored procedures

Last Updated:Jun 06, 2026

When you need to perform full-text search or complex analysis on business data in PolarDB for MySQL, operating directly on the database can impact the stability of your core business. PolarDB provides the AutoETL feature to automatically and continuously synchronize data from a read-write node to a PolarSearch node within the cluster. You can create data synchronization links using search views or ETL stored procedures. This approach eliminates the need to deploy and maintain separate ETL tools and isolates search and analytics workloads from online transactional processing workloads.

Note

When you create a link using AutoETL, the AutoETL engine is granted permission by default to access PolarDB data for synchronization.

Feature overview

AutoETL is a built-in data synchronization capability of PolarDB for MySQL that automatically synchronizes data between different types of nodes within a cluster. The current version supports only data synchronization from PolarDB for MySQL to a PolarSearch node in the same cluster for high-performance search and analysis.

AutoETL provides two methods for creating a data synchronization link:

  • Search view: Uses the CREATE SEARCH VIEW syntax to define data synchronization logic in standard SQL. This method is ideal for most single-table synchronization and multi-table aggregation scenarios, as the system automatically handles the underlying connection details.

  • ETL stored procedure (dbms_etl.sync_by_sql): Uses a stored procedure with Flink SQL-compatible syntax to define complex logic for data cleansing, transformation, and aggregation.

Prerequisites

Before you use the AutoETL feature, ensure that your environment meets the following requirements:

  • Cluster versions:

    • Search view:

      • MySQL 8.0.1, revision version 8.0.1.1.54 or later.

      • MySQL 8.0.2, revision version 8.0.2.2.34 or later.

    • ETL stored procedure (sync_by_sql):

      • MySQL 8.0.1, revision version 8.0.1.1.52 or later.

      • MySQL 8.0.2, revision version 8.0.2.2.33 or later.

  • Binlog: The cluster must have binlog enabled.

  • Synchronization direction: Only synchronization from PolarDB for MySQL to a PolarSearch node within the same cluster is supported.

  • DDL limitations: When you perform DDL operations on a source table with an associated search view or ETL stored procedure, you must follow specific rules to avoid interrupting synchronization. Some incompatible changes require you to re-create the search view. For more information, see DDL Change Rules and Best Practices.

  • Data types: AutoETL does not support synchronizing the BIT data type or spatial data types such as GEOMETRY, POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, and GEOMETRYCOLLECTION.

  • Search view query limitations: Search views can only be used to define synchronization semantics and do not support data queries. To query data, connect directly to the PolarSearch node.

Search views

A search view is a declarative data synchronization mechanism provided by AutoETL. You can use standard SQL syntax to create a search view, which automatically establishes a continuous data synchronization link from the source table to a PolarSearch node.

Create a search view

Syntax

CREATE SEARCH VIEW view_name [(column_list, PRIMARY KEY (pk_column_list))] AS select_statement;

Parameters

Parameter

Required

Description

view_name

Yes

The name of the search view. This is also the name of the target index in the PolarSearch node.

column_list

No

Manually defines the columns of the search view. Use commas (,) to separate multiple columns.

Note

For single-table synchronization, you do not need to specify column_list. The original column names from the source table are used by default.

pk_column_list

No

Specifies the primary key columns for the search view. The structure of a search view corresponds to the index mapping of a PolarSearch node, and pk_column_list is used to specify the index document ID for the PolarSearch node.

If this parameter is not specified, the first column in column_list is used as the document ID by default. For single-table synchronization, the primary key of the source table is used as the document ID by default.

select_statement

Yes

A SELECT statement that defines the data source and synchronization logic. This statement retrieves data from the source table and stores the result in PolarSearch. It supports operations such as single-table queries and multi-table JOIN, UNION ALL, and GROUP BY clauses.

Usage notes and limitations

  • The source table must have a primary key or a unique key.

  • You must have ALTER permissions on all source tables referenced in the search view and SELECT permissions on the relevant columns or the entire tables.

  • After you create a search view, the system does not synchronize new columns added to the source table by default. To synchronize new columns, see Modify a search view.

  • If you want to use a custom configuration for the target index, you can first manually create the index and define its configuration in the PolarSearch node, and then create the search view. If the target index does not exist when you create the search view, the system creates it automatically.

  • To configure advanced synchronization parameters, such as JSON field conversions or routing fields for multi-table aggregations or complex queries, see AutoETL Parameter Configuration and Use Cases.

Prepare the data

To prepare test data for the following examples, run these SQL statements in PolarDB for MySQL.

CREATE DATABASE IF NOT EXISTS db1;
CREATE DATABASE IF NOT EXISTS db2;

USE db1;
CREATE TABLE IF NOT EXISTS t1 (
    id INT PRIMARY KEY,
    c1 VARCHAR(100),
    c2 VARCHAR(100)
);
INSERT INTO t1(id, c1, c2) VALUES
(1, 'apple', 'red'),
(2, 'banana', 'yellow'),
(3, 'grape', 'purple');

USE db2;
CREATE TABLE IF NOT EXISTS t2 (id INT PRIMARY KEY, c2 INT);
INSERT INTO t2(id, c2) VALUES (1, 111), (2, 222), (4, 444);

Examples

  • Full table synchronization: Synchronize all data from db1.t1 to PolarSearch. The view name view_test serves as the name of the target index in PolarSearch.

    CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1;
  • Specific column synchronization: Synchronize only the c1 and c2 columns, and manually define the column types and primary key.

    CREATE SEARCH VIEW view_test1 AS SELECT c1, c2 FROM db1.t1;
  • Conditional filtering: Synchronize only the data that meets the WHERE condition.

    CREATE SEARCH VIEW view_test2 AS SELECT id, c1, c2 FROM db1.t1 WHERE c1 > 10;
  • Multi-table JOIN: Join db1.t1 and db2.t2 on the id field and synchronize the result to PolarSearch.

    CREATE SEARCH VIEW view_test3(id, c1, c2) AS SELECT t1.id, t1.c1, t2.c2 FROM db1.t1 AS t1 LEFT JOIN db2.t2 AS t2 ON t1.id = t2.id;
  • Multi-table UNION: Merge and synchronize multiple tables with the same structure. The SELECT statements must have the same number and types of columns.

    CREATE SEARCH VIEW view_test4(id, c2) AS SELECT id, c2 FROM db1.t1 UNION ALL SELECT id, c2 FROM db2.t2;
  • Group and aggregate: Group and aggregate data before synchronization. You must manually define the columns and primary key when using GROUP BY.

    CREATE SEARCH VIEW view_test5 (id, max_c) AS SELECT t1.id, MAX(t1.c1) AS max_c FROM db1.t1 GROUP BY t1.id;

Verify the data

Check the sync status of the search view:

SHOW SEARCH VIEW STATUS;

When the status is active, the search view is synchronizing data as expected. Connect to the PolarSearch node and use the Elasticsearch-compatible REST API to verify the data:

# Replace : with the credentials for the PolarSearch node and  with the connection endpoint and port of the PolarSearch node.
curl -u : -X GET "http:///view_test/_search"

Manage search views

You can use the following commands to view your search views.

  • View the status of all search views:

    SHOW SEARCH VIEW STATUS;

    The following result is returned. A status of active indicates that the search view is synchronizing data as expected.

    +------------+--------+----------+---------+---------------------+---------------------+
    | View Name  | Type   | Status   | Message | Created_at          | Updated_at          |
    +------------+--------+----------+---------+---------------------+---------------------+
    | view_test  | search | active   |         | 2026-03-18 18:44:12 | 2026-03-18 18:51:37 |
    +------------+--------+----------+---------+---------------------+---------------------+
  • View the creation statement for a specific search view:

    SHOW CREATE SEARCH VIEW view_test;

    The following result is returned:

    +-----------+------------------------------------------------------+
    | View Name | Create Search View                                   |
    +-----------+------------------------------------------------------+
    | view_test | CREATE SEARCH VIEW view_test AS SELECT * FROM db1.t1 |
    +-----------+------------------------------------------------------+

Delete a search view

Important

Deleting a search view is a high-risk operation. Proceed with caution. This operation stops data synchronization for the search view and cleans up related resources, but it does not delete the index data in PolarSearch.

DROP SEARCH VIEW view_name;

The system's behavior when you delete a search view varies based on its status:

  • For a search view in the active state, the status first changes to dropping. After the system finishes cleaning up the synchronization resources, the status changes to dropped.

  • If you delete a search view that is in the dropped state, the system permanently removes all information about that search view.

  • You cannot delete a search view that is in any other state.

Modify a search view

When you need to modify the synchronization logic of a search view, such as adding a synchronized field or changing a query condition, re-create it by using a "new index + new search view" approach. This method does not affect your business queries.

  1. Create a new search view that synchronizes data to a new PolarSearch index.

  2. Check the status of the new search view by running SHOW SEARCH VIEW STATUS. When the synchronization latency drops to between 0 and 1 second, switch your business query logic from the old index to the new index.

  3. Delete the old search view.

For more information about the impact of source table DDL changes on search views and best practices for modifications, see DDL Change Rules and Best Practices.

ETL stored procedure (sync_by_sql)

For scenarios that require complex transformations, aggregations, or calculations, you can use the CALL dbms_etl.sync_by_sql stored procedure to define data synchronization logic with Flink SQL-compatible syntax.

Create a synchronization link

Syntax

CALL dbms_etl.sync_by_sql("search", "");

Example

Note

The system automatically configures the connection information for the source and target tables, such as host address, port, and credentials. You do not need to specify it in the WITH clause.

CALL dbms_etl.sync_by_sql("search", "

-- Step 1: Define the PolarDB source table
CREATE TEMPORARY TABLE `db1`.`t1` (
  `id`   BIGINT,
  `c1`   STRING,
  PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
  'connector' = 'mysql',
  'database-name' = 'db1',
  'table-name' = 't1'
);

-- Step 2: Define the PolarSearch target table
CREATE TEMPORARY TABLE `dest` (
  `id`  BIGINT,
  `max_c` STRING,
  PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
  'connector' = 'opensearch',
  'index' = 'dest'
);

-- Step 3: Define the computation and insertion logic
INSERT INTO `dest`
SELECT
    `t1`.`id`,
    MAX(`t1`.`c1`)
FROM `db1`.`t1` AS `t1`
GROUP BY `t1`.`id`;
");

Verify the data

Connect to the PolarSearch node and use the Elasticsearch-compatible REST API to query and confirm that the data has been synchronized.

# Replace  with the connection endpoint of the PolarSearch node.
curl -u : -X GET "http:///dest/_search"

Manage synchronization links

You can use the following commands to view your data synchronization links.

  • To view all links:

    CALL dbms_etl.show_sync_link();
  • To view a specific link by its ID, replace <sync_id> with the ID returned when you created the link.

    CALL dbms_etl.show_sync_link_by_id('<sync_id>')\G

    The following describes the returned parameters:

    *************************** 1. row ***************************
            SYNC_ID: crb5rmv8rttsg
               NAME: crb5rmv8rttsg
             SYSTEM: search
    SYNC_DEFINITION: db1.t1 -> dest
      SOURCE_TABLES: db1.t1
        SINK_TABLES: dest
             STATUS: active  -- Link status. `active` indicates the link is running normally.
            MESSAGE:         -- If an error occurs, the error message is displayed here.
         CREATED_AT: 2024-05-20 11:55:06
         UPDATED_AT: 2024-05-20 17:28:04
            OPTIONS: ...

Delete a synchronization link

This operation stops data synchronization and cleans up related resources.

Important

Deleting a data synchronization link is a high-risk operation. Proceed with caution. This operation stops data synchronization for the link and cleans up related resources, but it does not delete the index data in PolarSearch.

CALL dbms_etl.drop_sync_link('<sync_id>');

The system's behavior when you run drop_sync_link to delete a link varies based on its status:

  • For a link in the active state, the status first changes to dropping. After the system finishes cleaning up the link resources, the status changes to dropped.

  • If you delete a link that is in the dropped state, the system permanently removes all information about that link.

  • You cannot delete a link that is in any other state.

Modify a synchronization link

When you need to modify a data synchronization link, re-create it by using the same "new index + new link" method as you would for a search view.

  1. Create a new data synchronization link that synchronizes data to a new PolarSearch index.

  2. Check the status of the new synchronization link by running CALL dbms_etl.show_sync_link_by_id('<sync_id>'). When the synchronization latency drops to between 0 and 1 second, switch your business queries from the old index to the new index.

  3. Delete the old data synchronization link.

FAQ

What is the difference between a search view and an ETL stored procedure (sync_by_sql)?

The main differences are their use cases and complexity:

  • Search view: Uses standard SQL syntax (CREATE SEARCH VIEW). The system automatically handles the underlying connection and synchronization details without requiring you to write Flink SQL. It is suitable for most single-table synchronization and multi-table aggregation scenarios.

  • ETL stored procedure: Uses Flink SQL-compatible syntax (CALL dbms_etl.sync_by_sql). It is suitable for scenarios that require complex data cleansing, transformation, and aggregation, providing full control with Flink SQL. The system automatically handles the connection configurations for the source and target tables.