All Products
Search
Document Center

AnalyticDB for MySQL:Use federated analytics to synchronize data to Data Lakehouse Edition

Last Updated:Jan 08, 2024

You can use federated analytics together with the AnalyticDB Pipeline Service (APS) feature of AnalyticDB for MySQL to synchronize data from PolarDB for MySQL to AnalyticDB for MySQL Data Lakehouse Edition (V3.0) in real time. This facilitates data synchronization and management. This topic describes how to use federated analytics to synchronize data from a PolarDB for MySQL cluster to an AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.

You can join the DingTalk group 33600023146 to learn more about the federated analytics feature.

Prerequisites

Limits

  • PolarDB for MySQL supports federated analytics only for AnalyticDB for MySQL Data Lakehouse Edition (V3.0) clusters.

  • Federated analytics is supported only in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), US (Silicon Valley), US (Virginia), Germany (Frankfurt), and UK (London).

  • You can create up to three synchronization jobs for each PolarDB for MySQL cluster and up to 30 synchronization jobs in each region.

Create a synchronization job

  1. Log on to the PolarDB console.

  2. In the upper-left corner of the console, select a region.

  3. In the left-side navigation pane, click Federated Analytics.

  4. Click Create Job. In the Create Job panel, configure the parameters that are described in the following table.

    Parameter

    Description

    Job Name

    The name of the job. Default value: data-sync-<Time>.

    PolarDB for MySQL Cluster

    The ID of the source PolarDB for MySQL cluster.

    Database Account Name

    The database account that is automatically created by federated analytics for the PolarDB for MySQL cluster to synchronize data. The name of the database account starts with sync. Do not delete or modify the name.

    AnalyticDB for MySQL Data Lakehouse Edition Cluster

    The ID of the destination AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.

    You can select an existing AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster or click Click to create an AnalyticDB for MySQL cluster to create an AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.

    Advanced Settings

    By default, advanced settings are disabled. In this case, the entire source cluster is synchronized.

    After you enable advanced settings, you can configure the Select the database or table to synchronize and Large Table Partition Key Settings parameters.

    Select the database or table to synchronize

    You can select the databases and tables that you want to synchronize. By default, all databases and tables are synchronized.

    Important
    • You cannot synchronize tables that do not have primary keys. These tables are automatically filtered out.

    • Each AnalyticDB for MySQL cluster can contain up to 2,048 databases. For more information, see Limits.

    Large Table Partition Key Settings

    To improve the data write and query performance, we recommend that you specify partition keys for tables. For more information, see Schema design.

    The following partition formats are supported:

    • value: partitioned by value.

    • yyyyMMdd: partitioned by year, month, and day.

    • yyyyMM: partitioned by year and month.

    • yyyy: partitioned by year.

  5. Click OK. The job automatically starts.

    The created job is displayed on the Federated Analytics page. You can click View, Edit, Delete, Suspend or Start in the Actions column.

    Important

    Deleted jobs cannot be recovered.

  6. To analyze data, click the destination cluster ID to go to the AnalyticDB for MySQL console. For more information, see SQL editor.