Use federated analytics to synchronize data from PolarDB for MySQL - AnalyticDB

Important

The PolarDB for MySQL Federated Analytics feature was upgraded on July 23, 2024. The entry point is no longer available, and you can no longer create new synchronization jobs. To create synchronization jobs, go to Data Integration in the PolarDB console. For more information, see Use zero-ETL to synchronize data. If you have an existing Federated Analytics link, the entry point remains available for link management in the China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), and US (Virginia) regions.

Federated Analytics uses AnalyticDB Pipeline Service (APS) to synchronize data in real time from PolarDB for MySQL to AnalyticDB for MySQL. Supported destination cluster editions: Enterprise Edition, Basic Edition, and Data Lakehouse Edition.

To ask questions about this feature, search for DingTalk group 33600023146.

Prerequisites

Before you begin, make sure that you have:

A PolarDB for MySQL cluster and an AnalyticDB for MySQL cluster (Enterprise Edition, Basic Edition, or Data Lakehouse Edition) in the same region. For setup instructions, see Create an instance and Create a cluster.
Binary logging enabled for the PolarDB for MySQL cluster. See Enable binary logging.

Limitations

Supported regions

China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), US (Silicon Valley), US (Virginia), Germany (Frankfurt), and UK (London).

Job quotas

Up to 3 synchronization jobs per PolarDB for MySQL cluster.
Up to 30 synchronization jobs per region.

Table restrictions

Tables without a primary key cannot be synchronized and are automatically excluded.
The destination AnalyticDB for MySQL cluster supports up to 2,048 databases. See Limits.

Tip: To exclude specific databases or tables from synchronization, enable Advanced Settings when creating the job and configure Select the database or table to synchronize.

Create a synchronization job

Log on to the PolarDB console.
In the upper-left corner, select a region.
In the left navigation pane, click Federated Analytics.

Click Create Job and configure the following parameters: Advanced settings (optional) Enable Advanced Settings to customize the synchronization scope and partition keys. By default, the entire source cluster is synchronized.

Basic settings

Parameter	Description
Job name	Name of the synchronization job. Default: `data-sync-<Time>`.
PolarDB for MySQL cluster	ID of the source PolarDB for MySQL cluster.
Source database account	Database account auto-created by Federated Analytics. The account name starts with sync. Do not delete or rename this account.
AnalyticDB for MySQL cluster	ID of the destination cluster. Select an existing cluster or click Click To Create An AnalyticDB For MySQL Cluster to create one.

Parameter	Description
Select the database or table to synchronize	Databases and tables to include. By default, all databases and tables are synchronized.
Large table partition key settings	Partition keys to improve write and query performance. See Set partition keys. Supported formats: `value` (partitioned by value), `yyyyMMdd` (partitioned by year, month, and day), `yyyyMM` (partitioned by year and month), `yyyy` (partitioned by year).

Click OK. The job starts automatically.

Important

Deleted jobs cannot be recovered.

Manage synchronization jobs

After creation, the job appears on the Federated Analytics page. Use the Actions column to View, Edit, Delete, Suspend, or Start a job.

To query synchronized data, click the destination cluster ID to open the AnalyticDB for MySQL console and use the SQL editor.

What's next

Use zero-ETL to synchronize data — the recommended replacement for Federated Analytics when creating new synchronization tasks.