All Products
Search
Document Center

DataWorks:Offline synchronization of the entire MySQL database to MaxCompute

Last Updated:Jun 27, 2025

Data Integration supports offline synchronization of entire databases from AnalyticDB for MySQL 3.0, ClickHouse, Hologres, MySQL, PolarDB, and other sources to MaxCompute. This topic uses MySQL to MaxCompute synchronization as an example to demonstrate how to synchronize an entire Hologres database to MaxCompute in a single operation.

Prerequisites

Limits

  • Real-time synchronization of all data in a MySQL database to MaxCompute does not support a MaxCompute data source with tenant-level schema syntax enabled.

  • Synchronization of source data to MaxCompute external tables is not supported.

Procedure

Step 1: Select the synchronization task type

  1. Go to the Data Integration page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. In the left-side navigation pane, click Synchronization Tasks, and then click Create Synchronization Task at the top of the page to go to the synchronization task creation page. Configure the following basic information:

    • Data Source And Destination: MySQLMaxCompute

    • New Task Name: Customize the synchronization task name.

    • Synchronization Type: Offline Database.

    • Synchronization Steps: Select Full Synchronization and Incremental Synchronization.

Step 2: Configure network and resources

  1. In the Network And Resource Configuration section, select the Resource Group for the synchronization task. You can allocate the number of CUs for Task Resource Usage.

  2. For Source Data Source, select the added MySQL data source. For Destination Data Source, select the added MaxCompute data source, and then click Test Connectivity.image

  3. After confirming that both the source and destination data sources are connected, click Next.

Step 3: Select the databases and tables to synchronize

In this step, you can select the tables from the source data source that you want to synchronize in the Source Database Tables section, and click the image icon to move them to the Selected Database Tables section on the right.

image

Step 4: Set destination table properties

You need to click the Configure button after Partition Initialization Settings to uniformly initialize the partitions for the new destination tables. The modifications made here will overwrite the partition settings for all new destination tables.

Step 5: Configure full and incremental control

  1. Configure the synchronization mode for the synchronization task.

    • If you selected both Synchronization Steps Full Synchronization and Incremental Synchronization, this section defaults to one-time full synchronization and periodic incremental synchronization, which cannot be changed.

    • If you selected Synchronization Steps Full Synchronization, you can choose whether the task type is one-time full synchronization or periodic full synchronization.

    • If you selected Synchronization Steps Incremental Synchronization, you can choose whether the task type is one-time incremental synchronization or periodic incremental synchronization.

      Note

      In this example, the synchronization mode of one-time synchronization of full data and periodic synchronization of incremental data is used.

  2. Configure the parameters for periodic scheduling.

    If your task involves periodic synchronization, you can click Scheduling Parameters to configure the settings.

Step 6: Map destination tables

After selecting the tables to synchronize in the previous step, the tables are automatically displayed on this page. However, the properties of the destination tables are in the Pending Refresh Mapping state by default. You need to define and confirm the mapping relationship between the source tables and destination tables, which determines how data is read and written. Then, click Refresh Mapping to proceed to the next step. You can directly refresh mappings between source tables and destination tables, or you can refresh mappings after configuring destination table settings.

Note
  • You can select the tables to be synchronized and click Batch Refresh Mapping. If no mapping rule is configured, the default table name rule is ${source_database_name}_${table_name}. If a table with the same name does not exist in the destination, a new table is automatically created.

  • Because periodic scheduling is required, you need to define the related properties for the periodic scheduling task, including Scheduling Cycle, Scheduling Time, Scheduling Resource Group, and more. The scheduling configuration for the current synchronization is the same as the scheduling configuration for nodes in Data Development. For more information about the parameters, see Node scheduling.

  • You need to set Incremental Condition to filter the source using a WHERE clause. You only need to write the WHERE clause here, without the WHERE keyword. If you configure the scheduling parameters for implementing periodic synchronization of incremental data, you can use the system parameter variables.

  • In the Custom Destination Table Name Mapping column, you can click the Edit button to customize the destination table name rule.

    You can concatenate built-in variables and specified strings into a destination table name. You can edit built-in variables. For example, when creating a mapping rule, you can add a suffix to a variable that indicates a source table name to form a destination table name.

1. Edit field type mapping

The synchronization task has default mappings between source field types and destination field types. You can click Edit Field Type Mapping in the upper-right corner of the table to customize the field type mapping relationship between source tables and destination tables. After configuration, click Apply And Refresh Mapping.

2. Edit destination table structure and add field assignments

When the destination table is in the To Be Created state, you can add fields to the destination table based on the original table structure. To do this, perform the following steps:

  1. Add fields to destination tables.

    • Add fields to a single table: Click the button in the Destination Table Nameimage.png column to add fields.

    • Add fields in batches: Select all tables to be synchronized, and at the bottom of the table, select Batch Modification > Destination Table Structure - Batch Modify And Add Fields.

  2. Assign values to the fields. You can perform one of the following operations to assign values to the fields:

    • Assign values to a single table: Click the Configure button in the Destination Table Field Assignment column to assign values to the destination table fields.

    • Assign values in batches: At the bottom of the list, select Batch Modification > Destination Table Field Assignment to assign values to the same fields in the destination tables in batches.

    Note

    When assigning values, you can assign constants and variables. You can switch the assignment mode by clicking the image icon.

3. Customize advanced parameters

If you need to make detailed configurations for the task to meet custom synchronization requirements, you can click Configure in the Custom Advanced Parameters column to modify the advanced parameters.

Important

Before modifying the configurations of advanced parameters, make sure that you understand the meanings of the parameters to prevent unexpected errors or data quality issues.

4. Set source split column

You can select a field from the source table in the source split column drop-down list or select No Split.

Step 7: Configure alerts

To prevent synchronization task failures from causing latency in business data synchronization, you can configure different alert rules for the synchronization task.

  1. Click Alert Configuration in the upper-right corner of the page to go to the alert settings page.

  2. Select the scheduling task for the synchronization table and configure alerts for it. For more information, see Alert information.

Step 8: Configure advanced parameters

You can change the values of specific parameters configured for the synchronization task based on your business requirements. For example, you can specify an appropriate value for the Maximum read connections parameter to prevent the current synchronization task from imposing excessive pressure on the source database and affecting data production.

Note

To prevent unexpected errors or data quality issues, we recommend that you understand the meanings of parameters before changing their values.

  1. Click Advanced Parameter Configuration in the upper-right corner of the interface to go to the advanced parameter configuration page.

  2. Modify the parameter values on the Advanced Parameter Configuration page.

Step 9: Configure resource groups

You can click Resource Group Configuration in the upper-right corner of the interface to view and switch the resource group used by the current task.

Step 10: Execute the synchronization task

  1. After completing all configurations, click Complete Configuration at the bottom of the page.

  2. On the Data Integration > Synchronization Tasks page, find the created synchronization task and click Start in the Operation column.

  3. Click the Name/ID of the corresponding task in the Task List to view the detailed execution process of the task.

Perform O&M operations on the synchronization task

View the status of the synchronization task

After the synchronization task is created, you can go to the Synchronization Task page to view all synchronization tasks created in the workspace and their basic information.

image

  • You can Start or Stop the synchronization task in the Operation column. In the More menu, you can Edit, View, and perform other operations on the synchronization task.

  • For started tasks, you can see the basic status of the task in Execution Overview. You can also click the corresponding overview area to view the execution details.

    image

    In the offline synchronization task from MySQL to MaxCompute:

    • If your task synchronization step is Full Synchronization, this section shows structure migration and full synchronization.

    • If your task synchronization step is Incremental Synchronization, this section shows structure migration and incremental synchronization.

    • If your task synchronization step is Full Synchronization + Incremental Synchronization, this section shows structure migration, full synchronization, and incremental synchronization.

Task rerun

  • Direct rerun: Without modifying the task configuration, directly click the Rerun operation.

    Effect: Reruns one-time tasks and updates periodic task properties.

  • Modify and rerun (add or remove tables): Edit the task, add or remove tables, and click Complete. At this point, the task operation will change to Apply Update. Clicking Apply Update will directly trigger the modified task to rerun.

    Effect: Only newly added tables will be synchronized. Previously synchronized tables will not be synchronized again.

  • Modify and rerun (modify table name information): Edit the task, modify the destination table name or switch the destination table for synchronization, and click Complete. At this point, the task operation will change to Apply Update. Clicking Apply Update will directly trigger the modified task to rerun.

    Effect: Synchronizes the modified tables. Unmodified tables will not be synchronized again.