All Products
Search
Document Center

DataWorks:Offline synchronization of an entire MySQL database to MaxCompute

Last Updated:Nov 27, 2025

Data Integration supports offline synchronization of entire databases from sources such as AnalyticDB for MySQL 3.0, ClickHouse, Hologres, MySQL, and PolarDB to MaxCompute. This topic describes how to perform a one-time offline synchronization of an entire MySQL database to MaxCompute.

Prerequisites

Limits

Synchronizing source data to MaxCompute external tables is not supported.

Procedure

Step 1: Select a sync task type

  1. Go to the Data Integration page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. In the left navigation pane, click Synchronization Task. At the top of the page, click Create Synchronization Task. In the dialog box that opens, configure the following basic parameters.

    • Source Type: MySQL

    • Destination Type: MaxCompute

    • Task Name: Enter a name for the sync task.

    • Task Type: Full Database Offline.

    • Sync Procedure: select Full initialization and Incremental Synchronization.

Step 2: Configure the network and resources

  1. In the Network and Resource Configuration section, select a Resource Group for the sync task and specify the number of CUs for Task Resource Usage.

  2. Set Source to your MySQL data source and Destination to your MaxCompute data source, and then click Test Connectivity.image

  3. After the connectivity tests for the source and destination data sources are successful, click Next.

Step 3: Select the databases and tables to synchronize

In the Source Table area, select the tables to sync from the source data source. Click the image icon to move the tables to the Selected Tables list.

image

Step 4: Set destination table property

Click the Configure button next to Partition Initialization Configuration to set the initial partition configuration for all new destination tables. This configuration overwrites the partition settings for these tables.

Step 5: Configure full and incremental synchronization control

  1. Configure the full and incremental sync type for the task.

    • If you select both Full initialization and Incremental synchronization in the Synchronization Mode, the task defaults to a one-time full sync and recurring incremental syncs. This setting cannot be changed.

    • If you selected Full initialization in the Synchronization Mode, you can configure the task for a one-time full sync or a recurring full sync.

    • If you select Incremental synchronization in the Synchronization Mode, you can configure the task as a one-time or recurring incremental sync.

      Note

      The following steps use a one-time full sync and recurring incremental sync task as an example.

  2. Configure recurring schedule parameters.

    If you want the task to run on a recurring schedule, click Configure Scheduling Parameters for Periodical Scheduling.

Step 6: Configure destination table mapping

After you select the tables to sync in the previous step, they are automatically displayed on this page. The destination tables have a status of 'mapping to be refreshed'. You must define the mapping between the source and destination tables, which specifies how data is read from the source tables and written to the destination tables. Then, click Refresh to proceed. You can refresh the mapping immediately or customize the destination table rules first.

Note
  • You can select the tables to sync and click Batch Refresh Mapping Results. If no mapping rule is configured, the default naming rule for tables is ${SourceDBName}_${TableName}. If a table with the same name does not exist in the destination, a new table is automatically created.

  • Because the task runs on a schedule, you must configure its scheduling properties, such as Scheduling Cycle, Time Properties, and Resource Group for Scheduling. This sync task uses the same scheduling configuration as a node in Data Studio. For more information, see Node Scheduling.

  • For the Condition for Incremental Synchronization, enter the content of a WHERE clause to filter the source data. Do not include the WHERE keyword. If periodic scheduling is enabled, you can use system parameter variables.

  • In the Customize Mapping Rules column, click Edit to customize the destination table naming rule.

    You can use built-in variables and manually entered strings to create the destination table name. You can also edit the built-in variables. For example, you can create a new table naming rule that adds a suffix to the source table name to form the destination table name.

1. Edit mapping of field data types

A sync task maps source field types to destination field types by default. To customize this mapping, click Edit Mapping of Field Data Types in the upper-right corner of the table. After you configure the mapping, click Apply and Refresh Mapping.

2. Edit the destination table schema and assign field values

If a destination table has a status of To Be Created, you can add fields to its schema. Follow these steps:

  1. Add fields to the destination table.

    • To add a field to a single table, click the image.png button in the Target Table Name column.

    • To add fields in batches, select all tables to sync. At the bottom of the table, choose Batch Modify > Destination Table Schema - Batch Modify and Add Field.

  2. Assign values to the fields. You can use the following operations to assign values to the fields that you just added.

    • To assign values to a single table: In the Destination Table Field Assignment column, click Configure.

    • To assign values in batches, at the bottom of the list, choose Batch Modify > Destination Table Field Assignment to assign values to identical fields across multiple destination tables.

    Note

    You can assign constants or variables. Click the image icon to switch between assignment modes.

3. Custom advanced parameters

For fine-grained control over the task, click Configure in the Custom Advanced Parameters column.

Important

Modify these parameters only if you fully understand what they do. Incorrect settings can cause unexpected errors or data quality issues.

4. Set source split column

From the source shard key drop-down list, select a field from the source table or select Not Split.

Step 7: Configure advanced parameters

The sync task provides several parameters that you can modify as needed. For example, you can limit the maximum number of connections to prevent the sync task from exerting too much pressure on your production database.

Note

Modify these parameters only if you fully understand what they do. Incorrect settings can cause unexpected errors or data quality issues.

  1. In the upper-right corner of the page, click Configure Advanced Parameters to go to the advanced parameter configuration page.

  2. On the Configure Advanced Parameters page, modify the parameter values.

Step 8: Configure the resource group

In the upper-right corner of the page, click Resource Group Configuration to view or switch the resource group for the current task.

Step 9: Run the sync task

  1. After you finish the configuration, click Complete at the bottom of the page.

  2. On the Data Integration > Synchronization Task page, find the created sync task and click Start in the Actions column.

  3. In the Tasks, click the Name/ID of the task to view the execution details.

Step 10: Configure alert rule

After the task runs, a scheduled job is generated in the Operation Center. To prevent task errors from causing data sync latency, you can set an alarm policy for the sync task.

  1. In the Tasks, find the running sync task. In the Actions column, choose More > Edit to open the task editing page.

  2. Click Next. Then, click Configure Alert Rule in the upper-right corner of the page to open the alarm settings page.

  3. In the Scheduling Information column, click the scheduled job to open the task details page in the Operation Center and retrieve the Task ID.

  4. In the navigation pane on the left of the Operation Center, choose Node Alarm > Alarm > Rule Management to go to the Rule Management page.

  5. Click Create Custom Rule and set Rule Object, Trigger Condition, and Alert Details. For more information, see Rule management.

    In the Rule Object field, search for the target task using the obtained Task ID and set an alert.

Sync task O&M

View task status

After you create a sync task, you can view the list of created tasks and their basic information on the Sync Task page.

image

  • In the Actions column, you can Start or Stop a sync task. Click More to perform other operations, such as Edit and View.

  • You can view the status of a running task in the Execution Overview section. You can also click an area in the overview to view execution details.

    image

    For an offline sync task that synchronizes an entire MySQL database to MaxCompute:

    • If you select Full initialization for the synchronization step, the details for the schema migration and full synchronization are displayed.

    • If you select Incremental synchronization for the synchronization step, the details for schema migration and incremental synchronization are displayed.

    • If you select Full initialization and Incremental synchronization as the synchronization steps, details for schema migration, full synchronization, and incremental synchronization are displayed.

Rerun a task

  • Click Rerun to rerun the task without changing the task configuration.

    Effect: This operation reruns a one-time task or updates the properties of a recurring task.

  • To rerun a task after modifying it by adding or removing tables, edit the task and click Complete. The task status then changes to Apply Update. Click Apply Update to immediately trigger a rerun of the modified task.

    Effect: Only the new tables are synced. Tables that were previously synced are not synced again.

  • After you edit a task (for example, by changing a destination table name or switching to a different destination table) and click Complete, the available operation for the task changes to Apply Update. Click Apply Update to immediately trigger a rerun of the modified task.

    Effect: The modified tables are synced. Unmodified tables are not synced again.