After you configure data sources, network environments, and resource groups, you can
create and run synchronization nodes. This topic describes how to configure a synchronization
solution to synchronize data to MaxCompute in real time and view the status of the
nodes generated by the sync solution.
Prerequisites
Before you create synchronization nodes, make sure that the following operations are
performed:
Billing
The One-click real-time synchronization to MaxCompute synchronization solution requires
periodic merging of full and incremental data. In this case, MaxCompute computing
resources are consumed. The fees for the MaxCompute computing resources are included
in your MaxCompute bill, and are positively correlated to the size of the full data
and the merging cycle. For more information, see Billing method.
Create a synchronization solution
- Go to the Create Data Synchronization Solution wizard. Select the source and the destination
for data synchronization from the drop-down lists. In this scenario, select MaxCompute
as the destination. After that, select One-click real-time synchronization to MaxCompute from the available synchronization solutions.
- In the Set Synchronization Sources and Rules step, configure basic information such
as the solution name for the data synchronization solution.
In the
Basic Configuration section, set the parameters.

Parameter |
Description |
Solution Name |
The name of the sync solution. The name can be up to 50 characters in length. |
Description |
The description of the sync solution. The description can be up to 50 characters in
length.
|
Location |
If you select Automatic Workflow Creation, DataWorks automatically creates a workflow
named in the format of clone_database_Source name+to+Destination name. All sync nodes generated by the sync solution are placed in the Data Integration folder of this workflow.
If you clear Automatically establish workflow, select a directory from the Select Location drop-down list. All sync nodes generated by the sync solution are placed in the specified
directory.
|
- Select a data source as the source and configure synchronization rules.
- In the Data Source section, specify the Type and Data source parameters.
Note You can set the Type parameter only to MySQL, Oracle, or PolarDB.
- In the Source Table section, select the tables whose data you want to synchronize from the Source Table list. Then, click the
icon to add the tables to the Selected Tables list. 
The Source Table section displays all the tables in the source. You can select all
or specific tables.
Notice If a selected table does not have a primary key, the table cannot be synchronized
in real time.
- In the Set Mapping Rules for Table/Database Names section, configure a rule as needed.
Supported options are
Conversion Rule for Table Name and
Rule for Destination Table name.
- Conversion Rule for Table Name: the rule that is used to convert the names of source tables to those of destination
tables.
- Rule for Destination Table name: the rule that is used to add a prefix or a suffix to the converted names of destination
tables.
- Click Next Step.
- Select a data source as the destination and configure the formats for the destination
tables.
- In the Set Destination Table step, set the Destination and Write Mode parameters.
- Click the
icon next to Time automatic partition setting. In the Edit dialog box, modify the partition settings for the destination tables. You can configure
daily partitions. You can write data to a partitioned table or a non-partitioned table
in MaxCompute.
- Optional. Set the Batch Sync for Special Tables parameter to specify whether to create a full batch synchronization node for tables
without primary keys.
- Click Refresh source table and MaxCompute Table mapping to create the mappings between the source tables and destination MaxCompute tables.
- View the mapping progress, source tables, and mapped destination tables.

No. |
Description |
1 |
The progress of mapping the source tables to the destination tables.
Note The mapping may require an extended period of time if you want to synchronize data
from a large number of tables.
|
2 |
The source of the destination table. Valid values: Create Table and Use Existing Table.
|
3 |
The name of the destination table. The table name that appears varies based on the
value that you selected from the drop-down list in the Table creation method column.
- If you set the Table creation method parameter to Create Table, the name of the destination table that is automatically created appears. You can
click the table name to view and modify the table creation statements.
- If you set the Table creation method parameter to Use Existing Table, you must select a table name from the drop-down list in the MaxComputeBase Table
name column.
Note If a source table does not have a primary key, you can click the edit icon next to
No primary key in the Synchronized Primary Key column and specify a primary key for
the source table so that full and incremental data can be subsequently synchronized
from the source table.
|
4 |
You can click Edit additional fields to add additional fields to the destination table in addition to the fields in the
source table.
Note If you set the Table creation method parameter to Create Table and specify additional fields, the corresponding column is automatically added to
the destination table. If you set the Table creation method parameter to Use Existing
Table and want to add additional fields to the existing destination table, make sure
that the corresponding column already exists in the destination table. This way, the
fields are written to the column. DataWorks does not modify the schema of the existing
table to add a new column to the existing table.
|
- Click Next Step.
- Configure the resources required by the synchronization solution.
In the
Set Resources for Solution Running step, set the parameters that are described in the following table.

Parameter |
Description |
Synchronization engine |
The engine used for data synchronization. Default value: Default embedded engine.
|
Select an exclusive resource group for real-time tasks |
The exclusive resource group used to run the real-time synchronization node generated
by the sync solution. Select an exclusive resource group from the drop-down list.
|
Real-time synchronization task name |
The name of the real-time synchronization node. |
Select scheduling Resource Group |
The exclusive resource groups used to run the real-time synchronization node and batch
synchronization node generated by the synchronization solution. Only exclusive resource
groups for Data Integration can be used to run synchronization nodes for synchronization
solutions. For more information, see Create and use an exclusive resource group for Data Integration.
|
Resource Groups for Full Batch Sync Nodes |
Maximum number of connections supported by source read |
The maximum number of Java Database Connectivity (JDBC) connections that are allowed
for the source. Specify an appropriate number based on the resources of the source.
|
Offline task name rules |
The name of the batch synchronization node that is used to synchronize the full data
of the source. After a synchronization solution is configured, DataWorks first runs
a batch synchronization node to synchronize full data, and then runs a real-time synchronization
node to synchronize incremental data.
|
- Click Complete Configuration. The synchronization solution is configured.
Run the synchronization solution
On the Solution task list tab of the Tasks page, find the created synchronization solution and click Submit and Run in the Operation column to run the synchronization solution.
View the status and result of the sync nodes
- On the Tasks page, find the solution that is run and click Execution details in the Operation column. Then, you can view the execution details of all nodes.
- Find a node whose execution details you want to view and click Execution details in the Status column. In the message that appears, click the provided link to go
to the DataStudio page.
Manage the sync solution
- View or edit the sync solution.
On the
Tasks page, find the solution that you want to view or edit, and choose More >
View Configuration in the Operation column.
Note You can click Modify Configuration to modify the sync solution only if the solution is in the Not Running state. If you click Modify Configuration in the Operation column that corresponds
to a sync solution in another state, you can view only the information about that
sync solution.
- Delete the sync solution.
Find the solution that you want to delete and choose More >
Delete in the Operation column. In the
Delete message, click
OK.
Note After you click OK, only the configuration record of the sync solution is deleted.
The generated sync nodes and tables are not affected.