This topic describes how to create a synchronization node to export data from MaxCompute
to a MySQL data source.
Background information
You can use Data Integration to periodically synchronize the business data generated
in a business system to a DataWorks workspace. You can create SQL nodes to compute
the data and use Data Integration to periodically synchronize the computing results
to your specified data source for further display or use. 
Data Integration can import data from and export data to various data sources, such
as ApsaraDB RDS, MySQL, SQL Server, PostgreSQL, MaxCompute, ApsaraDB for Memcache,
Distribute Relational Database Service (DRDS), Object Storage Service (OSS), Oracle,
FTP, Dameng, Hadoop Distributed File System (HDFS), and MongoDB. For more information about the data sources, see Supported data sources, readers, and writers.
Add a data source
Note Only the workspace administrator can add data sources. Members of other roles can
only view data sources.
- Go to the Data Source page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- After you select the region where the required workspace resides, find the workspace
and click Data Integration in the Actions column.
- In the left-side navigation pane of the Data Integration page, choose to go to the Data Source page.
- On the Data Source page, click Add data source in the upper-right corner.
- In the Add data source dialog box, click MySQL in the Relational Database section.
- In the Add MySQL data source dialog box, set the parameters as required.
In this example, set the Data source type parameter to
Alibaba Cloud instance mode. Then, set other parameters as described in the following table.

Parameter |
Description |
Data Source Type |
The mode in which the data source is added. Set this parameter to Alibaba Cloud Instance Mode.
|
Data Source Name |
The name of the data source. The name can contain letters, digits, and underscores
(_) and must start with a letter.
|
Data Source Description |
The description of the data source. The description can be a maximum of 80 characters
in length.
|
Environment |
The environment in which the data source is used. Valid values: Development and Production.
Note This parameter is displayed only when the workspace is in standard mode.
|
Region |
The region where the data source resides. |
RDS Instance ID |
The ID of the ApsaraDB RDS for MySQL instance. You can log on to the ApsaraDB RDS console to obtain the ID.
|
RDS Instance Account ID |
The ID of the Alibaba Cloud account that is used to purchase the ApsaraDB RDS for
MySQL instance. You can use the Alibaba Cloud account to log on to the DataWorks console, move the pointer over the profile picture in the upper-right corner, and then view
the ID of the account.
|
Default Database Name |
The name of the default ApsaraDB RDS for MySQL database. The following descriptions
provide instructions for you to configure a data synchronization node or solution
that uses a MySQL data source:
- When you configure a database-level real-time or batch data synchronization node or
solution that uses a MySQL data source, you can select one or more databases on which
you have access permissions in the ApsaraDB for MySQL instance.
- If you select multiple databases when you configure a batch synchronization node,
you must add a data source for each database.
|
Username |
The username used to log on to the ApsaraDB RDS for MySQL database. |
Password |
The password that is used to log on to the ApsaraDB RDS for MySQL database. Do not
use at signs (@) in the password.
|
- Test the connectivity between the data source and resource groups.
Click the
Data Integration tab next to the Resource Group connectivity parameter, find the resource group for
Data Integration that you want to use, and then click Test connectivity in the Actions
column. Click the
Schedule tab and perform the same operations to test the connectivity between the data source
and the resource group for scheduling that you want to use. If the connectivity status
is Connected, the resource groups are connected to the data source.
Note
- A synchronization node uses only one resource group of a specific type.
- To ensure that your synchronization nodes can be run as expected, you must test the
connectivity between the data source and all types of resource groups on which your
synchronization nodes will be run.
- For more information, see Select a network connectivity solution.

- After the data source passes the connectivity test, click Complete.
Create and configure a synchronization node
This section describes how to create and configure a synchronization node named write_result and use the node to synchronize data in the result_table table to your MySQL data
source. Perform the following operations:
- Go to the DataStudio page and create a batch synchronization node named write_result.
- On the DataStudio page, configure the insert_data node as the ancestor node of the write_result node.
- On the configuration page of the batch synchronization node, select ODPS from the drop-down list, select the odps_first database from the next drop-down list, and then select result_table from the Table drop-down list. result_table is the source table from which the data is synchronized.
- Select the odps_result table that you create in the ApsaraDB RDS for MySQL database as the destination table.
- In the Mappings section, configure field mappings. Make sure that fields in the source
table on the left have a one-to-one mapping with fields in the destination table on
the right.
- In the Channel section, configure the maximum transmission rate and dirty data check rules.
After you complete the preceding steps, you can configure channel control policies
for the synchronization node.

Parameter |
Description |
Expected Maximum Concurrency |
The maximum number of parallel threads that the synchronization node uses to read
data from the source or write data to the destination. You can configure the parallelism
for the synchronization node on the codeless UI.
|
Bandwidth Throttling |
Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling
and specify a maximum transmission rate to prevent heavy read workloads on the source.
We recommend that you enable bandwidth throttling and set the maximum transmission
rate to an appropriate value based on the configurations of the source.
|
Dirty Data Records Allowed |
The maximum number of dirty data records allowed. |
Distributed Execution |
The distributed execution mode. In distributed mode, your node can be sliced to multiple
ECS instances for parallel execution. This speeds up synchronization. If a large number
of synchronization nodes are run in parallel, excessive access requests are sent to
the data source. Evaluate the load on the data source before you use this mode. You
can use this mode only when you use exclusive resource groups for Data Integration.
|
- Preview and save the configuration.
After the node is configured, you can scroll up and down to view the node configuration.
After you confirm that the configuration is correct, click the

icon in the top toolbar.
Commit the synchronization node
Return to the workflow after you save the synchronization node. Click the
icon in the top toolbar to commit the synchronization node to the scheduling system.
The scheduling system automatically runs the node at the scheduled time from the next
day based on your settings.
What to do next
Now you have learned how to create a synchronization node to export data to a specific
data store. You can proceed with the next tutorial. In the next tutorial, you will
learn how to configure properties and dependencies for a synchronization node. For
more information, see Configure recurrence and dependencies for a node.