The data integration function allows to periodically import business data generated in your system to the workspace and periodically export the flow computing results to the data source you specify for further display or operation.
Currently, data from the following data sources can be imported to or exported from the workspace through the data integration function: RDS, MySQL, SQL Server, PostgreSQL, MaxCompute, ApsaraDB for Memcache, DRDS, OSS, Oracle, FTP, DM, Hdfs, MongoDB, and so on. For more information, see Supported data source types.
This section uses MySQL as an example to show how to export data in Data IDE to MySQL through the data integration function.
If your database is a self-built database on ECS or a RDS/MongoDB data source, you must add the data synchronization machine IP address whitelist to your ECS security group or RDS/MongoDB whitelist. For more information, see Add whitelist and security group.
If you use a custom resource group to schedule RDS data synchronization tasks, you must add the machine IP address of the custom resource group to the RDS whitelist.
Only the project administrator can create a data source. Other roles can only view the data source.
Log on to the DataWorks console as an administrator and click Enter Project in the operations column of the relevant project in the Project List.
Click Data Integration from the upper menu, and click Data Sources in the left-side navigation pane.
Click New Source in the upper-right corner, as shown in the following figure.
Enter the configuration items in the create data source dialog box, as shown in the following figure.
Type: The network type of data sources.
Name: The name must contain letters, numbers, and underscores (), but cannot begin with a number or an underscore (), for example, abc_123.
Description: The description cannot exceed 80 characters.
JDBC URL: jdbc:mysql://host:port/database
User name/Password: The user name and password are used to connect to the database.
For configurations of different types of data sources, see the articles under Data Source Config.
Click Test Connectivity.
If the connectivity test is successful, click Complete.
Note: Make sure that the target MySQL database contains tables.
Create the table odps_result in the MySQL database. The statements used for table creation are as follows.
CREATE TABLE `ODPS_RESULT` (
`education` varchar(255) NULL ,
`num` int(10) NULL
After the table is created, you can run desc odps_result; to view the table details.
This section shows how to create and configure the synchronization node write_result, and write data from result_table to the MySQL database. The specific steps are as follows.
Create the node write_result, as shown in the following figure.
Select the source.
Select the MaxCompute data source and the source table result_table and click Next, as shown in the following figure.
Select the target.
Select the MySQL data source and the target table odps_result and click Next, as shown in the following figure.
Map the fields.
Select the mapping between fields. You must configure the field mapping relationships. The Source Table Fields on the left correspond (one-to-one) with the Target Table Fields on the right.
Control the channel.
Click Next to configure the maximum job rate and dirty data check rules, as shown in the following figure.
Preview and store.
After configuration, you can scroll up or down to view the task configurations. If no errors are found, click Save.
Once you save a synchronization task click Submit, and the synchronization task is submitted to the scheduling system. The scheduling system automatically and periodically runs the task from the second day according to the configuration attributes.
Now, you know how to create a synchronization task and export data to data sources of different types. Continue to the next tutorial for further study. This tutorial shows you how to set the scheduling attribute and dependency for a synchronization task. For more information, see Set task scheduling attribute and dependency.