This topic uses a MySQL data source that is deployed in an on-premises data center as an example to show you how to connect the data source to DataWorks.
Use cases
If your data source meets the following condition, we recommend that you use this solution:
The data source is deployed in a data center.
Solution description
If your data source is in an on-premises data center (IDC), you can use a VPC connection. You can use a network connectivity tool, such as Express Connect, to connect the on-premises network of the data source to the VPC of the DataWorks workspace resource group. This enables network communication.
Prerequisites
A data source supported by DataWorks is deployed in an on-premises data center.
A workspace is created. For more information, see Create a workspace.
A serverless resource group is created and associated with your workspace. For more information, see Create a serverless resource group and 1. Bind the resource group to a workspace.
Billing
This solution uses Express Connect, which is a paid service. For more information about Express Connect billing, see Billing overview.
Configure network connectivity
Step 1: Get basic information
Data source side
On-premises data center CIDR block
Connect to the on-premises data center server to obtain the CIDR block. You can also contact your network administrator or data center provider to obtain the CIDR block.
DataWorks side
Information about the VPC and vSwitch that are attached to the resource group
Go to the Resource Group page in the DataWorks console. Find the target resource group and click Network Settings in the Actions column.
In the corresponding functional module, you can view the attached VPC and VSwitch CIDR Block.
For example, to connect a MySQL database in an on-premises data center to DataWorks for data synchronization, you can view the corresponding VPC and VSwitch CIDR Block under Data Scheduling & Data Integration.

Step 2: Establish a network connection
To connect an on-premises data center to a VPC, you must use a network connectivity tool. For instructions, see Use an Express Connect circuit to establish a network connection between an on-premises data center and a VPC.
If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.
Step 3: Add a route for the DataWorks resource group
To allow DataWorks to access the on-premises data source, you must add a route in the DataWorks resource group that points to the CIDR block of the on-premises data center.
Go to the Resource Group page in the DataWorks console. Find the target resource group and click Network Settings in the Actions column.
In the appropriate functional module, find the attached VPC and click Custom Route in the Actions column.
Click Add Route, select CIDR Block as the Connection Method, and set Destination CIDR Block to the CIDR block of the data center where the data source is deployed.
Step 4: (Optional) Add an IP address to the whitelist
If the data source is protected by a whitelist, add the vSwitch CIDR block of the resource group to the whitelist of the data source. This allows the resource group to access the data source.
This topic uses a MySQL IP address whitelist as an example. In this example, the whitelist is configured to allow access to the database only from the vSwitch CIDR block of the resource group.
Log on to the database as an administrator.
Create an account that can be used by DataWorks to access the data source and grant the required permissions to the account.
-- "dataworks_user" is the username. You can customize it. -- "StrongPassword123!" is the password. You can customize it. CREATE USER 'dataworks_user'@'<vSwitch CIDR block of the resource group>' IDENTIFIED BY 'StrongPassword123!'; -- Grant the user permissions to access a specific database, such as mydatabase, from the vSwitch CIDR block of the resource group. GRANT ALL PRIVILEGES ON mydatabase.* TO 'dataworks_user'@'<vSwitch CIDR block of the resource group>' WITH GRANT OPTION;Run the
FLUSH PRIVILEGES;command to make the permissions take effect, and then run theexitcommand to exit the database.
Step 5: (Optional) Configure the firewall for the on-premises data center
The configuration methods for firewall software vary. This topic uses firewalld as an example. You can configure other types of firewalls based on your requirements.
Allow the vSwitch CIDR block of the resource group to access the MySQL database:
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="<vSwitch CIDR block of the resource group>" port port="3306" protocol="tcp" accept'
sudo firewall-cmd --reloadTest the network connectivity
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.
In the navigation pane on the left, click Data Source. On the Data Sources page, click Add Data Source. Select a data source type and configure the connection parameters as required.
In the resource group list at the bottom of the page, select the resource group that is connected to the data source and click Test Connectivity.
NoteIf the connectivity test returns Failed, you can use the Connectivity Diagnosis Tool to diagnose and resolve the issue. If the issue persists, submit a ticket for assistance.
