Before you configure a data synchronization solution or node, you must make sure that your exclusive resource group for Data Integration and data sources are connected to each other. You can select appropriate network connectivity solutions to establish network connections between the resource group and data sources based on the network environments in which the data sources are deployed. This topic describes the network connectivity solutions that are available when data sources are deployed in different types of network environments.
Precautions
- You can run a data synchronization node only if network connections are established between the data sources and the resource group for the node. Therefore, before you commit your data synchronization node to the production environment for running, you must make sure that the data sources used for the node pass the network connectivity test. Take note that network connectivity does not necessarily ensure a successful running result of a data synchronization node.
- You can refer to the instructions provided in this topic to establish network connections between an exclusive resource group for scheduling and data sources.
- An exclusive resource group for Data Integration cannot connect to a data source that is deployed on the classic network. Before you synchronize data from or to such a data source, we recommend that you migrate the data source to a virtual private cloud (VPC).
- If you synchronize data from a data source over the Internet, the data transmission speed cannot be ensured. We recommend that you synchronize data over a VPC.
Background information
You can use an exclusive resource group for Data Integration to synchronize data between heterogeneous data sources in a complex network environment. Before you run a data synchronization node to synchronize data, you must establish network connections between the exclusive resource group for Data Integration and the data sources.

Purchase an exclusive resource group for Data Integration
- The maximum number of data synchronization nodes that can be run in parallel on a resource group and the maximum number of parallel threads supported by a resource group vary based on the specifications of the resource groups. You must purchase a resource group with appropriate specifications based on your business requirements.
- We recommend that you use different resource groups to run a batch synchronization node and a real-time synchronization node. If you use the same resource group to run a batch synchronization node and a real-time synchronization node, the two nodes compete for resources and affect each other. For example, CPU resources, memory resources, and networks used by the two nodes may affect each other. In this case, the batch synchronization node may slow down, or the real-time synchronization node may be delayed. Even worse, out-of-memory (OOM) errors may occur due to the lack of resources.
Configure network connectivity
Step 1: Associate a resource group with a VPC
The network connectivity solution that you can use varies based on the network relationship between your exclusive resource group for Data Integration and a data source. The following figure and table provide the related information.

Network type for data synchronization | Data source type | Relationship between the data source and the exclusive resource group for Data Integration | Common logic for network connections | Sample configuration |
---|---|---|---|---|
VPC | Alibaba Cloud data sources
| Same Alibaba Cloud account and same region![]() | Associate the exclusive resource group for Data Integration with the VPC in which the data source resides. | Scenario 1: Establish a network connection between an exclusive resource group for Data Integration and a data source that belong to the same Alibaba Cloud account and reside in the same region |
Different Alibaba Cloud accounts or regions![]() |
|
| ||
Data sources that do not belong to Alibaba Cloud
| ![]() | Scenario 4: Establish a network connection between an exclusive resource group for Data Integration and a data source that resides in a data center | ||
Internet | - | ![]() | The exclusive resource group for Data Integration can directly connect to the data source that is accessible over the Internet. | - |
Note If the data source is configured with an IP address whitelist, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated or the elastic IP address (EIP) of the exclusive resource group for Data Integration to the IP address whitelist regardless of the scenario in which you want to synchronize data. For information about how to obtain the CIDR block or IP address that must be added to the IP address whitelist, see Configure a whitelist. |
Step 2: Configure the IP address whitelist of the data source
If the data source is configured with an IP address whitelist, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated or the EIP of the exclusive resource group for Data Integration to the IP address whitelist regardless of the scenario in which you want to synchronize data.
- If you want to synchronize data over a VPC, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated to the IP address whitelist of the data source.
- If you want to synchronize data over the Internet, you must add the EIP of the exclusive resource group for Data Integration to the IP address whitelist of the data source.
- If you want to use an exclusive resource group for Data Integration to run a node to synchronize data from a data source over a VPC, you must add the CIDR block of the vSwitch to which the exclusive resource group is bound to an IP address whitelist of the data source. To obtain and add the CIDR block of the vSwitch to which the resource group is bound to an IP address whitelist of the data source, perform the following operations:On the Exclusive Resource Groups tab of the DataWorks console, find the desired exclusive resource group for Data Integration and click Network Settings in the Actions column to view the CIDR block of the vSwitch to which the resource group is bound. Then, add the CIDR block to the IP address whitelist of the data source.
- If you want to use an exclusive resource group for Data Integration to run a node to synchronize data from a data source over the Internet, add the EIP of the exclusive resource group to an IP address whitelist of the data source. To obtain and add the EIP of the exclusive resource group for Data Integration to an IP address whitelist of the data source, perform the following operations:On the Exclusive Resource Groups tab of the DataWorks console, find the exclusive resource group for Data Integration whose EIP you want to view and click View Information in the Actions column. In the Exclusive Resource Groups dialog box, copy the EIP. Then, add the copied EIP to the IP address whitelist of the data source.Note If you upgrade the configuration of the exclusive resource group for Data Integration, you must check whether the EIP of the resource group changes. If the EIP of the resource group changes, add the new EIP to the IP address whitelist of the data source after the configuration upgrade. This ensures the normal running of your synchronization node.
Sample configurations for different scenarios
Scenario 1: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to the same Alibaba Cloud account and reside in the same region
Instruction on establishing a network connection | Illustration |
---|---|
| ![]() |
Scenario 2: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to the same Alibaba Cloud account but reside in different regions
Instruction on establishing a network connection | Illustration |
---|---|
| ![]() |
Scenario 3: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to different Alibaba Cloud accounts
Instruction on establishing a network connection | Illustration |
---|---|
| ![]() |
Scenario 4: Establish a network connection between an exclusive resource group for Data Integration and a data source that resides in a data center
If the data source that you want to use does not belong to Alibaba Cloud, you can refer to this scenario to establish a network connection between the data source and the resource group that you want to use.
- Establish a network connection between the network environment in which the data source resides and Alibaba Cloud.
Use an Express Connect circuit to establish a network connection between the network environment in which the data source resides and a VPC within the Alibaba Cloud account to which the exclusive resource group for Data Integration belongs.
- Establish a network connection between the exclusive resource group for Data Integration and the data source.
- Associate the exclusive resource group for Data Integration with the VPC that is connected to the data source.
- Add a route that points to the CIDR block of the data source for the exclusive resource group for Data Integration in the DataWorks console. For more information, see General reference: Add a route.
- Add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated to the IP address whitelist of the data source.
What to do next
Configure a data synchronization solution or node. For more information, see the following topics: