Before you configure a synchronization task, you must make sure that network connections are established between your exclusive resource group for Data Integration and your data sources. You can select appropriate network connectivity solutions to establish network connections between the resource group and data sources based on the network environments in which the data sources are deployed. This topic describes the network connectivity solutions that are available when data sources are deployed in different types of network environments.
Precautions
You can run a synchronization task only if network connections are established between the data sources and the resource group for the synchronization task. Therefore, before you commit your synchronization task to the production environment for running, you must make sure that the data sources used for the synchronization task pass the network connectivity test. Take note that network connectivity does not necessarily ensure a successful running result of a synchronization task.
You can refer to the instructions provided in this topic to establish network connections between an exclusive resource group for scheduling and your data sources.
An exclusive resource group for Data Integration cannot connect to a data source that is deployed on the classic network. Before you synchronize data from or to such a data source, we recommend that you migrate the data source to a virtual private cloud (VPC).
If you synchronize data from a data source over the Internet, the speed of data transmission and the stability of your synchronization task cannot be ensured. We recommend that you synchronize data over a VPC or by using Cloud Enterprise Network (CEN).
Background information
You can use an exclusive resource group for Data Integration to synchronize data between heterogeneous data sources in a complex network environment. Before you run a synchronization task to synchronize data, you must establish network connections between the exclusive resource group for Data Integration and the data sources.
Before you perform data synchronization, you must establish network connections between the resource group that you want to use and your data sources, as shown in the preceding figure. This topic focuses on network connections between an exclusive resource group for Data Integration and data sources.
Purchase an exclusive resource group for Data Integration
For more information about how to purchase an exclusive resource group for Data Integration, see Create and use an exclusive resource group for Data Integration.
The maximum number of synchronization tasks that can be run in parallel on a resource group and the maximum number of parallel threads supported by a resource group vary based on the specifications of the resource groups. You must purchase a resource group with appropriate specifications based on your business requirements.
We recommend that you use different resource groups to run a batch synchronization task and a real-time synchronization task. If you use the same resource group to run a batch synchronization task and a real-time synchronization task, the two synchronization tasks compete for resources and affect each other. For example, CPU resources, memory resources, and networks used by the two synchronization tasks may affect each other. In this case, the batch synchronization task may slow down, or the real-time synchronization task may be delayed. Even worse, the batch synchronization task or real-time synchronization task may be killed by an out of memory (OOM) killer.
Exclusive resource groups in the same region use the same elastic IP address (EIP). If you block the EIP for the region, all exclusive resource groups in the region cannot access your data.
Configure network connectivity
Step 1: Associate a resource group with a VPC
The network connectivity solution that you can use varies based on the network relationship between your exclusive resource group for Data Integration and a data source. The following figure and table provide the related information.

Network type for data synchronization | Data source type | Relationship between the data source and the exclusive resource group for Data Integration | Common logic for network connections | Sample configuration |
VPC | Alibaba Cloud data sources
| Same Alibaba Cloud account and same region | Associate the exclusive resource group for Data Integration with the VPC in which the data source is deployed. | |
Different Alibaba Cloud accounts or regions |
| |||
Data sources that do not belong to Alibaba Cloud
| ![]() | |||
Internet | - | ![]() | The exclusive resource group for Data Integration can directly connect to the data source that is accessible over the Internet. | - |
Note If the data source is configured with an IP address whitelist, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated or the EIP of the exclusive resource group for Data Integration to the IP address whitelist regardless of the scenario in which you want to synchronize data. For information about how to obtain the CIDR block or IP address that must be added to the IP address whitelist, see Configure an IP address whitelist. |
Step 2: Configure the IP address whitelist of the data source
If the data source is configured with an IP address whitelist, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated or the EIP of the exclusive resource group for Data Integration to the IP address whitelist regardless of the scenario in which you want to synchronize data.
If you want to synchronize data over a VPC, you must add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated to the IP address whitelist of the data source.
If you want to synchronize data over the Internet, you must add the EIP of the exclusive resource group for Data Integration to the IP address whitelist of the data source.
You can use one of the following methods to obtain the CIDR block or IP address that must be added to the IP address whitelist:
- If you want to use an exclusive resource group for Data Integration to run a node to synchronize data from a data source over a VPC, you must add the CIDR block of the vSwitch to which the exclusive resource group is bound to an IP address whitelist of the data source. To obtain and add the CIDR block of the vSwitch to which the resource group is bound to an IP address whitelist of the data source, perform the following operations:On the Exclusive Resource Groups tab of the DataWorks console, find the desired exclusive resource group for Data Integration and click Network Settings in the Actions column to view the CIDR block of the vSwitch to which the resource group is bound. Then, add the CIDR block to the IP address whitelist of the data source.
- If you want to use an exclusive resource group for Data Integration to run a node to synchronize data from a data source over the Internet, add the EIP of the exclusive resource group to an IP address whitelist of the data source. To obtain and add the EIP of the exclusive resource group for Data Integration to an IP address whitelist of the data source, perform the following operations:On the Exclusive Resource Groups tab of the DataWorks console, find the exclusive resource group for Data Integration whose EIP you want to view and click View Information in the Actions column. In the Exclusive Resource Groups dialog box, copy the EIP. Then, add the copied EIP to the IP address whitelist of the data source.Note If you upgrade the configuration of the exclusive resource group for Data Integration, you must check whether the EIP of the resource group changes. If the EIP of the resource group changes, add the new EIP to the IP address whitelist of the data source after the configuration upgrade. This ensures the normal running of your synchronization node.
Sample configurations for different scenarios
The first three scenarios that are described in this section demonstrate how to establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance over a VPC. For information about how to obtain the VPC information of an ApsaraDB RDS instance, see Change the VPC and vSwitch.
In the following scenarios, the exclusive resource group for Data Integration is associated with a basic security group. For information about basic security groups, see Overview.
Scenario 1: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to the same Alibaba Cloud account and reside in the same region
Instruction on establishing a network connection | Illustration |
| ![]() |
Scenario 2: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to the same Alibaba Cloud account but reside in different regions
Instruction on establishing a network connection | Illustration |
| ![]() |
Scenario 3: Establish a network connection between an exclusive resource group for Data Integration and an ApsaraDB RDS instance that belong to different Alibaba Cloud accounts
Instruction on establishing a network connection | Illustration |
| ![]() |
Scenario 4: Establish a network connection between an exclusive resource group for Data Integration and a data source that resides in a data center
If the data source that you want to use does not belong to Alibaba Cloud, you can refer to this scenario to establish a network connection between the data source and the resource group that you want to use.
Establish a network connection between the network environment in which the data source resides and Alibaba Cloud.
Use an Express Connect circuit to establish a network connection between the network environment in which the data source resides and a VPC within the Alibaba Cloud account to which the exclusive resource group for Data Integration belongs.
Establish a network connection between the exclusive resource group for Data Integration and the data source.
Associate the exclusive resource group for Data Integration with the VPC that is connected to the data source.
Add a route that points to the CIDR block of the data source for the exclusive resource group for Data Integration in the DataWorks console. For more information, see General reference: Add a route.
Add the CIDR block of the vSwitch with which the exclusive resource group for Data Integration is associated to the IP address whitelist of the data source.
What to do next
Configure a synchronization task. For information about the capabilities supported by the full and incremental synchronization feature, batch synchronization feature, and real-time synchronization feature, see the following topics: