This topic introduces the basic concepts and types of resource groups, and how the connectivity and performance of resource groups affect data synchronization. This topic also compares different types of resource groups. You can select appropriate resource groups based on your needs.
Basic concepts
A resource group is a collection of computing resources on which batch sync nodes of Data Integration are run. Generally, a resource group refers to one or more servers that consist of CPU, memory, and network resources.

Connectivity and performance
- Connectivity
To ensure that data can be properly synchronized, a resource group must be properly connected to the source and destination data stores. Connectivity is the most important factor that affects data synchronization.
Data Integration cannot build networks. Before you use Data Integration to synchronize data, you must make sure that the resource group is properly connected to the data stores. If the resource group is disconnected from the data stores, batch sync nodes cannot be run.
- Performance
Batch sync nodes consume the CPU, memory, and network resources on the servers where the nodes are run. Insufficient resources may lead to various issues. For example, the nodes fail to start, wait for resources for a prolonged period after startup, transmit data at a low rate, or fail to generate results as scheduled.
To ensure the smooth running of batch sync nodes, you must allocate adequate resources for them. We recommend that you use exclusive resource groups to run batch sync nodes so that the nodes do not need to compete for resources in the public resource pool.
Types and comparison of resource groups
Type | Shared resource group | Exclusive resource group | Custom resource group |
---|---|---|---|
Ownership of resources | The resources are maintained by DataWorks and shared among all tenants. | The resources are maintained by DataWorks and exclusively used by the tenant that purchases the exclusive resources. | The resources are maintained by yourself and reside in your Internet data center (IDC). |
Network | Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on classic networks or the Internet. | Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on virtual private clouds (VPCs) or the Internet. | Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on VPCs or the Internet. |
Billing method | Tiered pricing based on the number of node instances | Subscription based on the server specifications | Monthly billing in pay-as-you-go mode based on the DataWorks edition |
Supported data stores | Specific data stores | All data stores | All data stores |
Security | High | High | Depending on the environment where your server resides |
Node running efficiency
Node running efficiency refers to whether nodes can be allocated sufficient computing resources to deliver the optimal performance. |
Low | High | Depending on the environment where your server resides |
Reliability
Reliability refers to whether nodes can be started and generate results as scheduled in the case that network resources are occupied by other tenants. |
Low | High | Depending on the environment where your server resides |
Scenario | Suitable for a small number of non-important, non-urgent, or testing nodes | Suitable for a large number of important production nodes | Suitable for the following scenarios:
|
Recommendation index | ★★ | ★★★★★ | ★ |
Based on the preceding table, we recommend that you use exclusive resource groups to run batch sync nodes.