This topic introduces the basic concepts and types of resource groups, and how the connectivity and performance of resource groups affect data synchronization. This topic also compares different types of resource groups. You can select appropriate resource groups based on your needs.

Basic concepts

A resource group is a collection of computing resources on which batch sync nodes of Data Integration are run. Generally, a resource group refers to one or more servers that consist of CPU, memory, and network resources.

In the process of running a sync node, the resource group pulls data from the source data store and pushes the data to the destination data store.Flowchart

Connectivity and performance

When you use resource groups, you must pay attention to their connectivity and performance.
  • Connectivity

    To ensure that data can be properly synchronized, a resource group must be properly connected to the source and destination data stores. Connectivity is the most important factor that affects data synchronization.

    Data Integration cannot build networks. Before you use Data Integration to synchronize data, you must make sure that the resource group is properly connected to the data stores. If the resource group is disconnected from the data stores, batch sync nodes cannot be run.

  • Performance

    Batch sync nodes consume the CPU, memory, and network resources on the servers where the nodes are run. Insufficient resources may lead to various issues. For example, the nodes fail to start, wait for resources for a prolonged period after startup, transmit data at a low rate, or fail to generate results as scheduled.

    To ensure the smooth running of batch sync nodes, you must allocate adequate resources for them. We recommend that you use exclusive resource groups to run batch sync nodes so that the nodes do not need to compete for resources in the public resource pool.

Types and comparison of resource groups

Data Integration supports the following types of resource groups:
The three types of resource groups are applicable to different scenarios. You can select a resource group as needed to run a sync node.
Type Shared resource group Exclusive resource group Custom resource group
Ownership of resources The resources are maintained by DataWorks and shared among all tenants. The resources are maintained by DataWorks and exclusively used by the tenant that purchases the exclusive resources. The resources are maintained by yourself and reside in your Internet data center (IDC).
Network Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on classic networks or the Internet. Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on virtual private clouds (VPCs) or the Internet. Supports Alibaba Cloud databases on any type of network and databases that are not provided by Alibaba Cloud and are deployed on VPCs or the Internet.
Billing method Tiered pricing based on the number of node instances Subscription based on the server specifications Monthly billing in pay-as-you-go mode based on the DataWorks edition
Supported data stores Specific data stores All data stores All data stores
Security High High Depending on the environment where your server resides
Node running efficiency

Node running efficiency refers to whether nodes can be allocated sufficient computing resources to deliver the optimal performance.

Low High Depending on the environment where your server resides
Reliability

Reliability refers to whether nodes can be started and generate results as scheduled in the case that network resources are occupied by other tenants.

Low High Depending on the environment where your server resides
Scenario Suitable for a small number of non-important, non-urgent, or testing nodes Suitable for a large number of important production nodes Suitable for the following scenarios:
  • You want to make full use of the computing resources you have purchased.
  • Both the source and destination data stores are in the same IDC as the custom resource group.
Recommendation index ★★ ★★★★★

Based on the preceding table, we recommend that you use exclusive resource groups to run batch sync nodes.