This topic describes the basic concepts and types of resource groups, and how the connectivity and performance of resource groups affect data synchronization. This topic also compares different types of resource groups. You can select appropriate resource groups based on your needs.

Basic concepts

A resource group is a collection of computing resources where batch sync nodes of Data Integration run. Generally, a resource group refers to a server that consists of CPU, memory, and network resources.

In the process of running a sync node, the resource group pulls data from the source data store and pushes the data to the destination data store.

Connectivity and performance

When you use resource groups, you must pay attention to their connectivity and performance.
  • Connectivity

    To make sure that data can be properly synchronized, a resource group must be properly connected to the source and destination data stores. Connectivity is a most important factor that affects data synchronization.

    Data Integration cannot build networks. You must make sure that Data Integration is properly connected to data stores before you use it to synchronize data. If Data Integration is disconnected from data stores, batch sync nodes cannot be run.

  • Performance

    Batch sync nodes consume the CPU, memory, and network resources on the servers where the nodes are run. Insufficient resources may lead to various issues. For example, the nodes fail to start, wait for resources for a prolonged period after startup, transmit data at a low rate, or fail to generate results in a timely manner.

    To guarantee smooth running of batch sync nodes, you must allocate adequate resources for them. We recommend that you use exclusive resource groups to run batch sync nodes so that the nodes do not need to compete for resources in the public resource pool.

Types and comparison of resource groups

Data Integration supports the following types of resource groups:
The three types of resource groups are applicable to different scenarios. You can select a resource group as needed to run a sync node.
Type Default resource group Exclusive resource group Custom resource group
Ownership of resources The resources are maintained by DataWorks and shared among all tenants. The resources are maintained by DataWorks and exclusively used by the tenant that purchases the exclusive resource group. The resources are maintained by yourself and are located in your IDC.
Network Supports non-Alibaba Cloud databases deployed on classic networks or the Internet and Alibaba Cloud databases on any type of networks Supports non-Alibaba Cloud databases deployed on Virtual Private Clouds (VPCs) or the Internet and Alibaba Cloud databases on any type of networks Supports non-Alibaba Cloud databases deployed on VPCs or the Internet and Alibaba Cloud databases on any type of networks
Billing method Tiered pricing based on the number of node instances Subscription based on the server specifications Monthly billing in the pay-as-you-go mode based on the DataWorks edition
Supported data stores Some data stores All data stores All data stores
Security High High Depending on the environment where your server resides
Node running efficiency

Node running efficiency refers to whether nodes can be allocated with sufficient computing resources to achieve the highest performance.

Low High Depending on the environment where your server resides
Reliability

Reliability refers to whether nodes can be started as scheduled and generate results in a timely manner in the case that network resources are occupied by other tenants.

Low High Depending on the environment where your server resides
Scenario Suitable for running a small number of non-important, non-urgent, or testing nodes Suitable for running a large number of important production nodes
  • Suitable if you want to make full use of the computing resources you have purchased
  • Suitable if both the source and destination data stores are in the same IDC as the custom resource group
Recommendation index ★★ ★★★★★

Based on the preceding table, we recommend that you use exclusive resource groups to run batch sync nodes.