If a large number of nodes must be run in parallel, exclusive computing resources are required to ensure that the nodes are run as scheduled. In this case, we recommend that you use exclusive resource groups for scheduling. This topic provides an overview of an exclusive resource group for scheduling.

Scenarios

  • You have a high requirement for the timeliness of data output. If you use the shared resource group for scheduling, node scheduling may be delayed due to resource preemption.
  • You want to adjust the size of a resource group.
  • You want to access a data source that is deployed on the Internet, in a virtual private cloud (VPC), or in a data center.
  • You want to control access to a data source by using an IP address whitelist.
  • Specific types of DataWorks nodes, such as E-MapReduce (EMR) nodes, CDH nodes, AnalyticDB for PostgreSQL nodes, and AnalyticDB for MySQL nodes, can be run by using only exclusive resource groups.

Limits

  • An exclusive resource group for scheduling is charged based on the subscription billing method. You cannot delete or release an exclusive resource group for scheduling before the resource group expires. An exclusive resource group for scheduling is suspended and released at the specified points in time after it expires.
  • An exclusive resource group for scheduling cannot be shared across regions. For example, an exclusive resource group for scheduling in the China (Shanghai) region can be used only by workspaces in the China (Shanghai) region.
  • You can purchase a maximum of 20 Elastic Compute Service (ECS) instances for each exclusive resource group, and the ECS instances must be of the same specifications.

Performance metrics

Specifications Maximum number of parallel instances
4c8g 16
8c16g 32
12c24g 48
16c32g 64
24c48g 96

Billing and related operations

(1) Billing

An exclusive resource group for scheduling is charged based on the subscription billing method. You can purchase an exclusive resource group for scheduling of appropriate specifications based on your business requirements. For more information, see Billing of exclusive resource groups for scheduling (subscription).

(2) Scaling

You can purchase an exclusive resource group for scheduling based on your business requirements. When you purchase an exclusive resource group for scheduling, you can specify the specifications and number of ECS instances that you need to use. After the purchase is complete, you can scale out or scale in the resource group. For more information about how to scale out and scale in a resource group, see Scale out or in a resource group.

(3) Specification change (specification upgrade or downgrade)

If the specifications of your exclusive resource group for scheduling no longer meet your business requirements, you can change the specifications of the resource group. After you change the specifications of the resource group, the specifications of all ECS instances in the resource group are changed. For more information about how to change the specifications of a resource group and the related precautions, see Change the specifications of a resource group.

(4) Renewal, suspension, and release of an exclusive resource group for scheduling

You can renew an exclusive resource group for scheduling when the resource group is about to expire. If you do not renew the resource group before it expires, the resource group is suspended upon expiration and automatically released. For more information, see Expiration and renewal.

Use an exclusive resource group for scheduling

After you have a command of the billing of an exclusive resource group for scheduling, you can purchase an exclusive resource group for scheduling based on your business requirements and use the resource group to run data synchronization nodes in Data Integration or nodes in Data Studio. To purchase and use an exclusive resource group for Data Integration, perform the following steps:
  1. Purchase exclusive resources for scheduling.
  2. Create an exclusive resource group for scheduling in the DataWorks console based on the purchased resources.
  3. Associate the exclusive resource group for scheduling with a workspace.
  4. (Optional) Associate the exclusive resource group for scheduling with a VPC.
  5. (Optional) Add the EIP of the exclusive resource group for scheduling or the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist of the data source that the resource group needs to access.
  6. Use the created exclusive resource group for scheduling.
For more information, see Create and use an exclusive resource group for scheduling.

Network connectivity solutions

Similar to other types of resource groups, an exclusive resource group for scheduling is a group of Alibaba Cloud ECS instances. Before you use an exclusive resource group for scheduling to run nodes such as nodes in DataStudio, you must make sure that network connections are established between the resource group and the data sources used for the node. You must also make sure that special security settings such as IP address whitelists do not affect the network connections between the resource group and data sources.
Note If an exclusive resource group for scheduling does not need to interact with a data source, you do not need to consider the network connectivity between the resource group and the data source. You can directly use the resource group after you purchase it.

After you purchase an exclusive resource group for scheduling, you must associate the resource group with a VPC. Then, you can select a network connectivity solution based on the network environment in which your data source is deployed.