If a large number of Data Integration nodes must be run in parallel, exclusive computing resource groups are required to ensure fast and reliable data transmission. In this case, you can use exclusive resource groups for Data Integration.

Limits

  • Network connectivity between data sources and exclusive resource groups for Data Integration

    Exclusive resource groups are deployed in the virtual private clouds (VPCs) of Alibaba Cloud. If your data source is deployed on the classic network of Alibaba Cloud, the exclusive resource groups cannot access the data source. In this case, we recommend that you create a data source that is the same as the original data source in a VPC.

  • Scale-out of exclusive resource groups for Data Integration
    • Each time you purchase exclusive resources for Data Integration, the resources belong to only one exclusive resource group for Data Integration. After an exclusive resource group for Data Integration is created, you can scale out the resource group by adding resources to the resource group.
    • A scale-out only increases the number of resources in your resource group. A scale-out does not upgrade the CPU and memory configurations of the resources or increase the number of VPCs with which your resource group is associated.
    • When you perform a scale-out for your resource group, you can add only resources of the same specifications as existing resources in the resource group.
    • You can perform a scale-out for your resource group without the need to stop the nodes that are running on the resource group. After you perform a scale-out for your resource group, the new configuration takes effect about 20 minutes later after you pay for your order.
  • Scale-in of exclusive resource groups for Data Integration
    • A scale-in decreases the number of resources in your resource group. You must stop nodes that are running on your resource group before you perform a scale-in for the resource group. After you perform a scale-in for your resource group, the new configuration takes effect about 20 minutes later.
    • Each time you purchase exclusive resources for Data Integration, the resources belong to only one exclusive resource group for Data Integration. After an exclusive resource group for Data Integration is created, you can scale in the resource group by removing resources from the resource group. After you perform a scale-in for your resource group, you are refunded for the resources that are not used from the time you scale in the resource group to the time the original order expires. You can view your bill for the accurate expense.
  • Specification change of resources in a resource group
    • You can change the specifications of resources in your resource group by changing the specifications of Elastic Compute Service (ECS) instances in your resource group. Before you change the specifications of ECS instances in your resource group, make sure that no nodes are running on your resource group. You cannot change the specifications of ECS instances in your resource group if nodes are running on your resource group.
    • If you upgrade the specifications of resources in your resource group, you need to pay for the upgraded resources that are used from the time you upgrade the specifications of resources in your resource group to the time the original order expires. If you downgrade the specifications of resources in your resource group, you are refunded for the resources that are not used from the time you downgrade the specifications of resources in your resource group to the time the original order expires. You can view your bill for the accurate expense. After you change the specifications of resources in your resource group, the new configuration takes effect about 20 minutes later.

    After you change the specifications of resources in your resource group and pay for the order, if some nodes are running on your resource group, you must manually stop the nodes on the Stop Node tab or wait until the running of the nodes is complete before you can confirm the operation of resource specification change.

  • Deletion of exclusive resource groups for Data Integration

    You cannot delete an exclusive resource group for Data Integration in DataWorks. An exclusive resource group for Data Integration can be suspended or released after the exclusive resource group for Data Integration expires. For more information about suspension and release of an exclusive resource group for Data Integration, see Other instructions.

Scenarios

  • If you want to deploy your Data Integration nodes in the production environment, we recommend that you use exclusive resource groups for Data Integration. This is because the resources in exclusive resource groups for Data Integration can be scheduled at any time to ensure the outputs of nodes.
  • If you want to run a large number of Data Integration nodes and want the nodes to generate data at the earliest opportunity, we recommend that you use exclusive resource groups for Data Integration.
  • If your data sources are deployed on the Internet or in VPCs, we recommend that you use exclusive resource groups for Data Integration.
Note
  • Exclusive resource groups for Data Integration guarantee the number of threads that a data synchronization instance can simultaneously run, not the number of data synchronization instances that can be simultaneously run. To guarantee the number of data synchronization instances that can simultaneously run, you can purchase exclusive resources for scheduling.

Performance metrics

Specifications Maximum number of parallel threads for a batch synchronization node *Maximum number of parallel real-time synchronization nodes
4c8g 8 3
8c16g 16 6
12c24g 24 9
16c32g 32 12
24c48g 48 18
Note *Maximum number of parallel real-time synchronization nodes: The configurations of different real-time synchronization nodes vary, and the maximum number of parallel real-time synchronization nodes for each type of specifications may not be reached during the actual running of the nodes. Therefore, the values in the Maximum number of parallel real-time synchronization nodes column of this table are for reference only. The actual maximum number of real-time synchronization nodes that can be run in parallel depends on the usage of resources in resource groups.

Billing method

Exclusive resource groups for Data Integration are charged based on the subscription billing method. You can purchase exclusive resources with appropriate specifications for your exclusive resource group for Data Integration based on your business requirements. For more information, see Performance metrics and pricing of exclusive resource groups for Data Integration.

Network connectivity

A DataWorks resource group is a group of Alibaba Cloud ECS instances. To run nodes, such as Data Integration nodes or data analytics nodes, make sure that resource groups and data sources are connected. In addition, make sure that special security settings such as whitelists do not affect the connections between resource groups and data sources.

  • Network connectivity
    • Data source deployed on the Internet: If your data source is deployed on the Internet, your exclusive resource group can directly access the data source.
    • Data source deployed in a VPC:
      • If your data source resides in the same region as your exclusive resource group, we recommend that you associate the exclusive resource group with the VPC where the data source resides and with a vSwitch that resides in the VPC. Then, the system adds a route for the exclusive resource group. This way, the exclusive resource group can access the data source.
      • If your data source resides in a different region from your exclusive resource group, you can use an Express Connect circuit or VPN gateway to connect the VPC where the exclusive resource group resides to the VPC where the data source resides.
    • Data source deployed in a data center: If your data source is deployed in a data center, you can use an Express Connect circuit or VPN gateway to connect the VPC where the exclusive resource group resides to the data center where the data source resides.
    • Data source deployed on the classic network:
  • Whitelist configuration: If an IP address whitelist is configured for your data source, you must add the elastic IP address (EIP) of your exclusive resource group for Data Integration or the CIDR block of the vSwitch to which the resource group is bound to the whitelist. For more information, see Add the EIP or CIDR block of an exclusive resource group for Data Integration to an IP address whitelist of a data source.

Other instructions

  • Scaling instructions
    • Scale-out
      You can scale out your exclusive resource group for Data Integration based on your business requirements. You need to only pay for the added resources that are used from the time you scale out the resource group to the time the original order expires.
      Note
      • Each time you purchase exclusive resources for Data Integration, the resources belong to only one exclusive resource group for Data Integration. After an exclusive resource group for Data Integration is created, you can scale out the resource group by adding resources to the resource group.
      • A scale-out only increases the number of resources in your resource group. A scale-out does not upgrade the CPU and memory configurations of the resources or increase the number of VPCs with which your resource group is associated.
      • When you perform a scale-out for your resource group, you can add only resources of the same specifications as existing resources in the resource group.
      • You can perform a scale-out for your resource group without the need to stop the nodes that are running on the resource group. After you perform a scale-out for your resource group, the new configuration takes effect about 20 minutes later after you pay for your order.
    • Scale-in
      You can scale in your exclusive resource group for Data Integration based on your business requirements.
      Note
      • A scale-in decreases the number of resources in your resource group. You must stop nodes that are running on your resource group before you perform a scale-in for the resource group. After you perform a scale-in for your resource group, the new configuration takes effect about 20 minutes later.
      • Each time you purchase exclusive resources for Data Integration, the resources belong to only one exclusive resource group for Data Integration. After an exclusive resource group for Data Integration is created, you can scale in the resource group by removing resources from the resource group. After you perform a scale-in for your resource group, you are refunded for the resources that are not used from the time you scale in the resource group to the time the original order expires. You can view your bill for the accurate expense.
  • Specification change instructions
    If the specifications of the current exclusive resource group for Data Integration do not meet your business requirements, you can change the specifications of the exclusive resource group for Data Integration. After you change the specifications of the current resource group, the specifications of all ECS instances in the resource group are changed at the same time.
    Note The specifications of each ECS instance in an exclusive resource group for Data Integration may affect the maximum number of parallel threads that you can configure for a single data synchronization node. If your data synchronization node needs to process large amounts of data and the running duration of the synchronization node is long, you can perform the following operations to shorten the running duration of the synchronization node: upgrade the specifications of ECS instances in the current resource group, adjust the number of threads that can be run in parallel on a single ECS instance, and increase the number of parallel threads that is configured for the synchronization node.
  • Renewal

    You can renew your exclusive resource group for Data Integration before the resource group expires or within 30 days after the resource group expires. If you do not renew the resource group within 15 days after the resource group expires, the system immediately suspends the resource group.

  • Expiration
    • Suspension

      By default, the system sends expiration notifications to the mobile phone number and email address that are bound to your Alibaba Cloud account 14 days, 12 days, and 8 days before your exclusive resource group for Data Integration expires. If you do not renew the resource group within 15 days after the resource group expires, the system immediately suspends the resource group.

    • Release

      After the exclusive resource group for Data Integration expires, it is retained for 15 days. If you do not renew the resource group within 15 days after the resource group is suspended, the system releases the resource group.

      The system sends release notifications to the mobile phone number and email address that are bound to your Alibaba Cloud account one day before the system releases your exclusive resource group for Data Integration.

Note