If a large number of Data Integration nodes must be run in parallel, exclusive computing resources are required to ensure fast and stable data transmission. In this case, we recommend that you use an exclusive resource group for Data Integration. This topic provides an overview of exclusive resource groups for Data Integration.

Scenarios

  • You need to run a large number of Data Integration nodes in parallel, and the requirement for the timeliness of data output is high.
  • You want to adjust the sizes of resource groups.
  • You want to access a data source that is deployed on the Internet, in a virtual private cloud (VPC), or in a data center.
  • You want to control access to a data source by configuring an IP address whitelist for the data source.
Note

An exclusive resource group for Data Integration ensures the number of threads that a data synchronization instance can simultaneously run but not the number of data synchronization instances that can be simultaneously run. To ensure the number of data synchronization instances that can simultaneously run, you can purchase an exclusive resource group for scheduling.

Limits

  • An exclusive resource group for Data Integration is billed based on the subscription billing method. You cannot delete or release an exclusive resource group for Data Integration before the resource group expires. The resource group is suspended and released at the specified points in time after it expires.
  • An exclusive resource group for Data Integration cannot be shared across regions. For example, an exclusive resource group for Data Integration in the China (Shanghai) region can be used only by workspaces in the China (Shanghai) region.
  • An exclusive resource group for Data Integration cannot access data sources that are deployed in the classic network of Alibaba Cloud. If your data source is deployed in the classic network, we recommend that you migrate the data source to the VPC in which the exclusive resource group for Data Integration is deployed.

Performance metrics

Specifications Maximum number of parallel threads for a batch synchronization node Maximum number of parallel real-time synchronization nodes for a single table in a source Maximum number of parallel real-time synchronization nodes for multiple tables in a source Maximum number of parallel real-time synchronization nodes for table shards
4c8g 8 3 3 Not supported
8c16g 16 6 6 1
12c24g 24 9 9 1
16c32g 32 12 12 2
24c48g 48 18 18 3
Note
  • Maximum number of parallel real-time synchronization nodes for a single table in a source and maximum number of parallel real-time synchronization nodes for multiple tables in a source: The configurations of different real-time synchronization nodes vary. The maximum number of parallel real-time synchronization nodes for a single table in a source and the maximum number of parallel real-time synchronization nodes for multiple tables in a source may not be reached during the actual running of the nodes for each type of specifications. Therefore, the values in these two columns of the preceding table are for reference only. The actual maximum numbers depend on the usage of resources in resource groups.
  • Maximum number of parallel real-time synchronization nodes for table shards: If you want to synchronize data from two or more tables in a source, at least 13 real-time synchronization nodes need to be run in parallel to accomplish the data synchronization. The specifications of the resource group that is used to run these real-time synchronization nodes must be at least 8c16g.

Billing and related operations

1. Billing

An exclusive resource group for Data Integration is billed based on the subscription billing method. You can purchase an exclusive resource group for Data Integration of appropriate specifications based on your business requirements. For more information, see Billing of exclusive resource groups for Data Integration (subscription).

Use an exclusive resource group for Data Integration

To ensure service efficiency, you can purchase an exclusive resource group for Data Integration of appropriate specifications and use the resource group to run Data Integration nodes. To purchase and use an exclusive resource group for Data Integration, perform the following steps:
  1. Create an order to purchase an exclusive resource group for Data Integration.
  2. Configure the exclusive resource group for Data Integration based on your business requirements.
  3. Associate the exclusive resource group for Data Integration with a workspace.
  4. Associate the exclusive resource group for Data Integration with a VPC.
  5. Add the elastic IP address (EIP) of the exclusive resource group for Data Integration or the CIDR block of the vSwitch to which the resource group is bound to the IP address whitelist of a data source.
  6. Application example: Use the created exclusive resource group for Data Integration.
For more information, see Create and use an exclusive resource group for Data Integration.

Network connectivity solutions

Similar to other types of exclusive resource groups, an exclusive resource group for Data Integration is a group of Alibaba Cloud ECS instances. To run Data Integration nodes, you must make sure that an exclusive resource group for Data Integration and a data source are connected. You must also make sure that special security settings such as an IP address whitelist do not affect the connections between the resource group and data source.

After you purchase an exclusive resource group for Data Integration, you must associate the resource group with a VPC. Then, you can select a network connection solution based on the network environment in which the data source you want to access is deployed. For more information, see Select a network connectivity solution.

  • Network connectivity solutions
    Network environment Network connectivity solution
    The data source is deployed on the Internet. The exclusive resource group for Data Integration that is deployed in a VPC can directly access the data source.
    The data source is deployed in a VPC and resides in the same region as the exclusive resource group for Data Integration. We recommend that you associate the exclusive resource group for Data Integration with the VPC in which the data source resides and with a vSwitch that resides in the VPC. Then, the system adds a route for the exclusive resource group for Data Integration. This way, the exclusive resource group for Data Integration can access the data source.
    The data source is deployed in a VPC and resides in a different region from that of the exclusive resource group for Data Integration. Use Express Connect circuits or VPN gateways to connect the data source to the VPC with which the exclusive resource group for Data Integration is associated, and add a route that points to the IP address of the destination data source to ensure the network connection between the data source and the exclusive resource group.
    The data source is deployed in a data center. Use Express Connect circuits or VPN gateways to connect the data source to the VPC with which the exclusive resource group for Data Integration is associated, and add a route that points to the IP address of the destination data source to ensure the network connection between the data source and the exclusive resource group.
    The data source is deployed in the classic network. Exclusive resource groups are deployed in VPCs of Alibaba Cloud. If your data source is deployed in the classic network of Alibaba Cloud, your exclusive resource group cannot access the data source. In this case, we recommend that you migrate the data source to the VPC in which your exclusive resource group is deployed.
  • Whitelist settings

    If an IP address whitelist is configured for your data source, you must add the EIP of your exclusive resource group for Data Integration or the CIDR block of the vSwitch to which the resource group is bound to the whitelist. For more information, see Add the EIP or CIDR block of an exclusive resource group for Data Integration to an IP address whitelist of a data source.