Exclusive resource groups for Data Integration can be used to run data synchronization nodes and can ensure fast and stable data transmission. Exclusive resource groups for Data Integration can also be used to synchronize data in complex network environments. Before you use the exclusive resource group for Data Integration that you purchased, you may need to perform operations such as configuring network settings and IP address whitelists. This topic describes the process from the purchase of an exclusive resource group for Data Integration to the use of the resource group.
Prerequisites
- You are familiar with the performance and billing of an exclusive resource group for Data Integration with specific specifications. The performance of an exclusive resource group for Data Integration is measured based on the number of nodes that can be run in parallel. We recommend that you determine the specifications and subscription duration based on your business requirements before you purchase an exclusive resource group for Data Integration. For more information, see Billing of exclusive resource groups for Data Integration (subscription).
- To ensure that data can be normally synchronized, you must make sure that network connections are established between your resource group and the data sources and that the resource group is allowed to access the data sources. Before you use Data Integration to synchronize data, you must make sure that network connections are established between your resource group and data sources. If the network connections are not established, your data synchronization node cannot be run. For more information about solutions for the network connectivity between an exclusive resource group for Data Integration and a data source, and precautions for configuring the IP address whitelist of a data source, see Exclusive resource groups for Data Integration.
- You have a good command of the use scenarios of exclusive resource groups for Data Integration. For more information, see Scenarios of exclusive resource groups for Data Integration.
Procedure
Step | Description | References |
---|---|---|
1 | Create an order to purchase exclusive resources for Data Integration and create an exclusive resource group for Data Integration based on the purchase order ID. Exclusive resource groups for Data Integration are charged based on the subscription billing method. | Create an exclusive resource group for Data Integration |
2 | Associate the exclusive resource group for Data Integration with a workspace based on your business requirements. After an exclusive resource group is created, the resource group does not belong to any workspace. Therefore, you must associate the exclusive resource group with a workspace. | Associate the exclusive resource group for Data Integration with a workspace |
3 | If you want to use the exclusive resource group for Data Integration to access a data source that resides in a virtual private cloud (VPC), associate the resource group with this VPC or a VPC that connects to the data source. | Associate the exclusive resource group with a VPC |
4 | If the access of the exclusive resource group for Data Integration to the data source is restricted by the IP address whitelist of the data source, add the elastic IP address (EIP) of the resource group or the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist. | Configure the IP address whitelist of a data source |
5 | Test the network connectivity between the exclusive resource group for Data Integration and the data source on the Data Source page. This ensures that a data synchronization solution or node that uses the data source can be normally configured. | Test the network connectivity of the exclusive resource group for Data Integration |
6 | Use the exclusive resource group for Data Integration to run a data synchronization node. If you want to use the exclusive resource group for Data Integration to run a data synchronization node in the workspace with which the resource group is associated, you must manually select the resource group when you configure the node in the workspace. | Use the exclusive resource group for Data Integration to run nodes |
7 | View and monitor the resource usage of the exclusive resource group for Data Integration and the number of instances that are waiting for resources in the resource group. | View the resource usage of the exclusive resource group for Data Integration and monitor the resource group |
Precautions
- Only an Alibaba Cloud account or a RAM user to which the AliyunBSSOrderAccess and AliyunDataWorksFullAccess policies are attached can create a resource group.
- Only a workspace administrator can associate a resource group with a workspace and change the workspace with which a resource group is associated.
- For information about the permissions that are required to use the features and perform operations on the Resource Groups page of the DataWorks console, see Policies that can be used to manage permissions on resource groups.
- For information about how to create a custom policy and attach the custom policy to a RAM user, see (Optional) Create a custom policy.
- You can associate an exclusive resource group for Data Integration that uses the specifications of 4 vCPUs and 8 GiB of memory with a maximum of two VPCs. You can associate an exclusive resource group for Data Integration that uses the other specifications with a maximum of three VPCs.
- You can purchase a maximum of 20 Elastic Compute Service (ECS) instances for each exclusive resource group for Data Integration, and the ECS instances must be of the same specifications.
Create an exclusive resource group for Data Integration
- Log on to the DataWorks console.
- In the left-side navigation pane, click Resource Groups. On the Exclusive Resource Groups tab of the Resource Groups page, click Create Resource Group for Data Integration. Then, configure the parameters in the Create a dedicated resource group panel and on the DataWorks Exclusive Resources page. The following table describes the parameters that need to be configured.
You can configure other parameters based on your business requirements.Parameter Description Region The region in which you want to use the exclusive resource group. Note An exclusive resource group cannot be shared across regions. For example, exclusive resource groups in the China (Shanghai) region can be used only by the workspaces in the China (Shanghai) region.Resource Group Type The type of the exclusive resource group. Select Exclusive Resource Group for Data Integration for this parameter. Resource Group Name The name of the exclusive resource group for Data Integration. The name must be unique within a tenant. Otherwise, an error is reported when you click OK. Note A tenant refers to an Alibaba Cloud account. Each tenant can have multiple RAM users.Resource Group Description The description of the exclusive resource group for Data Integration. Duration Exclusive resource groups for Data Integration are charged based on the subscription billing method. To ensure service continuity, we recommend that you select Auto Renewal. You can also go to the Renewal Management page to enable or disable auto renewal after the resource group is created. For more information, see General reference: Disable auto-renewal for subscription resources. Note You can purchase a maximum of 20 Elastic Compute Service (ECS) instances for each exclusive resource group for Data Integration, and the ECS instances must be of the same specifications. - After you configure the parameters in the Create a dedicated resource group panel, click OK. Then, DataWorks starts to initialize the resource group. When the resource group enters the Running state, the resource group is created in the DataWorks console.Note DataWorks requires approximately 20 minutes to initialize the exclusive resource group for Data Integration. Wait until the status of the resource group changes to Running.
Associate the exclusive resource group for Data Integration with a workspace
You must associate the exclusive resource group for Data Integration with a workspace before you can select the resource group in the workspace. An exclusive resource group for Data Integration can be shared among multiple workspaces but cannot be used across regions. For example, you can associate an exclusive resource group for Data Integration in the China (Shanghai) region only with workspaces in the China (Shanghai) region. To associate an exclusive resource group for Data Integration with a workspace, perform the following steps:
- Log on to the DataWorks console.
- On the Exclusive Resource Groups tab of the Resource Groups page, find the created resource group and click Change Workspace in the Actions column.
- In the Modify home workspace dialog box, find the workspace with which you want to associate the resource group and click Bind in the Actions column.
Associate the exclusive resource group with a VPC
- Log on to the DataWorks console.
- In the left-side navigation pane, click Resource Groups. On the Exclusive Resource Groups tab of the Resource Groups page, find the created resource group and click Network Settings in the Actions column. On the page that appears, you can associate the resource group with a VPC. Before you associate the exclusive resource group with a VPC, you must log on to the RAM console with your Alibaba Cloud account and authorize DataWorks to access your cloud resources. You can go to the Cloud Resource Access Authorization page to authorize DataWorks to access your cloud resources. You can also authorize DataWorks to access your cloud resources by clicking the related button in the dialog box that is displayed the first time you log on to the DataWorks console with your Alibaba Cloud account.
- Associate the exclusive resource group with a VPC. Note If your data source and the exclusive resource group reside in different regions or belong to different Alibaba Cloud accounts, you must add a route that points to the IP address of your data source after you associate the exclusive resource group with a VPC.
- Optional:Add host configurations. You may fail to access your data source by using IP addresses. For example, you can access your data source only by using hostnames. In this case, you must perform the following steps to add host configurations. Otherwise, the connectivity test fails when you add the data source by using its hostnames.
Configure the IP address whitelist of a data source
- If the exclusive resource group accesses your data source over an internal network, you must add the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist of your data source. To view the CIDR block of the vSwitch with which the resource group is associated, perform the following operations: Log on to the DataWorks console and click Resource Groups in the left-side navigation pane. On the Exclusive Resource Groups tab, find the exclusive resource group and click Network Settings in the Actions column. On the VPC Binding tab, you can view the CIDR block in the VSwitch CIDR Block column.
- If the exclusive resource group accesses your data source over the Internet, you must add the elastic IP address (EIP) of the resource group to the IP address whitelist of your data source.
Test the network connectivity of the exclusive resource group for Data Integration
After you complete the preceding network configuration, you need to test the network connectivity between the resource group and your data source by performing the following operations:
- On the Workspaces page, find the desired workspace, move the pointer over the
icon in the Actions column, and then select Workspace Settings. In the Workspace Settings panel, click More. The Workspace Management page appears.
- In the left-side navigation pane, click Data Source to go to the Data Source page.
- Find the desired data source and click Edit in the Operation column.
- In the dialog box that appears, select Data Integration for Resource Group connectivity. Find the exclusive resource group for Data Integration and click Test connectivity in the Actions column. If the connectivity status is Connected, a network connection is established between the resource group and data source. Note If a network connection cannot be established between the resource group and data source, click Troubleshoot in the Connectivity status column to choose a diagnostic tool to diagnose network connection exceptions. For more information about the solutions for network connectivity between an exclusive resource group and data sources that reside in various network environments, see Establish a network connection between a resource group and a data source.
- Click Complete.
Use the exclusive resource group for Data Integration to run nodes
After an exclusive resource group for Data Integration is created and configured, you can change the resource group used by nodes to the newly created exclusive resource group for Data Integration by using one of the following methods.Environment for the operation | Supported change operation | Entry point |
---|---|---|
Production environment | Change the resource groups for Data Integration for multiple nodes in the production environment at the same time | Go to the Operation Center page and choose Select the nodes for which you want to change the resource groups, click More in the lower part of the Cycle Task page, and then select Modify Data Integration Resource Group. ![]() | in the left-side navigation pane.
Development environment |
| Go to the DataStudio page.
Note If you cannot find the entry point of changing the resource groups for nodes, you can select Offline synchronization from the Node Type drop-down list in the filter condition section to search for all batch synchronization nodes. |
View the resource usage of the exclusive resource group for Data Integration and monitor the resource group
You can view the resource usage of the exclusive resource group for scheduling and the number of instances that are waiting for resources in the resource group in the DataWorks console. You can also use the intelligent monitoring feature provided in Operation Center to monitor the resource usage of the resource group and the number of instances that are waiting for resources in the resource group. For more information about how to view the resource usage of a resource group, see View the resource usage of an exclusive resource group. For more information about how to monitor a resource group, see Create a custom alert rule.