You can purchase an exclusive resource group for scheduling based on your business requirements and use the resource group to schedule nodes. Before you use the exclusive resource group for scheduling, you may need to configure network settings and IP address whitelists. This topic describes the process from the purchase of an exclusive resource group for scheduling to the use of the resource group.

Prerequisites

  • You are familiar with the performance and billing of exclusive resource groups for scheduling with specific specifications. The performance of an exclusive resource group for scheduling is measured based on the number of nodes that can be run in parallel. We recommend that you determine the specifications and subscription duration based on your business requirements before you purchase an exclusive resource group for scheduling. For more information, see Billing of exclusive resource groups for scheduling (subscription).
  • Optional. If an interaction is required between your exclusive resource group for scheduling and a data source (for example, you need to use a Shell node to access a self-managed database or a private IP address in a scheduling scenario) or if an exclusive resource group for scheduling is required to run a node that uses an E-MapReduce (EMR) or CDH compute engine instance, you must be familiar with the solutions for network connectivity between an exclusive resource group for scheduling and a data source or compute engine instance in different scenarios. In addition, you must be familiar with the precautions to practice when you configure the IP address whitelist of a data source. For more information about the network connectivity solutions that can be used in different scenarios and how to configure the IP address whitelist of a data source, see Overview.
    Note If you do not need to connect an exclusive resource group to a data source and you only want to fix issues that nodes are delayed due to insufficient resources in the shared resource group for scheduling, you can ignore the network configuration described in this topic. In this case, you can purchase an exclusive resource group for scheduling in any zone, and you do not need to configure network settings when you use the resource group.

Procedure

To purchase and use an exclusive resource group for scheduling, you must perform the following steps.
Step Description References
1 Create an order to purchase exclusive resources for scheduling. Exclusive resources for scheduling are charged based on the subscription billing method. Create an order to purchase exclusive scheduling resources
2 Create an exclusive resource group for scheduling in the DataWorks console based on the order ID. Create an exclusive resource group for scheduling based on the purchase order ID
3 Associate the exclusive resource group for scheduling with a workspace based on your business requirements. After an exclusive resource group is created, the resource group does not belong to any workspace. Therefore, you must associate the exclusive resource group with a workspace. Associate the exclusive resource group for scheduling with a workspace
4 If you want to use the exclusive resource group for scheduling to access a data source that is deployed in a virtual private cloud (VPC), associate the resource group with this VPC or a VPC that connects to the data source. (Optional) Associate the exclusive resource group for scheduling with a VPC
5 If the access of the exclusive resource group for scheduling to the data source is restricted by the IP address whitelist of the data source, add the elastic IP address (EIP) of the resource group or the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist. (Optional) Configure the IP address whitelist of a data source
6 Test the network connectivity between the exclusive resource group for scheduling and the data source on the Data Source page. This ensures that a data synchronization solution or node that uses the data source can be normally configured. (Optional) Test the network connectivity of the exclusive resource group for scheduling
7 Use the exclusive resource group for scheduling to run a data synchronization node. If you want to use the exclusive resource group for scheduling to run a data synchronization node in the workspace with which the resource group is associated, you must manually select the resource group when you configure the node in the workspace. Change the exclusive resource group for scheduling

Create an order to purchase exclusive scheduling resources

  1. Log on to the DataWorks console by using your Alibaba Cloud account and go to the buy page for exclusive scheduling resources.
  2. On the buy page, configure the parameters based on your business requirements. Take note of the following items when you configure the parameters:
    • Region: Select the region in which you want to use the exclusive scheduling resources.
      Note An exclusive resource group for scheduling cannot be shared across regions. For example, an exclusive resource group for scheduling in the China (Shanghai) region can be used only by workspaces in the China (Shanghai) region.
    • Type: Select Exclusive Resources for Scheduling.
    You can configure other parameters such as Duration based on your business requirements.
    Note You can purchase a maximum of 20 Elastic Compute Service (ECS) instances for each exclusive resource group for scheduling, and the ECS instances must be of the same specifications.
  3. Click Buy Now and follow the on-screen instructions to complete the payment.
    After you complete the payment, you can view the details of the purchase order, such as the order ID, on the Orders page. View the orderYou can create an exclusive resource group for scheduling in the DataWorks console based on the order ID.

Create an exclusive resource group for scheduling based on the purchase order ID

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Resource Groups. On the Exclusive Resource Groups tab of the Resource Groups page, click Create Resource Group for Scheduling. In the Create a dedicated resource group panel, configure the parameters. The following table describes the parameters.
    Purchase
    Parameter Description
    Resource Group Type The type of the exclusive resource group that you want to create.

    Valid values: Exclusive Resource Group for Scheduling and Exclusive Resource Group for Data Integration. The value Exclusive Resource Groups indicates exclusive resource groups for scheduling. An exclusive resource group for scheduling is used to schedule common nodes. An exclusive resource group for Data Integration is used by data synchronization nodes to synchronize data.

    Resource Group Name The name of the exclusive resource group for scheduling. The name must be unique within a tenant. Otherwise, an error is reported when you click OK.
    Note
    • A tenant refers to an Alibaba Cloud account. Each tenant can have multiple RAM users.
    • The resource group name cannot contain Chinese characters and can be a maximum of 128 characters. It contains letters, digits, and underscores (_) and must start with a letter.
    Resource Group Description The description of the exclusive resource group for scheduling.
    Order Number The order ID of the exclusive scheduling resources that you purchased. You can select the order ID from the Order Number drop-down list.
    Note If you want to purchase more exclusive scheduling resources, click Purchase next to Order Number. After you complete the purchase, go back to the Create a dedicated resource group panel and create an exclusive resource group for scheduling based on the purchased exclusive scheduling resources in the DataWorks console.
  3. Click OK. DataWorks starts to initialize the exclusive resource group for scheduling. When the resource group enters the Running state, the resource group is created in the DataWorks console.
    Note DataWorks requires approximately 20 minutes to initialize the exclusive resource group for scheduling. Wait until the status of the resource group changes to Running.
After the exclusive resource group for scheduling is created in the DataWorks console, you must associate the resource group with a workspace. This way, you can select the resource group when you configure a node in the workspace.

Associate the exclusive resource group for scheduling with a workspace

You must associate the exclusive resource group for scheduling with a workspace before you can select the resource group in the workspace. An exclusive resource group for scheduling can be shared among multiple workspaces but cannot be shared across regions. For example, you can associate an exclusive resource group for scheduling in the China (Shanghai) region only with workspaces in the China (Shanghai) region. The following descriptions provide the steps that you must perform to associate an exclusive resource group for scheduling with a workspace.

  1. Log on to the DataWorks console.
  2. On the Exclusive Resource Groups tab of the Resource Groups page, find the created resource group and click Change Workspace in the Actions column.
  3. In the Modify home workspace dialog box, find the workspace with which you want to associate the resource group and click Bind in the Actions column.

(Optional) Associate the exclusive resource group for scheduling with a VPC

If an interaction is required between your exclusive resource group for scheduling and a data source (for example, you need to use a Shell node to access a self-managed database or a private IP address in a scheduling scenario) or if an exclusive resource group for scheduling is required to run a node that uses an E-MapReduce (EMR) or CDH compute engine instance, you must associate the exclusive resource group for scheduling with a VPC and configure the IP address whitelist of the data source.

Exclusive resource groups are deployed in the VPC in which DataWorks is hosted. You must associate your exclusive resource group with the VPC that you need to use to allow the resource group to access your data source. To associate an exclusive resource group for scheduling with a VPC, perform the following steps:
Notice You can associate an exclusive resource group for scheduling that uses the specifications of 4 vCPUs and 8 GiB of memory with a maximum of two VPCs. You can associate an exclusive resource group for scheduling that uses other specifications with a maximum of three VPCs.
  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Resource Groups. On the Exclusive Resource Groups tab of the Resource Groups page, find the created exclusive resource group for scheduling and click Network Settings in the Actions column. On the VPC Binding tab of the page that appears, you can associate the resource group with a VPC.
    Before you associate the exclusive resource group for scheduling with a VPC, you must use your Alibaba Cloud account to grant permissions on your cloud resources to DataWorks in the RAM console. This way, DataWorks can access the cloud resources.
  3. Associate the exclusive resource group with a VPC.
    1. On the VPC Binding tab, click Add Binding. In the Add VPC Binding panel, configure the parameters.
      The following table describes the parameters that you must configure. The parameters vary base on the Alibaba Cloud accounts to which the resource group and data source belong and the regions in which the resource group and data source reside.
      Parameter Description (same region and Alibaba Cloud account) Description (different regions or Alibaba Cloud accounts)
      VPC If your data source and the exclusive resource group belong to the same Alibaba Cloud account, we recommend that you set this parameter to the VPC in which your data source resides.

      If your data source and the exclusive resource group belong to different Alibaba Cloud accounts, set this parameter based on the description for the scenario where your data source and the exclusive resource group reside in different regions.

      If your data source and the exclusive resource group belong to different regions, you can click Create VPC to create a VPC for the exclusive resource group. For example, if your data source is not reside in a VPC, you can create a VPC for the exclusive resource group. After you create the VPC, you can select it from the VPC drop-down list. You can also select a VPC that connects to your data source.
      Note If you click Create VPC to create a VPC for the exclusive resource group, you must connect the created VPC to the VPC in which your data source resides by using Express Connect circuits or VPN gateways, and manually add a route that points to the IP address of your data source to ensure network connectivity between the exclusive resource group and your data source.
      Zone Select the zone in which your data source resides. Select a zone from which a network connection to your data source is established.
      vSwitch If you set the VPC parameter to the VPC in which your data source resides, we recommend that you select the vSwitch to which the data source is connected.
      Note After you associate the exclusive resource group with the VPC in which the data source resides and a vSwitch that resides in the VPC, a route that points to the CIDR block of the VPC is automatically added. This ensures that the exclusive resource group can access the data sources in this VPC.
      Set this parameter to the vSwitch that can connect to your data source. If no vSwitch is available, you can click Create VSwitch to create a vSwitch for the exclusive resource group. After a vSwitch is created, select the vSwitch.
      Security Groups Security groups allow or deny access to the exclusive resource group over the Internet or an internal network. You can select an existing security group based on your business requirements, or click Create Security Group on the right side of this parameter to create a security group for the resources in the exclusive resource group. For more information about how to create a security group, see Add a security group rule.
    2. Click OK.
    Note If your data source and the exclusive resource group belong to different regions or Alibaba Cloud accounts, after you associate the exclusive resource group with a VPC, you must add a route that points to the IP address of your data source.
  4. Optional:Add host configurations.
    You may fail to access your data source by using IP addresses. For example, you can access your data source only by using hostnames. In this case, you must perform the following steps to add host configurations. Otherwise, the connectivity test fails when you add the data source by using its hostnames.
    1. Click the Hostname-to-IP Mapping tab. Then, click Add. In the Create Hostname-to-IP Mapping dialog box, configure the parameters. The following table describes the parameters.
      Parameter Description
      IP Address The actual IP address of the data source.
      The hostname The hostname that is used to access the data source. If you want to specify multiple hostnames, place each hostname on a separate line.
      Note The domain name can contain digits, letters, hyphens (-), and periods (.). It must start with a letter and end with a letter or digit.
    2. If the data source has multiple IP addresses, click Add to add more host configurations.
      Note
      • The IP address or hostnames that are added in a host configuration must be different from the IP addresses or hostnames in existing host configurations.
      • You can map one IP address to multiple hostnames in a host configuration. However, one hostname can point to only one IP address.
  5. Optional:Add Domain Name System (DNS) configurations.
    You may fail to access your data source by using IP addresses. For example, you can access your data source only by using the domain name of a Server Load Balancer (SLB) instance, and an internal DNS server resolves the domain name to IP addresses of your data source. In this case, you must perform the following steps to add DNS configurations. Otherwise, the connectivity test fails when you add the data source by using its DNS configuration.
    Note If a domain name that is added in a host configuration is also configured in a DNS configuration, the system preferentially uses the host configuration to access the data source.
    1. Click the DNS Configuration tab. Then, click Add. After you configure the parameters for a DNS configuration, click Save. The following table describes the parameters.
      Parameter Description
      Domain Optional. If you can use the same second-level domain to access your data sources, set this parameter to the second-level domain.

      For example, the domain name that is used to access data source 1 is domain1.example.com, and the domain name that is used to access data source 2 is domain2.example.com. In this example, we recommend that you set this parameter to example.com.

      Note The domain name can contain digits, letters, hyphens (-), and periods (.). It must start with a letter and end with a letter or digit.
      NameServer Enter the IP address of the DNS server that resolves the domain name of the data source. If you want to specify multiple DNS servers, place the IP address of each DNS server on a separate line.
    2. To modify an existing DNS configuration, click Modify in the lower-left corner.

(Optional) Configure the IP address whitelist of a data source

An exclusive resource group for scheduling may still fail to access your data source even if the resource group and your data source reside in the same zone and are associated with the same VPC and vSwitch. This failure occurs if the access of the resource group to the data source is denied by the IP address whitelist of the data source. Configure the IP address whitelist of your data source based on the following instructions:
  • If the exclusive resource group for scheduling accesses your data source over an internal network, you must add the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist of your data source.
    To view the CIDR block of the vSwitch with which the exclusive resource group for scheduling is associated, perform the following steps: Log on to the DataWorks console and click Resource Groups in the left-side navigation pane. On the Exclusive Resource Groups tab of the Resource Groups page, find the resource group and click Network Settings in the Actions column. On the VPC Binding tab of the page that appears, you can view the CIDR block in the VSwitch CIDR Block column. View the CIDR block of the vSwitch with which the exclusive resource group for scheduling is associated
  • If the exclusive resource group for scheduling accesses your data source over the Internet, you must add the elastic IP address (EIP) of the resource group to the IP address whitelist of your data source. View the EIP of the exclusive resource group for scheduling

(Optional) Test the network connectivity of the exclusive resource group for scheduling

After you complete the preceding network configuration, you need to test the network connectivity between the resource group and your data source by performing the following operations:

  1. On the Workspaces page, find the desired workspace, move the pointer over the Procedure icon in the Actions column, and then select Workspace Settings. In the Workspace Settings panel, click More. The Workspace Management page appears.
  2. Find the desired data source and click Edit in the Actions column.
  3. Select Schedule for Resource Group connectivity in the dialog box that appears, find the exclusive resource group for scheduling that you want to use, and then click Test connectivity in the Actions column. If the connectivity status is Connected, the resource group is connected to the data source.
    Note For more information about the solutions for network connectivity between an exclusive resource group and data sources that reside in various network environments, see Establish a network connection between a resource group and a data source.
  4. Click Complete.

Change the exclusive resource group for scheduling

  • Change the exclusive resource group for scheduling that is used to test a node on the DataStudio page
    1. On the DataStudio page, find the node for which you want to change the resource group and double-click the node name. The configuration tab of the node appears.
    2. Click the Run with Parameters icon on the top toolbar.
    3. In the Parameters dialog box, set the Resource Group parameter to the exclusive resource group for scheduling that you want to use to test the node.
    4. Click Create.
  • Change the exclusive resource group for scheduling that is used to schedule a node on the DataStudio page
    1. On the DataStudio page, find the node for which you want to change the resource group and double-click the node name. The configuration tab of the node appears.
    2. In the right-side navigation pane, click the Properties tab. In the Resource Group section of the Properties tab, select the exclusive resource group for scheduling that you want to use from the Resource Group drop-down list. For more information, see Configure a resource group. Resource Group
  • Change the exclusive resource group for scheduling in Operation Center
    1. In the left-side navigation pane of the Operation Center page, choose Cycle Task Maintenance > Cycle Task.
    2. Click the rightward arrow in the middle of the Cycle Task page to show the node list. Find the node for which you want to change the resource group, and click More and select Modify Scheduling Resource Group in the Actions column. In the Modify Scheduling Resource Group dialog box, select the exclusive resource group for scheduling that you want to use from the New Resource Group drop-down list and click OK. Modify Scheduling Resource Group
      Notice You cannot change the resource group for zero load nodes, workflow nodes, or Machine Learning experiment nodes.
      To change the exclusive resource groups for scheduling for multiple nodes at a time, select the nodes on the Cycle Task page and click Modify Scheduling Resource Group in the lower part of the page. Change the resource group for multiple nodes at a time
    3. In the Modify Scheduling Resource Group dialog box, select the exclusive resource group for scheduling that you want to use from the New Resource Group drop-down list and click OK.