When you activate DataWorks, the system provides you with the shared resource group for scheduling, the shared resource group for Data Integration (debugging), and the shared resource group for DataService Studio. You can use these resource groups to perform operations such as data development, node execution, and node test. The shared resource groups are used by multiple tenants. The tenants may compete for resources in the shared resource groups during peak hours. This topic provides an overview of the shared resource groups.

Scenarios

We recommend that you use a shared resource group only if the number of nodes that you want to run is small and the requirement for the timeliness of data output is low.

Limits

The shared resource groups are used by multiple tenants. DataWorks cannot ensure that sufficient resources are allocated to each tenant during peak hours.
Note
  • The shared resource group for scheduling allows you to run a maximum of 40 scheduling nodes in parallel. However, nodes may compete for resources in the shared resource group for scheduling during peak hours from 00:00 to 09:00. In this case, the number of nodes that can be run in parallel on the shared resource group for scheduling may be less than 40.
  • The shared resource group for DataService Studio cannot be used in the scenario where nodes are scheduled in a high-frequency and high-concurrency manner.
If you want to ensure separate, sufficient resources for your nodes, we recommend that you purchase exclusive resource groups. The following table describes the different types of exclusive resource groups that you can purchase.
Resource group type Description References
Exclusive resource group for scheduling If a large number of nodes must be run in parallel, exclusive computing resources are required to ensure that the nodes are run as scheduled. In this case, we recommend that you use an exclusive resource group for scheduling. Billing of exclusive resource groups for scheduling (subscription)
Exclusive resource group for Data Integration If a large number of Data Integration nodes must be run in parallel, exclusive computing resources are required to ensure fast and stable data transmission. In this case, we recommend that you use an exclusive resource group for Data Integration. Billing of exclusive resource groups for Data Integration (subscription)

Billing and related operations

1. Billing

Note After you purchase DataWorks, you can use the shared resource groups that are provided by DataWorks. You do not need to separately purchase shared resource groups.
You are charged based on items such as Elastic Compute Service (ECS) instances in the shared resource groups and the data synchronization threads that are used. The shared resource groups support the pay-as-you-go billing method. For more information about the billing of the shared resource groups, see the following topics:

2. Deductions and overdue payments

The settlement method for deductions and overdue payments varies based on the types of shared resource groups in DataWorks. For more information, see Deduction and overdue payments.

Use a shared resource group

To ensure service efficiency, you can select an appropriate type of shared resource group to run nodes for data integration or data development based on your business requirements. For more information about how to use a shared resource group, see Use a shared resource group.

Network connectivity solutions

A DataWorks resource group is a group of Alibaba Cloud ECS instances. To run nodes for data integration or data development, you must make sure that resource groups and data sources are connected. You must also make sure that special security settings such as an IP address whitelist do not affect the connections between resource groups and data sources.

  • Network connectivity
    The network connectivity between a data source and a shared resource group varies based on the network environments of the data source:
    • Shared resource group for scheduling
      • If you want the shared resource group for scheduling to access a public IP address, you must add the public IP address or domain name and port number to a sandbox whitelist on the Workspace Management page. If the shared resource group for scheduling cannot access the public IP address after you perform the preceding operation, we recommend that you use an exclusive resource group for scheduling.
      • You can use the shared resource group for scheduling to access only the data sources for which no IP address whitelist is configured. If you want to use the shared resource group for scheduling to access a data source for which an IP address whitelist is configured or a data source that is deployed in a virtual private cloud (VPC), we recommend that you use an exclusive resource group for scheduling.
      Note We recommend that you use an exclusive resource group for scheduling to access a data source that is deployed on the Internet or in a VPC. For more information about how to use an exclusive resource group for scheduling, see Exclusive resource groups for scheduling.
    • Shared resource group for DataService Studio

      The following table describes the network connectivity between the shared resource group for DataService Studio and data sources in different network environments.

      Network environment Accessible
      Internet Yes
      Classic network Yes
      VPC No
    • Shared resource group for Data Integration (debugging)

      The shared resource group for Data Integration (debugging) can access data sources over the Internet.

  • Whitelist settings

    The shared resource group for scheduling provides the security sandbox feature for nodes. This feature can be used to limit access to the resource group from unknown IP addresses. If you want to access the resource group, you can add the IP address that you use to the IP address whitelist of the security sandbox. For more information, see Configure security settings.