DataWorks introduces serverless resource groups to streamline resource management and unify the user experience across its features. These groups encompass the functionalities of the original exclusive resource groups for scheduling, Data Integration, and DataService Studio. A single serverless resource group enables you to carry out operations such as data synchronization, task scheduling and execution, and API management. This document guides you through the creation and utilization of serverless resource groups.
Prerequisites
-
You should be familiar with the specifications, performance, billing types, and other details of serverless resource groups. Based on your business needs, plan the specifications and subscription duration of the resource groups you intend to purchase. For more information, see DataWorks resource group overview and Serverless resource group billing.
-
Serverless resource groups are supported in the following regions: China (Beijing), China (Zhangjiakou), China (Ulanqab), China (Shanghai), China (Shenzhen), China (Hangzhou), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Japan (Tokyo), UK (London), US (Silicon Valley), Germany (Frankfurt), and US (Virginia).
-
You must have the necessary permissions for resource groups:
-
Only users with AliyunBSSOrderAccess and AliyunDataWorksFullAccess permissions can purchase resource groups. For related operations, see View RAM user permissions and Grant permissions to RAM users.
-
Only workspace administrators can associate and modify the ownership of resource groups.
-
For permission control of other operations on resource groups, see Console entity object-level permission control policy.
-
-
If you need to use a serverless general-purpose resource group in a virtual network operator (VNO) environment, confirm whether your provider offers this product.
-
If you have not activated DataWorks in any region before, after activating DataWorks, you can only purchase and use serverless resource groups, and cannot purchase or use old-version resource groups.
Considerations
-
If a pay-as-you-go serverless resource group remains unused for seven days, the system will automatically freeze it. To resume usage, you must manually enable the resource group. For the enabling operation and the definition of unused, see Freezing and enabling pay-as-you-go serverless resource groups.
-
To ensure the resource group can access the data source (such as a database, data service, or other data in the target network environment), understand and complete network connectivity in advance based on the data source situation. For more information, see Network connectivity solutions.
ImportantServerless resource groups can access data sources or addresses in complex network environments over an internal network by associating with a virtual private cloud (VPC). However, serverless resource groups do not have public network access capabilities by default. If you need to access data sources or networks over the public network, configure a public NAT gateway and EIP for the VPC associated with the serverless resource group. For specific operations, see Scenario 5: Data source on the public network.
-
If a VPC and vSwitch are already associated with a DataWorks serverless resource group, do not change the environment of the VPC and vSwitch arbitrarily, as this may cause task execution failures in DataWorks.
-
You cannot switch between different billing methods (subscription and pay-as-you-go) for resource groups. For example, if you purchase a serverless resource group using the subscription billing method, the resource group can only be billed using the subscription method in the future and cannot be switched to pay-as-you-go.
Comparison between serverless and old-version resource groups
Comparison item |
Old-version resource groups (exclusive and shared resource groups) |
Serverless resource groups |
Usage method |
Classified by function into three types: data integration, scheduling, and DataService Studio resource groups. |
General-purpose for all functions, without distinguishing purposes. |
Function boundaries |
Some capabilities of DataWorks are not supported by old-version resource groups. |
All capabilities of DataWorks are supported. |
Support for mixed use |
Not supported. Different types cannot be mixed. |
Supported. A resource group can be used for all functions (data integration, scheduling, DataService Studio). |
Sales mode |
Charged based on machine specifications and quantity. A minimum of one machine with 4 vCPUs and 8 GiB of memory is required. The minimum scaling step size is one machine with 4 vCPUs and 8 GiB of memory. |
Sold by compute unit (CU). A minimum of 2 CUs is required. The minimum scaling step size is 1 CU. |
Billing method |
|
Both subscription and pay-as-you-go billing methods are supported. |
Resource waste |
Limited machine specifications result in a certain amount of resource fragments on each machine that cannot be utilized, causing resource waste. |
Select the appropriate number of CUs as needed to avoid resource waste. |
Scalability |
|
Directly modify the number of CUs in the resource group. |
Impact of scaling |
Impact on running jobs. |
Tasks that are already running are not affected. Note
Upgrade or downgrade will not affect the running jobs. |
Network security |
DataWorks manages public network ingress and egress. Multiple users share resources, causing resource competition. |
Fully utilizes the customer's own public network capabilities, making behavior controllable. |
Development trend |
Planned to be discontinued in the future. |
Will become the only officially supported resource group in DataWorks. |
Support for custom images |
Not supported. |
Supports custom image management, allowing customization of images required for task execution to meet more task execution conditions. |
Serverless resource group billing
For billing related to resource groups, see Serverless resource group billing.
Step 1: Add a serverless resource group
Go to the Resource Groups page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group to go to the Resource Groups page.
-
On the Exclusive Resource Groups tab, click Create Resource Group to proceed to the serverless resource group purchase page.
Parameter
Description
Region and zone
Select a region that must be the same as the region where the DataWorks workspace resides.
Billing method
Subscription: a prepaid model.
Pay-as-you-go: a postpaid model.
NoteYou cannot switch between different billing methods (subscription and pay-as-you-go) for resource groups. For example, if you purchase a serverless resource group using the subscription billing method, the resource group can only be billed using the subscription method in the future and cannot be switched to pay-as-you-go.
You can purchase multiple resource groups with different billing methods to meet your business requirements.
Resource group specifications
When the billing method is Subscription, you need to set the resource group specifications.
Valid values: 2 CU to 99999999 CU.
Note1 CU = 1 vCPU + 4 GiB memory
. For specific purchase suggestions and minimum specification requirements for running various tasks, see Performance metrics and purchase suggestions.The upper limit of
99999999 CU
indicates that there is no upper limit on purchase specifications, but it may be affected by inventory. If there is insufficient inventory, pay attention to the prompts on the purchase page.
Resource group name
Set the resource group name.
Resource group description
Set the resource group description.
Virtual private cloud (VPC)
Select a virtual private cloud based on the network that the resource group needs to access.
If the data source and the serverless resource group belong to the same account and reside in the same region, configure the VPC and vSwitch where the data source resides.
If the data source is in another complex network environment, you also need to use a VPN Gateway or Express Connect to connect the VPC associated with the serverless resource group to the VPC network where the data source resides. For more information, see Network connectivity solutions.
NoteIf there are no options in the drop-down list, you need to go to the VPC console to create one. For more information about virtual private clouds, see What is a virtual private cloud.
Resource groups support associating with multiple VPCs. You can associate them with other VPCs after purchase.
If the billing method of the resource group is Subscription, the VPC configured here is applied to data services, data computing, and data integration. Data services cannot associate with new or replace existing virtual private clouds. Plan in advance.
If a VPC and vSwitch are already associated with a DataWorks serverless resource group, do not change the environment of the VPC and vSwitch arbitrarily, as this may cause task execution failures in DataWorks.
vSwitch
Billing cycle
When the billing method is Subscription, you need to set the billing cycle.
ImportantIt is recommended to select Auto-renewal Upon Expiration to avoid business impact due to resource expiration shutdown or release. After selection, the auto-renewal cycle is monthly, and fees will be automatically deducted at the real-time price before the instance expires.
Service-linked role
When purchasing for the first time, you need to Create A Service-linked Role (AliyunServiceRoleForDataWorks). Subsequently, the created role will be associated by default.
NoteIf you encounter the prompt
Please create AliyunServiceRoleForDataWorks
, provide this authorization address to the main account or other authorized personnel for authorization, and then continue the operation. This role is used to access virtual private cloud (VPC), elastic network interface (ENI), and security group resources. For more information, see DataWorks service-linked role.
Step 2: Associate the resource group with a workspace
After creating a resource group, you must associate it with a workspace. Once associated, you can select and use the serverless resource group when creating tasks in the target workspace.
-
Associate the resource group when creating a workspace
-
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Workspace to go to the Workspaces page.
-
Click Create Workspace. On the Create Workspace page, change the Default Resource Group Configuration parameter to the newly created target resource group.
-
-
Associate the resource group with an existing workspace
Go to the Resource Groups page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group to go to the Resource Groups page.
-
Click Bind Workspace in the Operation column of the target resource group. Locate the workspace to be bound and click Bind in the Operation column.
Step 3: network connectivity
To ensure tasks run smoothly, complete the necessary network connectivity configurations so the resource group can access the data source. For more information, see Network connectivity solutions.
Serverless resource groups can access data sources or addresses in complex network environments over an internal network by associating with a virtual private cloud (VPC). However, serverless resource groups do not have public network access capabilities by default. If you need to access data sources or networks over the public network, configure a public NAT gateway and EIP for the VPC associated with the serverless resource group. For specific operations, see Scenario 5: Data source on the public network.
The VPC associated with the resource group supports configuring DNS internal resolution, allowing DataWorks to access data sources through custom internal domain names. For example, for a CDH cluster, you can configure internal DNS resolution for the VPC associated with the serverless resource group. For more information, see Obtain CDH or CDP cluster information and configure network connectivity.
Step 4: resource group configuration item adjustment
Quota management
Configure the CU Upper Limit or CU Guarantee for data computing, data integration, and DataService Studio to ensure the smooth operation of tasks.
-
Set the CU upper limit for pay-as-you-go resource groups to prevent excessive resource usage.
-
Set the CU guarantee for subscription resource groups to establish the minimum CU guarantee amount.
Go to the Resource Groups page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group to go to the Resource Groups page.
-
Adjust quotas.
-
Make adjustments on the resource groups page.
Click the target resource group's Operation column's
> Quota Management, then modify the CU Upper Limit or CU Guarantee values for different purposes.
-
Make adjustments on the resource group details page.
On the resource groups page, click the Resource Group Name ofthe target resource group to view its details. In the upper-right corner, click Quota Management, and adjust the CU Upper Limit or CU Guarantee for different scenarios.
-
Data scheduling concurrency limit adjustment
For data scheduling, you can set the task concurrency limit to control the maximum number of concurrent task executions.
The default concurrency limit is 50, which can be increased to a maximum of 200.
Go to the Resource Groups page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group to go to the Resource Groups page.
-
Adjust the data scheduling concurrency limit.
-
Adjust on the resource groups page.
Click the Operation column of the target resource group
> Data Schedule Concurrency Limit, and then modify the Data Schedule Concurrency Limit value.
-
Adjust on the resource group details page.
On the resource groups page, click the Resource Group Name of the target resource group to view its details. In the upper-right corner, click Data Scheduling Concurrency Limit, and adjust the Data Scheduling Concurrency Limit value.
NoteThe Data Scheduling Concurrency Limit configured here is for controlling the upper limit of concurrent task scheduling and does not affect task execution behavior.
-
Next step: configure serverless resource groups for tasks
Once the serverless resource group is created and configured, assign it to data integration, scheduling, and DataService Studio tasks to utilize the resource group for task execution. For detailed instructions, see General reference: Switch resource groups.
More operations
References
-
For more information about resource groups, see DataWorks resource group overview.
-
You can use the intelligent monitoring feature provided in the Operation Center to monitor the resource usage of a resource group and the number of instances waiting for resources. For more information, see Create custom rules.
-
When viewing the instance status on the resource groups page:
-
If the resource group displays Expired, you can click Operation column of the target resource group
> Renew.
-
If the resource utilization of the resource group reaches the warning threshold, you can click Operation in the target resource group's column and then
Scale-out. For more information, see the referenced document.
-
-
If tasks running on a serverless resource group require a specific development environment (such as third-party library dependencies), you can create a custom image that integrates the necessary development packages and dependencies, and then specify the serverless resource group as the execution resource and the image as the runtime environment when running tasks.
-
To unsubscribe from a Serverless resource group, click the Operation column of the desired resource group
> Unsubscribe to proceed with the unsubscription. For more information, see the referenced document: stop using DataWorks products.