All Products
Search
Document Center

DataWorks:Plan and configure resource groups

Last Updated:Mar 31, 2025

If you use DataService Studio provided by DataWorks to call an API, the API call consumes resources in resource groups. You must ensure the network connectivity and high performance of the resource groups. Otherwise, various issues may occur. For example, a resource group may fail to access a data source, and API call exceptions or throttling on frequent API calls may occur due to insufficient CPU or memory resources. This topic describes the precautions for planning resource groups and suggestions for using different types of resource groups.

Basic concepts

A resource group provides the computing resources that are required to initiate API calls in DataService Studio. In most cases, a resource group consists of one or more servers that provide CPU, memory, and network resources. Process of API calls: A user initiates an API call. API Gateway receives the request and forwards the request to a server of DataService Studio. Then, the request is forwarded to the destination data source for data query.

image

Types

Resource groups are classified into shared resource groups and exclusive resource groups.

Shared resource groups

Shared resource groups are shared by all users of DataWorks. Users may compete for resources during peak hours. For more information about shared resource groups, see Use a shared resource group.

Exclusive resource groups

Exclusive resource groups can be used only by the users who purchase the exclusive resource groups. If highly concurrent and frequent API calls are initiated in DataService Studio, we recommend that you use exclusive resource groups. For more information about exclusive resource groups, see Exclusive resource groups for DataService Studio. For more information about how to use exclusive resource groups for DataService Studio, see Create and use an exclusive resource group for DataService Studio.

Note

Exclusive resource groups for DataService Studio are available only in the China (Shanghai) region.

Key to resource planning: connectivity and performance

When you use resource groups, take note of the connectivity and performance of the resource groups.

  • Connectivity

    After an API call is initiated, the API call is first sent to a server of DataService Studio and then to the destination data source for data query. Make sure that the resource group that is used to process the API call can access the destination data source and the network on which the data source resides. Before you use DataService Studio, make sure that the resource group is connected to the data source. Otherwise, the API call fails.

  • Performance

    API call nodes consume CPU, memory, and network resources of the servers on which the nodes are run. Insufficient resources may lead to various issues. For example, an exception may occur during an API call, throttling may be imposed on frequent API calls, or query results may not be returned at the earliest opportunity. Before you initiate API calls, make sure that you have sufficient resources. We recommend that you use exclusive resource groups to run API call nodes. This way, the nodes do not compete for resources in the public resource pool. For information about the performance metrics of exclusive resource groups, see Billing of exclusive resource groups for DataService Studio (subscription).

Differences between resource groups and recommendations

The two types of resource groups are suitable for different scenarios. The following table describes the differences between the two types of resource groups based on resource ownership, network connectivity, billing methods, and performance. Select a resource group based on your business requirements when you create an API.

Item

Exclusive resource group

Shared resource group

Ownership of resources

The resources are maintained by DataWorks and exclusively used by each tenant.

The resources are maintained by DataWorks and shared among all tenants.

Network connectivity

This type of resource group can connect to data sources that are deployed on the Internet, in Alibaba Cloud virtual private clouds (VPCs), and in data centers. For a data source that is deployed in a VPC, you can connect an exclusive resource group to the data source by using the instance ID or connection string.

This type of resource group can connect to data sources that are deployed on the Internet, in Alibaba Cloud VPCs, and on the classic network. For a data source that is deployed in a VPC, you can connect a shared resource group to the data source by using only the instance ID.

Note

You cannot connect a shared resource group to a data source that is deployed on the classic network in the China South 1 Finance region.

Billing methods

This type of resource group is charged by resource group specifications based on the subscription billing method.

This type of resource group is charged by the number of calls and the call duration based on different billing tiers.

Supported data source types

This type of resource group can connect to the following types of data sources: ClickHouse, Hologres, ApsaraDB RDS, MySQL, PostgreSQL, SQL Server, Oracle, Tablestore, AnalyticDB for MySQL V2.0, AnalyticDB for MySQL V3.0, AnalyticDB for PostgreSQL, MongoDB, PolarDB-X 1.0, StarRocks, and Doris. More types of data sources will be supported in the future.

This type of resource group can connect to the following types of data sources: Hologres, ApsaraDB RDS, MySQL, PostgreSQL, SQL Server, Oracle, Tablestore, AnalyticDB for MySQL V2.0, AnalyticDB for MySQL V3.0, AnalyticDB for PostgreSQL, MongoDB, and PolarDB-X 1.0.

Maximum QPS1

The queries per second (QPS) thresholds vary based on the specifications of exclusive resource groups. The minimum QPS is 500. You can select resource groups of different specifications based on your QPS requirement.

One exclusive resource group can be associated with multiple workspaces and multiple APIs.

If the number of API calls exceeds the QPS threshold of an exclusive resource group of specific specifications, throttling is triggered and the API calls fail.

A maximum of 200 QPS is supported for each tenant in each region. To increase the QPS threshold, you can use exclusive resource groups.

If the number of API calls exceeds 200 QPS, throttling is triggered and the API calls fail.

Timeout

30 seconds

10 seconds

Reliability

High

Low

Security

High

High

Scenarios

This type of resource group is used for highly concurrent and frequent online API calls in which complex query statements are used and a large volume of data needs to be returned.

This type of resource group is used for low-concurrency or low-frequency API calls.

Recommend rating

★★★★★

★★★

Note
  • Note1: The maximum queries per second (QPS) for exclusive resource groups is calculated based on actual business scenarios. You can estimate the QPS threshold by using the following information:

    • Whether to generate an API in script mode.

    • Whether the pagination feature is enabled for an API call so that the returned results are displayed on multiple pages.

    • The average runtime of SQL statements configured for an API call is 100 milliseconds in a data source.

    • The average size of data returned by a single API call is 3,000 bytes.

    If your business scenario is different from the preceding scenario, join the DataWorks DingTalk group to obtain the appropriate specifications that suit your business scenarios.

We recommend that you use exclusive resource groups to run API call nodes based on the preceding comparison results.

Instructions on resource group configuration

If you use a shared resource group to access a data source, you must add the required CIDR blocks or IP addresses of the related region to the IP address whitelist of the data source. For more information, see Configure network connectivity between the shared resource group for DataService Studio and a data source.

If you use an exclusive resource group, you must select a network connectivity solution based on the network on which the data source resides and configure the IP address whitelist of the data source. For more information, see Configure network connectivity between an exclusive resource group for DataService Studio and a data source.

FAQ

  • Q: When I configure a serverless resource group for DataService Studio, the serverless resource group associated with the workspace is dimmed and cannot be selected. Why does this happen?

    image

  • A: No quota is specified for the serverless resource group for DataService Studio. You must manually specify a quota for the serverless resource group. Procedure:

    1. Go to the Resource Groups page.

      Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group to go to the Resource Groups page.

    2. On the Exclusive Resource Groups tab of the Resource Groups page, find the serverless resource group, click the image icon in the Actions column, and then select Manage Quota. In the Manage Quota dialog box, specify a quota for DataService Studio in the Occupied CUs column.

      image

    3. After a quota is specified, click OK to save the configuration.

    4. Go to the Resource Group tab of the API operation and select the serverless resource group from the Exclusive Resource Group for DataService Studio drop-down list.

      image