Serverless auto-scaling resource group - AnalyticDB - Alibaba Cloud Documentation Center

Introduction

Core features

Feature	Description
Storage and compute separation	Compute and storage are separated at the node level. Storage nodes do not participate in query computation, resulting in a clearer architecture and more flexible scaling.
Min-max auto scaling with reserved resources	You set a minimum (min) and maximum (max) compute capacity. The min provides reserved baseline resources. The system scales automatically between min and max based on workload, eliminating manual peak provisioning.
Query-level resource isolation	Each query receives an independent resource quota, reducing inter-query interference and delivering stable latency under high concurrency.
Automated operations	The system scales automatically within the min-max range based on workload, eliminating daily manual scaling operations.

Overall architecture

Auto-Scaling resource groups deploy across availability zones in a region. Multi-tenant service components with stateful metadata use three-replica disaster recovery. Query-execution nodes support dynamic scaling and failover. With zero-reservation support, you can submit queries immediately after creating a resource group without provisioning nodes in advance.

Feature	Description
Availability zone disaster recovery	Multiple deployments within a region isolate tenant groups and limit failure blast radius. Services within each deployment span three availability zones for rapid failover if one zone fails.
High availability	Coordinator and Executor nodes support failover. Resource-request metadata and service discovery are offloaded to the independent Executor Control Service and Discovery Service, minimizing the impact of any single node failure.
High performance	The system defaults to availability-zone-affinity scheduling. Both the Coordinator (task scheduling) and Executor (task execution) scale automatically. The Assignment Service handles Coordinator scaling and load balancing; the Executor Control Service manages Executor scaling and resource-request proxying.

Auto-scaling and coupled resource groups

Auto-Scaling resource groups and coupled storage-and-compute resource groups can coexist and complement each other.

Note

A "coupled storage and compute resource group" refers to the reserved storage specified when purchasing an Enterprise Edition cluster. When purchasing an Enterprise Edition cluster, you specify reserved storage (coupled resource group) and can later create multiple Auto-Scaling resource groups. A typical setup is one coupled resource group + N Auto-Scaling resource groups. Create additional Auto-Scaling resource groups when you need storage-compute separation, min-max auto scaling, or query-level isolation.

Type	Architecture	Use cases
Auto-Scaling resource group	Strict storage and compute separation: Compute and storage nodes are separate. Supports min-max auto scaling (min is reserved; system scales between min and max based on workload).	Scale compute and storage independently, handle fluctuating workloads with min-max auto scaling, or achieve query-level isolation for high concurrent throughput.
Coupled storage and compute resource group	Coupled storage and compute: Storage nodes also perform computations. Compute cannot be scaled independently.	Tightly coupled storage and compute, strong data-locality dependency, or preference for a coupled deployment model.

You can bind database accounts to different resource groups in the console to route queries. Bind or unbind database accounts to a resource group. After routing queries to an Auto-Scaling resource group, you benefit from storage-compute separation, min-max auto scaling, and query-level isolation.

Use cases

Scenario	Pain points	Solution
Infrequent queries with no need for reserved compute	Queries run infrequently, and reserving compute resources for occasional queries is wasteful.	Set min to zero or a very low value. Compute resources are provisioned when the first query arrives and released afterward — a pay-as-you-go model for queries.
Idle resources and wasted costs	Resources over-provisioned for occasional peak loads remain idle most of the time.	Reserve only the min capacity for baseline performance. The system scales between min and max based on workload. Resources beyond min are billed by actual usage.
Insufficient resources during sudden peaks	Queries queue, time out, or fail during data ingestion spikes, report generation, or bursts of concurrent large SQL queries.	The system scales up to max automatically — no manual upgrades or peak provisioning needed.
OOM errors or performance fluctuations from large queries	With limited instance capacity, large SQL queries can cause OOM errors or compete with online queries for resources, destabilizing response times.	Large queries run in the Auto-Scaling resource group with query-level isolation and auto scaling, significantly improving success rates and stability.
Low throughput under high concurrency	Multiple queries share resources and contend with each other, causing latency fluctuations and limited throughput.	Query-level isolation reduces contention, improving throughput and stability under high concurrency.
Heavy burden of operations and capacity planning	Capacity planning based on peak hours requires frequent manual scaling.	The system adjusts resources based on workload, eliminating capacity planning and manual scaling.

Detailed test results: Auto-Scaling resource group benchmarks.

How it works

Mixed workloads of large and small queries

Auto-Scaling resource groups handle both large and small queries. A built-in classifier distinguishes between them and routes each through a different execution path:

Query type

Resource evaluation

Scheduling method

Small queries

No (bypassed)

Scheduled directly based on node view.

Small queries bypass resource evaluation and requests. The Coordinator schedules them directly to a physical node based on node view and load. This does not consume the Executor node quota and reduces latency.

Large queries

Yes

Scheduled after an Executor node quota is requested.

After resource evaluation determines the required quota, the Coordinator sends a resource request to the Executor Control Service. The query executes only after obtaining an independent resource quota on an Executor node. For large queries, CPU resources are isolated and memory is constrained by the quota, preventing a single query from causing OOM errors or degrading other queries.

Scaling policy

Min-max auto scaling: You set a min and max compute capacity. The min provides reserved baseline resources; the system scales automatically between min and max based on workload.
Scaling triggers: The system scales based on large-query complexity, queue depth, and Executor node CPU/memory usage. No manual intervention is required.
Response time: Scaling actions take effect within seconds to minutes.

Workload isolation and concurrency

Query-level resource isolation: Each large query has an independent resource quota enforced by the Executor. This reduces contention and improves query throughput and stability under high concurrency.
Concurrency control: With Auto-Scaling resource groups, stability no longer depends on priority queues and concurrency settings of Interactive resource groups. Small queries use adaptive throttling based on node resource load. The resource estimation process naturally controls large-query concurrency before execution. Set a relatively high query queue concurrency (for example, 100) to fully leverage scaling during traffic spikes.

Prerequisites

Your cluster kernel version is 3.2.7 or later.

Note
To view and update the minor version, go to the Configuration Information section on the Cluster Information page in the AnalyticDB for MySQL console.
Auto-Scaling resource groups are allowlist-based and supported only in specific regions. To enable this feature in other regions, submit a ticket.

Limitations

Incompatible kernel features: Auto-Scaling resource groups do not currently support the following features. Use other resource group types if you need them.
- Dynamic partition import for the XUANWU_V1 storage engine (INSERT OVERWRITE across multiple partitions). Query execution fails. Migrate the target table to the XUANWU_V2 engine.
- Paging Cache (deep paging performance optimization). Queries fail with the error "Resource group [xxx] is empty".
- Work Load Manager (WLM commands and workload management rules). Configured rules do not take effect.
- Incremental materialized views for XUANWU series storage engines. Queries fail.
Recommended workloads: Analytical workloads such as ad hoc queries, reports, and ETL.
Not recommended for: Latency-critical online services requiring fixed capacity, or scenarios relying on fixed execution plans and data locality (writes and point lookups). Use coupled storage-and-compute resource groups instead.
Quotas and limits: The capacity and concurrency limits for a single resource group and a single account are subject to the limits specified on the console and in the documentation for the current version.

Billing

Resource type	Description
Reserved resources (min capacity)	Billed based on the number of resident ACUs.
Elastic resources (from min to max capacity)	Billed based on the actual usage of elastic ACU-hours.

Configure the max capacity to limit scaling. If your cluster version supports usage limits or budget alerts, set up alerts to prevent overspending. Billing details: Billing overview.

Use an auto-scaling resource group

Create an Auto-Scaling resource group.
1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
2. In the navigation pane on the left, click Cluster Management > Resource Management, and then click the Resource Groups tab. In the upper-right corner of the resource group list, click Create Resource Group.
3. In the Create Resource Group dialog box, configure the following parameters.
  - Resource Group Name: Enter a name for the resource group.
  - Job Type: Select SQL.
  - Elastic Policy: Select auto scaling.
  - Min ACU and Max ACU: Define the range for auto scaling. The minimum ACU provides reserved resources, and the system scales up to the maximum ACU based on the workload.
4. Click OK.
Bind the database accounts that require elasticity and isolation to this resource group.
1. In the Actions column for the target resource group, click the icon, and then click Associate with Account.
2. In the Database Account drop-down list, select a database account, click Associate with Account, and in the Associate with Account confirmation dialog box that appears, click OK.
  
  Note
  If the drop-down list is empty, you have not created any database accounts. To create an account, see Create a database account.
Bind or unbind database accounts to a resource group.

Resource group comparison

Auto-Scaling, Job, Reserved, and Multi-Cluster resource groups all use storage-compute separation but differ in scaling policies, concurrency/isolation methods, and billing.

Item	Auto-scaling resource group	Job resource group	Reserved resource group	Multi-cluster resource group
Scaling method	Supports on-demand scaling (scales as needed; zero reservation enables pay-per-query) and reactive scaling (scales out when reserved utilization is high).	Scheduled by task or job. Compute resources are requested and released according to the task lifecycle.	Fixed reserved capacity. No auto scaling; requires user-configured scaling plans.	Scales by replica. Scale-out is based on an evaluation of queue length and node resource utilization.
Query-adaptive scaling	Supported. Estimates required resources based on query complexity. Minimum scaling step: 16 ACU. A single query can use both reserved and elastic resources.	Not supported. The amount of elastic resources is set manually by the user.	Not applicable (no elastic resources).	Not supported. Scales by replica. The manually configured base resource amount sets the scaling step. Does not adapt to individual queries. A single query uses at most one resource group replica.
Concurrency and isolation	Query-level soft resource isolation. Concurrency is naturally controlled by resource estimation and requests. Does not rely on priority queues or queue concurrency settings. Supports isolation of task scheduling nodes (Coordinator).	Query-level hard resource isolation. Elastic nodes are provisioned on a per-job basis and are not reused across queries. Job concurrency is limited by the configured max ACU. Does not support isolation of task scheduling nodes.	Multiple queries share reserved resources. Relies on query queues for concurrency control. Does not support isolation of task scheduling nodes.	Multiple queries share one resource group replica. Relies on query queues to control the concurrency limit of a single replica. Supports isolation of task scheduling nodes.
Billing	Reserved resources + elastic resources (ACU-hours billed with minute-level granularity).	Billed based on the duration of elastic resources used by the task or job.	Billed based on reserved compute resources.	Reserved resources + elastic resources (ACU-hours billed with minute-level granularity).

Auto-Scaling resource groups provide adaptive scaling and query-level isolation, and can completely replace the previous reserved, Job, and Multi-Cluster resource groups.