A Serverless Auto-Scaling resource group in AnalyticDB for MySQL uses a compute and storage disaggregation architecture that supports min-max auto scaling and query-level resource isolation. You only need to configure the minimum (min) and maximum (max) compute capacity. The minimum capacity provides reserved resources for baseline performance. The system automatically scales resources between the min and max thresholds based on your query workload, significantly improving query throughput and stability in high-concurrency scenarios.
Introduction
Key features
Feature | Description |
Compute and storage disaggregation | Compute and storage are separated at the node level. Storage nodes do not participate in query computation, which creates a cleaner architecture and allows for more flexible scaling. |
Min-max auto scaling with reserved capacity | You configure the minimum (min) and maximum (max) compute capacity. The minimum capacity provides reserved resources for baseline performance, while the system automatically scales resources between the min and max values based on workload. This eliminates the need to provision for peak capacity manually. |
Query-level resource isolation | Each query is allocated an independent resource quota, reducing interference between queries. This results in more stable latency and higher throughput under high concurrency. |
Automated operations | The system automatically adjusts resources within the min-max range based on workload, eliminating the need for daily manual scaling. |
Architecture
A Serverless Auto-Scaling resource group is deployed across multiple availability zones within a region. Stateful, multi-tenant service components use a three-replica design for disaster recovery. The nodes that execute queries can scale dynamically and support failover. This model supports zero reserved resources, eliminating the need to pre-provision any nodes. You can start submitting queries as soon as you create the resource group.
Feature | Description |
AZ disaster recovery | The region is divided into multiple deployments to isolate tenant groups and effectively contain the blast radius of a failure. Multi-tenant services within a deployment are deployed across three AZs, allowing for rapid self-healing and recovery from an availability zone failure. |
High availability | Coordinator and Executor nodes in the SQL execution path support failover. Independent Executor Control Service and Discovery Service components manage core metadata for resource requests and service discovery. This minimizes the impact of a single Coordinator or Executor node failure. |
High performance | By default, the system uses an availability zone-aware resource scheduling strategy. Both the Coordinator and Executor can scale adaptively. The Assignment Service handles autoscaling and load balancing for Coordinators. The Executor Control Service manages the autoscaling and resource requests for Executor nodes. |
Relationship with coupled compute and storage resource groups
Serverless Auto-Scaling resource groups (with compute and storage disaggregation) and coupled compute and storage resource groups are not mutually exclusive. They can coexist and complement each other.
In this document, "coupled compute and storage resource group" refers to the resource group that is based on the reserved resources you specify when purchasing an Enterprise Edition cluster. When purchasing an Enterprise Edition cluster, the reserved resources you specify form the coupled compute and storage resource group. You can later create multiple Serverless Auto-Scaling resource groups. Therefore, a typical setup is one coupled compute and storage resource group plus multiple Serverless Auto-Scaling resource groups. You can create additional Serverless Auto-Scaling resource groups whenever you need compute and storage disaggregation, min-max auto scaling, or query-level isolation.
Type | Architecture | Use cases |
Serverless Auto-Scaling resource group | Compute and storage disaggregation: Compute and storage nodes are separate. Supports min-max auto scaling (min for reserved resources, with automatic scaling between min and max based on workload). | Scenarios where you need to scale compute and storage independently, handle workload peaks and troughs with min-max auto scaling, or require query-level isolation and high-concurrency throughput. |
Coupled compute and storage resource group | Coupled compute and storage: Storage nodes are directly involved in computation. Compute resources cannot be scaled independently. | Scenarios where you prefer tightly coupled compute and storage, have strong data locality requirements, or prefer a coupled deployment model. |
You can bind the database accounts of different services to different resource groups in the console based on your business requirements for architecture and elasticity. For more information, see Bind a database account. After routing queries to a Serverless Auto-Scaling resource group, you can take advantage of the compute and storage disaggregation architecture, min-max auto scaling, and query-level isolation.
Use cases
Scenario | Common challenges | Solution |
Extremely infrequent queries, no reserved compute | You run very few queries and do not want to reserve any compute resources for them. | Supports per-query scaling. You can set a zero or very low min capacity. Compute resources are provisioned automatically when the first query arrives and released after it completes, achieving a pay-per-query model. |
Idle resources and wasted costs | You overprovision resources to handle occasional large queries or peak workloads, leaving them idle most of the time. | You only need to reserve the min capacity for baseline performance. The system scales resources between min and max based on workload, and you are billed only for the elastic resources you use beyond the min capacity. |
Insufficient resources during bursts or peaks | Temporary data ingestion, reporting, or increased concurrent large SQL queries lead to queuing, timeouts, or failures. | The system automatically scales out up to the max capacity, eliminating the need to temporarily upgrade or manually provision for peak demand. |
Large query OOM errors or performance jitter | With limited instance capacity, large SQL queries can easily cause OOM errors or compete with online queries for resources, leading to unstable response times. | Large queries are executed within the elastic resource group. Query-level resource isolation and automatic scaling significantly improve success rates and stability. |
Low throughput under high concurrency | Multiple queries share the same pool of resources, leading to contention, latency jitter, and limited overall throughput. | Query-level resource isolation reduces contention, resulting in better query throughput and stability under high-concurrency pressure. |
Heavy burden of operations and capacity planning | You need to perform capacity planning based on business peaks and frequently scale resources manually. | The system automatically adjusts resources based on workload, eliminating the need for capacity planning and routine scaling operations. |
How autoscaling works
Mixed workloads with large and small queries
A Serverless Auto-Scaling resource group supports both large and small queries submitted simultaneously. An internal classifier distinguishes between these two workload types and routes them through different execution paths:
Query type | Resource evaluation/request | Scheduling method |
Small queries | No (bypassed) | Scheduled directly based on the node's load information. When a query is classified as small, it bypasses the resource evaluation and request process. The Coordinator schedules it directly to a physical node based on its load information. This approach does not consume Executor node quotas and results in lower latency. |
Large queries | Yes | Scheduled after requesting an Executor node quota. After resource evaluation determines the required quota, a resource request is sent to the Executor Control Service to obtain an independent resource quota on an Executor node before execution. Large queries have isolated CPU resources from each other, and the quota controls their memory usage. This prevents a single large query from causing an OOM error or degrading node performance for other queries. |
Scaling policy
Min-max auto scaling: You configure the minimum (min) and maximum (max) compute capacity. The minimum capacity provides reserved resources for baseline performance, while the system automatically scales resources between min and max based on workload.
Scaling triggers: The system makes automatic scaling decisions based on factors such as the complexity of large queries, query queue depth, and the CPU/memory usage of Executor nodes. No manual intervention is required.
Response time: Scale-out and scale-in operations take effect within seconds to minutes.
Workload isolation and concurrency
Query-level resource isolation: Each large query receives an independent resource quota, and the Executor ensures these resources are delivered. This significantly reduces resource contention and performance fluctuations between queries, improving query throughput and stability under high concurrency.
Concurrency control: With Serverless Auto-Scaling resource groups, system stability is less sensitive to configurations like priority queues and concurrency limits of Interactive resource groups. Small queries have an adaptive flow control mechanism that adjusts concurrency based on the physical resources of nodes. The pre-execution resource estimation and request process naturally manages the concurrency of large queries, requiring no extra configuration. We recommend setting a relatively high concurrency limit for the query queue (for example, 100) to fully leverage the scaling capabilities for handling traffic bursts.
Prerequisites
The cluster kernel version must be 3.2.7 or later.
NoteTo view and update the minor version, go to the Configuration Information section on the Cluster Information page in the AnalyticDB for MySQL console.
Serverless Auto-Scaling resource groups are currently available through an allowlist and are supported only in specific regions. To use this feature or request it in other regions, submit a ticket.
Limitations
Incompatible kernel features: Serverless Auto-Scaling resource groups do not support the following features. Use other resource group types if you need these features.
Dynamic partition import for the XUANWU_V1 storage engine (
INSERT OVERWRITEtasks across multiple partitions of a partitioned table). The query will fail. We recommend migrating the target table to the XUANWU_V2 engine.Paging Cache (deep paging performance optimization). The query will fail with the error "Resource group [xxx] is empty".
Workload Manager (WLM commands and workload management rules). Configured rules will not take effect.
Incremental materialized views for XUANWU series storage engines. The query will fail.
Recommended workloads: Analytical workloads such as ad hoc queries, reporting, and ETL.
Not recommended for: Online services that are extremely latency-sensitive and require fixed capacity. Also not recommended for writes or point lookups that rely on fixed execution plans and strong data locality. For these scenarios, consider using a coupled compute and storage resource group.
Quotas and limits: The maximum capacity and concurrency limits for a single resource group or account are subject to the information provided in the console and the current version of the documentation.
Billing
Resource type | Description |
Reserved resources (the min portion) | Billed based on the provisioned ACUs. |
Elastic resources (the portion from min to max) | Billed based on the actual ACU-hours consumed. |
You can configure the max value to control the upper limit of scaling. If your cluster version supports usage limits or budget alerts, you can set up alerts to prevent cost overruns. For detailed information on billing items, pricing, and statements, see Billing overview.
Use a Serverless Auto-Scaling resource group
Create a Serverless Auto-Scaling resource group.
Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, click Cluster Management > Resource Management. Click the Resource Groups tab. Then, in the upper-right corner of the resource group list, click Create Resource Group.
In the Create Resource Group dialog box, configure the following parameters.
Resource Group Name: Enter a name for the resource group.
Job Type: Select SQL.
Elastic Policy: Select Autoscaling.
Min ACU and Max ACU: The minimum ACU provides reserved resources for baseline performance. The system automatically scales resources between the minimum and maximum ACU values based on workload.
Click OK.
Bind the database accounts that require elastic scaling and isolation to this resource group.
In the Actions column for the target resource group, click the
icon, and then click Associate with Account.From the Database Account drop-down list, select a database account, click Associate with Account, and then click OK in the Associate with Account confirmation dialog box.
NoteIf the drop-down list is empty, it means you have not created any database accounts. For more information, see Create a database account.
For more information, see Bind or unbind a database account.
Comparison with other resource group types
Serverless Auto-Scaling, Job, reserved, and Multi-Cluster resource groups all use a compute and storage disaggregation architecture. The main differences lie in their scaling policies, concurrency and isolation methods, and billing models.
Feature | Serverless auto-scaling resource group | Job resource group | Reserved resource group | Multi-cluster resource group |
Scaling method | Supports on-demand scaling (resources are provisioned and released as needed, which enables a pay-per-query effect with zero reserved resources) and reactive scaling (automatically scales out when reserved resource utilization is high). | Scheduled by task or job. Compute resources are requested and released according to the task lifecycle. | Fixed, reserved capacity with no automatic scaling. Requires you to configure a scaling plan. | Scales by replica. A scale-out is triggered based on queue length and node resource utilization. |
Adaptive resource allocation per query | Supported. The system estimates the required resources based on query complexity. The minimum scaling increment is 16 ACUs. A single query can use both reserved and elastic resources. | Not supported. The user must manually set the amount of elastic resources. | Not applicable. | Not supported. Scales by replica. The manually configured base capacity determines the scaling increment. Resource allocation is not adaptive to individual queries. A single query can use at most one resource group replica. |
Concurrency and isolation | Soft query-level resource isolation. Resource estimation and requests naturally manage concurrency. Does not rely on priority queues or queue concurrency limits. Supports Coordinator isolation. | Hard query-level resource isolation. Nodes are provisioned on a per-job basis and cannot be shared across queries. The configured max ACU limits job concurrency. Does not support Coordinator isolation. | Multiple queries share reserved resources. Relies on mechanisms like query queues for concurrency control. Does not support Coordinator isolation. | Multiple queries share one resource group replica. Relies on query queues to control the concurrency limit of a single replica. Supports Coordinator isolation. |
Billing | Reserved resources + Elastic resources (billed per ACU-hour with minute-level granularity). | Billed based on the duration of elastic resources used per task or job. | Billed based on reserved compute resources. | Reserved resources + Elastic resources (billed per ACU-hour with minute-level granularity). |
Recommendation: A Serverless Auto-Scaling resource group provides adaptive scaling and query-level isolation capabilities, making it a comprehensive replacement for reserved, Job, and Multi-Cluster resource groups.