How to configure auto scaling for a serverless cluster - PolarDB

A serverless cluster lets you define custom policies to control the upper and lower limits of auto scaling. You can also configure scheduled policies to scale up resources for predictable traffic peaks, such as promotions, and scale down during off-peak periods.

Auto scaling

Scale-up and scale-out triggers

Vertical scaling (scale-up)

PolarDB monitors the CPU utilization, memory usage, and other kernel-level metrics of the primary and read-only nodes. During a monitoring period, the system typically triggers a scale-up for a node when any of the following conditions is met:

The CPU utilization is higher than the preset threshold (default: 80%).

The memory usage is higher than a specific threshold:

Serverless format		Scale-up threshold
Serverless cluster		90%
Serverless feature for clusters with defined specifications	Less than or equal to 32 GB	90%
	64 GB	92%
	128 GB	96%
	256 GB to 512 GB	98%
	Other memory specifications	Vertical scaling is not supported.

A read-only node's specifications are less than half of the primary node's.
For example, if a read-only node has specifications of 4 PCU and the primary node has specifications of 10 PCU, the read-only node is scaled up to at least 5 PCU.

Horizontal scaling (scale-out)
If a read-only node in a cluster has scaled up to its configured maximum limit and still meets the scale-up trigger conditions (for example, its CPU utilization exceeds the custom threshold), a horizontal scale-out is triggered.

Scale-down and scale-in triggers

Vertical scaling (scale-down)

A scale-down is triggered for a node when its CPU utilization falls below the custom threshold (default: 50%) and its memory usage is below a specific threshold. The memory usage thresholds are as follows:

Serverless format		Scale-down threshold
Serverless cluster		80%
Serverless feature for clusters with defined specifications	Less than or equal to 32 GB	80%
	64 GB	86%
	128 GB	90%
	256 GB to 512 GB	94%
	Other memory specifications	Vertical scaling is not supported, so no scale-down threshold applies.

Horizontal scaling (scale-in)
A scale-in is triggered for a read-only node if its CPU utilization remains below 15% and the CPU utilization of all other read-only nodes remains below 60% for a sustained period of 15 to 30 minutes.
Note
- To prevent node jitter, only one read-only node is scaled in at a time. The cooldown period between consecutive scale-in events is 15 to 30 minutes.
- To immediately scale in all read-only nodes, modify the Serverless Configuration. Setting both the Maximum Number of Read-only Nodes and Minimum Number of Read-only Nodes to 0 immediately triggers a scale-in of all read-only nodes.

Note

The metrics for triggering scaling vary based on cluster parameter configurations and serverless configurations. You can specify thresholds for CPU scaling, but cannot change thresholds for other metrics.
When the workloads of a serverless cluster suddenly increase, the nodes of the cluster are scaled step by step to approach the expected specifications, instead of one step at a time. The minimum step size for node scaling is 0.5 PCUs. To quickly adapt to the current workloads, the next scaling step size increases based on the current number of PCUs per node.
To receive notifications when a node is scaled down, configure alert rules in the Performance Monitoring section of the console. For details, see Set up elastic monitoring.

Important notes

The maximum number of connections for a serverless cluster is 100,000, and the maximum IOPS is 84,000.
A serverless cluster uses PCU (PolarDB Capacity Unit) as the unit for both per-second billing and auto scaling resource management. One PCU is equivalent to approximately 1 CPU core and 2 GB of memory. The PCU of a node is dynamically adjusted based on the workload within the range that you define. The minimum scaling increment is 0.5 PCU.

Configure serverless parameters

Log on to the PolarDB console, click Clusters in the left-side navigation pane, select your target region, and then click the cluster ID. On the Basic Information page, in the Database Nodes section, click Serverless Configuration.

Configure current parameters

In the Configure Serverless-related Parameters dialog box, click Edit to configure the following parameters.

Current Parameters
- Minimum Read-only Nodes: The minimum number of read-only nodes. The value can range from 0 to 15.
- Maximum Read-only Nodes: The maximum number of read-only nodes. The value can range from 0 to 15.
Note
- The number of read-only nodes automatically increases or decreases based on the actual workload, within the specified minimum and maximum limits. For more information about the scaling logic, see Auto scaling.
- To ensure high availability for the serverless cluster, we recommend setting Minimum Read-only Nodes to 1.
- Minimum Resources for Single Node: The minimum PCU for each node in the cluster. The value can range from 0.25 PCU to 32 PCU.
- Maximum Resources for Single Node: The maximum PCU for each node in the cluster. The value can range from 1 PCU to 32 PCU.
Note
Example: If you set Minimum Resources for Single Node to 2 PCU and Maximum Resources for Single Node to 16 PCU, the default specifications of the primary node and read-only nodes in the serverless cluster are 2 PCU (approximately 2 CPU cores and 4 GB of memory). When the system detects increased workloads, it automatically increases the number of PCUs for the primary node or read-only nodes. Based on your settings, the maximum is 16 PCU (approximately 16 CPU cores and 32 GB of memory).
- Read-only Column Store Nodes: The number of read-only column store nodes. The value can range from 0 to 15.
  Note
  To add a read-only column store node, the cluster must have at least one read-only node. This means you must first set Minimum Read-only Nodes to 1 or higher.
  For more information about read-only column store nodes, see columnar indexes (IMCI).
- Enable No-activity Suspension: Enable this feature to automatically suspend the cluster during periods of inactivity. When enabled, the cluster suspends if no connections are active for the specified Detection Period for No-activity Suspension. During the suspension period, you are still charged for storage on a pay-as-you-go basis. The cluster automatically resumes as soon as a new connection request is received.
- Detection Period for No-activity Suspension: The detection period can be set from 5 minutes to 24 hours and must be a multiple of 5 minutes.
Advanced Settings
You can adjust the advanced parameters based on the resource pressure on your serverless cluster.
- Scan Interval: This setting determines how quickly the cluster responds to workload changes. Select Standard or Sensitive.
- Maximum CPU Resources for Elastic Upgrade (Maximum): The CPU utilization threshold that triggers a scale-up action. The value can range from 40% to 100%.
- Minimum CPU Resources for Elastic Upgrade (Minimum): The CPU utilization threshold that triggers a scale-down action. The value can range from 10% to 70%.
Note
- The CPU scale-down threshold must be lower than the CPU scale-up threshold. The difference between the two thresholds must be at least 30 percentage points.
- Sensitive mode is suitable for workloads that are sensitive to transient load fluctuations (for example, short CPU spikes) and need a faster response. However, the cluster may scale up and down more frequently as the load changes.

Configure a scheduled policy

A scheduled policy scales resources based on a predefined schedule. This allows you to scale up resources for predictable traffic peaks, such as promotions, and scale down during off-peak periods.

Warning

Proceed with caution:

When you delete a scheduled policy, its running tasks cannot be canceled, but its pending tasks will be.
If you disable the serverless feature, all associated scheduled policies and their scheduled tasks are deleted.

In the Configure Serverless-related Parameters dialog box, click + Add Scheduled Policy. The parameters are described as follows:

Parameter	Value
Maximum Resources for Single Node (Maximum)	1 to 32 PCU.
Minimum Resources for Single Node (Minimum)	1 to 32 PCU. The minimum value must be less than or equal to the maximum value.
Maximum Read-only Nodes	A value from 0 to 15.
Minimum Read-only Nodes	A value from 0 to 15. The value cannot be greater than the value of Maximum Read-only Nodes.
Read-only Column Store Nodes	A value from 0 to 15.
Start/End Time	The effective time range for the scheduled policy.
Policy Scheduling	The schedule settings for the policy. Positive: Select specific days of the month and a time for the policy to run. You can count from the beginning of the month (Last) or the end (Last). Separate multiple days with a comma (`,`), for example, `1,3,5`. Weekly: Select specific days of the week (Monday to Sunday) and a time for the policy to run. Daily: Select a specific time of day for the policy to run.

Note

After you configure a scheduled policy, the system adjusts the cluster's Serverless configuration parameters according to the scheduled time of the policy within the specified Start/End Time. After the adjustment is complete, the cluster's Serverless configuration parameters will not be automatically restored. If you need to restore the original configuration parameters at a specific time, configure another scheduled policy. For a detailed example, see Example.

View scheduled tasks. You can view tasks in two ways:
Note
Scheduled policies generate the underlying scheduled tasks that execute the scaling actions.
- After a scheduled policy is created, you can view its tasks on the cluster details page.
  On the cluster details page, in the Pending Scheduled and Failed Tasks section, you can view the tasks generated by the policy. This includes the task ID, status, action (such as ModifyDBClusterServerlessConf), planned start time, and planned end time. You can Cancel the task or Modify the scheduled time.
- Go to Task Management > Scheduled Tasks in the console.
  In the task list on this page, you can view the status of each scheduled task (such as Pending or Canceled) and Cancel any tasks that are in the Pending state.

Example

Scenario: Scale resources up to 5 PCU at 9:30 AM and down to 1 PCU at 10:00 PM on weekdays from August 1 to September 30. This requires two separate policies:

For the scale-up policy (runs at 09:30), set both Maximum Resources for Single Node and Minimum Resources for Single Node to 5 PCU. Set both Maximum Read-only Nodes and Minimum Read-only Nodes to 5.

For the scale-down policy (runs at 22:00), set Maximum Resources for Single Node and Minimum Resources for Single Node to 1 PCU, and set Maximum Read-only Nodes and Minimum Read-only Nodes to 1.