Common questions about Serverless clusters - PolarDB - Alibaba Cloud Documentation Center

This topic covers common questions about Serverless clusters.

Serverless cluster FAQ

Basic concepts

What does PCU mean for a Serverless cluster?

PCU stands for PolarDB Capacity Unit. One PCU is roughly equivalent to the standard service capacity of 1 vCPU and 2 GB of memory. PCU is the unit used by PolarDB Serverless clusters to manage resource elasticity. The smallest elastic increment is 0.5 PCU.

What is the maximum storage capacity of a PolarDB Serverless cluster?

The storage limit for a Serverless cluster is 500 TB.

Does PolarDB Serverless support hot standby storage clusters?

Yes. You can enable a Hot Standby Cluster when creating a PolarDB Serverless cluster.

Purchase and usage

How is billing handled for PolarDB Serverless clusters?

Serverless clusters are billed by the second, based on the average PCU usage during each billing interval. For details, see Serverless billing overview.

Elastic scaling

What is the elastic range for PolarDB Serverless?

A single node (RW or RO) supports an elastic range of 1 to 32 PCU (approximately 32 vCPUs and 64 GB of memory). You can add up to 15 RO nodes. Therefore, the theoretical maximum compute capacity for the entire cluster is 32 × 16 = 512 PCU. You can configure the per-node elastic range and the number of RO nodes in the console. For details, see Configure elastic scaling policies for Serverless clusters.

How long does elastic scaling take for a PolarDB Serverless cluster?

In-place scaling for a single node takes less than 10 seconds (cross-host scaling takes less than 30 seconds). Elastic scaling time consists of three parts: detection time + decision time + execution time. Detection time is 5 seconds. Decision time and execution time each take less than 1 second. When a single node reaches its configured upper limit (for example, the per-node elastic range is set to 1–16 PCU, and the RW node hits 16 PCU) while current service traffic still exceeds the cluster’s processing capacity, the system automatically adds RO nodes (up to your configured maximum). Adding an RO node takes about 1 minute. When service traffic drops and RO nodes become idle, they are reclaimed and deleted. To avoid frequent additions and deletions under periodic workloads, the decision time for deleting RO nodes is on the order of minutes.

What metrics trigger scale-up for a PolarDB Serverless cluster?

Trigger conditions for scale-up and scale-out

Scale-up (node upgrade)

PolarDB monitors the CPU utilization, memory usage, and other kernel-level metrics of the primary and read-only nodes. During a monitoring cycle, a scale-up is typically triggered for a node if one of the following conditions is met:

The CPU usage is higher than the preset threshold (default: 80%).

The memory usage is higher than a specific threshold:

Serverless type		Scale-up threshold
Serverless cluster		90%
Serverless feature for clusters with defined specifications	32 GB or less	90%
	64 GB	92%
	128 GB	96%
	256 GB to 512 GB	98%
	Other memory specifications	Scale-up is not supported.

The specifications of a read-only node are less than half of the specifications of the primary node.
For example, if a read-only node has specifications of 4 PCU and the primary node has specifications of 10 PCU, the read-only node is scaled up to at least 5 PCU.

Scale-out (add a node)
If a read-only node in a cluster is scaled up to its maximum specifications and the scale-up threshold is still met (for example, if the CPU usage is higher than the custom threshold), a scale-out is triggered to add more read-only nodes.

Trigger conditions for scale-down and scale-in

Scale-down (node downgrade)

A scale-down is triggered for a node when its CPU usage is below the specified threshold (default: 50%) and its memory usage is below a specific threshold. The memory thresholds are as follows:

Serverless type		Scale-down threshold
Serverless Cluster		80%
Serverless feature for clusters with defined specifications	32 GB or less	80%
	64 GB	86%
	128 GB	90%
	256 GB to 512 GB	94%
	Other memory specifications	Scale-up is not supported. Therefore, no scale-down threshold applies.

Scale-in (remove a node)
A scale-in is triggered for a read-only node if its CPU usage stays below 15% and the CPU usage of all other read-only nodes stays below 60% for 15 to 30 minutes.
Note
- To prevent node jitter, only one read-only node is scaled in at a time. The cool-down period between consecutive scale-in activities is 15 to 30 minutes.
- To trigger an immediate scale-in of all read-only nodes, modify the Serverless Configuration. Set both the Maximum Number Of Read-only Nodes and the Minimum Number Of Read-only Nodes to 0.

Do maximum connections and maximum IOPS for a Serverless cluster change during scale-up?

The maximum connections for a serverless cluster is 100,000, and the maximum IOPS is 84,000.
When the Serverless feature is enabled for a cluster with defined specifications, the maximum number of connections is 100,000. The scalable IOPS is directly proportional to the configured upper limit for resource scaling of a single Serverless node.

What does it mean that larger PCU values lead to larger scaling increments in a PolarDB Serverless cluster?

When service traffic spikes, a Serverless cluster does not scale directly to the required specification in one step. Instead, it scales incrementally toward the target. The minimum scaling increment is 0.5 PCU. To adapt faster to current traffic, the system automatically increases the next scaling increment based on the current PCU level.

Why does memory usage show as 100% when a PolarDB Serverless cluster runs at 1 PCU?

Because the maximum per-node specification for a Serverless cluster is 32 PCU, certain kernel modules reserve memory space to enable rapid scale-up from 1 PCU. Therefore, the PolarDB console shows 100% memory usage at 1 PCU.

Can I set the same value for the upper and lower limits of per-node PCU scaling?

Yes. However, if you set identical upper and lower limits, the Serverless cluster will not scale with changing service traffic, which may impact your workload. Set a reasonable per-node PCU scaling range.

Why doesn’t the system automatically scale down (release) a horizontally added read-only node if its load is low?

Horizontal scale-down (node removal) is triggered only when a read-only node’s CPU usage stays below 15%, all other read-only nodes’ CPU usage stays below 60%, and this condition persists for 15 to 30 minutes.

Note

To prevent node jitter, only one read-only node is scaled down at a time, and there is a 15- to 30-minute cool-down period between consecutive scale-down operations.
To immediately scale down all read-only nodes, modify your Serverless configuration. Set the minimum number of read-only nodes to 0. This immediately triggers removal of all read-only nodes.

Cross-host scaling

If local resources are insufficient for further vertical scaling, cross-host scaling is triggered, migrating the cluster to a host with more available resources. Cross-host scaling is enabled by default. If you do not want this feature enabled, submit a ticket to request it be disabled.

Business Impact

Migrating a compute node typically takes 5–10 minutes.

During migration of the primary (read/write) node, you may experience 1–2 transient disconnections lasting 30–90 seconds. If your cluster has failover with hot replica enabled and binary logging (Binlog) disabled, primary node migration causes virtually no transient disconnection.
Migrating a read-only node has no impact on write operations.

Other

What does strong transaction consistency mean for a PolarDB Serverless cluster?

Strong transaction consistency ensures coordination between read transactions on RO nodes and the RW node, guaranteeing that data read from RO nodes complies with the ACID attributes of cluster-wide transactions.

In fixed-specification clusters, the number of RO nodes is fixed. You can configure an appropriate consistency policy based on the topology associated with the cluster endpoint.
In Serverless clusters, only one RW node exists by default, and RO nodes scale automatically based on load. Because read/write splitting cannot be predicted, always configure global consistency (high-performance mode).

How do I migrate a standard PolarDB cluster to a Serverless cluster?

Migrate a standard PolarDB cluster to a Serverless cluster using DTS.

Can I upgrade a standard PolarDB cluster (subscription or pay-as-you-go) to a Serverless cluster?

Yes. You can enable the Serverless feature on standard PolarDB clusters (subscription or pay-as-you-go), creating what is known as a fixed-specification cluster with Serverless functionality. For details, see Enable the Serverless feature on fixed-specification clusters.

PolarDB for MySQLDoes enabling the Serverless feature on a fixed-specification cluster cause transient disconnections?

When enabling the Serverless feature on a fixed-specification cluster, cluster migration (to an idle host) may occur if the current host is resource-constrained. Enable this feature during off-peak hours. For details, see Manage the Serverless feature on fixed-specification clusters.

Note

If you connect to the database using the primary endpoint, you may experience a 5–10 second transient disconnection during migration.
If you connect using the cluster endpoint, no transient disconnection occurs during migration. Use the cluster endpoint and ensure failover with hot replica is active. For more information, see Endpoints (primary, cluster, and custom) and Failover with hot replica.