ACK Serverless clusters are built on Alibaba Cloud Elastic Container Instance (ECI) and provide elastic scaling capabilities. The cluster can scale out severalfold within minutes, or scale in quickly to reduce costs when demand drops.
This tutorial covers two scaling approaches:
-
Manual scaling: set the exact number of Pods in a Deployment on demand
-
Automatic scaling: configure a Horizontal Pod Autoscaler (HPA) that adjusts Pod count based on CPU utilization
Completing this tutorial costs approximately USD 0.5, assuming the resources run for 0.5 hours. Release the resources promptly after you finish.
Prerequisites
Before you begin, ensure that you have:
-
A web application deployed on your ACK Serverless cluster — see Quickly deploy an Nginx-based web application
Step 1: Install the metrics-server add-on
The metrics-server add-on collects CPU and memory metrics from cluster nodes. HPA uses these metrics to make scaling decisions.
-
Log on to the ACK console. In the left navigation pane, click Clusters.
-
Click the name of your cluster. In the left navigation pane, click Add-ons.
-
Click the Logs and Monitoring tab. Find the metrics-server card and click Install in the lower-right corner. Wait for the installation to complete.
Step 2: Scale a Deployment manually
Manual scaling lets you set an exact Pod count immediately, without waiting for metrics thresholds to trigger.
Console
-
Log on to the ACK console. In the left navigation pane, click Clusters.
-
Click the name of your cluster. In the left navigation pane, click Workloads > Deployments.
-
Click the Deployment named
nginx-deployto open its details page. -
In the upper-right corner, click Scale. In the Scale panel, set Desired Pod Count to
10, then click OK. Refresh the page. Nine new Pods appear, bringing the total to 10, which confirms a successful scale-out. -
Repeat the previous step and set Desired Pod Count to
1. Refresh the page. The count drops to 1, which confirms a successful scale-in.
kubectl
-
View the current state of the Deployment.
kubectl get deployExpected output:
NAME READY UP-TO-DATE AVAILABLE AGE nginx-deploy 1/1 1 1 9m32s -
Scale out to 10 Pods.
kubectl scale deploy nginx-deploy --replicas=10Expected output:
deployment.extensions/nginx-deploy scaled -
Verify the Pods are running.
kubectl get podExpected output (10 Pods, all
Running):NAME READY STATUS RESTARTS AGE nginx-deploy-55d8dcf755-8jlz2 1/1 Running 0 39s nginx-deploy-55d8dcf755-9jbzk 1/1 Running 0 39s nginx-deploy-55d8dcf755-bqhcz 1/1 Running 0 38s nginx-deploy-55d8dcf755-bxk8n 1/1 Running 0 38s nginx-deploy-55d8dcf755-cn6x9 1/1 Running 0 38s nginx-deploy-55d8dcf755-jsqjn 1/1 Running 0 38s nginx-deploy-55d8dcf755-lhp8l 1/1 Running 0 38s nginx-deploy-55d8dcf755-r2clb 1/1 Running 0 38s nginx-deploy-55d8dcf755-rchhq 1/1 Running 0 10m nginx-deploy-55d8dcf755-xspnt 1/1 Running 0 39s -
Scale in to 1 Pod.
kubectl scale deploy nginx-deploy --replicas=1 -
Verify the scale-in is complete.
kubectl get podExpected output:
NAME READY STATUS RESTARTS AGE nginx-deploy-55d8dcf755-bqhcz 1/1 Running 0 1m
Step 3: Configure automatic scaling with HPA
HPA automatically adjusts the number of Pods based on observed CPU utilization, keeping the average close to your target threshold.
Console
-
On the
nginx-deploydetails page, click the Pod Scaling tab. -
In the HPA section, click Create.
-
In the Create panel, enter the following values and click OK.
Configuration item Sample value Name nginx-deployMetric Click Add: Metric = CPU Usage, Threshold = 20% Max. Containers 10 Min. Containers 1
kubectl
Create an HPA that targets 20% average CPU utilization, with a minimum of 1 Pod and a maximum of 10 Pods.
kubectl autoscale deployment nginx-deploy --cpu-percent=20 --min=1 --max=10
Expected output:
horizontalpodautoscaler.autoscaling/nginx-deploy autoscaled
Verify the HPA is created.
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deploy Deployment/nginx-deploy 0%/20% 1 10 1 35s
The TARGETS column shows 0%/20% — current CPU utilization is 0% because no load has hit the server yet. The Pod count is already at the minimum (1), so the HPA will not scale in further.
Step 4 (Optional): Test the automatic scaling policy
Generate artificial CPU load to verify the HPA scales out and then scales back in.
Generate load
-
Log on to the ACK console. In the left navigation pane, click Clusters.
-
Click the name of your cluster. In the left navigation pane, click Workloads > Pods.
-
Find the test Pod and click Terminal in the Actions column. Select the test container to open a command-line interface.
-
Run the following command to create an infinite CPU load loop. Run this only on a test container to avoid disrupting production traffic.
while : ; do : ; done -
Return to the console. Click More > Monitor in the Actions column to observe the CPU load.
Observe scale-out
Wait a few minutes and refresh the page. Four new Pods are created, bringing the total to 5.
At this point, one Pod runs at 100% CPU utilization while the other four Pods are at approximately 0%, bringing the average across all five Pods to approximately 20%. The scale-out is complete.
Stop load and verify scale-in
-
Return to the container terminal. Press Ctrl+C to stop the loop.
If you closed the terminal page, open a terminal for the Pod again, run
topto find the process, then runkill -9 <PID>to terminate it. -
After 5–10 minutes, refresh the page. The number of Pods is reduced to 1, which confirms the automatic scale-in is complete.
Step 5: Release resources
Delete the application and Service
-
In the ACK console, go to the Clusters page and click the name of your cluster.
-
In the left navigation pane, choose Workloads > Deployments. Select the Nginx application, click Batch Delete, and follow the on-screen instructions.
Delete the cluster
ACK Serverless clusters do not incur cluster management fees. However, other Alibaba Cloud resources used by the cluster — such as ECI — are billed separately according to each product's billing rules.
-
To delete the cluster: Go to the ACK console, open the Clusters page, click More > Delete next to the target cluster, select the associated cloud resources to delete, and follow the on-screen instructions. For details, see Delete a cluster.
-
To keep the cluster: Maintain a sufficient account balance to avoid service interruption. For billing details, see Billing for cloud product resources.