All Products
Search
Document Center

Container Service for Kubernetes:Configure elastic scaling for a cluster

Last Updated:Nov 27, 2025

Built on Alibaba Cloud Elastic Container Instance, an ACK Serverless cluster provides powerful elastic scaling capabilities. Based on configured policies, the computing power of an ACK Serverless cluster can increase severalfold in a short period or be quickly scaled in to save costs when computing power demand decreases. This topic describes how to directly control the number of pods in a cluster or configure automatic scaling policies based on load.

Important

Completing this tutorial is expected to cost approximately USD 0.5, assuming the resources run for 0.5 hours. We recommend that you release the resources promptly after you complete the tutorial.

Prerequisites

A web application is deployed. For more information, see Quickly deploy an Nginx-based web application.

Step 1: Install the metrics-server component

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the one you want to manage and click its name. In the left navigation pane, click Add-ons.

  3. Click the Logs and Monitoring tab. Find the metrics-server card and click Install in the lower-right corner. Wait for the installation to complete.

Step 2: Scale an application

Console

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster you want to manage and click its name. In the left navigation pane, choose Workloads > Deployments.

  3. Click the Deployments application named nginx-deploy to go to its details page.

  4. In the upper-right corner, click Scale. In the Scale panel, set Desired Number of Pods: to 10, and then click OK.

    Refresh the page. The creation of nine new pods indicates a successful scale-out.

  5. Repeat Step 4 to change the number of pods to 1.

    Refresh the page. The number of pods is reduced to one, which indicates a successful scale-in.

kubectl

  1. Run the following command to view the details of the deployment.

    kubectl get deploy

    Expected output:

    NAME           READY   UP-TO-DATE   AVAILABLE   AGE
    nginx-deploy   1/1     1            1           9m32s
  2. Run the following command to scale out the number of pods in the deployment to 10.

    kubectl scale deploy nginx-deploy --replicas=10

    Expected output:

    deployment.extensions/nginx-deploy scaled
  3. Run the following command to view the pods.

    kubectl get pod

    Expected output:

    NAME                            READY   STATUS    RESTARTS   AGE
    nginx-deploy-55d8dcf755-8jlz2   1/1     Running   0          39s
    nginx-deploy-55d8dcf755-9jbzk   1/1     Running   0          39s
    nginx-deploy-55d8dcf755-bqhcz   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-bxk8n   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-cn6x9   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-jsqjn   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-lhp8l   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-r2clb   1/1     Running   0          38s
    nginx-deploy-55d8dcf755-rchhq   1/1     Running   0          10m
    nginx-deploy-55d8dcf755-xspnt   1/1     Running   0          38s
  4. Run the following command to scale in the number of pods in the deployment to 1.

    kubectl scale deploy nginx-deploy --replicas=1

    Expected output:

    deployment.extensions/nginx-deploy scaled
  5. Run the following command to view the pods.

    kubectl get pod

    Expected output:

    NAME                            READY   STATUS    RESTARTS   AGE
    nginx-deploy-55d8dcf755-bqhcz   1/1     Running   0          1m

Step 3: Configure load-based automatic scaling

Console

  1. On the application details page, click the Pod Scaling tab.

  2. In the HPA section, click Create.

  3. In the Create panel, enter the following sample values and click OK.

    Configuration Item

    Sample Value

    Name:

    nginx-deploy

    Metric:

    Click Add and enter the following metric:

    • Metric: CPU Usage

    • Threshold: 20%

    Max. Containers:

    10

    Min. Containers:

    1

kubectl

  1. Create a metric-based scaling policy.

    This policy ensures that the deployment has a minimum of 1 pod and a maximum of 10 pods. The deployment automatically scales to maintain an average CPU utilization of approximately 20% across all pods.

    kubectl autoscale deployment nginx-deploy --cpu-percent=20 --min=1 --max=10

    Expected output:

    horizontalpodautoscaler.autoscaling/nginx-deploy autoscaled
  2. View the information about the metric-based scaling policy.

    kubectl get hpa

    Expected output:

    NAME           REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
    nginx-deploy   Deployment/nginx-deploy   0%/20%    1         10        1          35s

Step 4 (Optional): Test the automatic scaling policy

You can test the automatic scaling policy by adding load to the containers in the cluster. To add load to a container, perform the following steps.

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Workloads > Pods.

  3. On the Pods page, find the test pod and click Terminal in the Actions column. Select the test container to open the command-line interface.

  4. The following command runs an infinite loop in the container to maximize the CPU load. Run this command only on a test container to avoid disrupting your services.

    while : ; do : ; done
  5. Return to the console. Click More > Monitor in the Actions column to observe the CPU load of the container.

  6. Wait a few minutes and refresh the page. Four new pods are created.

    At this point, the CPU load of one pod reaches 100%, while the CPU load of the other four pods is approximately 0%. The average CPU load of all pods in the deployment is now approximately 20%. This indicates that the scale-out is complete and the cluster is stable.

  7. Return to the container terminal page that you opened in Step 3. Press Ctrl+C to end the loop and return the container's CPU load to normal.

    Note

    If you closed the container terminal page, run the top command to view processes. Then, run the kill -9 <PID> command to terminate the process with 100% CPU load.

  8. Return to the console. After 5 to 10 minutes, refresh the page. You will see that the number of pods is reduced to one.

Step 5: Release resources

If you no longer need to use the cluster, release its resources.

Delete the created application and service

  1. On the Clusters page of the Container Service for Kubernetes (ACK) console, click the name of the target cluster.

  2. In the navigation pane on the left, choose Workloads > Deployments. Select the NGINX application that you created, click Batch Delete, and then follow the on-screen instructions to confirm the deletion.

Delete a cluster

ACK Serverless clusters are in public preview and offer a free trial. However, you must pay for other Alibaba Cloud services used by your ACK Serverless clusters based on the billing rules of the services. Fees are charged by these Alibaba Cloud services separately. After you complete the configuration, you can manage the cluster in one of the following ways:

  • If you no longer need the cluster, log on to the ACK console. On the Clusters page, choose More > Delete in the Actions column of the cluster to delete the cluster. In the Delete Cluster dialog box, select Delete ALB Instances Created by the Cluster, Delete Alibaba Cloud DNS PrivateZone instances Created by the Cluster, and I understand the above information and want to delete the specified cluster, and then click OK. For more information about how to delete an ACK Serverless cluster, see Delete an ACK Serverless cluster.

  • If you want to continue to use the cluster, recharge your Alibaba Cloud account at least 1 hour before the trial period ends and ensure that your account has a balance of at least CNY 100. For more information about the billing of Alibaba Cloud services used by ACK Serverless Pro clusters, see Cloud service fee.