All Products
Search
Document Center

Container Service for Kubernetes:Horizontal pod autoscaling

Last Updated:Aug 25, 2023

You can create an application that has Horizontal Pod Autoscaling (HPA) enabled in the Container Service for Kubernetes (ACK) console. HPA can automatically scale container resources for your application. You can also use a YAML file to configure HPA settings.

Prerequisites

Create an application that has HPA enabled in the ACK console

  1. Log on to the ACK console.

  2. In the left-side navigation pane of the ACK console, click Clusters.

  3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.

  4. In the left-side navigation pane of the details page, choose Workloads > Deployments.

  5. On the Deployments page, click Create from Image.

  6. On the Basic Information wizard page, enter a name for your application, set the parameters, and then click Next.

    Parameter

    Description

    Namespace

    Select the namespace to which the application belongs. The default namespace is automatically selected.

    Name

    Enter a name for the application.

    Replicas

    The number of pods that you want to provision for the application. Default value: 2.

    Type

    The type of the application. You can select Deployment, StatefulSet, Job, CronJob, or DaemonSet.

    Label

    Add labels to the application. The labels are used to identify the application.

    Annotations

    Add annotations to the application.

    Synchronize Timezone

    This parameter is supported only by ACK clusters. Serverless Kubernetes (ASK) clusters do not support this parameter. This parameter specifies whether to synchronize the time zone between nodes and containers.

  7. On the Container wizard page, set the container parameters, select an image, and then configure the required computing resources. Click Next. For more information, see Configure containers.

    Note

    You must configure the computing resources required by the Deployment. Otherwise, you cannot enable HPA.

  8. On the Advanced wizard page, click Create to the right of Services in the Access Control section, and then set the parameters. For more information, see Configure advanced settings.

  9. On the Advanced wizard page, select Enable for HPA and configure the scaling threshold and related settings.
    • Metric: Select CPU Usage or Memory Usage. The selected resource type must be the same as the one you have specified in the Required Resources field.
    • Condition: Specify the resource usage threshold. HPA triggers scaling events when the threshold is exceeded. For more information about the algorithms that are used to perform horizontal pod autoscaling, see Algorithm details.
    • Max. Replicas: Specify the maximum number of pods to which the Deployment can be scaled.
    • Min. Replicas: Specify the minimum number of pods that must run for the Deployment.
  10. In the lower-right corner of the Advanced wizard page, click Create to create the application that has HPA enabled.

Create an application that has HPA enabled by using kubectl

You can also create an HPA by using an orchestration template and associate the HPA with the Deployment for which you want to enable HPA. Then, you can run kubectl commands to enable HPA.

In the following example, HPA is enabled for an NGINX application.

  1. Create a file named nginx.yml and copy the following content to the file.

    Example:

    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx  
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9 # replace it with your exactly <image_name:tags>
            ports:
            - containerPort: 80
            resources:
              requests:                         ## This parameter is required to run the HPA. 
                cpu: 500m
  2. Run the following command to create an NGINX application:

    kubectl create -f nginx.yml
  3. Create an HPA.

    Use the scaleTargetRef parameter to associate the HPA with the nginx Deployment.

    YAML template for clusters whose Kubernetes versions are 1.24 and earlier

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: nginx-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx
      minReplicas: 1
      maxReplicas: 10
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50
                            

    YAML template for clusters whose Kubernetes versions are 1.26

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: nginx-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx
      minReplicas: 1
      maxReplicas: 10
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50
                            
    Note

    You must configure resource requests for the pods of the application. Otherwise, the HPA cannot be started.

  4. Run the kubectl describe hpa name command. The following output is an example of a warning that is returned:

    Warning  FailedGetResourceMetric       2m (x6 over 4m)  horizontal-pod-autoscaler  missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5897-mqzs7
    
    Warning  FailedComputeMetricsReplicas  2m (x6 over 4m)  horizontal-pod-autoscaler  failed to get cpu utilization: missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5
  5. After the HPA is created, run the kubectl describe hpa name command.

    If the following output is returned, the HPA is running as expected:

    Normal SuccessfulRescale 39s horizontal-pod-autoscaler New size: 1; reason: All metrics below target

    If the CPU utilization of the pod of the NGINX application exceeds 50% as specified in the HPA settings, the HPA automatically creates new pods. If the CPU utilization of the pod of the NGINX application drops below 50%, the HPA automatically removes pods.