Alibaba Cloud Container Service supports creating HPA-enabled applications in the console to achieve elastic scaling of container resources. You can also configure the applications by defining the YAML configuration of Horizontal Pod Autoscaling (HPA).


Create an HPA application in the Container Service console

Alibaba Cloud Container Service has integrated HPA. Developers can easily create an HPA application in the Container Service console.

  1. Log on to the Container Service console.
  2. In the left-side navigation pane, choose Applications > Deployments and then click Create from Image in the upper-right corner.
  3. Enter the application name, select the cluster and namespace, and click Next.
  4. Configure the application settings. Set the number of replicas, select the Enable check box for Automatic Scaling, and then configure the settings for scaling.
    • Metric: Supports CPU and memory. The resource type must be the same as the one you have specified in the Required Resources field.
    • Condition: Specifies the resource usage threshold. HPA starts scaling up the cluster when the threshold is exceeded.
    • Max. Replicas: The maximum number of replicas that the deployment can expand to.
    • Min. Replicas: The minimum number of replicas that the deployment keeps running.
  5. Configure the container. Select an image and configure the required resources. Click Next.
    Note You must configure required resources for the deployment. Otherwise, you cannot perform the container auto scaling.
  6. On the Access Control page, click Create directly.
    Now a deployment that supports HPA has been created. You can view the auto scaling group information in the details of your deployment.
  7. In the actual environment, the application scales according to the CPU load. You can also verify the auto scaling in the test environment. After you perform a CPU pressure test on the pod, you can find that the pod can complete the horizontal scaling in half a minute.

Use kubectl commands to configure auto scaling

You can also manually create an HPA by using an orchestration template and bind it to the deployment object to be scaled. You can use the kubectl commands to configure the container auto scaling.

The following is an example of an NGINX application.

  1. Create the xxx.yml file with the following content.
    The orchestration template for Deployment is as follows:
    apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1
    kind: Deployment
      name: nginx
        app: nginx
      replicas: 2
          app: nginx  
            app: nginx
          - name: nginx
            image: nginx:1.7.9 # replace it with your exactly <image_name:tags>
            - containerPort: 80
              requests:                         ##This parameter is required to run the HPA.
                cpu: 500m
  2. Run the following command to create an Nginx application:
    kubectl create -f xxx.yml
    Note Replace XXX with the actual file name.
  3. Create an HPA
    Use scaleTargetRef to set the object that is bound to the current HPA. In this example, a Deployment named nginx is bond.
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
      name: nginx-hpa
      namespace: default
      scaleTargetRef:                             ##Bind a Deployment named nginx
        apiVersion: apps/v1
        kind: Deployment
        name: nginx
      minReplicas: 1
      maxReplicas: 10
      - type: Resource
          name: cpu
          targetAverageUtilization: 50
    Note You need to configure the request resource for the pod to run the HPA.
  4. The following warnings will occur when you run kubectl describe hpa name:
    Warning  FailedGetResourceMetric       2m (x6 over 4m)  horizontal-pod-autoscaler  missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5897-mqzs7
    Warning  FailedComputeMetricsReplicas  2m (x6 over 4m)  horizontal-pod-autoscaler  failed to get cpu utilization: missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5
  5. After you have created an HPA, run the kubectl describe hpa name command again.
    If the following information is displayed, it means that the HPA is running normally.
    Normal SuccessfulRescale 39s horizontal-pod-autoscaler New size: 1; reason: All metrics below target

    When the utilization rate of the NGINX pod exceeds the 50% utilization rate specified in this example, the pod scales out horizontally. When the utilization rate is lower than 50%, the pod scales in.