Alibaba Cloud Container Service supports creating HPA-enabled applications in the console to achieve elastic scaling of container resources. You can also configure the applications by defining the YAML configuration of Horizontal Pod Autoscaling (HPA).
Create an HPA application in the Container Service console
Alibaba Cloud Container Service has integrated HPA. Developers can easily create an HPA application in the Container Service console.
- Log on to the Container Service console.
- In the left-side navigation pane, choose Create from Image in the upper-right corner. and then click
- Enter the application name, select the cluster and namespace, and click Next.
- Configure the application settings. Set the number of replicas, select the Enable check box for Automatic Scaling, and then configure the settings for scaling.
- Metric: Supports CPU and memory. The resource type must be the same as the one you have specified in the Required Resources field.
- Condition: Specifies the resource usage threshold. HPA starts scaling up the cluster when the threshold is exceeded.
- Max. Replicas: The maximum number of replicas that the deployment can expand to.
- Min. Replicas: The minimum number of replicas that the deployment keeps running.
- Configure the container. Select an image and configure the required resources. Click
Note You must configure required resources for the deployment. Otherwise, you cannot perform the container auto scaling.
- On the Access Control page, click Create directly.
Now a deployment that supports HPA has been created. You can view the auto scaling group information in the details of your deployment.
- In the actual environment, the application scales according to the CPU load. You can
also verify the auto scaling in the test environment. After you perform a CPU pressure
test on the pod, you can find that the pod can complete the horizontal scaling in
half a minute.
Use kubectl commands to configure auto scaling
You can also manually create an HPA by using an orchestration template and bind it to the deployment object to be scaled. You can use the kubectl commands to configure the container auto scaling.
The following is an example of an NGINX application.
- Create the xxx.yml file with the following content.
The orchestration template for Deployment is as follows:
apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 # replace it with your exactly <image_name:tags> ports: - containerPort: 80 resources: requests: ##This parameter is required to run the HPA. cpu: 500m
- Run the following command to create an Nginx application:
kubectl create -f xxx.ymlNote Replace XXX with the actual file name.
- Create an HPA
Use scaleTargetRef to set the object that is bound to the current HPA. In this example, a Deployment named nginx is bond.
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: nginx-hpa namespace: default spec: scaleTargetRef: ##Bind a Deployment named nginx apiVersion: apps/v1 kind: Deployment name: nginx minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50Note You need to configure the request resource for the pod to run the HPA.
- The following warnings will occur when you run
kubectl describe hpa name:
Warning FailedGetResourceMetric 2m (x6 over 4m) horizontal-pod-autoscaler missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5897-mqzs7 Warning FailedComputeMetricsReplicas 2m (x6 over 4m) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5
- After you have created an HPA, run the
kubectl describe hpa namecommand again.If the following information is displayed, it means that the HPA is running normally.
Normal SuccessfulRescale 39s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
When the utilization rate of the NGINX pod exceeds the 50% utilization rate specified in this example, the pod scales out horizontally. When the utilization rate is lower than 50%, the pod scales in.