Application Load Balancer (ALB) Ingresses support automatic application scaling based on the QPS values collected by the ALB instances. This ensures the stability of the application and controls resource costs. This topic describes how to use ALB Ingresses to enable automatic application scaling based on QPS.
Prerequisites
- alibaba-cloud-metrics-adapter 2.3.0 or later is installed. For more information, see Implement horizontal auto scaling based on Alibaba Cloud metrics.
- The ALB Ingress controller is installed. For more information, see Manage the ALB Ingress controller.
- The stress testing tool Apache Benchmark is installed. For more information, see Apache Benchmark.
- A Log Service project is created. For more information, see Manage a project.
- Two vSwitches that are deployed in different zones of the virtual private cloud (VPC) where your cluster resides. For more information, see Create and manage a vSwitch.
Procedure
Step 1: Create an application and a Service
- Create a file named tea.yaml and copy the following content to the file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: tea spec: replicas: 2 selector: matchLabels: app: tea template: metadata: labels: app: tea spec: containers: - name: tea image: nginx:1.7.9 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: tea-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: tea type: NodePort
- Run the following command to create an application and a Service:
kubectl apply -f tea.yaml
Step 2: Create an ALB Ingress
- Create an AlbConfig object.
- Create an IngressClass.
- Create an ALB Ingress.
- Run the following command to query the
ADDRESS
parameter of the ALB Ingress:kubectl get ingress
Expected output:
NAME CLASS HOSTS ADDRESS PORTS AGE tea-ingress alb demo.ingress.top alb-110zvs5nhsvfv*****.cn-chengdu.alb.aliyuncs.com 80 7m5s
Step 3: Create an HPA
- Create a file named hpa.yaml and copy the following content to the file:
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: ingress-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx-deployment-basic minReplicas: 2 maxReplicas: 10 metrics: - type: External external: metric: name: sls_alb_ingress_qps # The sls_alb_ingress_qps metric is used to configure auto scaling based on QPS values. selector: matchLabels: sls.project: "****" # Replace the value of sls.project with the Log Service project that you want to use. sls.logstore: "alb_****" # Replace the value with the Logstore that you want to use. sls.ingress.route: "default-tea-svc-80" # Specify the value of the sls.ingress.route parameter in the <namespace>-<svc>-<port> format. Example: default-nginx-80. target: type: AverageValue # The type parameter is set to AverageValue, which indicates that the average QPS value of each pod is used to determine whether to perform scaling activities. averageValue: 2
- Run the following command to create an HPA:
kubectl apply -f hpa.yaml
- Run the following command to query information about the HPA:
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE ingress-hpa Deployment/nginx-deployment-basic 0/2 (avg) 2 10 2 4h34m
- Run the following command to query information about the HPA:
kubectl describe hpa ingress-hpa
Expected output:
Name: ingress-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: Tue, 31 Jan 2023 11:35:01 +0800 Reference: Deployment/nginx-deployment-basic Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
Step 4: Verify that the application can be automatically scaled based on the QPS value
- Verify that the application can be automatically scaled out based on the QPS value.
- Verify that the application can be automatically scaled in based on the QPS value.
After the stress test is complete, the QPS value is decreased to 0, which is below the scale-in threshold. HPA automatically scales in the application.
After the stress test is complete, run the following command to check whether the application is scaled in:
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE ingress-hpa Deployment/nginx-deployment-basic 0/2 (avg) 2 10 2 60m
The value of the
REPLICAS
parameter is 2. This indicates that the number of pods created for the application is decreased as the QPS value decreases.