Application Load Balancer (ALB) Ingresses can automatically scale your application based on the queries-per-second (QPS) values collected by the ALB instance. This keeps your application stable under variable load while controlling resource costs.
Prerequisites
Before you begin, ensure that you have:
-
alibaba-cloud-metrics-adapter version 2.3.0 or later installed. See Implement horizontal auto scaling based on Alibaba Cloud metrics
-
The ALB Ingress controller installed. See Manage the ALB Ingress controller
-
Apache Benchmark installed. See Apache Benchmark
-
A Log Service project created. See Manage a project
-
Two vSwitches deployed in different zones of the virtual private cloud (VPC) where your cluster resides. See Create and manage a vSwitch
How it works
-
Create a Deployment and Service for your application.
-
Create an ALB Ingress to route external traffic to the Service.
-
Create a Horizontal Pod Autoscaler (HPA) that watches the
sls_alb_ingress_qpsmetric from Log Service. -
When QPS exceeds the per-pod threshold, the HPA scales out the Deployment. When QPS drops, the HPA scales it back in.
Step 1: Create an application and a service
-
Create a file named
tea.yamlwith the following content:apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: tea spec: replicas: 2 selector: matchLabels: app: tea template: metadata: labels: app: tea spec: containers: - name: tea image: nginx:1.7.9 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: tea-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: tea type: NodePort -
Apply the manifest:
kubectl apply -f tea.yaml
Step 2: Create an ALB Ingress
Create an AlbConfig object
-
Create a file named
alb-test.yamlwith the following content:Field Description zoneMappingsSpecify at least two vSwitch IDs from different zones in the same VPC. logProjectThe name of the Log Service project you created. logStoreThe Logstore name. Must start with alb_. If the Logstore does not exist, the system creates it automatically.apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: alb-demo spec: config: name: alb-test addressType: Internet # Internet-facing ALB zoneMappings: - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # Replace with your first vSwitch ID - vSwitchId: vsw-uf6nun9tql5t8nh15**** # Replace with your second vSwitch ID (different zone) accessLogConfig: logProject: "****" # Replace with your Log Service project name logStore: "alb_****" # Replace with your Logstore name; must start with alb_ -
Apply the manifest:
kubectl apply -f alb-test.yaml
Create an IngressClass
-
Create a file named
alb.yamlwith the following content:apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: alb spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: alb-demo # Must match the AlbConfig metadata.name above -
Apply the manifest:
kubectl apply -f alb.yaml
Create the Ingress
-
Create a file named
tea-ingress.yamlwith the following content:apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tea-ingress spec: ingressClassName: alb rules: - host: demo.ingress.top http: paths: - path: /tea pathType: Prefix backend: service: name: tea-svc port: number: 80 -
Apply the manifest:
kubectl apply -f tea-ingress.yaml -
Get the ALB address assigned to the Ingress:
kubectl get ingressExpected output:
NAME CLASS HOSTS ADDRESS PORTS AGE tea-ingress alb demo.ingress.top alb-110zvs5nhsvfv*****.cn-chengdu.alb.aliyuncs.com 80 7m5sNote the
ADDRESSvalue. You will use it in the stress test command in Step 4.
Step 3: Create an HPA
-
Create a file named
hpa.yamlwith the following content:apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: ingress-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx-deployment-basic # The Deployment to scale minReplicas: 2 # Minimum number of pods maxReplicas: 10 # Maximum number of pods metrics: - type: External external: metric: name: sls_alb_ingress_qps # ALB QPS metric from Log Service selector: matchLabels: sls.project: "****" # Replace with your Log Service project name sls.logstore: "alb_****" # Replace with your Logstore name sls.ingress.route: "default-tea-svc-80" # Format: <namespace>-<service-name>-<port> # Example: default-nginx-80 target: type: AverageValue # Scale based on average QPS per pod averageValue: 2 # Scale out when average QPS per pod exceeds 2This HPA scales
nginx-deployment-basicbetween 2 and 10 pods. It triggers a scale-out whenever the average QPS per pod exceeds 2, and scales back in when QPS drops below the threshold. -
Apply the manifest:
kubectl apply -f hpa.yaml -
Verify the HPA was created:
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE ingress-hpa Deployment/nginx-deployment-basic 0/2 (avg) 2 10 2 4h34mThe current QPS is
0because no traffic is hitting the application yet. The pod count is at its minimum (2), which is the expected baseline state. -
(Optional) Inspect the HPA details:
kubectl describe hpa ingress-hpaExpected output:
Name: ingress-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: Tue, 31 Jan 2023 11:35:01 +0800 Reference: Deployment/nginx-deployment-basic Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
Step 4: Verify auto scaling
Verify scale-out
-
Run the following stress test against the ALB address from Step 2. Replace the address with your actual ALB address.
ab -c 5 -n 5000 -H Host:demo.ingress.top http://alb-110zvs5nhsvfv*****.cn-chengdu.alb.aliyuncs.com/tea -
While the test is running, watch the HPA status in real time:
kubectl get hpa ingress-hpa --watchAs QPS climbs above the threshold, you will see the replica count increase. Press Ctrl+C to stop watching when you are done.
-
After the stress test completes, confirm the scale-out result:
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE ingress-hpa Deployment/nginx-deployment-basic 12500m/2 (avg) 2 10 10 15mREPLICASis10, confirming the Deployment scaled out to the maximum as QPS exceeded the per-pod threshold.
Verify scale-in
After the stress test ends, QPS drops to 0. The HPA automatically scales the Deployment back in. Scale-in has a default stabilization window of approximately 5 minutes, so wait a few minutes before checking.
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ingress-hpa Deployment/nginx-deployment-basic 0/2 (avg) 2 10 2 60m
REPLICAS is back to 2, confirming the Deployment scaled in once QPS dropped below the threshold.
What's next
-
To tune scale-out and scale-in behavior, configure the
behaviorfield in the HPA spec. See the Kubernetes HPA documentation. -
To use additional Alibaba Cloud metrics for auto scaling, see Implement horizontal auto scaling based on Alibaba Cloud metrics.