If your application needs to dynamically adjust its computing resources based on request volume, you can use queries per second (QPS) data from an Application Load Balancer (ALB) instance to configure auto scaling for its pods.
Before you start
Before you start, see Create and use an ALB Ingress to expose a service to learn the basics of ALB Ingress.
How it works
Queries per second (QPS) is the number of requests received per second. Application Load Balancer (ALB) instances use Simple Log Service to record client access data. A Horizontal Pod Autoscaler (HPA) monitors the QPS data of a service from these records and scales the corresponding workloads, such as deployments and StatefulSets.
Prerequisites
The alibaba-cloud-metrics-adapter component of version 2.3.0 or later is installed. For more information, see Deploy the ack-alibaba-cloud-metrics-adapter component.
The Apache Benchmark stress testing tool is installed. For more information, see the official documentation at Compiling and Installing.
A kubectl client is connected to the ACK cluster. For more information, see Get a cluster kubeconfig and connect to the cluster using kubectl.
You have created two vSwitches in different zones. The vSwitches must be in the same VPC as the cluster and in zones that support ALB. For more information, see Create and manage vSwitches.
Step 1: Create an AlbConfig and associate a Simple Log Service project
View the Simple Log Service project associated with the cluster.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, click Cluster Information.
On the Basic Information tab, find the Simple Log Service Project resource and record the project name.
Create an AlbConfig.
Create a file named alb-qps.yaml, copy the following content to the file, and then specify the details of the Simple Log Service project in the
accessLogConfigfield.apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: alb-qps spec: config: name: alb-qps addressType: Internet zoneMappings: - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # The ID of the vSwitch. - vSwitchId: vsw-uf6nun9tql5t8nh15**** accessLogConfig: logProject: <LOG_PROJECT> # The name of the Simple Log Service project associated with the cluster. logStore: <LOG_STORE> # The name of the custom Logstore. The name must start with "alb_". listeners: - port: 80 protocol: HTTPThe following describes the fields:
Field
Type
Description
logProject
string
The name of the Simple Log Service project.
Default value:
"".logStore
string
The name of the Simple Log Service Logstore, which must start with
alb_. The SLS Logstore is automatically created If it does not exist. For more information, see Enable Simple Log Service to collect access logs.Default value:
"alb_****".Run the following command to create the AlbConfig.
kubectl apply -f alb-qps.yamlExpected output:
albconfig.alibabacloud.com/alb-qps created
Step 2: Create sample resources
In addition to an AlbConfig, an ALB Ingress requires a deployment, a service, an IngressClass, and an Ingress to function. Follow these steps to quickly create these resources.
Create a file named qps-quickstart.yaml that contains the following content.
apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: qps-ingressclass spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: alb-qps # The name of the AlbConfig. --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: qps-ingress spec: ingressClassName: qps-ingressclass # The name of the IngressClass. rules: - host: demo.alb.ingress.top # Replace this with your domain name. http: paths: - path: /qps pathType: Prefix backend: service: name: qps-svc port: number: 80 --- apiVersion: v1 kind: Service metadata: name: qps-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: qps-deploy type: NodePort --- apiVersion: apps/v1 kind: Deployment metadata: name: qps-deploy labels: app: qps-deploy spec: replicas: 2 selector: matchLabels: app: qps-deploy template: metadata: labels: app: qps-deploy spec: containers: - name: qps-container image: nginx:1.7.9 ports: - containerPort: 80Run the following command to create the sample resources.
kubectl apply -f qps-quickstart.yaml
Step 3: Create an HPA
Create a file named qps-hpa.yaml with the following content.
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: qps-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: qps-deploy # The name of the workload that the HPA controls. minReplicas: 2 # The minimum number of pods. maxReplicas: 10 # The maximum number of pods. metrics: - type: External # Use external metrics (non-native Kubernetes metrics). external: metric: name: sls_alb_ingress_qps # The name of the metric (QPS of Alibaba Cloud ALB Ingress). Do not modify this value. selector: matchLabels: sls.project: <LOG_PROJECT> # The name of the Simple Log Service project. Replace this with the actual project name. sls.logstore: <LOG_STORE> # The name of the Logstore. Replace this with the actual Logstore name. sls.ingress.route: default-qps-svc-80 # The path of the service. The format is <namespace>-<svc>-<port>. target: type: AverageValue # The target metric type (average value). averageValue: "2" # The expected target value of the metric. In this example, the average QPS of all pods is 2.The following describes the fields:
Field
Description
scaleTargetRef
The application workload. In this example, this refers to the Deployment named
qps-deploythat was created in Step 1.minReplicas
The minimum number of pods to which the deployment can be scaled in. This value must be an integer greater than or equal to 1.
maxReplicas
The maximum number of pods to which the deployment can be scaled out. This value must be greater than the minimum number of replicas.
external.metric.name
The metric that is based on QPS data and is used by the HPA. Do not modify this value.
sls.project
The Simple Log Service project on which the metric is based. The value must be the same as that specified in the AlbConfig.
sls.logstore
The Logstore on which the metric is based. The value must be the same as that specified in the AlbConfig.
sls.ingress.route
The path for the Service uses the format <namespace>-<svc>-<port>. In this example, the path is for the qps-svc Service, which was created in Step 1.
external.target
The expected target value of the metric. In this example, the average QPS of all pods is 2. The HPA controls the number of pods to keep the QPS as close to the target value as possible.
Run the following command to create the HPA.
kubectl apply -f qps-hpa.yamlRun the following command to view the deployment status of the HPA.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 5m41sRun the following command to view the configuration details of the HPA.
kubectl describe hpa qps-hpaExpected output:
Name: qps-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: ******** # The timestamp of the HPA. You can ignore this parameter. Reference: Deployment/qps-deploy Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
(Optional) Step 4: Verify the results
Verify that the application is scaled out.
Run the following command to view information about the Ingress.
kubectl get ingressExpected output:
NAME CLASS HOSTS ADDRESS PORTS AGE qps-ingress qps-ingressclass demo.alb.ingress.top alb-********.alb.aliyuncs.com 80 10m31sRecord the values of
HOSTSandADDRESSfor subsequent steps.Run the following command to perform a stress test on the application.
Replace
demo.alb.ingress.topandalb-********.alb.aliyuncs.comwith the values that you recorded in the previous step.ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qpsRun the following command to view the auto scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 14375m/2 (avg) 2 10 10 15mThe output shows that the value of
REPLICASis 10. This indicates that as the QPS increases, the number of application pods scales out to 10, which is the value ofMAXPODS.
Verify that the application is scaled in.
After the stress test is complete, run the following command to view the auto scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 28mThe output shows that the value of
REPLICASis 2. This indicates that after the QPS drops to 0, the number of application pods scales in to 2, which is the value ofMINPODS.
References
To scale your application based on the CPU or memory usage of pods, see Horizontal pod autoscaling (HPA).
To schedule the scaling of your application, see Cron horizontal pod autoscaling (CronHPA).
For information about node auto scaling, see Node auto scaling.