If your application must dynamically adjust its compute resources based on request volume, use queries per second (QPS) data from your Application Load Balancer (ALB) instances to configure elastic scaling for the application's pods.
Before you begin
Before you begin, read Create and use an ALB Ingress to expose a service to learn the basics of using an ALB Ingress.
How it works
Queries per second (QPS) is the number of requests received per second. An Application Load Balancer (ALB) instance records client access data using Simple Log Service (SLS). A Horizontal Pod Autoscaler (HPA) monitors the QPS data for the service from these records and scales the corresponding workloads, such as deployments and StatefulSets.
Prerequisites
-
The alibaba-cloud-metrics-adapter component version 2.3.0 or later is installed. For more information, see Deploy the ack-alibaba-cloud-metrics-adapter component.
-
The Apache Benchmark stress testing tool is installed. For more information, see the official Compiling and Installing document.
A kubectl client is connected to the ACK cluster. For more information, see Connect to an ACK cluster using kubectl.
Two vSwitches are created in different zones that support ALB. The vSwitches must be in the same Virtual Private Cloud (VPC) as the cluster. For more information, see Create and manage vSwitches and Zones that support ALB.
Step 1: Create an AlbConfig and associate a Simple Log Service project
-
View the Simple Log Service project associated with the cluster.
-
Log on to the Container Service Management Console . In the navigation pane on the left, click Clusters.
-
On the Clusters page, click the name of your cluster. In the navigation pane on the left, click Cluster Information.
-
On the Basic Information tab, locate the Log Service Project resource and record the project name on the right.
-
-
Create an AlbConfig.
-
Create a file named alb-qps.yaml, copy the following content into the file, and enter the Simple Log Service project information in the
accessLogConfigfield.apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: alb-qps spec: config: name: alb-qps addressType: Internet zoneMappings: - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # The ID of the vSwitch - vSwitchId: vsw-uf6nun9tql5t8nh15**** accessLogConfig: logProject: <LOG_PROJECT> # The name of the Simple Log Service project associated with the cluster logStore: <LOG_STORE> # A custom Logstore name. The name must start with "alb_". listeners: - port: 80 protocol: HTTPThe following table describes the fields.
Field
Type
Description
logProject
string
The name of the Simple Log Service project.
Default value:
"".logStore
string
The name of the Simple Log Service Logstore. The name must start with
alb_. If the Logstore does not exist, it is automatically created. For a configuration example of a Simple Log Service Logstore, see Enable access logs.Default value:
"alb_****". -
Run the following command to create the AlbConfig.
kubectl apply -f alb-qps.yamlExpected output:
albconfig.alibabacloud.com/alb-qps created
-
Step 2: Create sample resources
In addition to an AlbConfig, an ALB Ingress requires a deployment, a service, an IngressClass, and an Ingress to function. Follow these steps to quickly create these resources.
-
Create a file named qps-quickstart.yaml with the following content.
apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: qps-ingressclass spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: alb-qps # Must be the same as the name of the AlbConfig --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: qps-ingress spec: ingressClassName: qps-ingressclass # Must be the same as the name of the IngressClass rules: - host: demo.alb.ingress.top # Replace with your domain name http: paths: - path: /qps pathType: Prefix backend: service: name: qps-svc port: number: 80 --- apiVersion: v1 kind: Service metadata: name: qps-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: qps-deploy type: NodePort --- apiVersion: apps/v1 kind: Deployment metadata: name: qps-deploy labels: app: qps-deploy spec: replicas: 2 selector: matchLabels: app: qps-deploy template: metadata: labels: app: qps-deploy spec: containers: - name: qps-container image: nginx:1.7.9 ports: - containerPort: 80 -
Run the following command to create the sample resources.
kubectl apply -f qps-quickstart.yaml
Step 3: Create an HPA
-
Create and save a file named qps-hpa.yaml with the following content.
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: qps-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: qps-deploy # The name of the workload controlled by the HPA minReplicas: 2 # The minimum number of pods maxReplicas: 10 # The maximum number of pods metrics: - type: External # Use external metrics (non-native Kubernetes metrics) external: metric: name: sls_alb_ingress_qps # The metric name (QPS of Alibaba Cloud ALB Ingress). Do not modify this value. selector: matchLabels: sls.project: <LOG_PROJECT> # The name of the Simple Log Service project. Replace with your actual project name. sls.logstore: <LOG_STORE> # The name of the Logstore. Replace with your actual Logstore name. sls.ingress.route: default-qps-svc-80 # The path of the service. The format is <namespace>-<svc>-<port>. target: type: AverageValue # The target metric type (average value) averageValue: "2" # The target value for the metric. In this example, the average QPS for all pods is 2.The following table describes the fields.
Field
Description
scaleTargetRef
The workload used by the application. This example uses the Deployment named
qps-deploycreated in Step 1.minReplicas
The minimum number of pods to which the deployment can be scaled in. This value must be an integer greater than or equal to 1.
maxRaplicas
The maximum number of pods to which the deployment can be scaled out. This value must be greater than the minimum number of replicas.
external.metric.name
The metric for QPS data that the HPA uses. Do not modify this value.
sls.project
The Simple Log Service project that provides the metric data. This must be the same as the project specified in the AlbConfig.
sls.logstore
The Logstore that provides the metric data. This must be the same as the Logstore specified in the AlbConfig.
sls.ingress.route
The Service path, formatted as <namespace>-<svc>-<port>, is the qps-svc Service you created in Step 1.
external.target
The target value for the metric. In this example, the average QPS for all pods is 2. The HPA adjusts the number of pods to keep the QPS as close to the target value as possible.
-
Run the following command to create the HPA.
kubectl apply -f qps-hpa.yaml -
Run the following command to view the HPA deployment status.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 5m41s -
Run the following command to view the HPA configuration details.
kubectl describe hpa qps-hpaExpected output:
Name: qps-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: ******** # The timestamp of the HPA. You can ignore this. Reference: Deployment/qps-deploy Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
Optional: Step 4: Verify the result
-
Verify application scale-out.
-
Run the following command to view the Ingress information.
kubectl get ingressExpected output:
NAME CLASS HOSTS ADDRESS PORTS AGE qps-ingress qps-ingressclass demo.alb.ingress.top alb-********.alb.aliyuncs.com 80 10m31sRecord the values of
HOSTSandADDRESSfor later use. -
Run the following command to perform a stress test on the application.
Replace
demo.alb.ingress.topandalb-********.alb.aliyuncs.comwith the values you recorded in the previous step.ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qps -
Run the following command to view the scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 14375m/2 (avg) 2 10 10 15mThe
REPLICASvalue in the output is 10. This indicates that as the QPS increased, the application scaled out to the maximum of 10 pods specified byMAXPODS.
-
-
Verify application scale-in.
After the stress test completes, run the following command to view the scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 28mThe
REPLICASvalue in the output is 2. This indicates that after the QPS dropped to 0, the application scaled in to the minimum of 2 pods specified byMINPODS.
References
-
To scale your application based on the CPU or memory load of pods, see Horizontal pod autoscaling (HPA).
-
To scale your application on a schedule, see Scheduled horizontal pod autoscaling (CronHPA).
-
For node scaling, see Node scaling.