Alibaba Cloud Container Compute Service (ACS) supports declaring compute classes (compute-class) and compute Quality of Service (QoS) in pod labels. The inventory of different instance types changes dynamically. An instance of a specific type may fail to be created due to factors such as insufficient resource inventory. With custom priority scheduling, you can specify multiple compute classes or compute QoS classes for a pod. The scheduler then attempts to create the corresponding pod instances in the specified order. This feature also controls the reverse-order scale-in of the application using the pod deletion cost mechanism. This topic describes how to use custom priority scheduling in an ACS cluster.
Prerequisites
kube-scheduler is installed and its version meets the following requirements.
ACS cluster version
Scheduler component version
1.31
v1.31.0-aliyun-1.2.0 and later
1.30
v1.30.3-aliyun-1.1.1 and later
1.28
v1.28.9-aliyun-1.1.0 and later
acs-virtual-node is installed and its version is v2.12.0-acs.4 or later.
Notes
For more information about the compute classes that support resource scheduling based on custom priorities, see Compute classes.
ACS resource scheduling based on custom priorities uses the Pod deletion cost feature of Kubernetes to control the scale-in order of pods. Theoretically, a pod with the lowest pod deletion cost is scaled in first. However, the scale-in algorithm considers a variety of factors that depend on the implementation of the pod controller. Note that if the pod you create has the
controller.kubernetes.io/pod-deletion-costannotation, its value is overwritten by the ACS custom priority resource scheduling policy.
Do not use system-reserved labels, such as alibabacloud.com/compute-class and alibabacloud.com/compute-qos, in the label selector of a workload, such as the spec.selector.matchLabels of a deployment. The system may modify these labels during custom priority scheduling. This can cause the controller to frequently recreate pods and affect application stability.
Usage
An ACS cluster provides resources in the form of virtual nodes. The main resource properties of a pod include the zone, compute class, and compute QoS. For this purpose, ACS defines the ResourcePolicy scheduling policy. This policy lets you mark a class of pods using spec.selector and configure multiple resource properties at the same time. If the resource inventory is insufficient, the scheduler creates other types of instances based on the configured order. You can use the ResourcePolicy scheduling policy as follows.
Create a ResourcePolicy scheduling policy.
apiVersion: scheduling.alibabacloud.com/v1alpha1 kind: ResourcePolicy metadata: name: rp-demo namespace: default spec: selector: # Mark pods in the selector. Pods with the app=stress label will follow this scheduling policy. app: stress units: # Define the scheduling order in units. - resource: acs # First, request resources of the best-effort type. podLabels: alibabacloud.com/compute-class: general-purpose alibabacloud.com/compute-qos: best-effort - resource: acs # If the former is out of stock, request resources of the default type. podLabels: alibabacloud.com/compute-class: general-purpose alibabacloud.com/compute-qos: defaultCreate a workload of any type, such as a Job. Ensure the
labelsconfiguration matches the selector in the ResourcePolicy.apiVersion: batch/v1 kind: Job metadata: name: demo-job namespace: default spec: parallelism: 3 template: metadata: labels: app: stress # Associate with the configuration defined in spec.selector of the ResourcePolicy. spec: containers: - name: demo-job image: registry.cn-hangzhou.aliyuncs.com/acs/stress:v1.0.4 args: - 'infinity' command: - sleep resources: requests: cpu: "1" memory: "1Gi" limits: cpu: "1" memory: "1Gi" restartPolicy: Never backoffLimit: 4
Advanced configuration parameters
The following detailed YAML example of a ResourcePolicy shows the format of advanced configuration parameters for custom priority scheduling.
This topic describes only the common configurations for using ResourcePolicy in an ACS cluster. For more information about all configurations of the ResourcePolicy feature, see Custom Elastic Resource Priority Scheduling.
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: rp-demo
namespace: default
spec:
# The following is the application configuration. It is used to mark a group of pods. Pods that meet the condition will follow this scheduling policy.
selector:
app: stress
# End of application configuration.
# The following is the resource configuration. It is used to describe the scheduling order.
units:
- resource: acs # The resource type must be set to acs.
podLabels: # First, request resources of the "general-purpose" + "best-effort" type.
alibabacloud.com/compute-class: general-purpose
alibabacloud.com/compute-qos: best-effort
nodeSelector: # You can use nodeSelector to specify the zone of the virtual node.
topology.kubernetes.io/zone: cn-hangzhou-i
- resource: acs # If the former is out of stock, request resources of the "general-purpose" + "default" type.
podLabels:
alibabacloud.com/compute-class: general-purpose
alibabacloud.com/compute-qos: default
# End of resource configuration.
# Other fields apply to non-ACS clusters. ResourcePolicy has default values for them after creation. You can ignore them.Application configuration
The application configuration consists of a set of labels. Only pods with these labels follow this policy. You can configure different resource orders based on the application type.
Configuration item | Type | Description | Example |
selector | map[string]string | Pods that have all these labels are scheduled according to this ResourcePolicy rule. | |
Resource configuration
The resource configuration is a list. Each element in the list describes detailed resource properties. The scheduler attempts to create pods that meet the conditions in the Application configuration in sequence, based on the properties of each element. If the inventory is insufficient, the scheduler automatically attempts to use the next element. If all of the specified resource types are out of stock, the pod remains in the Pending state. The scheduler continuously retries all resource types until the pod is created. The following table describes the parameters that each element contains.
Configuration item | Type | Description | Value | Example |
resource | string | The resource type. This parameter is required. Only `acs` is supported. | acs |
|
nodeSelector | map[string]string | Filter virtual nodes by label, for example, by zone. | For more information about the supported labels, see node affinity scheduling. | |
podLabels[alibabacloud.com/compute-class] | string | Describes the compute class requested by the pod. |
|
|
podLabels[alibabacloud.com/compute-qos] | string | Describes the compute QoS requested by the pod. |
|
|
Example
This example shows how to use a ResourcePolicy to sequentially request resources with the default and best-effort compute QoS for an application.
Create a file named resource-policy.yaml with the following content. This declares that for pods with the
app=stresslabel, resources with the compute class `performance` and compute QoS `default` are requested.apiVersion: scheduling.alibabacloud.com/v1alpha1 kind: ResourcePolicy metadata: name: stress-demo namespace: default spec: selector: app: stress units: - resource: acs podLabels: alibabacloud.com/compute-class: performance alibabacloud.com/compute-qos: defaultRun the following command to deploy the ResourcePolicy to the cluster.
kubectl apply -f resource-policy.yamlCreate a file named stress-dep.yaml with the following content.
apiVersion: apps/v1 kind: Deployment metadata: name: stress spec: replicas: 1 selector: matchLabels: app: stress template: metadata: labels: # Keep this consistent with the configuration in the ResourcePolicy. app: stress spec: containers: - name: stress image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "infinity" resources: limits: cpu: '1' memory: 1Gi requests: cpu: '1' memory: 1GiRun the following command to deploy the stress application to the cluster.
kubectl apply -f stress-dep.yamlRun the following command to view the pod status.
kubectl get pod -L alibabacloud.com/compute-class,alibabacloud.com/compute-qosExpected output:
# The output is affected by factors such as resource stock. The actual output may vary. NAME READY STATUS RESTARTS AGE COMPUTE-CLASS COMPUTE-QOS stress-xxxxxxxx1 1/1 Running 0 53s performance defaultUpdate the resource-policy.yaml file with the following content. This adds a resource property description to request resources in the following order:
First, request resources with the compute class `general-purpose` and compute QoS `best-effort`.
If the inventory of the preceding resources is insufficient, request resources with the compute class `performance` and compute QoS `default`.
apiVersion: scheduling.alibabacloud.com/v1alpha1 kind: ResourcePolicy metadata: name: stress-demo namespace: default spec: selector: app: stress units: - resource: acs podLabels: alibabacloud.com/compute-class: general-purpose alibabacloud.com/compute-qos: best-effort - resource: acs podLabels: alibabacloud.com/compute-class: performance alibabacloud.com/compute-qos: defaultRun the following command to update the ResourcePolicy in the cluster. The updated policy takes effect on subsequently created pods.
kubectl apply -f resource-policy.yamlRun the following command to scale out the stress application created in Step 3 to two replicas.
kubectl scale deployment stress --replicas=2Run the following command to view the pod status.
kubectl get pod -L alibabacloud.com/compute-class,alibabacloud.com/compute-qosExpected output:
# The output is affected by factors such as resource stock. The actual output may vary. NAME READY STATUS RESTARTS AGE COMPUTE-CLASS COMPUTE-QOS stress-xxxxxxxx1 1/1 Running 0 2m14s performance default stress-xxxxxxxx2 1/1 Running 0 33s general-purpose best-effortYou can see that the new replica has the compute class `general-purpose` and compute QoS `best-effort`.