The inter-pod affinity scheduling policy is used to express affinity preferences between pods. Compared with node affinity, inter-pod affinity scheduling limits the nodes to which pods can be scheduled based on the labels of pods that already run on nodes. In an Alibaba Cloud Container Compute Service (ACS) cluster, you can use Kubernetes-native scheduling semantics to implement inter-pod affinity scheduling. By specifying the topology domain and label rules in the podAffinity or podAntiAffinity field, you can schedule pods to the specified topology domain. This topic describes the limits and usage notes of inter-pod affinity scheduling in ACS.
Prerequisites
An ACS cluster is created. For more information, see Create an ACS cluster.
kube-scheduler is installed. For more information, see kube-scheduler.
acs-virtual-node v2.12.0-acs.4 or later is installed.
Usage notes
Inter-pod affinity contains the affinity and anti-affinity modes. Both modes have the same protocol format and consist of the requiredDuringSchedulingIgnoredDuringExecution
rule and the preferredDuringSchedulingIgnoredDuringExecution
rule. The TopologyKey field is required, which corresponds to the label of the virtual node. The following table describes the topology labels supported by ACS for different types of nodes.
Node type | Label | Description | Example |
Regular node | topology.kubernetes.io/zone | Network zone | topology.kubernetes.io/zone: cn-shanghai-b |
ACS supports multiple compute classes. The following constraints exist in different compute classes when you use other fields of topology distribution constraints.
requiredDuringSchedulingIgnoredDuringExecution
Compute class
Field
Description
Constraint
General-purpose
Performance-enhanced
LabelSelector
This field is used to find matching pods. Pods that match this label selector are counted to determine the number of pods in the topology domain.
Pods in other compute classes, such as the GPU-accelerated and performance-enhanced compute classes, are not counted.
Namespaces
This field is used to find matching namespaces and can be used together with the LabelSelector field.
NamespaceSelector
This field is similar to the Namespaces field, except it retrieves namespaces based on the namespace label.
GPU-accelerated
LabelSelector
This field is used to find matching pods. Pods that match this label selector are counted to determine the number of pods in the topology domain.
Pods in other compute classes, such as the general-purpose and performance-enhanced compute classes, are not counted.
Namespaces
This field is used to find matching namespaces and can be used together with the LabelSelector field.
Not supported
NamespaceSelector
This field is similar to the Namespaces field, except it retrieves namespaces based on the namespace label.
Not supported
preferredDuringSchedulingIgnoredDuringExecution
Compute class
Field
Description
Constraint
General-purpose
Performance-enhanced
LabelSelector
Namespaces
NamespaceSelector
None
Pods in other compute classes, such as the GPU-accelerated and performance-enhanced compute classes, are not counted.
For more information about the fields, see Inter-pod affinity and anti-affinity.
Examples
The following example shows how to schedule pods to a specific zone by configuring the podAffinity
field.
Run the following command to view the nodes in the cluster:
kubectl get node
Expected output:
NAME STATUS ROLES AGE VERSION virtual-kubelet-cn-hangzhou-i Ready agent 5h42m v1.28.3-xx virtual-kubelet-cn-hangzhou-j Ready agent 5h42m v1.28.3-xx
Create a file named with-affinity-pod.yaml and add the following content to the file:
apiVersion: v1 kind: Pod metadata: labels: pod-affinity-label: with-pod-affinity name: with-affinity-label-pod spec: containers: - args: - 'infinity' command: - sleep image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 imagePullPolicy: IfNotPresent name: stress resources: limits: cpu: '1' memory: 1Gi requests: cpu: '1' memory: 1Gi
Run the following command to deploy the with-affinity-label-pod file in the cluster:
kubectl apply -f with-affinity-pod.yaml
Run the following command to view the distribution results of pods:
kubectl get pod -o wide
Expected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES with-affinity-label-pod 1/1 Running 0 75s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none>
The output indicates that the pods are scheduled to the cn-hangzhou-i zone.
Create a file named origin-affinity-pod.yaml and add the following content to the file:
apiVersion: apps/v1 kind: Deployment metadata: name: dep-pod-affinity labels: app: pod-affinity-demo spec: replicas: 4 selector: matchLabels: app: pod-affinity-demo template: metadata: labels: app: pod-affinity-demo spec: containers: - name: pod-affinity-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "infinity" resources: limits: cpu: '1' memory: 1Gi requests: cpu: '1' memory: 1Gi # Specify the affinity between pods. This pod must be deployed in the same zone as the pod with the <pod-affinity-label:with-pod-affinity> label. affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: pod-affinity-label operator: In values: - with-pod-affinity topologyKey: topology.kubernetes.io/zone
Run the following command to deploy the origin-affinity-pod.yaml file in the cluster:
kubectl apply -f origin-affinity-pod.yaml
Run the following command to view the distribution results of pods:
kubectl get pod -o wide
Expected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dep-pod-affinity-6b9d4f7c87-5jlfx 1/1 Running 0 3m26s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none> dep-pod-affinity-6b9d4f7c87-hwdpc 1/1 Running 0 3m26s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none> dep-pod-affinity-6b9d4f7c87-jfcrq 1/1 Running 0 3m26s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none> dep-pod-affinity-6b9d4f7c87-xwbfr 1/1 Running 0 3m26s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none> with-affinity-label-pod 1/1 Running 0 6m30s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none>
The output indicates that all pods are distributed in the same zone.