Implement workload scaling based on the UnitedDeployment controller provided by Kruise - Container Service for Kubernetes

The UnitedDeployment custom resource simplifies the management of multiple homogeneous workloads by grouping them into flexible units called Subsets. For example, when an application needs to be deployed across multiple availability zones, you can define a Subset for the Deployment in each zone. UnitedDeployment then manages the fine-grained updates and deployments for each Subset, eliminating the need to manually configure and maintain separate Deployment YAML files.

Additionally, you can use UnitedDeployment in combination with the Horizontal Pod Autoscaler (HPA) to achieve ordered scaling in clusters that use a mix of compute resources. This allows pods to be scaled up in a specific order (such as from cheapest to most expensive) and scaled down in the reverse order, significantly optimizing resource costs.

See the Kruise official documentation UnitedDeployment for details. The present topic describes how to configure the UnitedDeployment controller by using YAML files to meet requirements in different scenarios.

Supported workload types

The UnitedDeployment controller supports only the following workload types: StatefulSet, Advanced StatefulSet, CloneSet, and Deployment. For more information, see Use OpenKruise to deploy cloud-native applications.

Prerequisites

The ack-kruise add-on is installed. For more information, see Manage add-ons.
The ACK Virtual Node add-on comes pre-installed with ECI. For more information, see Manage add-ons.
A kubectl client is connected to the ACK cluster. For more information, see Get a cluster kubeconfig and connect to the cluster using kubectl.

Scenario 1: Use the UnitedDeployment controller together with HPA

If your cluster contains multiple types of resources, configure a subset in the YAML file of a UnitedDeployment to select one resource type. You can also add the maxReplicas field to specify how to schedule pods after the number of pods reaches the upper limit. When you use HPA to implement horizontal pod autoscaling for a UnitedDeployment, pods are scaled based on the specified priorities of resources. When you configure HPA, set the scaleTargetRef field of HPA to UnitedDeployment and the name of the UnitedDeployment.

Important

OpenKruise 1.5.0 is required. For more information about the release notes for OpenKruise, see OpenKruise.

In this example, two node pools exist in the cluster. Node Pool A consists of subscription ECS instances, and Node Pool B consists of preemptible instances. Configure the system to schedule pods to different types of nodes in the following order of priority: subscription ECS instance > spot instance > ECI. When nodes of one type are out of stock, the system uses nodes of the lower priority for pod scheduling.

Create a file named test.yaml and copy the following content to the file.

The test.yaml file is used to orchestrate a UnitedDeployment. The template parameter in the file defines the configurations of a Deployment. The file also defines three subsets.

Pods of subset-a are deployed on the subscription ECS instances in Node Pool A. Pods of subset-b are deployed on the preemptible instances in Node Pool B. The number of replicas is 1 on both subset-a and subset-b.
subset-c is configured with the nodeSelectorTerm and tolerations parameters, which are used to select ECI. In this case, pods of subset-c are deployed on ECI. The number of replicas on subset-c is 3.

apiVersion: apps.kruise.io/v1alpha1
kind: UnitedDeployment
metadata:
  name: ud-nginx
spec:
  replicas: 6
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: ud-nginx
  template:
    deploymentTemplate:
      metadata:
        labels:
          app: ud-nginx
      spec:
        selector:
          matchLabels:
            app: ud-nginx
        template:
          metadata:
            labels:
              app: ud-nginx
          spec:
            containers:
            - image: alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/nginx_optimized:20240221-1.20.1-2.3.0
              name: nginx
  topology:
    subsets:
    - name: subset-a
      nodeSelectorTerm:
        matchExpressions:
        - key: alibabacloud.com/nodepool-id
          operator: In
          values:
          - np92019eec42004d878fcdc990fcb9****   # Replace the value with the ID of Node Pool A. 
      replicas: 1
    - name: subset-b
      nodeSelectorTerm:
        matchExpressions:
        - key: alibabacloud.com/nodepool-id
          operator: In
          values:
          - np011de1f2de3d48bd8a92a015fc5c****  # Replace the value with the ID of Node Pool B. 
      replicas: 1
    - name: subset-c
      nodeSelectorTerm:
        matchExpressions:
        - key: type
          operator: In
          values:
          - virtual-kubelet
      tolerations:
      - key: virtual-kubelet.io/provider
        operator: Exists
      replicas: 3
  updateStrategy:
    manualUpdate:
      partitions:
        subset-a: 0
        subset-b: 0
        subset-c: 0
    type: Manual

Run the following command to deploy the UnitedDeployment:

kubectl apply -f test.yaml

Expected output:

uniteddeployment.apps.kruise.io/ud-nginx created

Run the following command to check whether pods are created:

kubectl get pod -o wide
NAME                                       READY   STATUS    RESTARTS   AGE   IP               NODE                            NOMINATED NODE   READINESS GATES
ud-nginx-subset-a-7lbtd-5b5bd77549-5bw6l   1/1     Running   0          73s   192.XX.XX.126    cn-hangzhou.10.XX.XX.131       <none>           <none>
ud-nginx-subset-b-nvvfw-5c9bcd6766-lv6sp   1/1     Running   0          73s   192.XX.XX.239    cn-hangzhou.10.XX.XX.132      <none>           <none>
ud-nginx-subset-c-m78fd-7796b66fd8-7p52j   1/1     Running   0          73s   192.XX.XX.130    virtual-kubelet-cn-hangzhou-h   <none>           <none>
ud-nginx-subset-c-m78fd-7796b66fd8-fd7f7   1/1     Running   0          73s   192.XX.XX.129    virtual-kubelet-cn-hangzhou-h   <none>           <none>
ud-nginx-subset-c-m78fd-7796b66fd8-mn4qb   1/1     Running   0          73s   192.XX.XX.131    virtual-kubelet-cn-hangzhou-h   <none>           <none>

The output indicates that pods are deployed on different subsets as defined in the configurations of the UnitedDeployment.

Scenario 2: Deploy applications across zones

To improve the availability of applications, you usually need to deploy computing or storage resources across multiple zones. You can add different labels to nodes in different zones. Then, configure label selectors in the subset configurations in the YAML file of a UnitedDeployment. This way, the UnitedDeployment schedules pods of different subsets to different zones with matching labels.

Three nodes reside in different zones. Run the following commands to add a label to each node to indicate the zone where the node resides.

For example, the node=zone-a label is added to the node in Zone A, the node=zone-b label is added to the node in Zone B, and the node=zone-c label is added to the node in Zone C.

kubectl label node cn-beijing.10.XX.XX.131 node=zone-a
node/cn-beijing.10.80.20.131 labeled # Add the node=zone-a label to node 10.XX.XX.131. 
kubectl label node cn-beijing.10.XX.XX.132 node=zone-b
node/cn-beijing.10.80.20.132 labeled  # Add the node=zone-b label to node 10.XX.XX.132. 
kubectl label node cn-beijing.10.XX.XX.133 node=zone-c
node/cn-beijing.10.80.20.133 labeled  # Add the node=zone-c label to node 10.XX.XX.133.

Create a file named test.yaml and copy the following content to the file.

The test.yaml file is used to orchestrate a UnitedDeployment. The template parameter in the file defines the configurations of a StatefulSet. The file also defines a subset for each zone.

The statefulSetTemplate field defines the template of the StatefulSet. The UnitedDeployment creates a StatefulSet on each subset based on the template.
The subsets section defines a subset for each zone. Pods of subset-a are deployed on the node whose label is node=zone-a. Pods of subset-b are deployed on the node whose label is node=zone-b. Pods of subset-c are deployed on the node whose label is node=zone-c.

apiVersion: apps.kruise.io/v1alpha1
kind: UnitedDeployment
metadata:
  name: sample-ud
spec:
  replicas: 6
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: sample
  template:
    statefulSetTemplate:
      metadata:
        labels:
          app: sample
      spec:
        selector:
          matchLabels:
            app: sample
        template:
          metadata:
            labels:
              app: sample
          spec:
            containers:
            - image: alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/nginx_optimized:20240221-1.20.1-2.3.0
              name: nginx
  topology:
    subsets:
    - name: subset-a
      nodeSelectorTerm:
        matchExpressions:
        - key: node
          operator: In
          values:
          - zone-a
      replicas: 1
    - name: subset-b
      nodeSelectorTerm:
        matchExpressions:
        - key: node
          operator: In
          values:
          - zone-b
      replicas: 50%
    - name: subset-c
      nodeSelectorTerm:
        matchExpressions:
        - key: node
          operator: In
          values:
          - zone-c
  updateStrategy:
    manualUpdate:
      partitions:
        subset-a: 0
        subset-b: 0
        subset-c: 0
    type: Manual

Run the following command to deploy the UnitedDeployment:

kubectl apply -f test.yaml

Expected output:

uniteddeployment.apps.kruise.io/sample-ud created

Run the following command to check whether the pods and StatefulSets are created:

kubectl get pod
# Expected output:
NAME                                     READY   STATUS    RESTARTS   AGE
sample-ud-subset-a-cplwg-0               1/1     Running   0          6m5s
sample-ud-subset-b-rj7kt-0               1/1     Running   0          6m4s
sample-ud-subset-b-rj7kt-1               1/1     Running   0          5m49s
sample-ud-subset-b-rj7kt-2               1/1     Running   0          5m43s
sample-ud-subset-c-g6jvx-0               1/1     Running   0          6m5s
sample-ud-subset-c-g6jvx-1               1/1     Running   0          5m51s

kubectl get statefulset
# Expected output:
NAME                       READY   AGE
sample-ud-subset-a-cplwg   1/1     7m34s
sample-ud-subset-b-rj7kt   3/3     7m34s
sample-ud-subset-c-g6jvx   2/2     7m34s

The output indicates that the pods and StatefulSets are created and run on nodes in Zone A, Zone B, and Zone C.

Scenario 3: Use the UnitedDeployment controller to colocate applications on ECS instances and ECIs

To handle business spikes, you may need to ensure the resource supply of cluster nodes and control the costs at the same time. To address this issue, you can prioritize ECS instances for pod scheduling and configure Container Service for Kubernetes (ACK) to automatically deploy applications on ECIs when ECS instances are out of stock. When pods are scaled in, ACK first deletes pods that run on ECIs.

In the following example, a UnitedDeployment is created to show how replicas are scheduled. When the number of replicas does not exceed 4, the replicas are scheduled to ECS instances. When the number of replicas is between 4 and 10, the excess pods are scheduled to ECIs.

Create a file named test.yaml and copy the following content to the file.

The deploymentTemplate field defines the template of the Deployment. The UnitedDeployment creates a Deployment on each subset based on the template.
The subsets section defines two subsets. The pods on the first subset are scheduled to ECS instances. The replicas on the second subset are scheduled to ECIs. The first subset supports up to four replicas. When the number of replicas provisioned by the UnitedDeployment does not exceed 4, the replicas are scheduled to ECS instances. When the number of replicas is between 4 and 10, the excess pods are scheduled to ECIs.

apiVersion: apps.kruise.io/v1alpha1
kind: UnitedDeployment
metadata:
  name: ud-nginx
spec:
  replicas: 6
  selector:
    matchLabels:
      app: sample
  template:
  # statefulSetTemplate or advancedStatefulSetTemplate or cloneSetTemplate or deploymentTemplate
    deploymentTemplate:
      metadata:
        labels:
          app: sample
      spec:
        selector:
          matchLabels:
            app: sample
        template:
          metadata:
            labels:
              app: sample
          spec:
            containers:
            - image: alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/nginx_optimized:20240221-1.20.1-2.3.0
              name: nginx
              resources:
                requests:
                  cpu: "500m"
  topology:
    subsets:
    - name: ecs
      maxReplicas: 4
    - name: eci
      maxReplicas: null

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: united-deployment-hpa
spec:
  scaleTargetRef:
    apiVersion: apps.kruise.io/v1alpha1
    kind: UnitedDeployment
    name: ud-nginx  # Replace the value with the name of the UnitedDeployment. 
  minReplicas: 4
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Run the following command to deploy the UnitedDeployment:

kubectl apply -f test.yaml

Expected output:

horizontalpodautoscaler.autoscaling/united-deployment-hpa created

Run the following command to query the status of the pods:

kubectl get pod -o wide
# Expected output:
NAME                                  READY   STATUS    RESTARTS       AGE   IP               NODE                       NOMINATED NODE   READINESS GATES
ud-nginx-eci-dxfbz-864bdb77b-2d4t9    1/1     Running   0             3m9s   192.XX.XX.129   cn-hangzhou.192.XX.XX.120   <none>           <none>
ud-nginx-eci-dxfbz-864bdb77b-zppfh    1/1     Running   0             3m9s   192.XX.XX.11    cn-hangzhou.192.XX.XX.251   <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-5mlgh   1/1     Running   0             3m9s   192.XX.XX.4     cn-hangzhou.192.XX.XX.251   <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-6bdkz   1/1     Running   0             3m9s   192.XX.XX.145   cn-hangzhou.192.XX.XX.32    <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-dnsfl   1/1     Running   0             3m9s   192.XX.XX.150   cn-hangzhou.192.XX.XX.20    <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-mrzwc   1/1     Running   0             3m9s   192.XX.XX.128   cn-hangzhou.192.XX.XX.120   <none>           <none>

The output indicates that the replicas of the Deployment are dynamically scheduled based on the scheduling policy defined by the UnitedDeployment. The first four replicas are scheduled to ECS instances. The remaining two replicas are scheduled to ECIs.

Trigger HPA to perform a scale-in activity and then run the following command to query the status of pods:

kubectl get pod -o wide
NAME                                  READY   STATUS    RESTARTS       AGE    IP              NODE                        NOMINATED NODE   READINESS GATES
ud-nginx-ecs-5lm7r-868c4ccd5d-5mlgh   1/1     Running   0             8m14s   192.168.8.4     cn-hangzhou.192.168.8.251   <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-6bdkz   1/1     Running   0             8m14s   192.168.6.145   cn-hangzhou.192.168.6.32    <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-dnsfl   1/1     Running   0             8m14s   192.168.6.150   cn-hangzhou.192.168.6.20    <none>           <none>
ud-nginx-ecs-5lm7r-868c4ccd5d-mrzwc   1/1     Running   0             8m14s   192.168.5.128   cn-hangzhou.192.168.5.120   <none>           <none>

The output indicates that the number of replicas is scaled down from 6 to 4. Pods on ECIs are preferably deleted.

References

For more information about how to add nodes in multiple zones during scale-out activities, see Implement rapid scaling across multiple zones.
To enable node auto scaling when the resource capacity of the cluster cannot meet the requirements for pod scheduling, we recommend that you refer to the node scaling solutions provided by ACK. For more information, see Node scaling.