Dynamic distribution and descheduling - Container Service for Kubernetes

ACK One Fleet can dynamically distribute workload replicas, such as Deployments, StatefulSets, and Jobs, across associated clusters based on their available resources through PropagationPolicy. By default, ACK One Fleet instances enable descheduling. Because cluster resource availability fluctuates over time, some replicas in associated clusters may become unschedulable. Fleet instances automatically check for unschedulable replicas every two minutes, and trigger descheduling if they remain in this state for more than 30 seconds.

Prerequisites

The Fleet management feature is enabled.
The Fleet instance has multiple associated clusters.
The AliyunAdcpFullAccess permission is granted to the RAM user.
The AMC command-line tool is installed.

Procedure

Step 1: Create an application in the Fleet

Create a file named web-demo.yaml and add the following to it:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-demo
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-demo
  template:
    metadata:
      labels:
        app: web-demo
    spec:
      containers:
      - name: nginx
        image: registry-cn-hangzhou.ack.aliyuncs.com/acs/web-demo:0.5.0
        ports:
        - containerPort: 80

Run the following command to deploy the application:
```
kubectl apply -f web-demo.yaml
```

Step 2: Create a distribution policy

Create a dynamic weight-based distribution policy that automatically adjusts replica allocation ratios according to the available resources across all nodes in associated clusters.

apiVersion: policy.one.alibabacloud.com/v1alpha1
kind: PropagationPolicy
metadata:
  name: web-demo
spec:
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment
    name: web-demo
  placement:
    clusterAffinity:
      clusterNames:
      - ${cluster1-id} # Your cluster ID.
      - ${cluster2-id} 
    replicaScheduling:
      replicaSchedulingType: Divided
      replicaDivisionPreference: Weighted
      weightPreference:
        dynamicWeight: AvailableReplicas

Run the following command to check the application distribution status:

kubectl amc get deploy web-demo -M

Expected output (results may vary based on your associated cluster's available resource ratio):

NAME       CLUSTER      READY   UP-TO-DATE   AVAILABLE   AGE   ADOPTION
web-demo   cxxxxxxxx1   2/2     2            2           11s   Y
web-demo   cxxxxxxxx2   3/3     3            3           11s   Y

Step 3: Verify descheduling capability

You can simulate a scenario where the workload is in a Pending state due to insufficient resources by tainting nodes and restarting the existing deployment.

Run the following command to taint all nodes with NoSchedule:

kubectl --kubeconfig=<cluster1.config> taint nodes foo=bar:NoSchedule --all=true

Run the following command to restart the existing workload. The workload is expected to fail.
```
kubectl --kubeconfig=<cluster1.config> rollout restart deploy web-demo
```
After waiting for about 3 minutes, run the following command to check the scheduling results:
```
kubectl amc get deploy web-demo -M
```
Expected output:
```
NAME       CLUSTER      READY   UP-TO-DATE   AVAILABLE   AGE   ADOPTION
web-demo   cxxxxxxxx2   5/5     5            5           11s   Y
```
The expected output shows that the replicas of the workload in Cluster1 have been rescheduled to Cluster2.