When you expose pods through a LoadBalancer Service, rolling updates can cause brief traffic interruptions. Pod containers become ready faster than cloud-controller-manager (CCM) can register them with the Classic Load Balancer (CLB) or Network Load Balancer (NLB) backend server group. Configuring a readiness gate prevents this by holding each pod out of the rolling update cycle until it is fully registered and serving traffic.
Prerequisites
Before you begin, ensure that you have:
An ACK managed cluster or ACK Serverless cluster with the following configuration:
Network plug-in set to Terway (ACK managed clusters only)
Kubernetes version 1.24 or later. For upgrade instructions, see Upgrade clusters.
CCM version 2.10.0 or later. For version details, see CCM.
kubectl connected to the cluster. For setup instructions, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
For cluster creation, see Create an ACK managed cluster or Create an ACK Serverless cluster.
How it works
Without a readiness gate, this sequence causes traffic interruptions during rolling updates:
A rolling update starts. Kubernetes terminates an old pod and starts a new one.
The new pod's containers pass their health checks and reach
Runningstatus in seconds.Kubernetes marks the pod as
Readyand routes traffic to it.CCM has not yet registered the pod with the CLB or NLB backend server group—registration typically takes longer than container startup.
Requests routed to the pod fail until registration completes.
When you add a readiness gate with conditionType: service.readiness.alibabacloud.com/<Service Name>, Kubernetes adds a custom condition to each pod. CCM sets this condition to True only after the pod is registered and healthy in the backend server group. Until that condition is met, the pod's READINESS GATES status shows 0/1 and Kubernetes does not route traffic to it or advance the rolling update.
If a pod is mounted to multiple load balancers, configure one readiness gate per LoadBalancer Service.
Step 1: Create a CLB or NLB instance
Create a file named
my-svc.yamlusing one of the following templates.CLB
apiVersion: v1 kind: Service metadata: name: my-svc spec: ports: - port: 80 targetPort: 80 protocol: TCP selector: app: nginx type: LoadBalancerNLB
apiVersion: v1 kind: Service metadata: name: my-svc annotations: service.beta.kubernetes.io/alibaba-cloud-loadbalancer-zone-maps: "${zone-A}:${vsw-A},${zone-B}:${vsw-B}" # Example: cn-hangzhou-k:vsw-i123456,cn-hangzhou-j:vsw-j654321. spec: loadBalancerClass: alibabacloud.com/nlb # Set to NLB. ports: - port: 80 targetPort: 80 protocol: TCP selector: app: nginx type: LoadBalancerApply the Service manifest:
kubectl apply -f my-svc.yamlWait until the Service has an external IP address:
kubectl get service my-svcThe CLB or NLB instance is ready when an IP address appears in the
EXTERNAL-IPcolumn:NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-svc LoadBalancer 192.XX.XX.215 <IP address> 80:30493/TCP 8s
Step 2: Create a test Deployment
Create a file named
my-nginx.yaml. SetconditionTypetoservice.readiness.alibabacloud.com/my-svcto tie the pod's readiness gate to themy-svcService.apiVersion: apps/v1 kind: Deployment metadata: name: my-nginx # The name of the Deployment. labels: app: nginx spec: replicas: 2 # The number of replicated pods. selector: matchLabels: app: nginx # Must match the selector in the Service. template: metadata: labels: app: nginx spec: readinessGates: - conditionType: service.readiness.alibabacloud.com/my-svc # Readiness gate for the my-svc Service. containers: - name: nginx image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80Deploy the Deployment:
kubectl apply -f my-nginx.yamlCheck the pod status and readiness gate state:
kubectl get pod -owide -l app=nginxInitially,
READINESS GATESshows0/1, meaning CCM has not yet registered the pods:NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-d9f95dcf9-8dhwj 1/1 Running 0 14s 172.XX.XXX.188 cn-hangzhou.172.XX.XXX.174 <none> 0/1 my-nginx-d9f95dcf9-z9hjm 1/1 Running 0 14s 172.XX.XXX.182 cn-hangzhou.172.XX.XXX.174 <none> 0/1Run the command again after a short wait. When
READINESS GATESchanges to1/1, the pods are registered with the CLB or NLB backend server group and are ready to serve traffic.
Step 3: Perform a rolling update
Trigger a rolling update:
kubectl rollout restart deployment my-nginxExpected output:
deployment.apps/my-nginx restartedMonitor the pod status as the rolling update progresses:
kubectl get pod -owide -l app=nginxDuring the update, old and new pods coexist. New pods wait at
READINESS GATES: 0/1until CCM registers them:NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-d9f95dcf9-8dhwj 1/1 Running 0 113s 172.XX.XXX.188 cn-hangzhou.172.XX.XXX.174 <none> 1/1 my-nginx-df5c9cf7d-6p5jc 1/1 Running 0 6s 172.XX.XXX.182 cn-hangzhou.172.XX.XXX.174 <none> 0/1 my-nginx-df5c9cf7d-7dh2v 1/1 Running 0 15s 172.XX.XXX.189 cn-hangzhou.172.XX.XXX.174 <none> 1/1The rolling update advances to the next pod only after the current new pod reaches
READINESS GATES: 1/1, ensuring uninterrupted traffic throughout the update.
Troubleshooting
Readiness gate stays at 0/1 for an extended period
If a pod's READINESS GATES does not change to 1/1 after several minutes, inspect the pod's status conditions to find the reason:
kubectl get pod <pod-name> -o yaml | grep -A8 'service.readiness.alibabacloud.com'The output shows the condition set by CCM, including a reason field that indicates why registration has not completed. Common causes include CCM version being below 2.10.0, Terway not being the active network plug-in, or the backend server group not yet provisioned.
Check the CCM logs for more details:
kubectl logs -n kube-system -l app=cloud-controller-manager --tail=100