All Products
Search
Document Center

Container Service for Kubernetes:Zero-downtime application deployment

更新时间:Dec 11, 2025

To update applications in an Alibaba Cloud Container Service for Kubernetes (ACK) cluster without service interruptions, configure a Deployment with readiness probes, readinessGates, preStop hooks, and Server Load Balancer (SLB) connection draining. This setup enables smooth traffic migration and maintains high availability during upgrades.

How it works

The RollingUpdate strategy orchestrates upgrades for stateless workloads by incrementally replacing Pods while keeping enough replicas available to handle live traffic. The core process involves several stages:

  1. Startup stage: Kubernetes creates a new Pod (v2). The system waits for this Pod to pass its readiness probe, confirming it is ready to handle requests. The Pod is isolated from Service traffic until this probe succeeds.

  2. Traffic switching stage: This stage synchronizes Kubernetes state with the underlying infrastructure. After the new Pod passes its internal probes, its IP is added to the Service's Endpoints. However, the configured readinessGates prevent the Pod from being marked fully Ready until the Cloud Controller Manager has successfully registered it with the SLB backend server group. This guarantees that the load balancer knows about the new instance before it starts routing traffic to it. At the same time, the old Pod is deregistered from the SLB and receives a termination signal so it no longer accepts new requests.

    For more information about how readinessGates works, see How readinessGates works.
  3. Graceful shutdown stage: This stage coordinates the safe termination of the old instance. When the Pod receives a termination signal, the Kubelet invokes the preStop lifecycle hook, giving the application time to finish in‑flight requests within the configured terminationGracePeriodSeconds. In parallel, the SLB performs connection draining: it keeps existing connections while stopping new ones from being routed to the Pod. This coordinated shutdown helps ensure that no requests are dropped before the Pod is removed.

image

Prerequisites

  • Cluster: Version 1.24 or later. For more information about how to upgrade a cluster, see Upgrade a cluster.

  • Component: cloud-controller-manager version v2.10.0 or later. For more information about the component, see Cloud Controller Manager.

Deploy a sample application

Use one of the following methods to deploy a stateless NGINX application.

Console

  1. On the ACK Clusters page, click the name of the target cluster. In the left navigation pane, choose Workloads > Deployments.

  2. On the Deployments page, click Create from YAML. Copy the following code into the editor and click Create.

    Sample application YAML

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-demo
    spec:
      replicas: 1                 # Set to 2 or more for production HA. It is set to 1 for demonstration purposes.
      selector:
        matchLabels:
          app: nginx-demo
      # Rolling update strategy: Ensures service availability during updates.
      # strategy:
        # type: RollingUpdate     # Default strategy for Deployments.
        # rollingUpdate:
          # maxUnavailable: "25%" # Default. Max 25% of Pods can be unavailable during the update.
          # maxSurge: "25%"       # Default. Max 25% extra Pods can be created above the desired replica count.
      template:
        metadata:
          labels:
            app: nginx-demo 
        spec:
          # Pod-level graceful shutdown limit. Must exceed the sum of preStop execution and app cleanup time.
          terminationGracePeriodSeconds: 60 
          readinessGates:
          - conditionType: service.readiness.alibabacloud.com/nginx-demo-service # Configures the Readiness Gate for nginx-demo-service.
          containers:
          - name: nginx
            image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
            ports:
            - containerPort: 80
            resources:
              requests:
                cpu: 500m
                memory: 1Gi
              limits:
                cpu: 500m
            # --- Health check probes ---
            # Startup Probe: Verifies the application within the container has started.
            startupProbe:
              httpGet:
                path: / # Success indicates NGINX root path is accessible.
                port: 80
              # Allow sufficient time for startup. Total timeout = failureThreshold * periodSeconds.
              # Here: 30 * 10 = 300 seconds.
              failureThreshold: 30
              periodSeconds: 10
            # Readiness Probe: Verifies the container is ready to accept traffic.
            readinessProbe:
              httpGet:
                path: /
                port: 80
              initialDelaySeconds: 5  # Probing starts 5 seconds after the container starts.
              periodSeconds: 5        # Probes every 5 seconds.
              timeoutSeconds: 2       # Probe timeout.
              successThreshold: 1     # 1 success marks the Pod as ready.
              failureThreshold: 3     # 3 consecutive failures mark the Pod as not ready.
            # --- Service graceful shutdown configuration ---
            lifecycle:
              preStop:
                exec:
                  # Define a custom hook to process in-flight connections before shutdown.
                  # Relying solely on 'sleep' may not ensure a proper graceful exit.
                  command: ["sh", "-c", "sleep 30 && /usr/sbin/nginx -s quit"]
    ---           
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-demo-service
      annotations:
        # Timeout for SLB connection draining. Should align with the application's preStop logic. Range: 10-900 seconds.
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain-timeout: "30" 
        # Enable SLB connection draining.
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain: "on"
    spec:
      type: LoadBalancer
      selector:
        app: nginx-demo 
      ports:
        - protocol: TCP
          port: 80
  3. In the dialog box, locate the deployment and click View . Verify that the Pod status is Running

kubectl

  1. Connect to the cluster using kubectl. For clusters without public access, click Manage Clusters Using Workbench on the Cluster Information page to connect over the internal network.

  2. Create a file named nginx-demo.yaml with the following content.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-demo
    spec:
      replicas: 1                 # For production environments, set this to 2 or more to ensure high availability. It is set to 1 here for easy verification of the rolling deployment.
      selector:
        matchLabels:
          app: nginx-demo
      # Rolling update strategy: Ensures no service interruptions during updates.
      # strategy:
        # type: RollingUpdate     # The default strategy for a deployment workload is RollingUpdate.
        # rollingUpdate:
          # maxUnavailable: "25%" # Default value. A maximum of 25% of the pods can be unavailable during the update process.
          # maxSurge: "25%"       # Default value. The number of pods can exceed the desired number of replicas by a maximum of 25% during the update process.
      template:
        metadata:
          labels:
            app: nginx-demo 
        spec:
          # Pod-level graceful shutdown limit. Must exceed the sum of preStop execution and app cleanup time.
          terminationGracePeriodSeconds: 60 
          readinessGates:
          - conditionType: service.readiness.alibabacloud.com/nginx-demo-service # Configures the Readiness Gate for nginx-demo-service.
          containers:
          - name: nginx
            image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
            ports:
            - containerPort: 80
            resources:
              requests:
                cpu: 500m
                memory: 1Gi
              limits:
                cpu: 500m
            # --- Health check probes ---
            # Startup Probe: Verifies the application within the container has started.
            startupProbe:
              httpGet:
                path: / # Success indicates NGINX root path is accessible.
                port: 80
              # Allow sufficient time for startup. Total timeout = failureThreshold * periodSeconds.
              # Here: 30 * 10 = 300 seconds.
              failureThreshold: 30
              periodSeconds: 10
            # Readiness Probe: Verifies the container is ready to accept traffic.
            readinessProbe:
              httpGet:
                path: /
                port: 80
              initialDelaySeconds: 5  # Probing starts 5 seconds after the container starts.
              periodSeconds: 5        # Probes every 5 seconds.
              timeoutSeconds: 2       # Probe timeout.
              successThreshold: 1     # 1 success marks the Pod as ready.
              failureThreshold: 3     # 3 consecutive failures mark the Pod as not ready.
            # --- Service graceful shutdown configuration ---
            lifecycle:
              preStop:
                exec:
                  # Define a custom hook to process in-flight connections before shutdown.
                  # Relying solely on 'sleep' may not ensure a proper graceful exit.
                  command: ["sh", "-c", "sleep 30 && /usr/sbin/nginx -s quit"]
    ---           
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-demo-service
      annotations:
        # Timeout for SLB connection draining. Should align with the application's preStop logic. Range: 10-900 seconds.
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain-timeout: "30" 
        # Enable SLB connection draining.
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain: "on"
    spec:
      type: LoadBalancer
      selector:
        app: nginx-demo 
      ports:
        - protocol: TCP
          port: 80
  3. Apply the configuration to deploy the application and Service.

    kubectl apply -f nginx-demo.yaml
  4. Verify that the Pod's status is Running.

    kubectl get pod | grep nginx-deployment-demo
  • Pod readiness probes

    • startupProbe: Ideal for slow-starting workloads (e.g., Java applications) to confirm initialization. Liveness and readiness probes are paused until the startup probe succeeds, preventing Kubelet from prematurely restarting the container due to a slow startup.

    • readinessProbe: Determines if the container is ready to handle external traffic. Once successful, the Pod's IP address is added to the Service's Endpoints, enabling traffic flow.

    • readinessGates: Defines additional prerequisites for readiness. The Pod is considered fully ready—and capable of receiving traffic—only when both the standard readinessProbe and the specific conditions in readinessGates are met.

  • Graceful shutdown

    • Application graceful shutdown

      • preStop: A lifecycle hook executed immediately before container termination. Use this hook to trigger a graceful application shutdown, ensuring in-flight requests are finalized before the process stops.

        Define a custom hook tailored to your application logic. Relying solely on a sleep command is unreliable and may result in an incomplete shutdown.
      • terminationGracePeriodSeconds: The duration Kubernetes waits for a Pod to shut down gracefully before forcibly killing it (SIGKILL). Default: 30 seconds. This value must exceed the combined duration of the preStop hook execution and the application's internal cleanup process.

    • SLB connection draining

      • service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain: Annotation to enable connection draining on the SLB instance.

      • service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain-timeout: Sets the SLB connection draining timeout (in seconds). This value should align with the time required to process in-flight requests in the preStop hook.

  • Rolling update strategy

    • strategy: The default Deployment strategy is RollingUpdate. It replaces old Pods with new ones incrementally. New Pods are created and verified before the corresponding old Pods are terminated, ensuring service availability throughout the rollout.

    • maxUnavailable: The maximum number (or percentage) of Pods that can be unavailable during the update. Default: 25%.

    • maxSurge: The maximum number (or percentage) of extra Pods that can be created above the desired replica count during the update. Higher values accelerate the rollout but increase resource consumption. Default: 25%.

Verify the zero-downtime rolling update

  1. Connect to the cluster using kubectl.

  2. Retrieve the external endpoint of the sample application.

    export NGINX_ENDPOINT=$(kubectl get service nginx-demo-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}{":"}{.spec.ports[0].port}')
    echo $NGINX_ENDPOINT
  3. Install the stress testing tool hey. Run a load test with 200 concurrent connections and 50,000 total requests. Based on the sample resource configuration, a single replica typically completes this in about 1 minute.

    hey -c 200 -n 50000  -disable-keepalive http://$NGINX_ENDPOINT

    Simultaneously, open a new terminal window and trigger a rolling restart of the Deployment.

    kubectl rollout restart deployment nginx-deployment-demo
  4. Compare the expected outputs.

    Configuration Scenario

    Expected Output

    Without zero-downtime configuration

    Sample YAML without zero-downtime configuration

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-demo
    spec:
      replicas: 1                 # For production environments, set this to 2 or more to ensure high availability. It is set to 1 here for easy verification of the rolling deployment.
      selector:
        matchLabels:
          app: nginx-demo
      template:
        metadata:
          labels:
            app: nginx-demo 
        spec:
          containers:
          - name: nginx
            image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
            ports:
            - containerPort: 80
            resources:
              requests:
                cpu: 500m
                memory: 1Gi
              limits:
                cpu: 500m
    ---           
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-demo-service
    spec:
      type: LoadBalancer
      selector:
        app: nginx-demo 
      ports:
        - protocol: TCP
          port: 80

    Traffic loss observed.

    Status code distribution:
      [200]	49644 responses
    
    Error distribution:
      [320]	Get "http://114.215.XXX.XXX": dial tcp 114.215.XXX.XXX:80: connect: connection refused
      [18]	Get "http://114.215.XXX.XXX": dial tcp 114.215.XXX.XXX:80: connect: no route to host
      [18]	Get "http://114.215.XXX.XXX": dial tcp 114.215.XXX.XXX:80: connect: operation timed out

    With zero-downtime configuration

    Zero downtime achieved.

    Status code distribution:
      [200]	50000 responses

FAQ

Why is my Pod stuck in the Running state but not ready? (Snipaste_2025-11-05_13-57-58)

Cause: This behavior typically indicates that the startup probe or readiness probe is failing.

Solution:

  • Check probe configuration: Go to the Edit page for the target Workload. Confirm that the health check path (such as /healthz) and port match the application's actual settings. If the application starts slowly, increase the Unhealthy Threshold to prevent premature failure.

    Manual verification: Temporarily disable the probe, access the terminal of the Pod, and use a command like curl to verify that the health check endpoint is responding correctly.
  • Analyze Logs: Check the Pod's Events and Logs. Select the Show the log of the last container exit option to troubleshoot previous crashes.

References