If you perform a scale-in or rolling restart operation on an ingress gateway, a small amount of traffic is lost because the number of gateway pods is reduced. To resolve this issue, you can enable graceful shutdown for the Server Load Balancer (SLB) instance of the ingress gateway. This way, traffic can continue to be transferred by using the SLB instance within the specified period of time even if the number of gateway pods is reduced. This ensures that no traffic is lost. This topic describes how to enable graceful shutdown for the SLB instance of an ingress gateway.

Prerequisites

Step 1: Enable graceful shutdown for an SLB instance

When you create an ingress gateway, you can enable graceful shutdown for the SLB instance of the ingress gateway. You can also enable graceful shutdown for the SLB instance of an existing ingress gateway.

Enable graceful shutdown for the SLB instance when you create an ingress gateway

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Gateways > Ingress Gateway.
  3. On the Ingress Gateway page, click Create.
  4. On the Create page, select a cluster in which you want to deploy the ingress gateway, set the SLB Instance Type parameter to Internet Access, select an SLB instance type below Create SLB Instance, and then set the Gateway instances parameter to 10. Retain the default configurations for other parameters.
    For more information about the parameters, see Create an ingress gateway service.
  5. Click Advanced Options, select SLB graceful offline, specify a connection timeout for the SLB instance, and then click Create.
    ParameterDescription
    SLB graceful offlineAfter you select SLB graceful offline, the ingress gateway service is not affected if the SLB instance becomes unavailable.
    Connection timeout (seconds)After the SLB instance is removed from the pod of the ingress gateway service, the SLB instance is not disconnected from the pod of the ingress gateway service until the specified time ends. During the specified period of time, the pod of the ingress gateway service can handle existing connections. The default offline grace period is 30 seconds. We recommend that you set a connection timeout that does not exceed 30 seconds.

Enable graceful shutdown for the SLB instance of an existing ingress gateway

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Gateways > Ingress Gateway.
  3. On the Ingress Gateway page, click the name of the ingress gateway.
  4. In the Advanced Options section of the gateway details page, click the Edit icon icon next to SLB graceful offline, select SLB graceful offline, specify a connection timeout for the SLB instance, and then click OK.

Step 2: Deploy a sample application

  1. Connect to the ACK cluster by using kubectl. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
  2. Create an httpbin.yaml file that contains the following content:
    
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: httpbin
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: httpbin
      labels:
        app: httpbin
        service: httpbin
    spec:
      ports:
      - name: http
        port: 8000
        targetPort: 80
      selector:
        app: httpbin
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: httpbin
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: httpbin
          version: v1
      template:
        metadata:
          labels:
            app: httpbin
            version: v1
        spec:
          serviceAccountName: httpbin
          containers:
          - image: docker.io/kennethreitz/httpbin
            imagePullPolicy: IfNotPresent
            name: httpbin
            ports:
            - containerPort: 80
  3. Run the following command to deploy the httpbin application:
    kubectl apply -f httpbin.yaml -n default

Step 3: Create a virtual service and an Istio gateway

  1. Create a virtual service.
    1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
    2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Traffic Management Center > VirtualService. On the page that appears, click Create from YAML.
    3. On the Create page, select a namespace and a template, specify the following YAML code in the code editor, and then click Create.
      apiVersion: networking.istio.io/v1beta1
      kind: VirtualService
      metadata:
        name: httpbin
        namespace: default
      spec:
        gateways:
          - httpbin-gateway
        hosts:
          - '*'
        http:
          - route:
              - destination:
                  host: httpbin
                  port:
                    number: 8000
  2. Create an Istio gateway.
    1. On the details page of the ASM instance, choose ASM Gateways > Gateway in the left-side navigation pane. On the page that appears, click Create from YAML.
    2. On the Create page, select a namespace and a template, specify the following YAML code in the code editor, and then click Create.
      apiVersion: networking.istio.io/v1beta1
      kind: Gateway
      metadata:
        name: httpbin-gateway
        namespace: default
      spec:
        selector:
          istio: ingressgateway
        servers:
          - hosts:
              - '*'
            port:
              name: http
              number: 80
              protocol: HTTP
  3. Verify that the route configuration is successful.
    1. Obtain the IP address of the ingress gateway. For more information, see Create an ingress gateway service.
    2. In the address bar of your browser, enter http://<IP address of the ingress gateway>.
      If the following information appears, the route configuration is successful. httpbin

Step 4: Verify that the graceful shutdown feature is functional

  1. Download and install the lightweight stress testing tool hey of a version that is suitable for your operating system. For more information, visit hey at GitHub.
    The following hey versions are supported for different operating systems:
    • Linux 64-bit: hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
    • macOS 64-bit: hey-release.s3.us-east-2.amazonaws.com/hey_darwin_amd64
    • Windows 64-bit: hey-release.s3.us-east-2.amazonaws.com/hey_windows_amd64
  2. Scale in the ingress gateway.
    1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
    2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Gateways > Ingress Gateway.
    3. On the Ingress Gateway page, find the ingress gateway that you want to scale in and click YAML.
    4. In the Edit panel, set the replicaCount parameter to 1 and click OK.
      Scale in the ingress gateway
  3. Check whether traffic loss occurs when graceful shutdown for the SLB instance is enabled.

    Run the following command to send requests to access the httpbin application. Set the number of requests that run concurrently to 200 and the total number of requests to 50000.

    hey -c 200 -n 50000  -disable-keepalive   http://{IP address of the ingress gateway}/
    Feature enabled or notResult
    Graceful shutdown for the SLB instance is not enabledOutput:
    Status code distribution:
      [200] 49747 responses
    
    Error distribution:
      [253] Get "http://47.55.2xx.xx": dial tcp 47.55.2xx.xx:80: connect: connection refused

    The status code 200 is returned for 49,747 out of the 50,000 access requests. This indicates that only 49,747 access requests are successful and a small amount of traffic is lost.

    Graceful shutdown for the SLB instance is enabledOutput:
    ............
    Status code distribution:
      [200] 50000 responses
    The status code 200 is returned for all 50,000 access requests. This indicates that all 50,000 access requests are successful and no traffic is lost.