When you scale in or perform a rolling restart on a Service Mesh (ASM) gateway, gateway pods are deleted. This can cause a small amount of traffic loss. You can enable the graceful shutdown feature to prevent this loss. When graceful shutdown is enabled, existing connections continue to transfer data for a period of time while the gateway pods are being deleted. This topic describes how to use the graceful shutdown feature.
Scope
An ASM instance of Enterprise Edition or Ultimate Edition has been created. For more information, see Create an ASM instance.
Step 1: Enable graceful shutdown
Enable the feature for an existing gateway
In ASM 1.26 and later, changing the graceful shutdown configuration causes the gateway to restart. Perform this operation during off-peak hours.
Console
Log on to the ASM console. In the left-side navigation pane, choose .
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose .
On the Ingress Gateway page, click the name of the target gateway.
On the Gateway overview page, click Advanced Options, click the
icon next to Graceful Shutdown, select the Graceful Shutdown checkbox, set the Connection timeout (seconds), and then click Submit.
YAML configuration (for ASM versions below 1.26)
Add the required annotations to the gateway YAML file under the serviceAnnotations field.
apiVersion: istio.alibabacloud.com/v1
kind: IstioGateway
metadata:
name: ingressgateway
namespace: istio-system
spec:
gatewayType: ingress
serviceAnnotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain: 'on' # Enable connection draining for the load balancer, which is graceful shutdown.
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain-timeout: '10' # The connection draining timeout period. Valid values: 10 to 30.
...YAML configuration (for ASM versions 1.26 and later)
Add the required annotation to the gateway YAML file under the annotations field.
apiVersion: istio.alibabacloud.com/v1
kind: IstioGateway
metadata:
annotations:
# For Classic Load Balancer (CLB) and Network Load Balancer (NLB) gateways, the valid values are 10 to 890.
# For ClusterIP and NodePort gateways, there is no upper limit.
asm.alibabacloud.com/gateway-drain-timeout-seconds: "30"
name: ingressgateway
namespace: istio-system
...Enable the feature when you create a gateway
Log on to the ASM console. In the left-side navigation pane, choose .
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose .
On the Ingress Gateway page, click Create.
On the Create page, select a Deployment Cluster, set CLB Type to Public Network Access, select a load balancer specification under New CLB Instance, and set Number Of Gateway Replicas to 10. Keep the default values for the other configuration items.
For more information about the configuration items, see Create an ingress gateway.
Click Advanced Options, select the Graceful Shutdown checkbox, set Connection Timeout (Seconds), and then click Create.
Configuration item
Description
Graceful Shutdown
If you select this option, the Classic Load Balancer (CLB) instance smoothly drains existing connections when gateway pods are rolling restarted. This minimizes the impact on your services and better supports scenarios such as configuration changes and gateway upgrades.
Connection Timeout (Seconds)
After the CLB instance removes a gateway pod, it waits for the configured connection timeout period before it disconnects from the pod. This parameter provides a buffer for the gateway pod to process existing connections. The default graceful shutdown time for a gateway pod is 30 seconds. The timeout period that you configure for the CLB instance should not exceed 30 seconds.
Starting from version 1.26, you can set the timeout period to a maximum of 890 seconds.
Step 2: Deploy a sample application
Connect to the ACK cluster by using kubectl. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Create an httpbin.yaml file with the following content.
Deploy the httpbin application.
kubectl apply -f httpbin.yaml -n default
Step 3: Create a virtual service and a gateway rule
Create a virtual service.
Log on to the ASM console. In the left-side navigation pane, choose .
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose . On the page that appears, click Create from YAML.
On the Create page, select a Namespace and a Scenario Template, enter the following YAML configuration, and then click Create.
apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: httpbin namespace: default spec: gateways: - httpbin-gateway hosts: - '*' http: - route: - destination: host: httpbin port: number: 8000
Create a gateway rule.
On the details page of the ASM instance, choose in the left-side navigation pane. On the page that appears, click Create from YAML.
On the Create page, select a Namespace and a Scenario Template, enter the following YAML configuration, and then click Create.
apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: httpbin-gateway namespace: default spec: selector: istio: ingressgateway servers: - hosts: - '*' port: name: http number: 80 protocol: HTTP
Verify that the routing is configured successfully.
Obtain the ASM gateway address. For more information, see Create an ingress gateway.
In the address bar of your browser, enter http://<ASM gateway address>.
The following page appears, which indicates that the routing is configured successfully.

Step 4: Verify the effect of graceful shutdown
Download and install a version of the lightweight stress testing tool hey that is compatible with your operating system. For more information, see hey.
Scale in the ASM gateway.
Log on to the ASM console. In the left-side navigation pane, choose .
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose .
On the Ingress Gateway page, click Edit YAML to the right of the target gateway.
In the Edit dialog box, set the value of the
replicaCountparameter to 1, and then click OK.
Run the following command to send 50,000 requests to the httpbin application with a concurrency of 200. Check the traffic loss before and after graceful shutdown is enabled.
hey -c 200 -n 50000 -disable-keepalive http://<ASM gateway address>/Type
Result analysis
Graceful shutdown disabled
The following output is returned:
Status code distribution: [200] 49747 responses Error distribution: [253] Get "http://47.55.2xx.xx": dial tcp 47.55.2xx.xx:80: connect: connection refusedOf the 50,000 requests, only 49,747 return a status code of 200. This indicates that only 49,747 requests are successful and a small amount of traffic is lost.
Graceful shutdown enabled
The following output is returned:
............ Status code distribution: [200] 50000 responsesAll 50,000 requests return a status code of 200. This indicates that all 50,000 requests are successful and no traffic is lost.