All Products
Search
Document Center

Microservices Engine:Configure empty list protection

Last Updated:Jan 23, 2024

When a registry pushes the service list to a client, if the service list is empty due to some reasons such as internal failures or network issues, the registry triggers the empty list protection mechanism. This way, the empty service list is not pushed to the client. This ensures the fault tolerance and stability of the system.

Background information

When a client subscribes to a registry for a list of server addresses, a server exception may occur during service registration, and the registry may return an empty list. If you enable the empty list protection feature, the client ignores the empty list and obtains the most recent valid list from the cache.

If the service registry updates configurations, performs an upgrade or a downgrade, or encounters an exception such as network disconnection or power outage, subscription exceptions may occur. As a result, consumers may obtain an empty server list. This affects the availability of consumers. The empty list protection feature can help protect service calls and improve business reliability.

In this example, the application architecture consists of backend Spring Cloud applications. In the backend call process, Spring Cloud consumers call Spring Cloud providers. For the backend applications, service registration and discovery are implemented by using a Nacos registry.

image
Note

The agent of the empty list protection feature is in a canary release. If you want to use this feature, contact technical support in the DingTalk group (ID: 34754806) of Microservices Engine (MSE). You can obtain a trial after an upgrade.

Limits

Item

Requirement

Description

Spring Cloud version

Spring Cloud Edgware and later

-

Dubbo version

V2.5.3 to V2.7.8

Dubbo 3.0 and later versions are in a canary release.

Registry type

  • Nacos

  • Eureka

  • ZooKeeper

-

Prerequisites

Deploy demo applications

  1. In the left-side navigation pane of the ACK console, click Clusters. On the Clusters page, click the name of the cluster that you want to manage.

  2. In the left-side navigation pane of the cluster details page, choose Workload > Deployments.

  3. In the upper-left corner of the Deployments page, select a namespace from the Namespace drop-down list and click Create from YAML. Configure the parameters and click Create. In this example, the sc-consumer, sc-consumer-empty, and sc-provider applications are deployed, and an open source Nacos instance is used as the service registry.

    Show YAML code

    # Enable the empty list protection feature for the sc-consumer application.
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sc-consumer
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sc-consumer
      template:
        metadata:
          labels:
            msePilotCreateAppName: sc-consumer
            app: sc-consumer
        spec:
          containers:
          - env:
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: spring.cloud.nacos.discovery.server-addr
              value: nacos-server:8848
            image: registry.cn-hangzhou.aliyuncs.com/mse-demo-hz/demo:sc-consumer-0.1
            imagePullPolicy: Always
            name: sc-consumer
            ports:
            - containerPort: 18091
            livenessProbe:
              tcpSocket:
                port: 18091
              initialDelaySeconds: 10
              periodSeconds: 30
    ---
    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec: slb.s1.small
        service.beta.kubernetes.io/alicloud-loadbalancer-address-type: internet
      name: sc-consumer-slb
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 18091
      selector:
        app: sc-consumer
      type: LoadBalancer
    status:
      loadBalancer: {}
    # Disable the empty list protection feature for the sc-consumer-empty application.
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sc-consumer-empty
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sc-consumer-empty
      template:
        metadata:
          labels:
            msePilotCreateAppName: sc-consumer-empty
            app: sc-consumer-empty
        spec:
          containers:
          - env:
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: spring.cloud.nacos.discovery.server-addr
              value: nacos-server:8848
            image: registry.cn-hangzhou.aliyuncs.com/mse-demo-hz/demo:sc-consumer-0.1
            imagePullPolicy: Always
            name: sc-consumer-empty
            ports:
            - containerPort: 18091
            livenessProbe:
              tcpSocket:
                port: 18091
              initialDelaySeconds: 10
              periodSeconds: 30
    ---
    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec: slb.s1.small
        service.beta.kubernetes.io/alicloud-loadbalancer-address-type: internet
      name: sc-consumer-empty-slb
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 18091
      selector:
        app: sc-consumer-empty
      type: LoadBalancer
    status:
      loadBalancer: {}
    # sc-provider
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sc-provider
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sc-provider
      strategy:
      template:
        metadata:
          labels:
            msePilotCreateAppName: sc-provider
            app: sc-provider
        spec:
          containers:
          - env:
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: spring.cloud.nacos.discovery.server-addr
              value: nacos-server:8848
            image: registry.cn-hangzhou.aliyuncs.com/mse-demo-hz/demo:sc-provider-0.3
            imagePullPolicy: Always
            name: sc-provider
            ports:
            - containerPort: 18084
            livenessProbe:
              tcpSocket:
                port: 18084
              initialDelaySeconds: 10
              periodSeconds: 30
    # Nacos Server
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nacos-server
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nacos-server
      template:
        metadata:
          labels:
            app: nacos-server
        spec:
          containers:
          - env:
            - name: MODE
              value: standalone
            image: nacos/nacos-server:v2.2.0
            imagePullPolicy: Always
            name: nacos-server
          dnsPolicy: ClusterFirst
          restartPolicy: Always
    
    # The configurations for the nacos-server service.
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nacos-server
    spec:
      ports:
      - port: 8848
        protocol: TCP
        targetPort: 8848
      selector:
        app: nacos-server
      type: ClusterIP
                            

Enable the empty list protection feature

  1. Log on to the MSE console, and select a region in the top navigation bar.

  2. In the left-side navigation pane, choose Microservices Governance > Application Governance. On the page that appears, click the resource card of the application that you want to manage.

  3. On the application details page, click Traffic management in the left-side navigation pane, and click the Push-through protection tab.

  4. Turn on the Push-through protection switch.

View the empty list protection events

  1. Log on to the MSE console, and select a region in the top navigation bar.

  2. In the left-side navigation pane, choose Microservices Governance > Application Governance. On the page that appears, click the resource card of the application that you want to manage.

  3. On the application details page, click Traffic management in the left-side navigation pane, and click the Push-through protection tab.

  4. In the upper-right corner of the page, view protection events in the Protection event section.

Verify the feature

  1. Run the vi curl.sh command to write a test script.

    while :
    do
            result=`curl $1 -s`
            if [[ "$result" == *"500"* ]]; then
                    echo `date +%F-%T` $result
            else
                    echo `date +%F-%T` $result
            fi
    
            sleep 0.1
    done
  2. Execute the script to perform the test.

    1. Run the % sh curl.sh {sc-consumer-empty-slb}:18091/user/rest command to execute the script. The following result is returned:

      2022-01-19-11:58:12 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:12 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:12 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
    2. Keep the script to be called all the time. MSE console shows the metric data.

    3. Run the % sh curl.sh {sc-consumer-slb}:18091/user/rest command to execute the script. The following result is returned:

      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:13 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:14 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:14 Hello from [18084]10.116.0.142!
      2022-01-19-11:58:14 Hello from [18084]10.116.0.142!
    4. Keep the script to be called all the time. MSE console shows the metric data

  3. Remove all the coredns components to simulate the DNS resolution failure scenario.

    The application is disconnected from the Nacos registry and the service list is empty.

  4. Simulate the DNS service recovery and scale out the number of coredns components to 2.

Verify the result

During the preceding business process, a large number of errors constantly occur in the sc-consumer-empty application. The sc-consumer-empty application is recovered only after the sc-provider application is restarted.

2022-01-19-12:02:37 {"timestamp":"2022-01-19T04:02:37.597+0000","status":500,"error":"Internal Server Error","message":"com.netflix.client.ClientException: Load balancer does not have available server for client: mse-service-provider","path":"/user/feign"}
2022-01-19-12:02:37 {"timestamp":"2022-01-19T04:02:37.799+0000","status":500,"error":"Internal Server Error","message":"com.netflix.client.ClientException: Load balancer does not have available server for client: mse-service-provider","path":"/user/feign"}
2022-01-19-12:02:37 {"timestamp":"2022-01-19T04:02:37.993+0000","status":500,"error":"Internal Server Error","message":"com.netflix.client.ClientException: Load balancer does not have available server for client: mse-service-provider","path":"/user/feign"}

Different from the sc-consumer-empty application, the sc-consumer application encounters no errors during the business process.

References

If you use an MSE Nacos instance as the service registry, you can also enable Nacos empty list protection to obtain more technological support. For more information, see Empty list protection.