In a microservice framework, service calls are affected if consumers cannot detect the exceptions on the application instances of a provider. This further affects the performance and even availability of the services provided by the consumers. The outlier instance removal feature monitors the availability of application instances and dynamically adjusts the instances. This ensures successful service calls and improves the service stability and quality of service (QoS).

Background information

A system includes Applications A, B, C, and D, where Application A calls Applications B, C, and D. If the instances of Application B, C, or D become abnormal and Application A does not identify the abnormal instances, a part of calls initiated by Application A fail. Application B has one abnormal instance, and Applications C and D each have two abnormal instances. If Applications B, C, and D have a large number of abnormal instances, the service performance and availability of Application A may be affected.

To ensure the service performance and availability of Application A, you can configure an outlier application removal policy. After the policy is configured, Enterprise Distributed Application Service (EDAS) can monitor the instance status of Applications B, C, and D, and dynamically add or remove instances to ensure successful service calls.

The following list describes the process of outlier instance removal:

  1. EDAS detects whether Applications B, C, and D have abnormal instances. Then, EDAS determines whether to remove the abnormal instances from the applications based on the configured Upper limit of instance removal ratio parameter.
  2. EDAS does not distribute the call requests of Application A to the removed instances.
  3. EDAS detects whether the abnormal instances are recovered based on the configured Recovery detection unit time parameter.
  4. The detection interval is proportional to the number of detection times and linearly increases by the value of the Recovery detection unit time parameter, which is 0.5 minutes by default. If the value of the Maximum cumulative number of times not restored parameter is reached, EDAS detects whether the abnormal instances are recovered at the maximum detection interval.
  5. After the abnormal instances are recovered, they are added to the instance lists of the applications to continue processing call requests. The detection interval is reset to the value of the Recovery detection unit time parameter, such as 0.5 minutes.
Note
  • If the provider has a large number of abnormal instances and the ratio of the abnormal instances exceeds the value of the Upper limit of instance removal ratio parameter, the number of actually removed instances equals the configured upper limit.
  • If the provider has only one instance available, this instance is not removed even if the error rate exceeds the configured limit.

Video tutorial

Create an outlier instance removal policy

Verify the result

The outlier instance removal feature is enabled after you configure and create an outlier instance removal policy. You can go to the details page of the application for which you have configured outlier instance removal to view the application monitoring information. For example, you can check whether call requests are still forwarded to abnormal instances and whether the error rate per minute for application calls is higher than the value of the Lower error rate limit parameter in a topology. This way, you can check whether the outlier instance removal policy takes effect.