In a microservice framework, an application is used to provide services and can be deployed on multiple instances. Some instances may become abnormal. If a consumer calls the application in this situation and does not perceive the abnormal instances, the call may fail. This affects service performance and availability. The outlier instance removal feature monitors the availability of instances and dynamically adjusts instances. This ensures successful service calls, and improves service stability and performance.
In the following figure, a system includes Applications A, B, C, and D, where Application A calls Applications B, C, and D. If the instances of Application B, C, or D become abnormal and Application A does not identify the abnormal instances, a part of calls initiated by Application A fail. In the following figure, Application B has one abnormal instance, and Applications C and D each have two abnormal instances. If Applications B, C, and D have a large number of abnormal instances, the service performance and availability of Application A may be affected.
To ensure service performance and availability, you can configure an outlier application removal policy. After the policy is configured, Enterprise Distributed Application Service (EDAS) can monitor the instance status of Applications B, C, and D, and dynamically add or remove instances to ensure successful service calls.
The following list describes the outlier instance removal process:
- EDAS detects whether Applications B, C, and D have abnormal instances. Then, EDAS determines whether to remove the abnormal instances from the applications based on the configured Upper Limit of Instance Removal Ratio.
- EDAS does not distribute the call requests of Application A to the removed instances.
- EDAS detects whether the abnormal instances are recovered based on the configured Recovery Detection Unit Time.
- The detection interval is proportional to the number of detection times and linearly increases by Recovery Detection Unit Time, which is 0.5 minutes by default. If the value of Maximum Cumulative Number of Times Not Restored is reached, EDAS detects whether the abnormal instances are recovered at the maximum detection interval.
- After the abnormal instances are recovered, they are added to the instance lists of the applications to continue processing call requests. The detection interval is reset to the value of Recovery Detection Unit Time, such as, 0.5 minutes.
- If the provider has a large number of abnormal instances and the ratio of the abnormal instances exceeds the configured Upper Limit of Instance Removal Ratio, the number of actually removed instances equals the configured upper limit.
- If the provider has only one instance available, this instance is not removed even if the error rate exceeds the configured limit.
Create a policy for outlier instance removal
- Log on to the EDAS console.
- In the left-side navigation pane, choose .
- In the left-side navigation tree of Service Mesh, click Outlier Instance Removal.
- On the Outlier Instance Removal page, select a region and a Namespaces. Then, click Create an outlier instance removal policy.
- In the Create Outlier Instance Removal Policy wizard, configure the parameters in the Basic information step and click Next.
The following table describes the parameters in the Basic information step.
Parameter Description Namespace Select a region and anamespace. Policy name Enter a name for the policy. The name can be a maximum of 64 characters in length. Framework Select Service Mesh.
- In the Select effective application step of the Create Outlier Instance Removal Policy wizard, select the specific application and click the > icon to add the application to the Selected Applications section. Then, click Next step.
After the specific application is selected, all abnormal application instances that are called by the application are removed. Call requests from the application on which the created policy takes effect are not sent to the removed instances.
- In the Configure policies step of the Create Outlier Instance Removal Policy wizard, configure the parameters and click Next step.
The following table describes the parameters in the Configure policies step.
Parameter Description Exception type Default value: 5xx Error. You cannot change the value of this parameter. Proportion of Largest Instances Enter the upper limit for the proportion of abnormal instances that can be removed. If the limit is reached, no more abnormal instances are removed. For example, if an application has 6 instances and you set this parameter to 60%, the number of instances that can be removed is 3.6. This number is rounded down to the nearest integer, which is 3. The number is calculated by using the following formula: 6 × 60%. If the calculation result is less than 1, instances are not removed. Recovery detection unit time Specify a unit interval to detect whether abnormal instances are recovered, in milliseconds. After abnormal instances are removed, EDAS linearly increases the detection interval by the specified unit interval. Default value: 30000 ms, which is equal to 0.5 minute. Number of Consecutive Errors Specify the threshold of the number of consecutive errors during requests. If the threshold is reached, the instance is removed. Maximum Connections Specify the maximum number of connections are supported by a service. Default value: 1024. Maximum Pending Requests Specify the maximum number of pending requests that are supported by a service. Default value: 1024. Maximum Requests for a Single Connection The maximum number of requests that are supported by a service. Default value: 1024.
- In the Create Confirm step of the Create Outlier Instance Removal Policy wizard, check the settings and click Create.
Verify the result
After you configure and submit an outlier instance removal policy, the outlier instance removal feature is enabled. After you configure an outlier instance removal policy for an application, you can go to the details page of the application to view the monitoring information. You can view the monitoring information in topology to check whether all requests are still forwarded to abnormal instances. You can also check whether Error Rate per Minute of the application is higher than the configured Lower Error Rate. Based on the information, you can determine whether the outlier instance removal policy takes effect.
What to do next
On the Outlier Instance Removal page, you can click Edit or Delete in the Operation column to manage the policies.