Spring Cloud Alibaba seven-day training camp (v) service fusing and throttling-Alibaba Cloud Developer Community

Document directory
  • Spring Cloud Alibaba seven-day training camp (1) Basic Knowledge
  • Spring Cloud Alibaba seven-day training camp (II) distributed configuration
  • Spring Cloud Alibaba seven-day training camp (III) service registration and discovery
  • Spring Cloud Alibaba seven-day training camp (4) call distributed services
  • Spring Cloud Alibaba seven-day training camp (v) service fusing and throttling
  • Spring Cloud Alibaba seven-day training camp (6) distributed message (event) driven
  • Spring Cloud Alibaba seven-day training camp (7) distributed transactions

why throttling degradation is required

some unstable situations often occur in our production environment, such:

  • the system exceeds the maximum load, and the load rises sharply. The system crashes, causing the user to fail to place an order.
  • Hot products of "black horse" broke down the cache, and DB was destroyed, squeezing normal traffic.
  • The caller is dragged down by unstable services and the thread pool is full, causing the entire trace to be stuck.

These unstable scenarios may lead to serious consequences. You may want to ask: how to achieve even and smooth user access? How to prevent the impact of excessive traffic or service instability? In this case, we need to find a magic weapon for microservice stability-high-availability traffic protection. The important methods are traffic control and fusing degradation, which are important to ensure the stability of microservices.

Why is traffic control required?

Traffic is very random and unpredictable. The first second may be calm, and the next second may be traffic peak (for example, double 11 00:00). However, the capacity of our system is always limited. If the sudden traffic exceeds the system's bearing capacity, requests may fail to be processed and accumulated requests may be processed slowly, CPU/Load increases sharply, resulting in system crash. Therefore, we need to limit this burst of traffic and process requests as much as possible to ensure that services are not destroyed. This is traffic control.

Why is fusing degradation required?

A service often calls other modules, such as another remote service, database, or third-party API. For example, you may need to remotely call the API provided by UnionPay to query the price of a product. However, the stability of the dependent service cannot be guaranteed. If the dependent service is unstable and the response time of the request becomes longer, the response time of the method calling the service also becomes longer and threads accumulate, eventually, the thread pool of the service itself may be exhausted and the service itself becomes unavailable.

Modern microservice architectures are distributed and consist of many services. Different services call each other to form a complex call link. The preceding problems can be amplified in link calls. If a link on a complex link is unstable, it may be cascaded layer by layer, resulting in the entire link being unavailable. Therefore, we need to perform fusing and degradation on unstable weak dependency services, temporarily cut off unstable calls, and avoid the overall avalanche caused by local unstable factors.

Sentinel: a powerful tool for high-availability escort

Sentinel is an open-source high-availability protection component designed for the distributed service architecture of Alibaba. It takes traffic as the entry point, including traffic control, traffic shaping, fusing and degradation, system adaptive protection, hotspot protection and other dimensions help developers ensure the stability of microservices. Sentinel has undertaken the core traffic scenarios of Alibaba's Double 11 shopping festival in the past 10 years, such as seckill, cold start, message peak shifting, adaptive traffic control, and real-time fusing of downstream unavailable services, it is a powerful tool to ensure high availability of microservices. It supports multiple languages, such as Java, Go, and C ++, and provides Istio/++ global throttling support to protect Service Mesh from high availability.

Technical highlights of Sentinel:

common scenarios:

  • in the case of Service providers (Service Provider), we need to protect Service providers from traffic peaks. In this case, traffic is usually controlled based on the service capability of the service provider or limited to specific service callers. We can use the previous stress testing to evaluate the capacity of core interfaces and configure the QPS mode throttling. When the number of requests per second exceeds the set threshold, redundant requests are automatically rejected.
  • To avoid being dragged down by unstable services when calling other services, we need to isolate and fuse unstable Service dependencies on the Service caller (Service Consumer). Methods include Semaphore isolation, abnormal proportional degradation, and RT degradation.
  • When the system is in low water level for a long time and the flow suddenly increases, directly pulling the system to high water level may instantly crush the system. At this time, we can use the WarmUp flow control mode of the Sentinel to control the slow increase of the passing traffic, and gradually increase to the upper limit of the threshold within a certain period of time, instead of allowing all traffic in a flash. This can give the cold system a warm-up time to avoid the cold system being crushed.
  • Use the uniform queuing mode of the Sentinel to perform "peak shifting and valley filling", and evenly distribute the request spikes to a period of time, keeping the system load within the request processing level and processing as many requests as possible.
  • Use the Gateway traffic control feature of the Sentinel to protect traffic at the gateway entrance or limit the frequency of API calls.

Sentinel has a rich open source ecosystem. Sentinel open source was soon incorporated into CNCF Landscape territory and became one of the officially recommended throttling downgrade components in Spring Cloud. The community provides Spring Cloud, Dubbo, gRPC, Quarkus, and other common microservice frameworks for out-of-the-box use. It also supports the Reactive ecosystem and supports Reactor, Spring WebFlux asynchronous response architectures. Sentinel is gradually covering API Gateway and Service Mesh scenarios and playing a greater role in cloud-native architectures.

In the original Spring Cloud Netflix series, there is a self-contained fusing component Hystrix, which is an open-source component provided by Netflix company. It provides the features of fusing and isolation, but Hystrix started from November 2018, instead of iterative development, it enters the maintenance mode. In the same year, open-source Spring Cloud Alibaba (SCA) provided a one-stop solution. By default, it integrates Sentinel, Spring Web, RestTemplate, and FeignClient for Spring WebFlux. In the Sentinel ecosystem, Spring Cloud not only fills the blank of Hystrix in Servlet, RestTemplate, and API Gateway, but also is fully compatible with the usage of Hystrix throttling and degradation in FeignClient, it also supports flexible configuration and adjustment of throttling and degradation rules at runtime. SCA also integrates the Sentinel flow control module provided by API gateway, which can seamlessly support the flow control and degradation of Spring Cloud Gateway and Zuul gateway.

Spring Cloud Alibaba Sentinel service throttling/fusing

it's time to start! We use an instance to implement Spring Cloud throttling and fusing. Our instance project consists of four modules:

      enabled: true
          # route ID 转化小写
          lower-case-service-id: true
        - id: foo-service-route
          uri: http://localhost:9669/
            - Path=/demo/**
        - id: httpbin-route
          uri: https://httpbin.org
            - Path=/httpbin/**
            - RewritePath=/httpbin/(?<segment>.*), /$\{segment}

this route configuration includes two routes:

  • foo-service-route: Will /demo/the access starting with localhost:9669 is routed to the backend service, which corresponds to our Web service. All the APIs in the access example pass through this route. For example localhost:8090/demo/time.
  • httpbin-route: this is a sample route that will /httpbin/access routes starting with https://httpbin.org, for example localhost:8090/httpbin/jsonwill be mapped https://httpbin.org/jsonabove.

In addition, our environment also includes the Sentinel console, which can be directly accessed and accessed by various services. Address: TODO

step by step, configure the throttling and degradation rules for the access SCA Sentinel in the console or Nacos dynamic data source.

spring-cloud-alibaba-dependencies configuration

first, we import the latest version of spring-cloud-alibaba-dependencies into the parent pom of the project. In this way, we do not need to specify the version number when introducing SCA-related dependencies:


service access SCA Sentinel

first, we introduce Spring Cloud Alibaba Sentinel dependencies to the three service modules:


the starter automatically configures the adaptation module of the Sentinel. You can quickly access the Sentinel and connect to the Sentinel console with simple configurations.

For Dubbo services, we also need to introduce additional Dubbo adaptation modules. Sentinel provides an out-of-the-box integration module for Apache Dubbo. sentinel-apache-dubbo-adapterYou can access Dubbo automatic tracking statistics based on dependencies (supports provider and consumer):


we add adapter dependencies to the pom files of web-api-demo and dubbo-provider applications, so that the Dubbo consumer/provider interfaces of the two applications can be automatically counted by Sentinel.

For gateways such as Spring Cloud Gateway and Zuul 1.x, we also need to introduce additional SCA dependencies spring-cloud-alibaba-sentinel-gatewaydependency:


this dependency automatically adds Sentinel-Related configurations to the gateway so that the API gateway can access the Sentinel automatically. Add this dependency to the pom file of the demo-gateway application so that our gateway application can access the Sentinel.

After the dependency is introduced, you can quickly access the Sentinel console with simple configurations. We can application.propertiesconfigure the application name and the address to connect to the console in the file. Take web-api-demo as an example:


of which spring.application.nameI believe everyone is familiar with it. Spring Cloud Alibaba Sentinel will automatically extract this value as the appName to access the application. And we pass spring.cloud.sentinel.transport.dashboardto configure the console address and port to connect.

After completing the preceding configurations, we can start dubbo-provider, web-api-demo, and demo-gateway applications in sequence and access them through the gateway portal. localhost:8090/demo/timeObtain the current time. After the service is triggered, you can view the three applications in the Sentinel console. You can view the access information on the monitoring page, indicating that the access is successful.

On the cluster point link page of each application, you can see some tracking calls of the current application. For example, Web applications can see Web URL and Dubbo consumer calls:

throttling rules

the following is a simple flow control rule. On the Dubbo provider, go to the cluster link page com.alibaba.csp.sentinel.demo.dubbo.FooService:getCurrentTime(boolean)this service call configures a throttling rule, which must have been viewed before it can be seen. We configure a throttling rule with a QPS of 1, which indicates that the call to this service method cannot exceed once per second. If it exceeds the QPS, it will be rejected directly.

Click the Add New button, successfully added rules. We can repeatedly request in the browser localhost:8090/demo/time(The frequency is not too slow), you can see the flow limiting exception message (Dubbo provider the default flow limiting processing logic is to throw an exception, the exception message is directly returned by Dubbo, the default error page is displayed by Spring:

on the real-time monitoring page, you can view the real-time access and rejection:

We can also configure rate limiting rules at the Web API to observe the effect. Spring Web the default throttling logic is to return the default prompt (Blocked by Sentinel) with the status code 429. The following sections describe how to customize the flow control processing logic.

After understanding the basic usage of throttling, you may want to ask: do I need to configure throttling rules for each interface in the production environment? What if the threshold does not match? In fact, the configuration of throttling and degradation needs to be combined with capacity planning and dependency sorting. We can use stress testing tools such as JMeter or Alibaba Cloud PTS to perform full-link stress testing on our services to understand the maximum capacity of each service and determine the maximum capacity of core interfaces as the QPS threshold.

Gateway throttling rules

Sentinel has customized the throttling scenario for API Gateway, and supports throttling for gateway routes (such as foo-service-route defined in the Gateway) or custom API groups, supports throttling for request attributes, such as a header. You can customize an API Group in the Sentinel console, which can be considered as a combination of URL matching. For example, you can define an API called my_api, and the request path mode is /foo/**and /baz/**all belong to the API Group my_api. Throttling can be performed for this custom API Group dimension.

Configure a gateway flow control rule in the console. We can see that the console page of API Gateway is different from that of common applications. These are customized for Gateway scenarios. Sentinel gateway throttling rules can extract the request attributes of a route, including the remote IP, header, URL parameters, and cookie, and can automatically count and limit the hotspot values, you can also limit a specific value (for example, to limit a uid).

The foo-service-route route route is configured with a gateway throttling rule for request attributes. This rule limits each hotspot uid parameter extracted from the URL parameter. The maximum number of requests per minute is 2.

After the rules are saved, we can construct some requests to backend services with different uid parameters (even if they are not used), such localhost:8090/demo/time?uid=xxx. We can observe that a throttling page appears when the access of each uid exceeds twice per minute.

For more information about how to configure Sentinel gateway throttling, see gateway throttling.

Fusing degradation rules

fusing degradation is usually used to automatically cut off unstable services to prevent cascading failures caused by the caller being dragged down. Fusing degradation rules are usually performed on the caller. Weak dependency calls the predefined fallback value is returned during fusing. This ensures that the core link is not affected by unstable bypass.

Sentinel provides the following fusing strategies:

  • SLOW_REQUEST_RATIO: Select the slow call ratio as the threshold. You need to set the allowed slow call RT (that is, the maximum response time), if the response time of a request is greater than this value, a slow call is returned. When the number of requests per unit (statIntervalMs, 1s by default) is greater than the set minimum number of requests, and the proportion of slow calls is greater than the threshold, the request is automatically fusing during the next fusing period. After the fusing duration, the fuse enters the detection recovery status (HALF-OPEN status). If the response time of the next request is less than the set slow call RT, the fusing ends, if the RT of a slow call is greater than the specified value, the call is broken again.
  • Error ratio (ERROR_RATIO): when the number of requests per unit statistical period is greater than the set minimum number of requests, and the proportion of exceptions is greater than the threshold, the request is automatically fusing during the next fusing period. After the fusing duration, the fuse enters the detection recovery state (HALF-OPEN state). If the next request is successfully completed (without errors), the fusing ends. Otherwise, the fusing will be broken again. The threshold range of the exception rate is [0.0, 1.0]indicates 0% to 100%.
  • ERROR_COUNT: when the number of exceptions in the unit statistical period exceeds the threshold, the system automatically fuses. After the fusing duration, the fuse enters the detection recovery state (HALF-OPEN state). If the next request is successfully completed (without errors), the fusing ends. Otherwise, the fusing will be broken again.

Next, we configure slow call fusing rules for Dubbo consumer in a Web application and simulate slow calls to observe the effect. In web-api-demo com.alibaba.csp.sentinel.demo.dubbo.FooServiceconfigure fusing degradation rules for service calls.

The statistical duration configured in the console is 1s by default. In the preceding rule, we set the threshold for slow calls to 50ms. If the response time exceeds 50ms, slow calls are recorded. When the number of requests in the statistical period is greater than or equal to 5 and the proportion of slow calls exceeds the threshold set by us (80%), a fusing is triggered. The fusing duration is 5s, after the fusing duration, a request is allowed to pass the test. If the request is normal, the request is resumed. Otherwise, the fusing continues.

In our instance /demo/timeyou can use slow request parameters to simulate slow calls. When slow is set to true, the request takes more than 100ms. We can use AB isostress testing tools or scripts to request in batches. localhost:8090/demo/time?slow=true, the return of fusing can be observed

if we simulate slow calls all the time, we can observe that each 5s is allowed to pass through one request after the fusing, but the request is still a slow call and will return to the fusing and cannot be recovered. We can trigger fuse rear, after wait time manual hair a slow=truethe normal request, and then the request, you can observe the fuse recovery.

Note that even if the service caller introduces the fusing degradation mechanism, we still need to configure the request timeout on the HTTP or RPC client to do a thorough protection.

Custom tracking in annotation mode

the tracking points we saw just now are all automatic tracking points provided by the Sentinel adaptation module. In some cases, automatic tracking may not meet our needs. We want to limit traffic at a specific business logic. Can we do this? Of course! Sentinel provides two ways to customize tracking: SphUAPI and @SentinelResourceannotations: The former is the most common but the code is complex and the coupling degree is high. Annotations are less invasive but have restrictions on scenarios. Here, we add annotations to the DemoService of the Web application to achieve the goal of tracking local services.

In DemoServicewe have implemented a simple greeting service:

public class DemoService {

    public String bonjour(String name) {
        return "Bonjour, " + name;

next, add the bonjour function @SentinelResourceannotation. The value of the annotation represents the name (resourceName) of the tracking point, which is displayed on the cluster point link/monitoring page.

@SentinelResource(value = "DemoService#bonjour")
public String bonjour(String name)

Add this annotation and then access it through the Gateway /demo/bonjour/{name}in this API, we can see the custom DemoService#bonjourburied.

Adding annotation tracking is only the first step. In the production environment, we want to have some fallback logic when these custom tracking points are throttled, instead of directly throwing exceptions externally. Here we can write a fallback function:

public String bonjourFallback(Throwable t) {
    if (BlockException.isBlockException(t)) {
        return "Blocked by Sentinel: " + t.getClass().getSimpleName();
    return "Oops, failed: " + t.getClass().getCanonicalName();

our fallback function accepts a Throwable parameter to obtain exception information. Sentinel the annotation fallback captures business exceptions and throttling exceptions (that is, BlockException and their subclasses). We can handle them in the fallback logic (such as logging), returns the value of fallback.

Small : Sentinel Annotation requires the method signature of the fallback and blockHandler functions. For more information, see this topic.

After writing the implementation of the fallback function, we @SentinelResourcespecify the following in the annotation:

@SentinelResource(value = "DemoService#bonjour", defaultFallback = "bonjourFallback")
public String bonjour(String name)

in this way, when we customize DemoService#bonjourwhen a resource is throttled or fused, the request goes to the fallback logic and returns the fallback result without directly throwing an exception. We can configure a rate limiting rule with QPS = 1, and then check the return value after a quick request:

?  ~ curl http://localhost:8090/demo/bonjour/Sentinel
Bonjour, Sentinel
?  ~ curl http://localhost:8090/demo/bonjour/Sentinel
Blocked by Sentinel: FlowException

small : use @SentinelResourcethe Annotation requires that the corresponding class must be hosted by Spring (that is, Spring bean), and cannot be called internally (without access to the proxy), nor can it be a private method. Sentinel dynamic proxy mechanism that depends on Spring AOP annotation.

Configure custom flow control processing logic

various adaptation methods of Sentinel support custom flow control processing logic. Take Spring Web adaptation as an example, we only need to provide custom BlockExceptionHandlerimplement and register as a bean to provide custom processing logic for Web tracking. The definition of BlockExceptionHandler is as follows:

public interface BlockExceptionHandler {

    // 在此处处理限流异常,可以跳转到指定页面或返回指定的内容
    void handle(HttpServletRequest request, HttpServletResponse response, BlockException e) throws Exception;

our Web application provides an example of custom flow control logic for Web tracking:

public class SentinelWebConfig {

    public BlockExceptionHandler sentinelBlockExceptionHandler() {
        return (request, response, e) -> {
            // 429 Too Many Requests

            PrintWriter out = response.getWriter();
            out.print("Oops, blocked by Sentinel: " + e.getClass().getSimpleName());

the handler obtains the flow control type and prints the returned information. The returned status code is 429. You can configure redirection or custom response information based on your business needs.

As mentioned in the previous section, you can specify the fallback function to handle throttling and business exceptions. We will not explain it here. For Dubbo adaptation, you can register provider/consumer fallback DubboAdapterGlobalConfig to provide custom throttling logic. For Spring Cloud Gateway adaptation, you can register custom BlockRequestHandler implementation classes to register custom processing logic for Gateway throttling.

Support for Spring Cloud Other components

Spring Cloud Alibaba Sentinel also provides support for Spring Cloud other commonly used components, including RestTemplate, Feign, etc. Due to the limited space, we will not carry out practice. For more information, see Spring Cloud Alibaba documentation.

How to select a throttling degradation component

at this point, you may have questions: What are the advantages and disadvantages of Sentinel compared with other similar products (such as Hystrix)? Is it necessary to migrate data to Sentinel? How to quickly migrate data? The following is a comparison between Sentinel and other fault-tolerance components:

Sentinel Hystrix resilience4j


this topic describes the importance of throttling and downgrade as a high-availability protection method, and the core features and principles of Sentinel, and learned how to quickly access SCA Sentinel to perform throttling and degradation for microservices through hands-on practice. Sentinel has many advanced features to explore, such as hotspot protection and cluster flow control. For more information, see Sentinel official documentation.

Is there no need for throttling protection if the service scale is small? Is the architecture of microservices simple, so there is no need to introduce the fuse protection mechanism? In fact, this has nothing to do with the request magnitude and the complexity of the architecture. Most of the time, it may be a very marginal service failure that affects the overall business and causes huge losses. We need to have the awareness of failure-oriented design. We should do a good job in capacity planning and sorting out strong and weak dependencies in normal times, reasonably configure throttling and degradation rules, and do a good job in advance protection, instead of fixing problems online.

At the same time, we also provide Sentinel Enterprise version AHAS Sentinel on Alibaba Cloud, providing out-of-the-box enterprise-level high-availability protection capabilities. Compared with the open-source version, AHAS also provides the following professional capabilities:

You are welcome to experience the Sentinel of the Enterprise edition on the cloud. You are also welcome to participate in the Community contribution and help the community evolve better.

If you have any questions about the document, please leave a message in the comment area!

Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now