All Products
Search
Document Center

API Gateway:Circuit breaker plug-ins

Last Updated:Jul 10, 2023

1. Overview

API Gateway provides a circuit breaker for each API to protect the API in the event of abnormal backend performance. By default, if timeout occurs 1,000 times at the backend of an API within 30 seconds, the circuit breaker trips. The circuit breaker stays open for 90 seconds, during which the following error is returned for all API requests: Status=503,X-Ca-Error-Code=D503CB. After 90 seconds, the circuit breaker allows a limited number of concurrent API requests to pass through. If these requests are successful, the circuit breaker closes and API requests can be handled as expected again.

You can also bind a plug-in of the Circuit Breaker type to an API to customize the configurations of its circuit breaker. Take note that circuit breaker plug-ins take effect only for APIs in dedicated instances. You can customize the following configurations of a circuit breaker:

  • The condition under which the circuit breaker trips. You can specify that the circuit breaker trips after the number of occurrences of timeout or another specified error at the backend reaches a threshold within a specified period of time.

  • The time window during which the number of occurrences of timeout, the number of occurrences of long response, or the number of occurrences of a specified error, at the backend is checked by the circuit breaker to determine whether to trip.

  • The period of time during which the circuit breaker stays open after it trips.

  • The backend to which API requests are directed when the circuit breaker is open.

2. Configurations

Circuit breaker plug-ins take effect only for APIs in dedicated instances. If you bind a circuit breaker plug-in to an API in a shared instance, the circuit breaker still uses the default configurations.

2.1 Specify that the circuit breaker trips after the number of occurrences of timeout at the backend reaches a threshold

When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of timeout at the backend reaches a threshold within a specified period of time. If the backend timeout threshold specified for an API is 10 seconds and no response is received from the backend within 10 seconds, one occurrence of timeout is counted.

timeoutThreshold: 15         # The threshold of the number of occurrences of timeout at the backend.
windowInSeconds: 30          # The time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15       # The period of time during which the circuit breaker stays open after it trips.
downgradeBackend:            # The backend to which API requests are directed when the circuit breaker is open.
  type: mock
  statusCode: 418

In the preceding code snippet, you can specify the following parameters:

  • timeoutThreshold: the threshold of the number of occurrences of timeout at the backend. If this threshold is reached, the circuit breaker trips. The maximum value of this parameter is 5000. We recommend that you specify an appropriate value. If the value is small, the circuit breaker trips regularly after timeout occurs only several times.

  • windowInSeconds: the time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.

  • openTimeoutSeconds: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.

  • downgradeBackend: optional. The backend to which API requests are directed when the circuit breaker is open.

2.2 Specify that the circuit breaker trips after the number of occurrences of long response at the backend reaches a threshold

When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of long response at the backend reaches a threshold within a specified period of time. The backend response time is the duration between when API Gateway sends a request to the backend and when API Gateway receives a response from the backend.

---
errorThreshold: 10         # The threshold of the number of occurrences of long response at the backend.
windowInSeconds: 60          # The time window during which the number of occurrences of long response at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 120        # The period of time during which the circuit breaker stays open after it trips.
errorCondition: "$LatencyMilliSeconds > 500"     # The conditional expression that is used to determine whether the backend response is counted as a long response. In this example, if the backend response time exceeds 500 ms, the response is considered as a long response.
downgradeBackend:               # The backend to which API requests are directed when the circuit breaker is open.
  type: mock
  statusCode: 403

In the preceding code snippet, you can specify the following parameters:

  • errorThreshold: the threshold of the number of occurrences of long response.

  • windowInSeconds: the time window during which the number of occurrences of long response at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.

  • openTimeoutSeconds: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.

  • errorCondition: the conditional expression that is used to determine whether the backend response is counted as a long response. You can use the $LatencyMilliSeconds and $LatencySeconds variables. The unit of $LatencyMilliSeconds is milliseconds. The unit of $LatencySeconds is seconds.

  • downgradeBackend: optional. The backend to which API requests are directed when the circuit breaker is open.

2.3 Specify that the circuit breaker trips after the number of occurrences of a specified error at the backend reaches a threshold

When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of a specified error at the backend reaches a threshold within a specified period of time.

errorCondition: "$StatusCode == 503"  # The conditional expression that specifies the error whose number of occurrences is checked by the circuit breaker to determine whether to trip.
errorThreshold: 1000                  # The threshold of the number of occurrences of the specified error.
windowInSeconds: 30                   # The time window during which the number of occurrences of the specified error at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15                # The period of time during which the circuit breaker stays open after it trips.
downgradeBackend:                     # The backend to which API requests are directed when the circuit breaker is open.
  type: "HTTP"
  address: "http://api.foo.com"
  path: "/system-busy.json"
  method: GET
  • errorCondition: the conditional expression that specifies the error whose number of occurrences is checked by the circuit breaker to determine whether to trip. You can use the $StatusCode and $LatencySeconds variables.

    • If you specify a conditional expression as $StatusCode = 503 or $StatusCode = 504, the circuit breaker checks the total number of occurrences of HTTP status code 503 or 504.

    • If you specify a conditional expression as $LatancySeconds > 30, the circuit breaker checks the total number of occurrences of timeout that is greater than 30 seconds.

  • errorThreshold: the threshold of the number of occurrences of the specified error.

  • windowInSeconds: the time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.

  • openTimeoutSeconds: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.

  • downgradeBackend: optional. The backend to which API requests are directed when the circuit breaker is open.

2.4. Accurate status control

The API Gateway service is deployed on multiple nodes in a cluster to ensure high availability and performance. By default, different service nodes independently calculate and save the circuit breaker status. As a consequence, from a global perspective, the circuit breaker may have status inaccuracy. If you require accurate circuit breaker status, you can add the useGlobalState field to the plug-in configuration. Example:

---
timeoutThreshold: 15 # The threshold of the number of occurrences of timeout at the backend.
windowInSeconds: 30 # The time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15 # The period of time during which the circuit breaker stays open after it trips.
useGlobalState: true # Accurate status control is enabled.
downgradeBackend: # The backend to which API requests are directed when the circuit breaker is open.
 type: mock
 statusCode: 302
 body: |
 <result>
 <errorCode>I's a teapot</errorCode>
 </result>

The default value of useGlobalState is false. If you set it to true, the accurate circuit breaker status is obtained at the cost of service performance loss to a degree that does not compromise the promised queries per second (QPS) and service level agreement (SLA) metrics of the current instance.

2.5. Specify that the circuit breaker trips after the percentage of requests in which a specific error occurs to all requests within a specified period of time reaches a threshold

API Gateway trips a circuit breaker when one of the following conditions is met:

  • errorThreshold: the threshold of the number of occurrences of the specific error at the backend. This field is used in combination with a conditional expression.

  • timeoutThreshold: the threshold of the number of occurrences of timeout at the backend.

  • errorThresholdByPercent: the threshold of the percentage of the number of requests in which the specific error occurs to the total number of requests in a time window.

  • timeoutThresholdByPercent: the threshold of the percentage of the number of requests in which timeout occurs to the total number of requests in a time window.

Example:

---
windowInSeconds: 3  # The time window during which the circuit breaker determines whether to trip. Valid values: 10 to 90. Unit: seconds.
openTimeoutSeconds: 3
errorThreshold: 90  # The threshold of the number of occurrences of the specified error.
timeoutThreshold: 90   # The threshold of the number of occurrences of timeout.
errorThresholdByPercent: 20    # The threshold of the percentage of requests in which the specified error occurs to the total number of requests.
timeoutThresholdByPercent: 20   # The threshold of the percentage of requests in which timeout occurs to the total number of requests.
errorCondition: "$StatusCode = 500"   # The error condition.
downgradeBackend:
  type: mock
  statusCode: 418
  body: |
    <result>
      <errorCode>I's a teapot</errorCode>
    </result>
Important

If you use a percentage threshold, the number of requests in a time window must be 100 at least. Otherwise, the rule does not take effect.

In this example, the errorThreshold: 90 timeoutThreshold: 90 configuration specifies that the circuit breaker trips if the number of occurrences of the specified error or timeout exceeds 90 in a time window.

In this example, the errorThresholdByPercent: 20 timeoutThresholdByPercent: 20 configuration specifies that the circuit breaker trips if the number of occurrences of the specified error or timeout exceeds 20 for every 100 requests in a time window.

2.6. Throttle requests when the circuit breaker trips

Once the circuit breaker trips, a temporary throttling configuration is added to the API, and all traffic is throttled based on this configuration when the circuit breaker is open or half open. Example:

---
windowInSeconds: 1             # The time window in which the circuit breaker checks the number of occurrences of timeout at the backend.
openTimeoutSeconds: 15          # The period of time during which the circuit breaker stays open after it trips.
errorThreshold: 3
errorCondition: "$LatencyMilliSeconds > 1"
downgradeTrafficLimit:               # The throttling rule when the circuit breaker is open.
  limit: 2
  period: MINUTE

3. Configure the backend to which API requests are directed when the circuit breaker is open

You can set the downgradeBackend parameter to specify a backend to which API requests are directed when the circuit breaker is open. The configurations of the backend must be consistent with the API specification files that are imported to API Gateway. For more information, see Import Swagger files to create APIs. You can configure the following types of backends by using the samples:

  • HTTP

---
backend:
  type: HTTP
  address: "http://10.10.100.2:8000"
  path: "/users/{userId}"
  method: GET
  timeout: 7000        
  • HTTP-VPC

---
backend:
  type: HTTP-VPC
  vpcAccessName: vpcAccess1
  path: "/users/{userId}"
  method: GET
  timeout: 10000        
  • Function Compute

---
backend:
  type: FC
  fcRegion: cn-shanghai
  serviceName: fcService
  functionName: fcFunction
  arn: "acs:ram::111111111:role/aliyunapigatewayaccessingfcrole"        
  • MOCK

---
backend:
  type: MOCK
  mockResult: "mock resul sample"
  mockStatusCode: 200
  mockHeaders:
    - name: Content-Type
      value: text-plain
    - name: Content-Language
      value: zhCN

4. Error codes

Error code

HTTP status code

Message

Description

D503BB

503

Backend circuit breaker busy

The error message returned because the API is protected by its circuit breaker.

D503CB

503

Backend circuit breaker open, ${Reason}

The error message returned because the circuit breaker of the API is open. Test API calls after you check the backend performance of the API.

5. Limits

  • Circuit breaker plug-ins take effect only for APIs in dedicated instances.

  • Each conditional expression can contain a maximum of 512 characters.

  • Each plug-in can contain a maximum of 50 KB of metadata.