All Products
Search
Document Center

API Gateway:Configure policies and plug-ins

Last Updated:Dec 04, 2025

AI Gateway lets you add policies and configure plug-ins for Agent APIs to improve their security, performance, and maintainability.

Procedure

  1. Go to the Instance page in the AI Gateway console and select the region where your instance is located.

  2. On the target instance page, click Agent API. In the Agent API list, click the name of the target API to open its details page.

  3. Select the Policies and Plug-ins tab and click Enable Policy/Plug-in.

  4. In the Enable Policy/Plug-in panel, select Policy or Plugin.

Policy configuration

Concurrency control

Concurrency control rules count the total number of requests being processed by the gateway. When this number reaches a specified threshold, the gateway immediately blocks traffic. You can set this threshold to the maximum number of concurrent requests that your backend service can handle. This protects the availability of your backend service during periods of high concurrency.

Procedure

On the Add Policy tab, click the Concurrency Control card. In the Add Policy: Concurrency Control panel, configure the parameters.

Configuration item

Description

Enable

If enabled, the concurrency control rule takes effect.

Overall Concurrency Threshold

Set the Overall Concurrency Threshold.

Web Fallback Behavior

Return Specified Content

HTTP Status Code

Set the HTTP Status Code. The default value is 429.

Return Content-type

Set Return Content-type to Plain Text or JSON.

HTTP Response Body

Enter the response body text.

Specify Content to Return

Redirect URL

Enter the Redirect URL.

Traffic shaping

Traffic shaping rules monitor the queries per second (QPS) of an API. When the QPS reaches a specified threshold, the gateway immediately blocks traffic. This prevents sudden traffic spikes from overwhelming the backend service and ensures high availability.

Procedure

On the Add Policy tab, click the Traffic Shaping card. In the Add Policy: Traffic Shaping panel, configure the parameters.

Configuration item

Description

Enable

If enabled, the traffic shaping rule takes effect.

Overall QPS Threshold

Set the Overall QPS Threshold.

Web Fallback Behavior

Return Specified Content

HTTP Status Code

Set the HTTP Status Code. The default value is 429.

Return Content-type

Set Return Content-type to Plain Text or JSON.

HTTP Response Body

Enter the response body text.

Redirect To A Specific Page

Redirect URL

Enter the Redirect URL.

Circuit breaking policy

Circuit breaking rules monitor the response time or error rate of an API. When a threshold is reached, the gateway immediately trips the circuit. For a specified period, the gateway stops calling the unstable resource. This prevents the backend service from being affected and ensures its high availability. After the specified time, the gateway resumes calls to the resource.

Procedure

On the Add Policy tab, click the Circuit Breaking card. In the Add Policy: Circuit Breaking panel, configure the parameters.

Configuration item

Description

Enable

If enabled, the circuit breaking rule takes effect.

Statistics Window Duration

The length of the time window for statistics. The value can be from 1 second to 120 minutes.

Minimum Number Of Requests

The minimum number of requests required to trigger circuit breaking. If the number of requests in the current statistics window is less than this value, the rule is not triggered, even if the circuit breaking conditions are met.

Threshold Type

Select Slow Call Ratio (%) or Error Ratio (%) as the threshold.

  1. If you select Slow Call Ratio (%) as the threshold, you must set the allowed Slow Call RT (maximum response time). A request is counted as a slow call if its response time is greater than this value. Set the slow call ratio that triggers circuit breaking in the degradation threshold. After the rule is enabled, if the number of requests within the statistics window duration is greater than the minimum number of requests, and the slow call ratio exceeds the threshold, requests are automatically blocked for the circuit breaking duration. After the circuit breaking duration, the circuit breaker enters a probing recovery state. If the response time of the next request is less than the set Slow call RT, the circuit breaking ends. If it is greater than the set Slow call RT, the circuit will be broken again.

  2. If you select Error Ratio (%) as the threshold, you must set the error ratio that triggers circuit breaking in the degradation threshold. After the rule is enabled, if the number of business errors within the statistics window duration is greater than the minimum number of requests, and the error ratio exceeds the threshold, requests are automatically blocked for the circuit breaking duration.

Slow Call RT

Set the allowed Slow Call RT (maximum response time).

Circuit Breaking Ratio Threshold

The slow call ratio threshold that triggers circuit breaking. The value can be from 0 to 100, which represents 0% to 100%.

Circuit Breaking Duration (s)

The duration for which the circuit remains broken after being triggered. After a resource enters the circuit breaking state, requests fail fast during the configured circuit breaking duration.

Web Fallback Behavior

Return Specified Content

HTTP Status Code

Set the HTTP Status Code. The default value is 429.

Return Content-type

Set Return Content-type to Plain Text or JSON.

HTTP Response Body

Enter the response body text.

Redirect To A Specific Page

Redirect URL

Enter the Redirect URL.

IP blacklist and whitelist policy

The IP blacklist and whitelist policy controls client access to services based on a pre-configured list of allowed (whitelist) or denied (blacklist) IP addresses.

Procedure

On the Add Policy tab, click the IP Blacklist/Whitelist card. In the Add Policy: IP Blacklist/Whitelist panel, configure the parameters.

Parameter

Description

Enable

If enabled, the IP blacklist and whitelist policy takes effect.

Name

A custom ID to distinguish and manage multiple policies.

Notes

A description of the policy for easy identification and management.

Type

Specify whether the list is a blacklist or a whitelist to control the access policy type.

  • Whitelist: Allows access only from specified IP addresses. All other IP addresses are denied by default.

  • Blacklist: Blocks access from specific IP addresses. All other IP addresses are allowed by default.

IP Address/CIDR Block

Configure the list of IP addresses or CIDR blocks to allow or deny. Multiple entries are supported. Use a format such as 192.168.1.1/24.

Timeout policy

AI Gateway provides API-level timeout settings. You can configure the maximum time the gateway waits for a response from a backend service for a specific API. If the gateway does not receive a response from the backend service within the specified time, it returns an HTTP status code of 504 (Gateway Timeout) to the client.

Procedure

On the Add Policy tab, click the Timeout card. In the Add Policy: Timeout panel, configure the parameters.

Note

After you configure and enable the timeout policy, verify that the timeout rule works as expected.

Parameter

Description

Enable

Specifies whether to enable the timeout policy.

  • Enable: The gateway API timeout policy takes effect.

  • Disable: The gateway API timeout policy is disabled.

Timeout Period

Set the timeout period for the current API in seconds.

Note

If you set this parameter to 0 or disable the timeout policy, the gateway waits indefinitely for a response.

Retry policy

AI Gateway provides API-level retry settings that allow you to automatically retry failed requests. You can configure the conditions that trigger a retry, such as a connection failure, an unavailable backend service, or a specific HTTP status code.

API retry conditions

When the backend service returns a 5xx error, AI Gateway automatically retries the failed request based on the configured number of retries.

image
  • The retry conditions for the HTTP Protocol are as follows:

    • 5xx: If the backend service returns any 5xx response, or if a connection is lost, reset, or a read timeout event occurs, AI Gateway attempts to retry the failed request.

      Note

      5xx includes the conditions for connect-failure and refused-stream.

    • reset: If a connection is lost, reset, or a read timeout event occurs, AI Gateway attempts to retry the failed request.

    • connect-failure: If a connection to the backend service cannot be established, AI Gateway attempts to retry the failed request.

    • refused-stream: If the backend service resets the stream with a REFUSED_STREAM error code, AI Gateway attempts to retry the failed request.

    • retriable-status-codes: If the HTTP status code of the backend service response matches one of the specified retry status codes, AI Gateway attempts to retry the request.

      Note

      You can use retry status codes only if you specify retriable-status-codes in the retry conditions.

  • The retry conditions for the GRPC Protocol are as follows:

    • cancelled: If the gRPC status code in the response header from the backend gRPC service is cancelled, AI Gateway attempts to retry the request.

    • deadline-exceeded: If the gRPC status code in the response header from the backend gRPC service is deadline-exceeded, AI Gateway attempts to retry the request.

    • internal: If the gRPC status code in the response header from the backend gRPC service is internal, AI Gateway attempts to retry the request.

    • resource-exhausted: If the gRPC status code in the response header from the backend gRPC service is resource-exhausted, AI Gateway attempts to retry the request.

    • unavailable: If the gRPC status code in the response header from the backend gRPC service is unavailable, AI Gateway attempts to retry the request.

Procedure

On the Add Policy tab, click the Retry card. In the Add Policy: Retry panel, configure the parameters.

Note

After you configure and enable the retry policy, verify that the retry rule works as expected.

Parameter

Description

Enable

Specifies whether to enable the retry policy.

  • Enable: The gateway API retry policy takes effect.

  • Disabled: The gateway API retry policy does not take effect.

    After you disable retries, the gateway has a default internal retry configuration. By default, the number of retries is 2 and the retry conditions are connect-failure, refused-stream, unavailable, cancelled, non_idempotent, or retriable-status-codes.

Number Of Retries

The maximum number of retries for a failed request. You can set this parameter to an integer from 0 to 10. We recommend that you set this parameter to 0, 1, or 2.

If you set this parameter to 0, failed requests are not retried.

Retry Conditions

Select the appropriate conditions. You can select multiple conditions.

Retry Status Codes

Retry the request for responses with specific HTTP status codes. You can configure multiple HTTP status codes.

Important

You can configure Retry Status Codes only if you specify retriable-status-codes for Retry Conditions.

Header modification policy

The header modification feature lets you modify the headers in the original request before it is forwarded to the backend service, or in the response from the backend service before it is returned to the client.

Procedure

On the Add Policy tab, click the Header Modification card. In the Add Policy: Header Modification panel, configure the parameters.

Configuration item

Description

Enable

Specifies whether to enable the header modification policy.

  • Enable: If enabled, the gateway controls the request and response headers.

  • Disable: If disabled, the gateway does not control the request and response headers.

Header Type

Select the header type.

  • Request: Modifies the request header.

  • Response: Modifies the response header.

Operation Type

Select the operation type.

  • Add: Adds a header to the request or response.

    Note

    If the header to be added already exists, the new header value is appended to the existing value, separated by a comma (,).

  • Modify: Modifies a specified header in the request or response.

    Note

    • If the specified header does not exist, it is added with the specified header key and value.

    • If the specified header exists, its value is overwritten.

  • Delete: Deletes a specified header from the request or response.

Header Key

Enter the name of the request or response header.

Header Value

Enter the value of the request or response header.

Plug-in configuration

  1. Click the Add Plug-in tab.

  2. In the Quick Navigation section, select a plug-in type or search for a plug-in by name, and then click the card for the desired plug-in:

    • If the plug-in is not installed, click Install and Configure in the pop-up window. Then, configure the plug-in rules and enable it.

    • If the plug-in is already installed, configure its rules and enable it in the pop-up window.

  3. Click OK. You are returned to the More Policies and Plug-ins page, where you can view the attachment and enabled status of the plug-in for the API.