AI Gateway lets you add policies and configure plugins at the API level to improve the security, performance, and maintainability of your APIs.
Policy configuration changes take effect immediately. You do not need to republish the API.
Procedure
-
Go to the Instance page of the AI Gateway console. In the top navigation bar, select the region where your target instance is located, and then click the target instance ID.
-
In the left-side navigation pane, click Model API, and then click the target API Name to go to the API Details page.
-
Click the Policies and Plug-ins tab. In the More policies and plugins section, select where you want to configure the policy or plugin (Inbound Processing/Outbound Processing), and then click Enable Policy/Plug-in.
-
In the Enable Policy/Plug-in panel, select and configure a policy or plugin. For more information, see Policy configurations and Plugin configurations.
Policy configurations
Concurrency control
Concurrency control protects your backend service by limiting the number of simultaneous requests. The gateway counts the requests it is processing, and when this count reaches a specified threshold, it immediately blocks subsequent traffic to prevent the backend service from being overloaded and to ensure its availability.
Traffic shaping
Traffic shaping monitors an API's queries per second (QPS). When the QPS reaches a specified threshold, the gateway immediately blocks traffic. This prevents sudden traffic spikes from overwhelming the backend service and ensures high availability.
Circuit breaking policy
A circuit breaking policy protects your backend service by monitoring its response time or error ratio. When a specified threshold is exceeded, the gateway "trips the circuit" and stops sending requests to the service for a set duration. This action prevents cascading failures and ensures high availability. After the duration passes, the gateway cautiously resumes sending requests to the resource.
IP blacklist and whitelist policy
The IP blacklist and whitelist policy controls client access to your services based on a pre-configured list of IP addresses that are either allowed (whitelist) or denied (blacklist).
Timeout policy
AI Gateway lets you set API-level timeouts, defining the maximum time the gateway waits for a response from a backend service. If no response is received within this period, the gateway returns an HTTP 504 Gateway Timeout status code to the client.
Retry policy
AI Gateway can automatically retry failed requests at the API level. You can configure retries to be triggered by specific conditions, such as a connection failure, an unavailable backend service, or a specific HTTP status code, to trigger a retry.
Header modification policy
The header modification feature lets you modify request headers before they are sent to the backend service, and response headers before they are returned to the client.
Plugin configurations
-
Click the Add Plug-in tab.
-
In the Quick Navigation section, find the desired plugin by type or by searching its name, and then click the plugin card:
-
If the plugin is not installed, click Install and Configure in the dialog box that appears. Then, configure the plugin rules and enable it.
-
If the plugin is already installed, configure the plugin rules and enable it in the dialog box that appears.
-
-
Click OK. You are redirected to the API attachment list, where you can view its attachment status.
In the attachment list, the Inbound Processing section shows jwt-auth (disabled) and key-auth (enabled). The Outbound Processing section shows key-auth (enabled) and jwt-auth (disabled). The backend service address is test.static. The request flows from the front-end API, through Inbound Processing, to the backend service, and then through Outbound Processing.