API Gateway - Model API throttling policy adds the number of requests and parallelism dimensions
Oct 28 2025
API GatewayContent
Target customers: all users who use the model proxy. New features /specifications: The throttling policy of the Model API adds the number of requests and the number of parallelism. For the text scenario, you can set the throttling policy based on the number of requests, the number of parallelism, and the number of tokens. For other scenarios, you can set the throttling policy based on the number of requests and the number of parallelism. At the same time, API-level throttling is added. You can configure the overall number of requests and parallelism thresholds for APIs.