New Features

API Gateway - Model API throttling policy adds the number of requests and parallelism dimensions

Oct 28 2025

API Gateway
The Model API supports throttling based on the number of requests, parallelism, and tokens. It supports various scenarios including text, embedded, rerank, and multi-modal.
Content

Target customers: all users who use the model proxy. New features /specifications: The throttling policy of the Model API adds the number of requests and the number of parallelism. For the text scenario, you can set the throttling policy based on the number of requests, the number of parallelism, and the number of tokens. For other scenarios, you can set the throttling policy based on the number of requests and the number of parallelism. At the same time, API-level throttling is added. You can configure the overall number of requests and parallelism thresholds for APIs.

7th Gen ECS Is Now Available

Increase instance computing power by up to 40% and Fully equipped with TPM chips.
Powered by Third-generation Intel® Xeon® Scalable processors (Ice Lake).

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.