All Products
Search
Document Center

Alibaba Cloud Model Studio:Update deployment throttling

Last Updated:Jun 06, 2026

Update the throttling settings for a specified deployment.

Prerequisites

Update model deployment settings

Note

For the model unit deployment method, only some models support modifying the rpm and tpm settings.

Endpoint

PUT https://dashscope-intl.aliyuncs.com/api/v1/deployments/{deployed_model}/update

Request example

Use the following command to update the throttling settings for a specified deployment:

curl -X PUT "https://dashscope-intl.aliyuncs.com/api/v1/deployments/{deployed_model}/update" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "rpm_limit": 1000,
    "tpm_limit": 200
}'

Request parameters

Parameter

Type

In

Required

Description

deployed_model

String

path

Yes

The unique identifier for the deployment. Obtain it by calling the Create a deployment or List deployments operation.

rpm_limit

Number

body

At least one parameter is required.

The maximum number of requests per minute (rpm).

tpm_limit

Number

body

The maximum number of tokens per minute (tpm).

Response example

A successful request returns the following example response:

{
    "request_id": "1d121fd9-876c-40ad-bc40-a9e68ef3b986",
    "output":
    {
        "deployed_model": "qwen-plus-2025-12-01-b6d61c71",
        "gmt_create": "2026-01-07T13:52:44",
        "gmt_modified": "2026-01-07T14:01:41",
        "status": "PENDING",
        "model_name": "qwen-plus-2025-12-01",
        "base_model": "qwen-plus-2025-12-01",
        "base_capacity": 4,
        "capacity": 4,
        "ready_capacity": 0,
        "workspace_id": "llm-8v53e*******",
        "charge_type": "post_paid",
        "creator": "16542902******",
        "modifier": "16542902********",
        "plan": "mu",
        "model_unit_spec": "MU1",
        "enable_thinking": true,
        "max_context_length": 1,
        "rpm_limit": 1000,
        "tpm_limit": 200
    }
}

Response parameters

For response parameter descriptions, see Create a model deployment task.

Errors

Response example

{
    "request_id": "ca218d57-b91b-46b2-bd35-c41c6287bcf4",
    "message": "Model: qwen-plus-20230703-cx7f not found!",
    "code": "NotFound"
}

Response parameters

Parameter

Type

Description

request_id

String

The unique ID of the request.

code

String

The error code.

message

String

The error message.

If a request fails, the response may contain one of the following errors.

Error code

Error message

Cause

NotFound

Model: xxx not found!

  • You specified a non-existent model when creating a deployment.

  • You specified a non-existent model when querying, updating, or deleting a deployment.

Conflict

Deployed model xxx already exists, please specify a suffix.

The specified suffix is already in use by another deployment.

InvalidParameter

Invalid capacity (xx), capacity must be larger than or equal to 0 and multiples of 1 and less than 1000!

You specified an invalid capacity value when creating or updating a deployment.