The deployment mode determines the computing region for model inference and the location where static data is stored. Choose the appropriate deployment mode to optimize network latency and ensure data processing complies with applicable boundaries.
Compare deployment modes
The deployment mode determines the available computing power and the execution region for model inference. The region also determines where static data is stored. Currently, these two are pre-bound and do not support independent configuration.
To reduce network latency and improve model response speed, select a deployment mode that corresponds to a region near your primary users or business applications:
Deployment mode | Bound region (data storage) | Model inference computing scope |
Global | US (Virginia) | Global dynamic scheduling |
International | Singapore | Global dynamic scheduling (excluding Mainland China) |
United States | US (Virginia) | Limited to the US |
Mainland China | China (Beijing) | Limited to Mainland China |
In the Global and International deployment modes, you are responsible for ensuring the legality of cross-border data processing. These modes involve cross-border computation. Cross-region inference requests are received by the frontend endpoint of the selected region. Static data, such as prompt inputs and model outputs, generated during a model invocation is processed only transiently and is not persistently stored in the region where the compute node resides. Data is encrypted in transit.
Usage
Models in Global deployment mode
Before using these models, configure the request address, API key, and model name:
Request address (base URL): The global deployment mode is bound to the US (Virginia) region. Use the
dashscope-us.aliyuncs.comdomain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:OpenAI Chat Completions API:
https://dashscope-us.aliyuncs.com/compatible-mode/v1DashScope:
https://dashscope-us.aliyuncs.com/api/v1
API key: Go to the Key Management (Virginia).
Model name: See Model list and choose a model for the global deployment mode.
Models in International deployment mode
Before using these models, configure the request address, API key, and model name:
Request address (base URL): The international deployment mode is bound to the Singapore region. Use the
dashscope-intl.aliyuncs.comdomain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:OpenAI Chat Completions API:
https://dashscope-intl.aliyuncs.com/compatible-mode/v1DashScope:
https://dashscope-intl.aliyuncs.com/api/v1
API key: Go to the Key Management (Singapore).
Model name: See Model list and choose a model for the international deployment mode.
Models in United States deployment mode
Before using these models, configure the request address, API key, and model name:
Request address (base URL): The US deployment mode is bound to the US (Virginia) region. Use the
dashscope-us.aliyuncs.comdomain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:OpenAI Chat Completions API:
https://dashscope-us.aliyuncs.com/compatible-mode/v1DashScope:
https://dashscope-us.aliyuncs.com/api/v1
API key: Go to the Key Management (Virginia).
Model name: See Model list and choose a model for the US deployment mode (with the
-ussuffix).
Models in Mainland China deployment mode
Before using these models, configure the request address, API key, and model name:
Request address (base URL): The Mainland China deployment mode is bound to the China (Beijing) region. Use the
dashscope.aliyuncs.comdomain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:OpenAI Chat Completions API:
https://dashscope.aliyuncs.com/compatible-mode/v1DashScope:
https://dashscope.aliyuncs.com/api/v1
API key: Go to the Key Management (Beijing).
Model name: See Model list and choose a model for the Mainland China deployment mode.
Asynchronous tasks
For asynchronous tasks, such as image generation and video generation, you must use the same service domain name and API key for all subsequent operations as were used when creating the task. Otherwise, an error occurs.
The following example shows how to create an image generation task and query its result in global deployment mode:
# Create task (global deployment mode, service domain name dashscope-us.aliyuncs.com)
curl --location 'https://dashscope-us.aliyuncs.com/api/v1/services/aigc/image-generation/generation' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--data '{
"model": "wan2.6-t2i",
"input": {
"messages": [
{
"role": "user",
"content": [
{
"text": "A flower shop with exquisite windows, beautiful wooden doors, and flowers."
}
]
}
]
},
"parameters": {
"n": 1
}
}'
# Response example: {"output":{"task_id":"abc123..."},"request_id":"..."}
# Query task (must use the same service domain name)
curl -X GET https://dashscope-us.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
# [Error] Querying with a different service domain name causes an error
curl -X GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"Region information
A region is the physical location of the node where you access Alibaba Cloud Model Studio services. The region IDs are:
Singapore:
ap-southeast-1US (Virginia):
us-east-1China (Beijing):
cn-beijing
The platform features supported in each region are:
Section | Feature | Singapore | US (Virginia) | China (Beijing) |
Usage | Real-time inference | |||
Batch inference | ||||
Playground | ||||
Management | Monitoring | |||
Alerting | ||||
Transmission security | ||||
Permission management | ||||
Optimization | Fine-tuning |
References
Model list: View supported models and context information for each deployment mode.
Model invocation pricing: View price differences for each deployment mode.
Rate limits: View RPM and TPM limits for each deployment mode.
Get an API key: Create and manage API keys for each deployment mode.