All Products
Search
Document Center

Alibaba Cloud Model Studio:How to choose a deployment mode

Last Updated:Feb 13, 2026

The deployment mode determines the computing region for model inference and the location where static data is stored. Choose the appropriate deployment mode to optimize network latency and ensure data processing complies with applicable boundaries.

Compare deployment modes

The deployment mode determines the available computing power and the execution region for model inference. The region also determines where static data is stored. Currently, these two are pre-bound and do not support independent configuration.

To reduce network latency and improve model response speed, select a deployment mode that corresponds to a region near your primary users or business applications:

Deployment mode

Bound region (data storage)

Model inference computing scope

Global

US (Virginia)

Global dynamic scheduling

International

Singapore

Global dynamic scheduling (excluding Mainland China)

United States

US (Virginia)

Limited to the US

Mainland China

China (Beijing)

Limited to Mainland China

Important

In the Global and International deployment modes, you are responsible for ensuring the legality of cross-border data processing. These modes involve cross-border computation. Cross-region inference requests are received by the frontend endpoint of the selected region. Static data, such as prompt inputs and model outputs, generated during a model invocation is processed only transiently and is not persistently stored in the region where the compute node resides. Data is encrypted in transit.

Usage

Models in Global deployment mode

Before using these models, configure the request address, API key, and model name:

  • Request address (base URL): The global deployment mode is bound to the US (Virginia) region. Use the dashscope-us.aliyuncs.com domain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:

    • OpenAI Chat Completions API: https://dashscope-us.aliyuncs.com/compatible-mode/v1

    • DashScope: https://dashscope-us.aliyuncs.com/api/v1

  • API key: Go to the Key Management (Virginia).

  • Model name: See Model list and choose a model for the global deployment mode.

Models in International deployment mode

Before using these models, configure the request address, API key, and model name:

  • Request address (base URL): The international deployment mode is bound to the Singapore region. Use the dashscope-intl.aliyuncs.com domain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:

    • OpenAI Chat Completions API: https://dashscope-intl.aliyuncs.com/compatible-mode/v1

    • DashScope: https://dashscope-intl.aliyuncs.com/api/v1

  • API key: Go to the Key Management (Singapore).

  • Model name: See Model list and choose a model for the international deployment mode.

Models in United States deployment mode

Before using these models, configure the request address, API key, and model name:

  • Request address (base URL): The US deployment mode is bound to the US (Virginia) region. Use the dashscope-us.aliyuncs.com domain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:

    • OpenAI Chat Completions API: https://dashscope-us.aliyuncs.com/compatible-mode/v1

    • DashScope: https://dashscope-us.aliyuncs.com/api/v1

  • API key: Go to the Key Management (Virginia).

  • Model name: See Model list and choose a model for the US deployment mode (with the -us suffix).

Models in Mainland China deployment mode

Before using these models, configure the request address, API key, and model name:

  • Request address (base URL): The Mainland China deployment mode is bound to the China (Beijing) region. Use the dashscope.aliyuncs.com domain name. The following are examples of request addresses. For other APIs, see the corresponding documentation:

    • OpenAI Chat Completions API: https://dashscope.aliyuncs.com/compatible-mode/v1

    • DashScope: https://dashscope.aliyuncs.com/api/v1

  • API key: Go to the Key Management (Beijing).

  • Model name: See Model list and choose a model for the Mainland China deployment mode.

Asynchronous tasks

For asynchronous tasks, such as image generation and video generation, you must use the same service domain name and API key for all subsequent operations as were used when creating the task. Otherwise, an error occurs.

The following example shows how to create an image generation task and query its result in global deployment mode:

# Create task (global deployment mode, service domain name dashscope-us.aliyuncs.com)
curl --location 'https://dashscope-us.aliyuncs.com/api/v1/services/aigc/image-generation/generation' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--data '{
    "model": "wan2.6-t2i",
    "input": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "A flower shop with exquisite windows, beautiful wooden doors, and flowers."
                    }
                ]
            }
        ]
    },
    "parameters": {
        "n": 1
    }
}'

# Response example: {"output":{"task_id":"abc123..."},"request_id":"..."}

# Query task (must use the same service domain name)
curl -X GET https://dashscope-us.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"

# [Error] Querying with a different service domain name causes an error
curl -X GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"

Region information

A region is the physical location of the node where you access Alibaba Cloud Model Studio services. The region IDs are:

  • Singapore: ap-southeast-1

  • US (Virginia): us-east-1

  • China (Beijing): cn-beijing

The platform features supported in each region are:

Section

Feature

Singapore

US (Virginia)

China (Beijing)

Usage

Real-time inference

Supported

Supported

Supported

Batch inference

Supported

Not supported

Supported

Playground

Supported

Supported

Supported

Management

Monitoring

Supported

Supported

Supported

Alerting

Supported

Not supported

Supported

Transmission security

Supported

Supported

Supported

Permission management

Supported

Supported

Supported

Optimization

Fine-tuning

Supported

Supported

Supported

References