Before calling Model Studio, select a region, service deployment scope, and access domain:
Region: Determines the access point and data storage location. Choose a nearby region to reduce latency.
Service deployment scope: Determines the inference execution location. If you have data residency requirements, select a scope with specific geographic boundaries. Otherwise, select the Global scope for a larger inference resource pool.
Access domain: Affects concurrency limits, timeouts, and other service guarantees. Each region has its own dedicated access domain.
A complete model invocation works as follows:
Your application sends a request to the selected region (such as Singapore) through the access domain. Request data is stored in that region.
The region forwards the request to an inference node within the service deployment scope (transient data is not persisted; all transmissions are encrypted).
The inference result is returned to the region for storage and then sent back to your application. Your static data always remains in the selected region.
Select region and service deployment scope
Choose a region and service deployment scope based on your scenario:
Scenario | Region | Service deployment scope |
No data residency restrictions. Maximize the inference resource pool through cross-region scheduling (ensure cross-border compliance independently). | US (Virginia) | Global (any available node, including within and outside China) |
No data residency restrictions. Maximize the inference resource pool through cross-region scheduling (ensure cross-border compliance independently). | Germany (Frankfurt) | Global (any available node, including within and outside China) |
No data residency restrictions. Maximize the inference resource pool through cross-region scheduling (ensure cross-border compliance independently). | Japan (Tokyo) | Global (any available node, including within and outside China) |
No data residency restrictions. Maximize the inference resource pool through cross-region scheduling (ensure cross-border compliance independently). | China (Hong Kong) | Global (any available node, including within and outside China) |
Data must not pass through the Chinese mainland (cross-region inference scheduling; ensure cross-border compliance independently) | Singapore | International (global nodes excluding the Chinese mainland) |
Data must stay within the Chinese mainland | China (Beijing) | Chinese mainland (inference restricted to China) |
Data must stay within China (Hong Kong) | China (Hong Kong) | China (Hong Kong) (inference restricted to Hong Kong) |
Data must stay within the US | US (Virginia) | United States (inference restricted to the US) |
Data must stay within the EU | Germany (Frankfurt) | EU (inference restricted to the EU) |
Data must stay within Japan | Japan (Tokyo) | Japan (inference restricted to Japan) |
Select access domain
Model Studio provides three types of access domains for model inference APIs: workspace-dedicated, DashScope, and trial, covering scenarios from trial exploration to enterprise-grade production. We recommend using the workspace-dedicated domain. The key differences are as follows:
Comparison | Workspace-dedicated domain (recommended) | DashScope domain (existing domain) | Trial domain |
Domain format |
|
Using Singapore region as an example |
|
Use case | Recommended for production environments. Offers higher concurrency capacity and network isolation, ensuring stable and low-latency access under heavy traffic. | For existing integrations. We recommend Migrate to workspace-dedicated domain. | Quick trial and feature validation. Not recommended for production environments. |
Authentication scope | Access to the current workspace only | Access to all workspaces | Access to all workspaces |
Rate limits | RPM and TPM vary by model | RPM and TPM vary by model | RPM is 1000; TPM varies by model |
Request timeout | 3600 seconds | 600 seconds | 600 seconds |
Protocol support | HTTP, SSE, WebSocket, WebRTC | HTTP, SSE, WebSocket | HTTP, SSE |
SLA | 99.9% | 99.9% | Not provided |
Regional access information
Each region has its own access domain, API Key, and model list. These cannot be used across regions.
Region | Region ID | Workspace-dedicated domain | DashScope domain | Trial domain | API Key | Model list |
China (Beijing) |
|
|
|
| ||
Singapore |
|
|
|
| ||
Germany (Frankfurt) |
|
| Not supported | Not yet supported | ||
Japan (Tokyo) |
|
| Not supported | Not yet supported | ||
China (Hong Kong) |
|
|
|
| ||
US (Virginia) |
| Not yet supported |
| Not yet supported |
Germany (Frankfurt), Japan (Tokyo), and China (Hong Kong) regions use Workspaces to separate service deployment scopes. Before making API calls, go to the Workspace Management page to create a workspace and select a service deployment scope: Germany (Frankfurt) (Global/EU), Japan (Tokyo) (Global/Japan), China (Hong Kong) (Global/Hong Kong).
US (Virginia) region: Use model names with the
-ussuffix (such asqwen-plus-us) to restrict inference to the US. Without the suffix, inference defaults to the Global scope.China (Beijing) and Singapore regions each support only one service deployment scope (Chinese mainland and International, respectively), so no selection is needed.
Migrate to workspace-dedicated domain
Migrating from a DashScope domain or trial domain to a workspace-dedicated domain requires only two steps, with no changes to your business logic code:
Obtain the workspace-dedicated domain:
Option 1: In the popup that appears after API Key creation, copy the API Host .
Option 2: On the Workspace Management page, copy the content in the API Host column.
Replace the domain in the Base URL: Replace the original domain with the workspace-dedicated domain. Using China (Beijing) region as an example, where
llm-xxxis the workspace ID:OpenAI compatible: Replace
https://dashscope.aliyuncs.com/compatible-mode/v1withhttps://llm-xxx.cn-beijing.maas.aliyuncs.com/compatible-mode/v1DashScope: Replace
https://dashscope.aliyuncs.com/api/v1withhttps://llm-xxx.cn-beijing.maas.aliyuncs.com/api/v1Anthropic compatible: Replace
https://dashscope.aliyuncs.com/apps/anthropicwithhttps://llm-xxx.cn-beijing.maas.aliyuncs.com/apps/anthropic
Feature availability by region
Feature | Singapore | US (Virginia) | China (Beijing) | China (Hong Kong) | Germany (Frankfurt) | Japan (Tokyo) |
Real-time inference | ||||||
Batch inference | ||||||
Playground | ||||||
Monitoring (standard) | ||||||
Monitoring (advanced) | ||||||
Alerting | ||||||
Transmission security | ||||||
Permissions | ||||||
Fine-tuning |
References
Recommended models — Models and context lengths by region
Model inference pricing — Pricing by region
Rate limiting— RPM and TPM limits
Obtain an API key — Create and manage keys
Base URL Overview — Model API access URLs