When you activate Alibaba Cloud Model Studio(Singapore region) for the first time, you automatically receive a free quota for each model.
A free quota is available only for International Edition (Singapore) models. China Mainland Edition (Beijing) models do not have a free quota.
Rules
Validity period
The free quota for new users is typically valid for 30 to 90 days, starting from the date you activate Model Studio or your model request is approved. After the validity period expires or the free quota is exhausted, continued use of the model inference service will incur fees.
Starting from 3:00 UTC on September 8, 2025, the validity period of the free quota for new users who activate Model Studio for the first time will be adjusted to 90 days. Users who activated the service before this date are not affected. For more information, see Validity period change for new user free quota.
Scope
The free quota for new users covers only the fees for real-time model inference (calls). It does not cover the fees for Batch calls and context cache.
Notes
An Alibaba Cloud account and its RAM users share the free quota.
For example, the total free quota for qwen-max is 1,000,000 tokens. If the Alibaba Cloud account uses 100,000 tokens and a RAM user uses 200,000 tokens, the remaining free quota for qwen-max is 700,000 tokens.
Get the quota
Go to Model Studio - Singapore region. After you read and agree to the Terms of Service, Model Studio is automatically activated and you receive a free inference quota (A free quota is available only for the Singapore region. The Beijing region does not provide a free quota).
If the Terms of Service do not appear, this indicates that you have already activated Model Studio and received the free quota.
View the remaining quota
After you activate Alibaba Cloud Model Studio, go to the model list page (Singapore) in the console. Click the target model to view the remaining quota on its product page.
As shown in the following figure, 24,098/1,000,000 indicates that 24,098 tokens remain out of a total of 1,000,000 tokens.

Use the quota
Real-time calls to models in the (Singapore region) automatically use your free quota. For more information, see Getting started with Model Studio.
Free quota only
By default, you are charged for usage after your free quota is exhausted. If you enable the free quota only feature, you cannot make calls after the quota is exhausted. An `AllocationQuota.FreeTierOnly` error is returned. This feature prevents you from incurring extra charges.
How to enable
Take Qwen3-Coder-Plus as an example. Go to the Qwen3-Coder-Plus details page (Singapore region) and turn on Free Quota Only.

If the switch is not displayed for the model, the model's free quota is exhausted or has expired, or the model does not offer a free quota.
How to disable
This feature is disabled by default. If you have enabled Free Quota Only, you can disable it only after the console shows that the free quota is exhausted.
The free quota displayed in the console is updated hourly and is not real-time data.
FAQ
Are there notifications when the free quota is used up?
Currently, there is no notification mechanism.
What happens when the free quota is used up?
If you do not enable the Free quota only feature, ongoing model calls will complete and will not be interrupted when the free quota is exhausted. Tokens that exceed the free quota are billed based on the input/output costs specified in Models and pricing. The resulting charges are automatically deducted from your Alibaba Cloud account on a pay-as-you-go basis. This may result in an overdue payment on your account.
If your account has an overdue payment, you cannot call other models, even if they still have a free quota.
Before you call a model, check the model's free quota and use budget management.
Why am I being charged?
Possible reasons include the following:
You used a model that does not have a free quota. For example, the free quotas for the qwen-max and qwen-max-latest models are not shared.
The free quota does not cover fees for OpenAI compatible-Batch calls.
The free quota data in the console is updated hourly. Therefore, the console may show a remaining free quota when it has actually been exhausted, resulting in charges. Check the latest free quota status again later.
To confirm your billing details, see How to check which model incurred charges? and How to view model call records?.
How to check which model incurred charges?
One hour after you call a model, on the Bill Details page, select a Billing Month. Then, set Commodity Name to Model Studio Foundation Model Inference, and click Search. In the Instance ID column, you can view the models that incurred costs.

How do I view model call records?
One hour after you call a model, go to the Model Observation (Singapore or Beijing) page. Set the query conditions, such as the time range and workspace. Then, in the Models area, find the target model and click Monitor in the Actions column to view the model's call statistics. For more information, see the Model Observation document.
Data is updated hourly. During peak periods, there may be an hour-level latency.

How to avoid charges?
After the free quota is exhausted, charges are automatically deducted from your Alibaba Cloud account balance. You can manage the risk of charges in the following ways:
Delete the API key (Singapore or Beijing). After you delete the API key, you can no longer use an API to call models in Model Studio, which prevents you from incurring further model call fees.

Set a spending limit alert. You will receive a notification email if your spending in the current month exceeds the alert threshold.

I still have a remaining quota. Why did the call fail?
Check if your Alibaba Cloud account has an overdue payment. If your account has an overdue payment, you cannot call models, even if they still have a free quota.
Why can't I see the free quota and its validity period?
When the Free Quota column displays No Free Quota or the Free Quota area is not displayed, this indicates that the free quota for the model in your account has expired.
The Beijing region does not have a free quota.
