Activate Model Studio in the Singapore region to receive free quota for each model.
Free quota is only available for models in the Singapore region. Other regions do not offer free quota.
Rules
Validity period
Free quota is valid for 30 to 90 days from activation or model approval. After expiration or depletion, continued inference incurs charges.
Starting from 3:00 UTC on September 8, 2025, the validity period for first-time activations is adjusted to 90 days. Users who activated the service before this date are not affected. For more information, see Validity period change for new user free quota.
Scope
Free quota only offsets real-time inference costs. It does not offset fees for:
Custom models (fine-tuned and deployed models)
Notes
Free quota is shared across the account and all RAM users.
Example: Total quota for qwen-max is 1,000,000 tokens. If the account uses 100,000 tokens and a RAM user uses 200,000 tokens, the remaining quota is 700,000 tokens.
Get your free quota
Go to the Model Studio console - Singapore region. Accept Terms of Service to activate and receive your free quota. Free quota is only available for the Singapore region. Other regions do not offer free quota.
If Terms of Service don't appear, you've already activated and received free quota.
View remaining quota
View remaining free quota using either method.
Method 1: Usage page
On the Model Usage page, click the Free Quota tab to view remaining quota and validity period for all models.
Method 2: Models page
After you activate Model Studio, go to the Models page (Singapore) in the console. Click the target model to view the remaining quota on its product page.
24,098/1,000,000: 24,098 tokens remaining of 1,000,000 total.

Use your quota
Real-time calls (Singapore region) automatically use free quota. For more information, see Getting started.
Prevent overage charges
By default, calls continue after quota exhaustion and incur charges. Enable Free Quota Only to block calls when quota depletes, returning error AllocationQuota.FreeTierOnly.
How to enable
Method 1: Usage page
For a single model:
On the Model Usage page in the console, click the Free Quota tab.
Find the target model in the list and turn on the Free Quota Only switch in the Actions column. (This switch is only available for models that still have a free quota.)
In batch:
On the Model Usage page in the console, click the Free Quota tab.
Click Free Quota Only Batch Operation and select Batch Enable from the drop-down menu.
Check the target models and click Batch Enable. To enable this feature for all eligible models that do not have it enabled, click Enable for All Models.
In the confirmation dialog box, click Enable Free Quota Only.

Method 2: Enable on the Models page
Take Qwen3-Coder-Plus as an example. Go to the Qwen3-Coder-Plus product page (Singapore region) and turn on the Free Quota Only switch.

If the switch isn't displayed, the quota is exhausted, expired, or the model doesn't offer free quota.
How to disable
This feature defaults to disabled. Once enabled, you can disable Free Quota Only only when the console shows quota exhausted.
Console quota updates per minute. You may need to manually refresh the page.
FAQ
Are there notifications when the free quota runs out?
Yes. The system sends notifications when your remaining free quota drops to 20% or fully runs out. You will receive notifications via internal messages and email.
What happens when the free quota is used up?
If Free quota only is not enabled, calls continue and excess tokens are billed per Models pricing. Charges deduct from your account and may cause overdue status.
Overdue status blocks all calls, even with remaining quota.
Before calling models, check quota and set up budget management.
How to view free quota consumption records or bills?
Consumption records are generated a few minutes after a model invocation ends. To query them, follow these steps:
On the Bill Details page, select the billing month. Then, set Product Name to Alibaba Cloud Model Studio. Click Search.
Click the
icon in the upper-right corner of the bill list. Find Usage Details. Check Deduct Charge Usage. Click OK.Find the bill item where Line Item Type is Free Quotas. Deduct Charge Usage indicates the usage deducted by the free quota.
Why am I being charged?
Possible reasons:
You used a model without free quota (e.g., qwen-max and qwen-max-latest have separate quotas).
Free quota doesn't cover OpenAI compatible - Batch (file input) fees.
Free quota data in the console updates every minute. Manually refresh the page to view the latest data. Otherwise, the console might still display a remaining free quota for the model, even if the quota has already been exhausted.
To confirm billing, see How can I check which model incurred charges? and How can I view model call records?.
How can I check which model incurred charges?
Several minutes after calling a model, on the Bill Details page, select Billing Month, set Commodity Name to Model Studio Foundation Model Inference, and click Search. View charged models in the Instance ID column.

How can I view model call records?
One hour after you call a model, go to the Monitoring (Singapore or Beijing) page. Set the query conditions, such as the time range and workspace. Then, in the Models area, find the target model and click Monitor in the Actions column to view the model's call statistics. For more information, see the Monitoring document.
Data is updated hourly. During peak periods, there may be an hour-level latency.

How to avoid unexpected charges?
After quota exhaustion, charges deduct from your balance. To reduce unexpected charges:
Go to the API-Key (Singapore) or API-Key (Beijing) page and delete all API keys to prevent further calls and charges.

Set a spending limit alert to receive email notifications when monthly spending exceeds the threshold.

Why did my call fail even though I have remaining quota?
An overdue balance blocks all calls, even with remaining quota.
Why can't I see my free quota and its validity period?
If the quota column shows No free quota or the Free Quota area is missing, the quota has expired.
The Beijing region does not offer a free quota.