We offer a free quota for each model to new users who activate Alibaba Cloud Model Studio for the first time.
Rules
Validity period
The validity period of the free quota is usually 30 to 180 days, starting from the activation date of Model Studio. After the free quota runs out or is exhausted, using model inference services will incur charges.
Validity scope
The free quota can only offset fees incurred by model inference (model calling). It cannot offset fees incurred by batch calling and context cache.
Notes
The Alibaba Cloud account and its RAM users share the free quota.
For example: The total free quota for qwen-max is 1 million tokens. If the Alibaba Cloud account consumes 100,000 tokens and a RAM user consumes 200,000 tokens, the remaining free quota for qwen-max is 700,000 tokens.
Claim free quota
When you activate Model Studio for the first time, you automatically get the free quotas for various models.
View remaining quota
(For first-time users) You must first activate Model Studio before you can view the quota details. Go to the Model Studio console and click Activate Now.
After activating Model Studio, click the target model on the Models page.
As shown in the figure below, 24,098/1,000,000 indicates a remaining quota of 24,098 tokens out of a total of 1,000,000 tokens.
Use free quota
The free quota will be automatically deducted when you call the model in real time.
FAQ
Will I be notified when the free quota is used up?
No, you are not. Currently, there is no such mechanism.
What happens when the free quota is used up?
Model calls that have already started will not be interrupted. Tokens exceeding the quota will be charged according to the input or output price detailed in Models. The charges will be automatically deducted from your Alibaba Cloud account as pay-as-you-go bills. This may result in overdue payments.
If your account has overdue payments, you cannot call other models even if they still have free quotas.
We recommend that you check the free quota for a model before using it, and manage your budget.
Why am I being charged?
Possible causes:
You are using a model whose quota has run out. For example, the free quotas for qwen-max and qwen-max-latest are not shared).
The free quota cannot offset fees incurred by batch calling.
The free quota details displayed in the console is updated hourly. Therefore, even if the console shows remaining free quota, it may have already been exhausted. You can check the latest free quota status later.
You can check the model that incurs fees and call records for more details.
How to check which model incurred fees?
One hour after you use the model, go to Bill Details. Select a Billing Cycle and set Product Details to Model Studio Foundation Model Inference. Click Search and view the model that incurs fees in the Instance ID column.
How to check model call records?
One hour after you use the model, go to Model Observation, find the target model in the Models list, and click Monitor to view the statistics. For more information, see Model observation.
The data is updated hourly and may have hour-level delays in peak hours.
How to avoid charges?
After the free quota runs out, fees are automatically deducted from your Alibaba Cloud account. However, you can take the following steps to manage risks:
Delete existing API keys. After you delete all API keys, you cannot call models from Model Studio, so no fees will be incurred.
Set spending alerts. When the monthly consumption amount exceeds the threshold, you will receive a notification.
Why did the call fail when I still have free quota?
You can check whether your Alibaba Cloud account has an overdue payment. If you do have overdue payments, you cannot call the model even if you have free quota.