This topic describes the billing rules and pricing for model training and model deployment on Alibaba Cloud Model Studio.
Training billing
Text generation models – Qwen
For the training workflow, see Qwen model fine-tuning. After training completes, deploy the new model before evaluating or calling it.
Method | Billed by training tokens |
Formula | Model training fee = (Total tokens in training data + Total tokens in mixed training data) × Number of epochs × Training unit price (Minimum billing unit: 1 token) View the estimated training fee at the bottom of the model training console, and click Computing Details to view the total number of training tokens, number of epochs, and training unit price. |
Video generation models – Wan
For the training workflow, see Model fine-tuning. After training completes, deploy the new model before calling it.
Method | Billed by training tokens |
Formula | Model training fee = Total training tokens × Training unit price (Billing unit: per 1,000 tokens) |
Model | Code | Training price (per 1K tokens) |
Image-to-video (first frame) | wan2.6-i2v | $0.08 |
wan2.5-i2v-preview | $0.05 | |
wan2.2-i2v-flash | $0.03 | |
Image-to-video (first and last frames) | wan2.2-kf2v-flash | $0.03 |
Deployment billing
Text generation models: Qwen
Billing by usage duration (PTU)
Cost = Usage duration × (Input Tokens Per Minute (TPM) unit price × Input TPM + Output TPM unit price × Output TPM)
-
Subscription orders take effect immediately and expire at 23:59 on day N (orders placed after 22:00 extend to day N+1).
-
After a subscription order expires, the service stops after a 2-hour grace period. Resources are retained for 14 hours before release.
-
Subscription orders cannot be terminated early.
-
For pay-as-you-go, if your account has an overdue payment, deployed resources remain active and continue to be billed for 24 hours before they are automatically released.
When input exceeds max token limit or purchased TPM quota, calls automatically switch to pay-as-you-go. Performance may degrade, workspace rate limits apply, and costs follow standard pay-as-you-go rates.
-
In this case, the API response header includes:
x-dashscope-ptu-overflow:true. -
View TPM statistics: Monitoring (Beijing).
|
Model |
Type |
Context window (Input + output tokens) |
Max input tokens |
Pay-as-you-go - hourly |
Subscription - daily |
||
|
Input (per 10k TPM) |
Output (per 1k TPM) |
Input (per 10k TPM) |
Output (per 1k TPM) |
||||
|
Qwen3-Max-2025-09-23 |
Instruct |
128,000 |
128,000 |
$1.11 |
$0.45 |
$13.32 |
$5.40 |
|
Qwen-Plus-2025-12-01 |
Instruct |
$0.28 |
$0.07 |
$3.36 |
$0.84 |
||
|
Thinking |
$0.28 |
$3.36 |
|||||
|
Qwen-Flash-2025-07-28 |
Instruct/Thinking |
$0.06 |
$0.06 |
$0.72 |
$0.72 |
||
|
Qwen3-VL-Plus-2025-09-23 |
Instruct/Thinking |
$0.35 |
$0.35 |
$4.20 |
$4.20 |
||
|
DeepSeek-v3.2 |
Instruct/Thinking |
64,000 |
$1.04 |
$0.16 |
$12.48 |
$1.92 |
|
Model types:
-
Instruct: The model runs in non-thinking mode after deployment.
-
Thinking: The model runs in thinking mode after deployment.
FAQ
Q: When does billing for model deployment start?
A: Billing starts when the model status changes to Running. No charges apply during Deploying, Overdue Payment, or Deployment Failed.
Q: Am I charged if I cancel a training job?
A: Yes. If you cancel training manually, you are charged for all tokens processed before cancellation. Training jobs interrupted by system errors or other non-user causes are not charged.
Q: How do I view invocation statistics for a deployed model?
A: Visit the Model Monitoring (Singapore), Model Monitoring (Virginia), or Model Monitoring (Beijing) page.