Alibaba Cloud Model Studio offers a wide variety of models. This topic describes all supported models in Model Studio.
Flagship model
Flagship models |
Best inference performance |
Balanced performance, speed and cost |
Fast speed and low cost |
Maximum context (Tokens) | 32,768 | 131,072 | 1,008,192 |
Minimum input price (Million tokens) | $1.6 | $0.4 | $0.05 |
Minimum output price (Million tokens) | $6.4 | $1.2 | $0.2 |
Model overview
Category | Model | Description |
Text generation | ||
Video generation | Generates video based on a single sentence, showcasing a wide range of artistic styles and cinematic-quality visuals | |
| ||
Embedding | Converts text into numerical representations, suitable for search, clustering, recommendation, and classification tasks. |
Text generation-Qwen
The commercial models of the Qwen series, boasts the latest capabilities and enhancements over its open source counterpart.
QwQ
QwQ reasoning model, trained based on Qwen2.5, has made significant improvements in reasoning capabilities by reinforcement learning. Its performance against core mathematic and coding metrics (AIME 24/25, LiveCodeBench) and general metrics (IFEval, LiveBench, etc.) have reached the level of DeepSeek-R1. Usage instructions
Name | Version | Context | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qwq-plus | Stable | 131,072 | 98,304 | 32,768 | 8,192 | $0.8 | $2.4 | 1 million tokens Valid for 180 days after activation |
Qwen-Max
Qwen-Max provides the best inference performance among Qwen models, especially for complex and multi-step tasks. Usage instructions | API reference | Try online
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | ||||||
qwen-max Also qwen-max-2025-01-25 | Stable | 32,768 | 30,720 | 8,192 | $1.6 Batch: Half price | $6.4 Batch: Half price | 1 million tokens each Valid for 180 days after activation |
qwen-max-latest Always the latest snapshot | Latest | $1.6 | $6.4 | ||||
qwen-max-2025-01-25 Also qwen-max-0125 or Qwen2.5-Max | Snapshot |
Qwen-Plus
Qwen-Plus provides a balanced combination of performance, speed, and cost, ideal for moderately complex tasks. Usage instructions | API reference | Try online | Deep thinking
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | ||||||
qwen-plus Also qwen-plus-2025-01-25 | Stable | 131,072 | 129,024 | 8,192 | $0.4 Batch: Half price | $1.2 Batch: Half price | 1 million tokens each Valid for 180 days after activation |
qwen-plus-latest Always the latest snapshot | Latest | 16,384 CoT: 38,912 | $0.4 | Thinking: $8 Non-Thinking: $1.2 | |||
qwen-plus-2025-04-28 Also qwen-plus-0428, Qwen3 | Snapshot | ||||||
qwen-plus-2025-01-25 Also qwen-plus-0125 | Snapshot | 8,192 | $1.2 |
The latest qwen-plus-2025-04-28 model is capable of responding in both thinking and non-thinking modes, allowing you to switch between the two using the enable_thinking
parameter. In addition to this, the model's capabilities have been significantly enhanced:
Reasoning capability: The model has significantly outperformed QwQ and non-reasoning models of the same size in evaluations of mathematics, coding, and logical reasoning, reaching SOTA performance at its size.
Human preference following: Its abilities in creative writing, role-playing, multi-turn conversation, and instruction following have greatly improved, surpassing general capabilities of models of similar size.
Agent capability: The model achieves industry-leading levels in both thinking and non-thinking modes, enabling precise external tool invocation.
Multilingual capability: The model supports over 100 languages and dialects, with marked improvements in multilingual translation, instruction comprehension, and common sense reasoning abilities.
Response format fixes: Previous issues with response formats in earlier versions, such as anomalous Markdown, mid-text truncation, and incorrect boxed outputs, have been fixed.
Qwen-Turbo
Qwen-Turbo provides fast speed and low cost, suitable for simple tasks. Usage instructions | API reference | Try online | Deep thinking
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | ||||||
qwen-turbo Also qwen-turbo-2024-11-01 | Stable | 1,008,192 | 1,000,000 | 8,192 | $0.05 Batch: Half price | $0.2 Batch: Half price | 1 million tokens each Valid for 180 days after activation |
qwen-turbo-latest Always the latest snapshot | Latest | Non-Thinking 1,000,000 Thinking 131,072 | Thinking 129,024 Non-Thinking 1,000,000 | 16,384 CoT: 38,912 | $0.05 | Thinking: $1 Non-Thinking: $0.2 | |
qwen-turbo-2025-04-28 Also qwen-turbo-0428, Qwen3 | Snapshot | ||||||
qwen-turbo-2024-11-01 Also qwen-turbo-1101 | 1,008,192 | 1,000,000 | 8,192 | $0.2 |
The latest qwen-turbo-2025-04-28 model is capable of responding in both thinking and non-thinking modes, allowing you to switch between the two using the enable_thinking
parameter. In addition to this, the model's capabilities have been significantly enhanced:
Reasoning capability: The model has significantly outperformed QwQ and non-reasoning models of the same size in evaluations of mathematics, coding, and logical reasoning, reaching SOTA performance at its size.
Human preference following: Its abilities in creative writing, role-playing, multi-turn conversation, and instruction following have greatly improved, surpassing general capabilities of models of similar size.
Agent capability: The model achieves industry-leading levels in both thinking and non-thinking modes, enabling precise external tool invocation.
Multilingual capability: The model supports over 100 languages and dialects, with marked improvements in multilingual translation, instruction comprehension, and common sense reasoning abilities.
Response format fixes: Previous issues with response formats in earlier versions, such as anomalous Markdown, mid-text truncation, and incorrect boxed outputs, have been fixed.
QVQ
QVQ is a visual reasoning model that supports visual input and chain-of-thought output. It shows stronger capabilities in mathematics, coding, visual analysis, creativity, and general tasks. Usage instructions
Name | Version | Context | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qvq-max Also qvq-max-2025-03-25 | Stable | 122,880 | 98,304 Up to 163,84 per image | 16,384 | 8,192 | Time-limited free trial After the free quota runs out, you cannot access this model. Please stay tuned for updates. | 1 million tokens each Valid for 180 days after activation | |
qvq-max-latest Always the latest snapshot | Latest | |||||||
qvq-max-2025-03-25 Also qvq-max-0325 | Snapshot |
Qwen-VL
Qwen-VL is a text generation model that can understand and process images. The model performs OCR operations and provides further functionalities, such as summarizing and reasoning. For example, it can extract product attributes from photos, and solving problems from images. Usage instructions | API reference | Try online
Qwen-VL is billed based on the total number of input and output tokens.
Image token calculation rule: Every 28 × 28 pixels count as 1 token. Each image converts to at least 4 tokens. For more information, see Visual understanding.
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | ||||||
qwen-vl-max Enhanced capabilities of visual reasoning and instruction following compared with qwen-vl-plus. Best for complex tasks. | Stable | 32,768 | 30,720 Up to 16,384 tokens per image | 2,048 | $0.8 | $3.2 | 1 million tokens each Valid for 180 days after activation |
qwen-vl-plus Enhanced detail and text recognition capabilities, supporting images with over one million pixel resolution and any aspect ratio. Exceptional performance for various visual tasks. | Stable | $0.21 | $0.63 |
Text generation - Qwen - open source
In the model name, 'xxb' indicates the parameter scale. For example, 'qwen2-72b-instruct' has 72 billion parameters.
Model Studio facilitates the use of open source Qwen models without the need for local deployment. Qwen3 and Qwen2.5 are most recommended among the open source models.
Qwen3
Qwen3 is capable of responding in both thinking and non-thinking modes, allowing you to switch between the two using the enable_thinking
parameter. In addition to this, the model's capabilities have been significantly enhanced:
Reasoning capability: The model has significantly outperformed QwQ and non-reasoning models of the same size in evaluations of mathematics, coding, and logical reasoning, reaching SOTA performance at its size.
Human preference following: Its abilities in creative writing, role-playing, multi-turn conversation, and instruction following have greatly improved, surpassing general capabilities of models of similar size.
Agent capability: The model achieves industry-leading levels in both thinking and non-thinking modes, enabling precise external tool invocation.
Multilingual capability: The model supports over 100 languages and dialects, with marked improvements in multilingual translation, instruction comprehension, and common sense reasoning abilities.
Response format fixes: Previous issues with response formats in earlier versions, such as anomalous Markdown, mid-text truncation, and incorrect boxed outputs, have been fixed.
Open source Qwen3 does not support non-stream output in either thinking or non-thinking mode.
Open source Qwen3 is charged at the non-thinking price if it does not output the thinking process under the thinking mode.
Thinking mode | Non-thinking mode | Usage instructions
Name | Mode | Contect | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qwen3-235b-a22b | Non-Thinking | 131,072 | 129,024 | - | 16,384 | $0.7 | $2.8 | 1 million tokens each Valid for 180 days after activation |
Thinking | 38,912 | $8.4 | ||||||
qwen3-32b | Non-Thinking | - | $0.7 | $2.8 | ||||
Thinking | 38,912 | $8.4 | ||||||
qwen3-30b-a3b | Non-Thinking | - | $0.2 | $0.8 | ||||
Thinking | 38,912 | $2.4 | ||||||
qwen3-14b | Non-Thinking | - | 8,192 | $0.35 | $1.4 | |||
Thinking | 38,912 | $4.2 | ||||||
qwen3-8b | Non-Thinking | - | $0.18 | $0.7 | ||||
Thinking | 38,912 | $2.1 | ||||||
qwen3-4b | Non-Thinking | - | $0.11 | $0.42 | ||||
Thinking | 38,912 | $1.26 | ||||||
qwen3-1.7b | Non-Thinking | 32,768 | 30,720 | - | $0.42 | |||
Thinking | 28,672 | 30,720 (CoT+Response) | $1.26 | |||||
qwen3-0.6b | Non-Thinking | 30,720 | - | $0.42 | ||||
Thinking | 28,672 | 30,720 (CoT+Response) | $1.26 |
Qwen2.5
Qwen2.5 is the latest series of the Qwen LLM. For Qwen2.5, we have launched a series of base and instruct models with parameter sizes ranging from 7 billion to 72 billion. Qwen2.5 has made the following improvements over Qwen2:
Qwen2.5 is pre-trained on our latest large-scale dataset containing 18 trillion tokens.
Thanks to our expert models in specific fields, Qwen2.5 has significantly increased knowledge and greatly improved coding and maths capabilities.
Qwen2.5 has shown significant improvements in following instructions, generating long texts (over 8K tokens), understanding structured data (such as tables), and generating structured outputs (especially JSON). It supports more diversified system prompts, enhancing its role-playing and conditional setting as a chatbot.
Qwen2.5 supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
Usage instructions | API reference | Try online
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (Million tokens) | ||||
qwen2.5-14b-instruct-1m | 1,008,192 | 1,000,000 | 8,192 | Time-limited free trial | |
qwen2.5-7b-instruct-1m | |||||
qwen2.5-72b-instruct | 131,072 | 129,024 | |||
qwen2.5-32b-instruct | |||||
qwen2.5-14b-instruct | |||||
qwen2.5-7b-instruct |
Qwen2
The open-source Qwen2 models. Usage instructions | API reference | Try online
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (Million tokens) | ||||
qwen2-72b-instruct Deprecated | 131,072 | 128,000 | 6,144 | Time-limited free trial | |
qwen2-57b-a14b-instruct Deprecated | 65,536 | 63,488 | |||
qwen2-7b-instruct Deprecated | 131,072 | 128,000 |
Qwen1.5
The open-source Qwen1.5 models. Usage instructions | API reference | Try online
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (Million tokens) | ||||
qwen1.5-110b-chat Deprecated | 8,000 | 6,000 | 2,000 | Time-limited free trial | |
qwen1.5-72b-chat Deprecated | |||||
qwen1.5-32b-chat Deprecated | |||||
qwen1.5-14b-chat Deprecated | |||||
qwen1.5-7b-chat Deprecated |
Qwen-Omni
Qwen-Omni is a omni-modal understanding and generation model trained on Qwen2.5. It can understand text, image, audio, and video swiftly. It can also generate text and voice simultaneously in stream. Usage instructions | API reference
Name | Context | Maximum input | Maximum output | Free quota |
(Tokens) | ||||
qwen2.5-omni-7b | 32,768 | 30,720 | 2,048 | 1 million tokens (regardless of modality) Valid for 180 days after activation |
After the free quota runs out, you cannot access qwen2.5-omni-7b. Please stay tuned for updates.
Qwen-VL - open source
The open-source version of Qwen-VL. Usage instructions | API reference
Qwen2.5-VL has made the following improvements over Qwen2-VL:
Richer perception of the world: Qwen2.5-VL is good at recognizing common objects such as flowers, birds, fish, and insects, as well as analyzing text, charts, icons, graphics, and layouts within images.
Long video understanding: Qwen2.5-VL can understand videos of up to 10 minutes. It can also pinpoint video segments to capture events.
Visual locating: Qwen2.5-VL can accurately locate objects in images by generating bounding boxes (coordinates for the top-left and bottom-right corners) or points (coordinates for the center of the bounding box). It can provide stable JSON outputs for these coordinates.
Structured output: Qwen2.5-VL supports structured output for data such as invoices, forms, and tables, suitable in finance, business, among other scenarios.
Name | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||
qwen2.5-vl-72b-instruct | 131,072 | 129,024 Up to 16,384 per image | 8,192 | Time-limited free trial | ||
qwen2.5-vl-32b-instruct | Time-limited free trial After the free quota runs out, you cannot access the model. Stay tuned for future updates. | 1 million tokens Valid for 180 days after activation | ||||
qwen2.5-vl-7b-instruct | Time-limited free trial | |||||
qwen2.5-vl-3b-instruct |
Video generation - Wan
Text-to-video
Wan Text-to-video can generate a video based on a single sentence, showcasing a wide range of artistic styles and cinematic-quality visuals. API reference | Try online
Name | Description | Unit price | Free quota |
wan2.1-t2v-turbo | Faster generation with balanced performance. | $0.036/second | 200 seconds for each Valid for 180 days after activation |
wan2.1-t2v-plus | Richer details and more textured visuals. | $0.10/second |
Sample input | Output video |
Prompt: A kitten running in the moonlight |
Image-to-video: first frame
Wan Image-to-video takes an input image as the first frame, then generates the subsequent video content based on a prompt. The resulting video features a wide range of artistic styles and cinematic-quality visuals. API reference | Try online
Name | Description | Unit price | Free quota |
wan2.1-i2v-turbo | Faster generation, taking only one-third of the time of the Plus model, offering better cost-effectiveness. | $0.036/second | 200 seconds for each Valid for 180 days after activation |
wan2.1-i2v-plus | Rich details and enhanced texture. | $0.10/second |
Sample input | Output video |
Prompt: A cat running on the grass. Input image: | Output video: Takes the input image as the first frame, then generates the subsequent video content based on a prompt. Model: wanx2.1-i2v-turbo |
Image-to-video: first and last frame
Wan Image-to-video can generate a smooth and fluid dynamic video based on the first and last frame images along with a prompt. The video showcases a wide range of artistic styles and cinematic-quality visuals. API reference | Try online
Name | Unit price | Free quota |
wan2.1-kf2v-plus | $0.10/second | 200 seconds Valid for 180 days after activation |
Sample input | Output video | ||
First frame | Last frame | Prompt | |
Realistic style, a black kitten curiously looking at the sky, the camera gradually rises from eye level, finally looking down at the kitten's curious eyes. |
Text embedding
Converts text into numerical representations, suitable for search, clustering, recommendation, and classification tasks. Billed based on the number of input tokens. API reference
Name | Vector dimensions | Maximum rows | Maximum tokens per row | Supported languages | Price (Million input tokens) | Free quota |
text-embedding-v3 | 1,024 (default), 768 or 512 | 10 | 8,192 | Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, Russian, and more than 50 other languages | $0.07 | 500,000 tokens Valid for 180 days after activation |