Alibaba Cloud Model Studio offers a wide variety of models. This topic describes all supported models in Model Studio.
Flagship models
Flagship models |
|
|
|
Model name for API calls (Stable version) | qwen-max | qwen-plus | qwen-turbo |
Maximum context (Tokens) | 32,768 | 131,072 | 1,000,000 |
Input unit price (1,000 tokens) | $0.0016 | $0.0004 | $0.00005 |
Output unit price (1,000 tokens) | $0.0064 | $0.0012 | $0.0002 |
Qwen-Max provides the best inference performance among Qwen models, especially for complex tasks.
Qwen-Plus provides a balanced combination of performance, speed, and cost.
Qwen-Turbo provides fast speed and low cost, suitable for simple tasks.
For more details and models, see the table below.
Model overview
Category | Model | Description |
Text generation | ||
Embedding | Converts text into numerical representations, suitable for search, clustering, recommendation, and classification tasks. |
Text generation - Qwen
The commercial models of the Qwen series, boasts the latest capabilities and enhancements over its open source counterpart.
Qwen-Max
Qwen-Max provides the best inference performance among Qwen models, especially for complex and multi-step tasks. Usage method | API reference | Online experience
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (1,000 tokens) | ||||||
qwen-max | Stable | 32,768 | 30,720 | 8,192 | $0.0016 | $0.0064 | 1 million tokens |
qwen-max-latest | Latest | ||||||
qwen-max-2025-01-25 Also qwen-max-0125 or Qwen2.5-Max | Snapshot |
Qwen-Plus
Qwen-Plus provides a balanced combination of performance, speed, and cost, ideal for moderately complex tasks. Usage method | API reference | Online experience
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (1,000 tokens) | ||||||
qwen-plus | Stable | 131,072 | 129,024 | 8,192 | $0.0004 | $0.0012 | 1 million tokens |
qwen-plus-latest | Latest | ||||||
qwen-plus-2025-01-25 Also qwen-plus-0125 | Snapshot |
Qwen-Turbo
Qwen-Turbo provides fast speed and low cost, suitable for simple tasks. Usage method | API reference | Online experience
Name | Version | Context | Maximum input | Maximum output | Input price | Output price | Free quota |
(Tokens) | (1,000 tokens) | ||||||
qwen-turbo | Stable | 1,000,000 | 1,000,000 | 8,192 | $0.00005 | $0.0002 | 1 million tokens |
qwen-turbo-latest | Latest | ||||||
qwen-turbo-2024-11-01 Also qwen-turbo-1101 | Snapshot |
Qwen-VL
Qwen-VL integrates text generation with visual understanding, capable of performing OCR and advanced summarization and reasoning tasks. For example, it can extract attributes from product images or solve problems based on images. Usage method | API reference | Online experience
Rules for converting images to tokens: A 512×512 pixel image is roughly equivalent to 334 tokens. Images of other resolutions are converted proportionally. The smallest unit is 28×28 pixels, with each corresponding to one token. If the length or width of an image is not a multiple of 28, it is rounded up to the nearest multiple. A single image requires a minimum of 4 tokens.
Name | Version | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (1,000 tokens) | |||||
qwen-vl-max | Stable | 7,500 | 6,000 Up to 1,280 per image | 1,500 | Time-limited free trial | |
qwen-vl-plus | Stable |
qwen-vl-max: Compared with QwenVL-Plus, this model boasts further enhancements in visual reasoning and instruction-following capabilities. It excels in delivering optimal performance for a broader spectrum of complex tasks.
qwen-vl-plus: This model provides significantly enhanced detail and text recognition capabilities and supports images whose resolutions exceed a million pixels and various aspect ratios. It offers exceptional performance for various visual tasks.
Text generation - Qwen - open source
In the model name, 'xxb' denotes the parameter scale. For instance, 'qwen2-72b-instruct' refers to a scale of 72 billion parameters.
Model Studio facilitates the use of open source Qwen models without the need for local deployment. Qwen2 is most recommended among the open source models.
Qwen2.5
Qwen2.5 is the latest series of the Qwen LLM. For Qwen2.5, we have launched a series of base and instruct models with parameter sizes ranging from 7 billion to 72 billion. Qwen2.5 has made the following improvements over Qwen2:
Qwen2.5 is pre-trained on our latest large-scale dataset containing 18 trillion tokens.
Thanks to our expert models in specific fields, Qwen2.5 has significantly increased knowledge and greatly improved coding and maths capabilities.
Qwen2.5 has shown significant improvements in following instructions, generating long texts (over 8K tokens), understanding structured data (such as tables), and generating structured outputs (especially JSON). It supports more diversified system prompts, enhancing its role-playing and conditional setting as a chatbot.
Qwen2.5 supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
Usage method| API reference| Online experience
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (1,000 tokens) | ||||
qwen2.5-14b-instruct-1m | 1,000,000 | 1,000,000 | 8,192 | Time-limited free trial | |
qwen2.5-7b-instruct-1m | |||||
qwen2.5-72b-instruct | 131,072 | 129,024 | |||
qwen2.5-32b-instruct | |||||
qwen2.5-14b-instruct | |||||
qwen2.5-7b-instruct |
Qwen2
Alibaba Cloud's open source Qwen2 model. Usage method | API reference | Online experience
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (1,000 tokens) | ||||
qwen2-72b-instruct | 131,072 | 128,000 | 6,144 | Time-limited free trial | |
qwen2-57b-a14b-instruct | 65,536 | 63,488 | |||
qwen2-7b-instruct | 131,072 | 128,000 |
Qwen1.5
Alibaba Cloud's open source Qwen1.5 model. Usage method | API reference | Online experience
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (1,000 tokens) | ||||
qwen1.5-110b-chat | 8,000 | 6,000 | 2,000 | Time-limited free trial | |
qwen1.5-72b-chat | |||||
qwen1.5-32b-chat | |||||
qwen1.5-14b-chat | |||||
qwen1.5-7b-chat |
Qwen-VL - open source
The open source version of Qwen VL models. Usage method| API reference
Qwen2.5-VL has made the following improvements over Qwen2-VL:
Significant improvements in capabilities related to instruction following, maths calculations, code generation, and structured outputs (JSON outputs).
Supports unified parsing of visual content such as text, charts, and layouts within images. Enhanced ability to precisely locate visual elements. Supports the representation of detection boxes (coordinates of the top-left and bottom-right corners of a rectangle) as well as point coordinates (the center point of the rectangle).
With powerful positioning and reasoning abilities, the model's agent capabilities have been greatly enhanced, allowing integration with devices like smartphones, computers, and robots to perform automated operations based on visual environments and text instructions.
Supports understanding of long video files (up to 10 minutes) with the ability to pinpoint events to the second, and can comprehend the sequence and speed of events.
Name | Context | Maximum input | Maximum output | Input price | Output price |
(Tokens) | (1,000 tokens) | ||||
qwen2.5-vl-72b-instruct | 131,072 | 129,024 Up tp 16,384 per image | 8,192 | Time-limited free trial | |
qwen2.5-vl-7b-instruct | |||||
qwen2.5-vl-3b-instruct |
Text embedding
Converts text into numerical representations, suitable for search, clustering, recommendation, and classification tasks. Billed based on the number of input tokens. API reference
Name | Vector dimension | Maximum number of rows | Maximum tokens per row | Supported languages | Unit price (1,000 tokens) | Free quota |
text-embedding-v3 | 1024 768 512 | 6 | 8,192 | Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, Russian, and over 50 other languages. | Time-limited free trial | 500,000 tokens |