Models - Alibaba Cloud Model Studio - Alibaba Cloud Documentation Center

Choose the right text generation model for AI agents, chatbots, and document processing.

OpenClaw, Claude Code, or Hermes

qwen3.6-plus is ideal for large codebases, offering a balance of performance and cost, full tool calling support, and a 1 million-token context window. Coding Plan users can also select glm-5 or MiniMax-M2.5. All these models are optimized for agent workflows.

Use cases

For chatbots, content generation, summarization, and document processing, we recommend qwen3.6-plus. It offers a good balance of performance and cost, a 1 million-token context window, and built-in tools. Once you've confirmed its performance meets your needs, try qwen3.6-flash to reduce costs. Its performance is close to that of the flagship model, and it supports the same context length and features. If you require the most powerful reasoning capabilities, select qwen3.6-max-preview, which has a higher cost.

Context window

1 million tokens is roughly equivalent to 750,000 English words or about 8 to 10 novels.

For long documents or large codebases: qwen3.6-plus / qwen3.6-flash (1 million tokens).
For standard tasks: A context window of 128k to 256k tokens is typically sufficient.

For context window details, visit the Models page.

China (Beijing) | Singapore | US | China (Hong Kong) | Frankfurt

Thinking mode

This mode enables step-by-step reasoning, which is ideal for scenarios like multi-step mathematical calculations, code debugging, architecture planning, or legal cross-referencing.

Use the enable_thinking parameter to enable this mode. In the Responses API, the reasoning.effort parameter lets you enable or disable thinking mode and control its depth. All Qwen3 and later models support this feature, most of which operate in a hybrid mode that can be toggled per request.

See Deep thinking.

Function calling and built-in tools

These features allow the model to perform actions, such as querying the weather, searching a database, or booking a meeting.

Function calling (custom tools that the model calls): Supported by all general-purpose models.
Built-in tools (such as web search, code interpreter, and web scraping) that require no complex configuration.

See Tool calling.

Structured output

This feature ensures that the model returns valid JSON, for example, when extracting names and addresses from text.

See Structured output.

Batch inference

Batch inference is ideal for high-volume, non-latency-sensitive scenarios because it reduces costs.

See Batch inference.

Recommended models

International

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen3.6-max-preview`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3.6-plus`	1M	Supported	Supported	Supported	Supported	Supported
`qwen3.6-flash`	1M	Supported	Supported	Supported	Supported	Supported

Global

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen3.6-max-preview`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3.6-plus`	1M	Supported	Supported	Supported	Supported	Unsupported
`qwen3.6-flash`	1M	Supported	Supported	Supported	Supported	Unsupported

US

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-plus-us`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen-flash-us`	1M	Supported	Supported	Unsupported	Supported	Unsupported

Chinese mainland

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen3.6-max-preview`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3.6-plus`	1M	Supported	Supported	Supported	Supported	Supported
`qwen3.6-flash`	1M	Supported	Supported	Supported	Supported	Supported
`kimi-k2.6`	256k	Supported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-v4-pro`	1M	Supported	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-v4-flash`	1M	Supported	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`glm-5.1`	198k	Supported	Supported	Unsupported	Supported	Unsupported
`MiniMax-M2.5`	192k	Supported	Supported	Unsupported	Unsupported	Unsupported

China (Hong Kong) and EU

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-plus`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3.5-flash`	1M	Supported	Supported	Supported	Supported	Unsupported

All models

Qwen3.6

Model ID	Context	Max output	Thinking budget	Function Calling	Built-in tools	Structured output	Batch calling	Coding Plan
`qwen3.6-max-preview`	256k	64k	128k	Supported	Unsupported	Supported	Unsupported	Unsupported
`qwen3.6-plus`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Supported (Pro only)
`qwen3.6-plus-2026-04-02`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.6-flash`	1M	64k	128k	Supported	Supported	Supported	Supported	Unsupported
`qwen3.6-flash-2026-04-16`	1M	64k	128k	Supported	Supported	Supported	Supported	Unsupported

Qwen3.5

Global and Chinese mainland

Model ID	Context	Max output	Thinking budget	Function Calling	Built-in tools	Structured output	Batch calling	Coding Plan
`qwen3.5-plus`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Supported
`qwen3.5-plus-2026-02-15`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-flash`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-flash-2026-02-23`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-397b-a17b`	256k	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-122b-a10b`	256k	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-27b`	256k	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-35b-a3b`	256k	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported

Hong Kong (China) and EU

Model ID	Context	Max output	Thinking budget	Function Calling	Built-in tools	Structured output	Batch calling	Coding Plan
`qwen3.5-flash`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported
`qwen3.5-flash-2026-02-23`	1M	64k	80k	Supported	Supported	Supported	Unsupported	Unsupported

Third-party models

Model ID	Context	Max output	Thinking budget	Function Calling	Built-in tools	Structured output	Batch calling	Coding Plan
`deepseek-v4-pro`	1M	384k (shared)		Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-v4-flash`	1M	384k (shared)		Supported	Unsupported	Unsupported	Unsupported	Unsupported
`glm-5.1`	198k	128k	128k	Supported	Unsupported	Supported	Unsupported	Unsupported
`kimi-k2.6`	256k	96k	80k	Supported	Unsupported	Unsupported	Unsupported	Supported
`MiniMax-M2.5`	192k	32k (includes chain-of-thought)	Unsupported	Supported	Unsupported	Unsupported	Unsupported	Supported

Legacy and other models

For new projects, use the Qwen3.6 or Qwen3.5 series. The following models are legacy and no longer recommended. Visit the Models page to view detailed model parameters, such as context window and billing.

China (Beijing) | Singapore | United States | China (Hong Kong) | Germany (Frankfurt)

View legacy and other models

Qwen3

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen3-max`	256k	Supported	Supported	Supported	Supported	Supported
`qwen3-max-2026-01-23`	256k	Supported	Supported	Supported	Supported	Unsupported
`qwen3-max-preview`	256k	Supported	Supported	Supported	Supported	Unsupported
`qwen3-max-2025-09-23`	256k	Supported	Supported	Supported	Supported	Unsupported
`qwen3-235b-a22b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-235b-a22b-thinking-2507`	256k	Supported	Supported	Unsupported	Unsupported	Unsupported
`qwen3-235b-a22b-instruct-2507`	256k	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen3-next-80b-a3b-thinking`	256k	Supported	Supported	Unsupported	Unsupported	Unsupported
`qwen3-next-80b-a3b-instruct`	256k	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen3-32b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-30b-a3b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-30b-a3b-thinking-2507`	256k	Supported	Supported	Unsupported	Unsupported	Unsupported
`qwen3-30b-a3b-instruct-2507`	256k	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen3-14b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-8b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-4b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-1.7b`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-0.6b`	256k	Supported	Supported	Unsupported	Supported	Unsupported

Qwen3-Coder

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen3-coder-plus`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-plus-2025-09-23`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-plus-2025-07-22`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-flash`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-flash-2025-07-28`	1M	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-next`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-480b-a35b-instruct`	256k	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen3-coder-30b-a3b-instruct`	256k	Unsupported	Supported	Unsupported	Supported	Unsupported

Qwen2.5 (open source)

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen2.5-omni-7b`	1M	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen2.5-vl-72b-instruct`	1M	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen2.5-vl-32b-instruct`	1M	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen2.5-vl-7b-instruct`	1M	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen2.5-vl-3b-instruct`	1M	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen2.5-72b-instruct`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-72b-instruct-1m`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-32b-instruct`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-14b-instruct`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-14b-instruct-1m`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-7b-instruct`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported
`qwen2.5-7b-instruct-1m`	1M	Unsupported	Supported	Unsupported	Supported	Unsupported

Translation

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-mt-plus`	16k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-mt-turbo`	16k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-mt-flash`	16k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-mt-lite`	16k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported

Qwen-Long

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-long`	10M	Unsupported	Unsupported	Unsupported	Supported	Supported
`qwen-long-latest`	10M	Unsupported	Unsupported	Unsupported	Supported	Supported
`qwen-long-2025-01-25`	10M	Unsupported	Unsupported	Unsupported	Supported	Unsupported

Role-playing

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-plus-character`	32k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-plus-character-ja`	32k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-flash-character`	8k	Unsupported	Unsupported	Unsupported	Unsupported	Unsupported

Legacy Qwen

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`qwen-plus` and its snapshots	1M	Supported	Supported	Supported	Supported	Supported
`qwen-max` and its snapshots	128k	Supported	Supported	Supported	Supported	Supported
`qwen-flash` and its snapshots	1M	Supported	Supported	Supported	Supported	Supported
`qwen-turbo` and its snapshots	1M	Supported	Supported	Supported	Supported	Supported
`qwq-plus`	128k	Supported	Supported	Unsupported	Unsupported	Supported
`qvq-max` and its snapshots	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`qwen-omni-turbo` and its snapshots	32k	Unsupported	Unsupported	Unsupported	Unsupported	Supported

Third-party models

Model	Context	Thinking mode	Function Calling	Built-in tools	Structured output	Batch calling
`glm-5`	198k	Supported	Supported	Unsupported	Supported	Unsupported
`glm-4.7`	198k	Supported	Supported	Unsupported	Supported	Unsupported
`glm-4.5`	198k	Supported	Supported	Unsupported	Supported	Unsupported
`glm-4.5-air`	198k	Supported	Supported	Unsupported	Supported	Unsupported
`MiniMax-M2.1`	200k	Supported	Supported	Unsupported	Unsupported	Unsupported
`kimi-k2.5`	256k	Supported	Supported	Unsupported	Unsupported	Unsupported
`kimi-k2-thinking`	256k	Supported	Supported	Unsupported	Supported	Unsupported
`Moonshot-Kimi-K2-Instruct`	256k	Unsupported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-v3.2`	128k	Supported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-v3.2-exp`	128k	Supported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-v3.1`	128k	Supported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-v3`	128k	Unsupported	Supported	Unsupported	Unsupported	Supported
`deepseek-r1`	128k	Supported	Supported	Unsupported	Unsupported	Supported
`deepseek-r1-0528`	128k	Supported	Supported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-llama-70b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-qwen-32b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-qwen-14b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-qwen-7b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-qwen-1.5b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported
`deepseek-r1-distill-llama-8b`	128k	Supported	Unsupported	Unsupported	Unsupported	Unsupported