×
Community Blog Alibaba's Latest Thinking Model Excels at Adaptive Tool Use

Alibaba's Latest Thinking Model Excels at Adaptive Tool Use

This article introduces Qwen3-Max-Thinking, Alibaba’s latest reasoning model that excels in adaptive tool use and advanced test-time scaling to outperform leading AI systems.

Qwen_AI_4

Alibaba has unveiled its latest reasoning model: Qwen3-Max-Thinking. By significantly scaling up model parameters (over 1 trillion parameters) for reinforcement learning, Qwen3-Max-Thinking delivers substantial performance gains across multiple dimensions — including factual knowledge, complex reasoning, instruction following, alignment with human preferences, and agent capabilities.

On 19 established benchmarks, the model demonstrates leading performance competitive with advanced models such as Claude Opus 4.5, Gemini 3 Pro and GPT-5.2-Thinking-xhigh in areas including solving science, math, and coding questions as well as solving expert-level questions across broad subjects with search tools.

2

The standout capabilities of Qwen3-Max-Thinking stem from two distinguished innovations. One is adaptive tool-use capabilities: The model intelligently retrieves information and invokes its built-in code interpreter on demand—dramatically enhancing user experience without requiring manual tool selection; the other innovation is advanced test-time scaling techniques: Such techniques significantly boost reasoning performance, enabling the model to surpass other leading models on critical reasoning benchmarks.

Unlike earlier approaches that required users to manually choose tools before each task, Qwen3-Max-Thinking dynamically selects and leverages its integrated Search, Memory, and Code Interpreter capabilities during conversations—eliminating the need for explicit tool specification. This achievement was made through extensive training on diverse tasks using both rule-based and model-based feedback, after its initial fine-tuning for tool use.

Notably, the model’s search and memory tools effectively reduce hallucinations, enhance access to real-time information, and enable more personalized responses tailored to individual user needs. Additionally, the built-in Code Interpreter empowers users to execute code snippets or apply computational reasoning to solve complex problems more efficiently.

Furthermore, the team introduced an experience-cumulative, multi-round test-time scaling strategy. This mechanism distills key insights from prior interaction rounds, allowing the model to avoid re-deriving known conclusions and instead focus on resolving remaining uncertainties. As a result, this approach achieves higher context efficiency than naively referencing raw interaction histories, while consistently outperforms the standard method (parallel sampling plus aggregation) at similar token costs.

Qwen3-Max-Thinking is now live in Qwen Chat, where users can interact with the model and benefit from its adaptive tool-use capabilities. The model’s API is also available on Alibaba’s generative AI development platform, Model Studio.


This article was originally published on Alizila written by Crystal Liu.

0 1 0
Share on

Alibaba Cloud Community

1,328 posts | 464 followers

You may also like

Comments