Visual reasoning models first output their thinking process and then provide an answer. This makes them suitable for complex visual analysis tasks, such as solving math problems, analyzing chart data, or understanding complex videos.
Showcase
The component above is for demonstration purposes only and does not send a real request.
Model availability
Qwen3-VL
Hybrid-thinking models: qwen3-vl-plus series, qwen3-vl-flash series
Thinking-only models: qwen3-vl-235b-a22b-thinking, qwen3-vl-32b-thinking, qwen3-vl-30b-a3b-thinking, qwen3-vl-8b-thinking
QVQ
Thinking-only models: qvq-max series, qvq-plus series
Usage guide
Thinking process: Alibaba Cloud Model Studio provides two types of visual reasoning models: hybrid-thinking and thinking-only.
Hybrid-thinking models: Control their thinking behavior using the
enable_thinkingparameter:Set to
trueto enable thinking. The model will first output its thinking process and then the final response.Set to
falseto disable thinking. The model will generate the response directly.
Thinking-only models: These models always think before responding, and this behavior cannot be disabled.
Output method: Visual reasoning models include a detailed thinking process. To avoid timeouts due to long responses, use streaming output.
Qwen3-VL supports both streaming and non-streaming methods.
QVQ supports only streaming output.
System prompt recommendations:
For single-turn or simple conversations: For the best inference results, do not set a
System Message. Pass instructions, such as model role settings and output format requirements, through theUser Message.For complex applications such as building agents or implementing tool calls: Use a
System Messageto define the model's role, capabilities, and behavioral framework to ensure stability and reliability.
Getting started
Prerequisites
You have created an API key and exported the API key as an environment variable.
If you call the model using an SDK, install the latest version of the SDK. The DashScope Python SDK must be version 1.24.6 or later, and the DashScope Java SDK must be version 2.21.10 or later.
The following examples show how to call the qvq-max model to solve a math problem in an image and print the thinking process and final response separately using streaming output.
OpenAI compatible
Python
from openai import OpenAI
import os
# Initialize the OpenAI client
client = OpenAI(
# If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If the environment variable is not configured, replace this with your Model Studio API key: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY"),
# If you are using a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
# Create a chat completion request
completion = client.chat.completions.create(
model="qvq-max", # This example uses qvq-max. You can replace it with another model name as needed.
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)Node.js
import OpenAI from "openai";
import process from 'process';
// Initialize the OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable. If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
// If you are using a model in the China (Beijing) region, replace the baseURL with: https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qvq-max',
messages: messages,
stream: true
});
console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();HTTP
# ======= IMPORTANT =======
# If you are using a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qvq-max",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true}
}'DashScope
When calling the QVQ model using DashScope:
The
incremental_outputparameter defaults totrueand cannot be set tofalse. Only incremental streaming output is supported.The
result_formatparameter defaults to"message"and cannot be set to"text".
Python
import os
import dashscope
from dashscope import MultiModalConversation
# If you are using a model in the China (Beijing) region, replace the base_http_api_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If the environment variable is not configured, replace the following line with your Model Studio API key: api_key="sk-xxx",
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qvq-max", # This example uses qvq-max. You can replace it with another model name as needed.
messages=messages,
stream=True,
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")Java
// DashScope SDK version >= 2.19.0
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {
// If you are using a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
// If the environment variable is not configured, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
// This example uses qvq-max. You can replace it with another model name as needed.
.model("qvq-max")
.messages(Arrays.asList(Msg))
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}HTTP
curl
# ======= IMPORTANT =======
# If you are using a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# If you are using a model in the China (Beijing) region, you need to use an API key for that region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qvq-max",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
}
}'Core capabilities
Enable or disable the thinking process
For scenarios that require a detailed reasoning process, such as solving problems or analyzing reports, you can enable the thinking process using the enable_thinking parameter. The following example shows how to enable the thinking process.
The enable_thinking parameter is supported only by the qwen3-vl-plus and qwen3-vl-flash series models.
OpenAI compatible
The enable_thinking and thinking_budget parameters are not standard OpenAI parameters. The method for passing them varies across SDKs for different languages:
Python SDK: You must pass them through the
extra_bodydictionary.Node.js SDK: You can pass them directly as top-level parameters.
import os
from openai import OpenAI
client = OpenAI(
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
enable_thinking = True
# Create a chat completion request
completion = client.chat.completions.create(
model="qwen3-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# The enable_thinking parameter enables the thinking process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
extra_body={
'enable_thinking': enable_thinking
},
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
if enable_thinking:
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)import OpenAI from "openai";
// Initialize the OpenAI client
const openai = new OpenAI({
// The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
// The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let enableThinking = true;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qwen3-vl-plus',
messages: messages,
stream: true,
// Note: In the Node.js SDK, non-standard parameters like enableThinking are passed as top-level properties and do not need to be in extra_body.
enable_thinking: enableThinking
});
if (enableThinking){console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');}
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true},
"enable_thinking": true
}'DashScope
import os
import dashscope
from dashscope import MultiModalConversation
# If you are using a model in the Singapore region, uncomment the following line
# dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"
enable_thinking=True
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qwen3-vl-plus",
messages=messages,
stream=True,
# The enable_thinking parameter enables the thinking process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
enable_thinking=enable_thinking
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
if enable_thinking:
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")// DashScope SDK version >= 2.21.10
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
// The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3-vl-plus")
.messages(Arrays.asList(Msg))
.enableThinking(true)
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}# ======= IMPORTANT =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qwen3-vl-plus",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true
}
}'Limit the thinking length
To prevent the model from generating an overly long thinking process, use the thinking_budget parameter to limit the maximum number of tokens generated for the thinking process. If the thinking process exceeds this limit, the content is truncated, and the model immediately starts generating the final answer. The default value of thinking_budget is the model's maximum chain-of-thought length. For more information, see Model list.
The thinking_budget parameter is supported only by Qwen3-VL (thinking mode).
OpenAI compatible
The thinking_budget parameter is not a standard OpenAI parameter. If you use the OpenAI Python SDK, you must pass it through extra_body.
import os
from openai import OpenAI
client = OpenAI(
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
enable_thinking = True
# Create a chat completion request
completion = client.chat.completions.create(
model="qwen3-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# The enable_thinking parameter enables the thinking process. The thinking_budget parameter sets the maximum number of tokens for the reasoning process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
extra_body={
'enable_thinking': enable_thinking,
"thinking_budget": 81920},
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
if enable_thinking:
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)import OpenAI from "openai";
// Initialize the OpenAI client
const openai = new OpenAI({
// The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
// The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let enableThinking = true;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qwen3-vl-plus',
messages: messages,
stream: true,
// Note: In the Node.js SDK, non-standard parameters like enableThinking are passed as top-level properties and do not need to be in extra_body.
enable_thinking: enableThinking,
thinking_budget: 81920
});
if (enableThinking){console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');}
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true},
"enable_thinking": true,
"thinking_budget": 81920
}'DashScope
import os
import dashscope
from dashscope import MultiModalConversation
# If you are using a model in the Singapore region, uncomment the following line
# dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"
enable_thinking=True
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qwen3-vl-plus",
messages=messages,
stream=True,
# The enable_thinking parameter enables the thinking process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
enable_thinking=enable_thinking,
# The thinking_budget parameter sets the maximum number of tokens for the reasoning process.
thinking_budget=81920,
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
if enable_thinking:
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")// DashScope SDK version >= 2.21.10
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
// The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3-vl-plus")
.messages(Arrays.asList(Msg))
.enableThinking(true)
.thinkingBudget(81920)
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}# ======= IMPORTANT =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qwen3-vl-plus",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"thinking_budget": 81920
}
}'More examples
In addition to their reasoning capabilities, visual reasoning models also have all the features of visual understanding models. You can combine these features to handle more complex scenarios:
Billing
Total cost = (Input tokens × Input price per token) + (Output tokens × Output price per token).
The thinking process (
reasoning_content) is part of the output content and is billed as output tokens. If a model in thinking mode does not output a thinking process, it is billed at the non-thinking mode price.For information about how to calculate tokens for images or videos, see Images.
API reference
For the input and output parameters of visual reasoning models, see Qwen API reference.
Error codes
If a call fails, see Error messages for troubleshooting.