Visual reasoning models first output their thinking process and then provide an answer. This makes them suitable for complex visual analysis tasks, such as solving math problems, analyzing chart data, or understanding complex videos.
Showcase
The component above is for demonstration purposes only and does not send a real request.
Availability
Supported regions
Supported models
Global
In the global deployment mode, the endpoint and data storage are located in the US (Virginia) region, and model inference compute resources are dynamically scheduled worldwide.
Hybrid-thinking models: qwen3-vl-plus, qwen3-vl-plus-2025-09-23, qwen3-vl-flash, qwen3-vl-flash-2025-10-15
Thinking-only models: qwen3-vl-235b-a22b-thinking, qwen3-vl-32b-thinking, qwen3-vl-30b-a3b-thinking, qwen3-vl-8b-thinking
International
In the international deployment mode, the endpoint and data storage are located in the Singapore region, and model inference compute resources are dynamically scheduled globally, excluding Mainland China.
Qwen3-VL
Hybrid-thinking models: qwen3-vl-plus, qwen3-vl-plus-2025-12-19, qwen3-vl-plus-2025-09-23, qwen3-vl-flash, qwen3-vl-flash-2025-10-15
Thinking-only models: qwen3-vl-235b-a22b-thinking, qwen3-vl-32b-thinking, qwen3-vl-30b-a3b-thinking, qwen3-vl-8b-thinking
QVQ
Thinking-only models: qvq-max series, qvq-plus series
US
In the US deployment mode, the endpoint and data storage are located in the US (Virginia) region, and model inference compute resources are limited to the United States.
Hybrid-thinking models: qwen3-vl-flash-us, qwen3-vl-flash-2025-10-15-us
Mainland China
In Mainland China deployment mode, the endpoint and data storage are located in the Beijing region, and model inference compute resources are limited to Mainland China.
Qwen3-VL
Hybrid-thinking models: qwen3-vl-plus, qwen3-vl-plus-2025-12-19, qwen3-vl-plus-2025-09-23, qwen3-vl-flash, qwen3-vl-flash-2025-10-15
Thinking-only models: qwen3-vl-235b-a22b-thinking, qwen3-vl-32b-thinking, qwen3-vl-30b-a3b-thinking, qwen3-vl-8b-thinking
QVQ
Thinking-only models: qvq-max series, qvq-plus series
Usage guide
Thinking process: Model Studio provides two types of visual reasoning models: hybrid-thinking and thinking-only.
Hybrid-thinking models: You can control their thinking behavior using the
enable_thinkingparameter:Set to
trueto enable thinking. The model will first output its thinking process and then the final response.Set to
falseto disable thinking. The model will generate the response directly.
Thinking-only models: These models always generate a thinking process before providing a response, and this behavior cannot be disabled.
Output method: Because visual reasoning models include a detailed thinking process, we recommend using streaming output to prevent timeouts caused by long responses.
The Qwen3-VL series supports both streaming and non-streaming methods.
The QVQ series supports only streaming output.
System prompt recommendations:
For single-turn or simple conversations: For the best inference results, do not set a
System Message. Pass instructions, such as model role settings and output format requirements, through theUser Message.For complex applications such as building agents or implementing tool calls: Use a
System Messageto define the model's role, capabilities, and behavioral framework to ensure its stability and reliability.
Getting started
Prerequisites
You have created an API key and exported the API key as an environment variable.
If you call the model using an SDK, install the latest version of the SDK. The DashScope Python SDK must be version 1.24.6 or later, and the DashScope Java SDK must be version 2.21.10 or later.
The following examples demonstrate how to call the qvq-max model to solve a math problem from an image. These examples use streaming output to print the thinking process and the final response separately.
OpenAI compatible
Python
from openai import OpenAI
import os
# Initialize the OpenAI client
client = OpenAI(
# API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If you have not configured an environment variable, replace the following with your Model Studio API key: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY"),
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
# Create a chat completion request
completion = client.chat.completions.create(
model="qvq-max", # This example uses qvq-max. You can replace it with another model name as needed.
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)Node.js
import OpenAI from "openai";
import process from 'process';
// Initialize the OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable. API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
// The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qvq-max',
messages: messages,
stream: true
});
console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();HTTP
# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qvq-max",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true}
}'DashScope
When calling the QVQ model using DashScope:
The
incremental_outputparameter defaults totrueand cannot be set tofalse. Only incremental streaming output is supported.The
result_formatparameter defaults to"message"and cannot be set to"text".
Python
import os
import dashscope
from dashscope import MultiModalConversation
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# If the environment variable is not configured, replace the following line with your Model Studio API key: api_key="sk-xxx",
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qvq-max", # This example uses qvq-max. You can replace it with another model name as needed.
messages=messages,
stream=True,
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")Java
// DashScope SDK version >= 2.19.0
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {
// The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
// This example uses qvq-max. You can replace it with another model name as needed.
.model("qvq-max")
.messages(Arrays.asList(Msg))
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}HTTP
curl
# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# API keys differ by region. To obtain one, see https://bailian.console.alibabacloud.com/?tab=model#/api-key
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qvq-max",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
}
}'Core capabilities
Enable or disable the thinking process
For scenarios that require a detailed thinking process, such as solving problems or analyzing reports, you can enable the thinking mode using the enable_thinking parameter. The following example shows how to do this.
The enable_thinking parameter is supported only by the qwen3-vl-plus and qwen3-vl-flash series models.
OpenAI compatible
The enable_thinking and thinking_budget parameters are not standard OpenAI parameters. The method for passing these parameters varies by programming language:
Python SDK: You must pass them through the
extra_bodydictionary.Node.js SDK: You can pass them directly as top-level parameters.
import os
from openai import OpenAI
client = OpenAI(
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
enable_thinking = True
# Create a chat completion request
completion = client.chat.completions.create(
model="qwen3-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# The enable_thinking parameter enables the thinking process. The thinking_budget parameter sets the maximum number of tokens for the reasoning process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
extra_body={
'enable_thinking': enable_thinking
},
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
if enable_thinking:
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)import OpenAI from "openai";
// Initialize the OpenAI client
const openai = new OpenAI({
// API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
// The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1
// If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let enableThinking = true;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qwen3-vl-plus',
messages: messages,
stream: true,
// Note: In the Node.js SDK, non-standard parameters like enableThinking are passed as top-level properties and do not need to be in extra_body.
enable_thinking: enableThinking
});
if (enableThinking){console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');}
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true},
"enable_thinking": true
}'DashScope
import os
import dashscope
from dashscope import MultiModalConversation
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"
enable_thinking = True
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qwen3-vl-plus",
messages=messages,
stream=True,
# The enable_thinking parameter enables the thinking process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
enable_thinking=enable_thinking
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
if enable_thinking:
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")// DashScope SDK version >= 2.21.10
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
// The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1
// If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
static {Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
// API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3-vl-plus")
.messages(Arrays.asList(Msg))
.enableThinking(true)
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}# ======= IMPORTANT =======
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qwen3-vl-plus",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true
}
}'Limit thinking length
To prevent the model from generating an overly long thinking process, use the thinking_budget parameter to limit the maximum number of tokens generated for the thinking process. If the thinking process exceeds this limit, the content is truncated, and the model immediately starts generating the final answer. The default value of thinking_budget is the model's maximum chain-of-thought length. See Model list.
The thinking_budget parameter is supported only by Qwen3-VL (thinking mode).
OpenAI compatible
The thinking_budget parameter is not a standard OpenAI parameter. If you use the OpenAI Python SDK, you must pass it through extra_body.
import os
from openai import OpenAI
client = OpenAI(
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define the full thinking process
answer_content = "" # Define the full response
is_answering = False # Check if the thinking process has ended and the response has started
enable_thinking = True
# Create a chat completion request
completion = client.chat.completions.create(
model="qwen3-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
},
},
{"type": "text", "text": "How do I solve this problem?"},
],
},
],
stream=True,
# The enable_thinking parameter enables the thinking process. The thinking_budget parameter sets the maximum number of tokens for the reasoning process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
extra_body={
'enable_thinking': enable_thinking,
"thinking_budget": 81920},
# Uncomment the following to return token usage in the last chunk
# stream_options={
# "include_usage": True
# }
)
if enable_thinking:
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print the usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print the thinking process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
# Print the response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(reasoning_content)
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(answer_content)import OpenAI from "openai";
// Initialize the OpenAI client
const openai = new OpenAI({
// API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
// The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1
// If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let enableThinking = true;
let messages = [
{
role: "user",
content: [
{ type: "image_url", image_url: { "url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg" } },
{ type: "text", text: "Solve this problem" },
]
}]
async function main() {
try {
const stream = await openai.chat.completions.create({
model: 'qwen3-vl-plus',
messages: messages,
stream: true,
// Note: In the Node.js SDK, non-standard parameters like enableThinking are passed as top-level properties and do not need to be in extra_body.
enable_thinking: enableThinking,
thinking_budget: 81920
});
if (enableThinking){console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');}
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle the thinking process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle the formal response
else if (delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();# ======= IMPORTANT =======
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"
}
},
{
"type": "text",
"text": "Solve this problem"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true},
"enable_thinking": true,
"thinking_budget": 81920
}'DashScope
import os
import dashscope
from dashscope import MultiModalConversation
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"
enable_thinking = True
messages = [
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "How do I solve this problem?"}
]
}
]
response = MultiModalConversation.call(
# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="qwen3-vl-plus",
messages=messages,
stream=True,
# The enable_thinking parameter enables the thinking process.
# For qwen3-vl-plus and qwen3-vl-flash, you can use enable_thinking to enable or disable thinking. For models with the 'thinking' suffix, such as qwen3-vl-235b-a22b-thinking, enable_thinking can only be set to true. This parameter does not apply to other Qwen-VL models.
enable_thinking=enable_thinking,
# The thinking_budget parameter sets the maximum number of tokens for the reasoning process.
thinking_budget=81920,
)
# Define the full thinking process
reasoning_content = ""
# Define the full response
answer_content = ""
# Check if the thinking process has ended and the response has started
is_answering = False
if enable_thinking:
print("=" * 20 + "Thinking process" + "=" * 20)
for chunk in response:
# If both the thinking process and the response are empty, ignore
message = chunk.output.choices[0].message
reasoning_content_chunk = message.get("reasoning_content", None)
if (chunk.output.choices[0].message.content == [] and
reasoning_content_chunk == ""):
pass
else:
# If it is currently the thinking process
if reasoning_content_chunk != None and chunk.output.choices[0].message.content == []:
print(chunk.output.choices[0].message.reasoning_content, end="")
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If it is currently the response
elif chunk.output.choices[0].message.content != []:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content[0]["text"], end="")
answer_content += chunk.output.choices[0].message.content[0]["text"]
# To print the full thinking process and response, uncomment and run the following code
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")// DashScope SDK version >= 2.21.10
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.InputRequiredException;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
// The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1
// If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1
static {Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(MultiModalConversationResult message) {
String re = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String reasoning = Objects.isNull(re)?"":re; // Default value
List<Map<String, Object>> content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (Objects.nonNull(content) && !content.isEmpty()) {
Object text = content.get(0).get("text");
finalContent.append(content.get(0).get("text"));
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(text);
}
}
public static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage Msg) {
return MultiModalConversationParam.builder()
// If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
// API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3-vl-plus")
.messages(Arrays.asList(Msg))
.enableThinking(true)
.thinkingBudget(81920)
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(MultiModalConversation conv, MultiModalMessage Msg)
throws NoApiKeyException, ApiException, InputRequiredException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(Msg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> {
handleGenerationResult(message);
});
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"),
Collections.singletonMap("text", "Solve this problem")))
.build();
streamCallWithMessage(conv, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Full response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | UploadFileException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}# ======= IMPORTANT =======
# API keys differ by region. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base URL for the Singapore region. If you are using a model in the US (Virginia) region, replace the base_url with https://dashscope-us.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# If you are using a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before execution ===
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qwen3-vl-plus",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://img.alicdn.com/imgextra/i1/O1CN01gDEY8M1W114Hi3XcN_!!6000000002727-0-tps-1024-406.jpg"},
{"text": "Solve this problem"}
]
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"thinking_budget": 81920
}
}'More examples
In addition to their reasoning capabilities, visual reasoning models have all the features of visual understanding models. You can combine these features to handle more complex scenarios:
Billing
Total cost = (Input tokens × Input price per token) + (Output tokens × Output price per token).
The thinking process (
reasoning_content) is part of the output content and is billed as output tokens. If a model in thinking mode does not output a thinking process, it is billed at the non-thinking mode price.For information about how to calculate tokens for images or videos, see Visual understanding.
API reference
For the input and output parameters, see Qwen.
Error codes
If a call fails, see Error messages for troubleshooting.