The Qwen3 and QwQ (based on Qwen2.5) models have powerful reasoning capability. They first output the thinking process, then the response content.
Model overview
Qwen3
Qwen3 supports thinking and non-thinking modes, allowing you to switch between the two using the enable_thinking
parameter. In addition to this, the model's capabilities have been significantly enhanced:
Reasoning capability: The model has significantly outperformed QwQ and non-thinking models of the same size in evaluations of mathematics, coding, and logical reasoning, reaching SOTA performance at its size.
Human preference following: Its abilities in creative writing, role-playing, multi-turn conversation, and instruction following have greatly improved, surpassing general capabilities of models of similar size.
Agent capability: The model achieves industry-leading levels in both thinking and non-thinking modes, enabling precise external tool invocation.
Multilingual capability: The model supports over 100 languages and dialects, with marked improvements in multilingual translation, instruction comprehension, and common sense reasoning abilities.
Response format fixes: Previous issues with response formats in earlier versions, such as anomalous Markdown, mid-text truncation, and incorrect boxed outputs, have been fixed.
When enable_thinking
is enabled, there is an extremely small probability that the reasoning content may not be output.
The thinking mode only supports incremental output.
Commercial models
Only the latest and 0428 versions of Qwen-Plus and Qwen-Turbo belong to the Qwen3 series and support the thinking mode.
By default, the thinking mode is not enabled for the commercial models. You need to setenable_thinking
totrue
first
Qwen-Plus
Name | Version | Context window | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qwen-plus-latest Always same performance as the latest snapshot | Latest | 131,072 | 98,304 | 38,912 | 16,384 | $0.4 | $8 | 1 million tokens each Valid for 180 days after activation |
qwen-plus-2025-04-28 Also qwen-plus-0428 | Snapshot |
Qwen-Turbo
Name | Version | Context window | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qwen-turbo-latest Always same performance as the latest snapshot | Latest | 131,072 | 98,304 | 38,912 | 16,384 | $0.05 | $1 | 1 million tokens each Valid for 180 days after activation |
qwen-turbo-2025-04-28 Also qwen-plus-0428 | Snapshot |
Open source models
The thinking mode is enabled by default for open source models. To disable it, setenable_thinking
tofalse
.
Open source Qwen3 only supports streaming output in both thinking and non-thinking modes.
Name | Context window | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (1,000 tokens) | ||||||
qwen3-235b-a22b | 131,072 | 98,304 | 38,912 | 16,384 | $0.7 | $8.4 | 1 million tokens each Valid for 180 days after activation |
qwen3-32b | |||||||
qwen3-30b-a3b | $0.2 | $2.4 | |||||
qwen3-14b | 8,192 | $0.35 | $4.2 | ||||
qwen3-8b | $0.18 | $2.1 | |||||
qwen3-4b | $0.11 | $1.26 | |||||
qwen3-1.7b | 32,768 | 28,672 | 30,720 (CoT+Response) | ||||
qwen3-0.6b |
QwQ (based on Qwen2.5)
QwQ reasoning model, trained based on Qwen2.5, has made significant improvements in reasoning capabilities by reinforcement learning. Its performance against core mathematic and coding metrics (AIME 24/25, LiveCodeBench) and general metrics (IFEval, LiveBench, etc.) has reached the level of DeepSeek-R1.
Reasoning cannot be disabled.
Only streaming output is supported.
Commercial models
Name | Version | Context window | Maximum input | Maximum CoT | Maximum response | Input price | Output price | Free quota |
(Tokens) | (Million tokens) | |||||||
qwq-plus | Stable | 131,072 | 98,304 | 32,768 | 8,192 | $0.8 | $2.4 | 1 million tokens Valid for 180 days after activation |
For information about rate limiting, see Rate limits.
Get started
Prerequisites: You must have obtained an API key and configured it as an environment variable. To use the SDKs, you must install OpenAI or DashScope SDK. The DashScope SDK for Java must be version 2.19.4 or later.
Run the following code to call a deep thinking model in stream mode. Get the thinking process from the returned reasoning_content
, and the response from the returned content
.
OpenAI
Python
Sample code
from openai import OpenAI
import os
# Initialize OpenAI client
client = OpenAI(
# If environment variables are not configured, replace with the Model Studio API Key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
model="qwen-plus-2025-04-28", # You can replace it with other deep thinking models as needed
messages=messages,
# enable_thinking parameter opens the thinking process, this parameter is invalid for QwQ models
extra_body={"enable_thinking": True},
stream=True,
# stream_options={
# "include_usage": True
# },
)
reasoning_content = "" # Complete reasoning process
answer_content = "" # Complete response
is_answering = False # Whether entering the response phase
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
continue
delta = chunk.choices[0].delta
# Only collect reasoning content
if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
if not is_answering:
print(delta.reasoning_content, end="", flush=True)
reasoning_content += delta.reasoning_content
# Received content, starting to respond
if hasattr(delta, "content") and delta.content:
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
print(delta.content, end="", flush=True)
answer_content += delta.content
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you?" and I need to provide an accurate and friendly response. First, I should confirm my identity: I am Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions, creating text, logical reasoning, etc. At the same time, the tone should remain warm and approachable, avoiding overly technical jargon so that the user feels comfortable. It’s also important not to use complex terms and to ensure the response is concise and clear. Additionally, it might be helpful to include some interactive elements, inviting the user to ask more questions to promote further communication. Lastly, I need to check for any missing key information, such as my name "Qwen", along with my association to Alibaba Group and Tongyi Lab. This ensures the response is comprehensive and meets user expectations.
====================Complete Response====================
Hello! I am Qwen, a super-large-scale language model independently developed by the Tongyi Lab under Alibaba Group. I can answer questions, create text, perform logical reasoning, programming, and more, aiming to provide users with high-quality information and services. You can call me Qwen. Is there anything I can assist you with?
Node.js
Sample code
import OpenAI from "openai";
import process from 'process';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variables
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
async function main() {
try {
const messages = [{ role: 'user', content: 'Who are you?' }];
const stream = await openai.chat.completions.create({
// You can replace with other Qwen3 models or QwQ models as needed
model: 'qwen-plus-2025-04-28',
messages,
stream: true,
// The enable_thinking parameter initiates the reasoning process, which is ineffective for QwQ models
enable_thinking: true
});
console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Only collect reasoning content
if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
if (!isAnswering) {
process.stdout.write(delta.reasoning_content);
}
reasoningContent += delta.reasoning_content;
}
// Receive content, start responding
if (delta.content !== undefined && delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();
Sample response
====================Thinking Process====================
Okay, the user asked "Who are you," and I need to respond with my identity. Firstly, I should clearly state that I am Qwen, a large-scale language model developed by Alibaba Cloud. Next, I can mention my main functions, such as answering questions, generating text, logical reasoning, etc. It's also important to highlight my multilingual support, including Chinese and English, so the user knows I can handle requests in different languages. Additionally, it might be helpful to explain my application scenarios, such as assistance in learning, work, and daily life. However, the user's question is quite direct, so detailed information may not be necessary; it's best to keep it concise and clear. It's important to maintain a friendly tone and invite further questions from the user. Check if there is any missing crucial information, like my version or the latest updates, but the user might not need such detailed information. Finally, ensure the answer is accurate and free of errors.
====================Complete Response====================
I am Qwen, a large-scale language model independently developed by Tongyi Lab. I can handle various tasks such as answering questions, generating text, logical reasoning, programming, and supporting multiple languages including Chinese and English. If you have any questions or need help, feel free to let me know at any time!
HTTP
Sample code
curl
For Qwen3, set enable_thinking
to true
to enable reasoning. enable_thinking
is not effective for QwQ.
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus-2025-04-28",
"messages": [
{
"role": "user",
"content": "Who are you?"
}
],
"stream": true,
"stream_options": {
"include_usage": true
},
"enable_thinking": true
}'
Sample response
data: {"choices":[{"delta":{"content":null,"role":"assistant","reasoning_content":""},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
.....
data: {"choices":[{"finish_reason":"stop","delta":{"content":"","reasoning_content":null},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":10,"completion_tokens":360,"total_tokens":370},"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
data: [DONE]
DashScope
When use DashScope to call Qwen3:
incremental_output
must betrue
.result_format
must be"message"
.
When use DashScope to call QwQ:
incremental_output
must betrue
.result_format
defaults to"message"
.
Python
Sample code
import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"
messages = [{"role": "user", "content": "Who are you?"}]
completion = Generation.call(
# If the environment variable is not set, please replace the line below with the Model Studio API Key: api_key = "sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# You may switch to other Deep thinking models as needed
model="qwen-plus-2025-04-28",
messages=messages,
result_format="message",
# Enable Deep thinking; this parameter is ineffective for QwQ models
enable_thinking=True,
stream=True,
incremental_output=True,
)
# Define complete reasoning process
reasoning_content = ""
# Define complete response
answer_content = ""
# Determine whether the reasoning process has finished and response has started
is_answering = False
print("=" * 20 + "Thinking Process" + "=" * 20)
for chunk in completion:
# Ignore if both reasoning process and response are empty
if (
chunk.output.choices[0].message.content == ""
and chunk.output.choices[0].message.reasoning_content == ""
):
pass
else:
# If currently in reasoning process
if (
chunk.output.choices[0].message.reasoning_content != ""
and chunk.output.choices[0].message.content == ""
):
print(chunk.output.choices[0].message.reasoning_content, end="", flush=True)
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If currently in response
elif chunk.output.choices[0].message.content != "":
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content, end="", flush=True)
answer_content += chunk.output.choices[0].message.content
# If you need to print the complete reasoning process and complete response, please uncomment the code below and run it
# print("=" * 20 + "Complete Thinking Process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Complete Response" + "=" * 20 + "\n")
# print(f"{answer_content}")
Sample response
====================Thinking Process====================
Alright, the user asked, "Who are you?" and I need to answer this question. First, I should clarify my identity, namely Qwen, a large-scale language model developed by Alibaba Cloud. Next, I need to explain my functions and purposes, such as answering questions, generating text, logical reasoning, etc. Moreover, I should emphasize my goal of being a helpful assistant to users, providing help and support.
When expressing, I should keep it conversational and avoid technical jargon or complex sentences. Adding some friendly terms, like "Hello there~", can make the conversation more natural. Additionally, it’s important to ensure the information is accurate and doesn't miss key points, such as my developer, main functions, and usage scenarios.
I should also consider possible follow-up questions from the user, like specific application examples or technical details, so I can plant subtle hints in the response to guide further questions. For example, mentioning "Whether it's everyday inquiries or professional domain questions, I'm here to assist," offers a comprehensive yet inviting approach.
Finally, I need to check if the response flows smoothly and doesn’t contain repetitive or redundant information, making sure it's concise and clear. While keeping a balance between friendliness and professionalism, I should ensure the user feels both welcomed and assured.
====================Complete Response====================
Hello there~ I'm Qwen, a large-scale language model developed by Alibaba Cloud. I can answer questions, generate text, perform logical reasoning, and even handle programming tasks, aiming to provide help and support to users. Whether it's everyday inquiries or professional domain questions, I'm here to assist. Is there anything I can help you with?
Java
Sample code
// Version of dashscope SDK >= 2.19.4
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(GenerationResult message) {
String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking Process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (!content.isEmpty()) {
finalContent.append(content);
if (!isFirstPrint) {
System.out.println("\n====================Complete Response====================");
isFirstPrint = true;
}
System.out.print(content);
}
}
private static GenerationParam buildGenerationParam(Message userMsg) {
return GenerationParam.builder()
// If the environment variable is not set, please replace the line below with the Model Studio API Key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen-plus-2025-04-28")
.enableThinking(true)
.incrementalOutput(true)
.resultFormat("message")
.messages(Arrays.asList(userMsg))
.build();
}
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
public static void main(String[] args) {
try {
Generation gen = new Generation();
Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
streamCallWithMessage(gen, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Complete Response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you?" and I need to respond based on previous settings. First and foremost, my role is Qwen, a large-scale language model under Alibaba Group. The answer should be conversational and easy to understand.
The user might be new to interacting with me or wants to confirm my identity. I should first directly state who I am and then briefly explain my functions and purposes, such as answering questions, creating text, programming, etc. It's also important to mention my multilingual support so the user knows I can handle requests in different languages.
Additionally, according to guidelines, I should maintain a personable approach, so the tone should be friendly, possibly using emojis to increase friendliness. Moreover, I might need to guide the user toward further questions or using my features, like asking if they need any help.
It's crucial to avoid complex jargon and lengthy explanations. Check for any missing key points, like multilingual support and specific abilities. Ensure the response meets all requirements, including being conversational and concise.
====================Complete Response====================
Hello! I'm Qwen, a large-scale language model under Alibaba Group. I can answer questions, create text like writing stories, official documents, emails, scripts, perform logical reasoning, programming, and more. I can also express opinions, play games, etc. I am proficient in multiple languages, including but not limited to Chinese, English, German, French, Spanish, and more. Is there anything I can assist you with?
HTTP
Sample code
curl
For Qwen3, set enable_thinking
to true
to enable reasoning. enable_thinking
is not effective for QwQ.
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen-plus-2025-04-28",
"input":{
"messages":[
{
"role": "user",
"content": "Who are you?"
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"result_format": "message"
}
}'
Sample response
id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"Well","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":14,"input_tokens":11,"output_tokens":3},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":", ","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":15,"input_tokens":11,"output_tokens":4},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:3
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"the user","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":16,"input_tokens":11,"output_tokens":5},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:4
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"asks","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":17,"input_tokens":11,"output_tokens":6},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:5
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"“","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":18,"input_tokens":11,"output_tokens":7},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
......
id:358
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"help","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":373,"input_tokens":11,"output_tokens":362},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:359
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":", ","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":374,"input_tokens":11,"output_tokens":363},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:360
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"please","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":375,"input_tokens":11,"output_tokens":364},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:361
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"tell","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":376,"input_tokens":11,"output_tokens":365},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:362
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"me","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":377,"input_tokens":11,"output_tokens":366},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:363
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
id:364
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
Multi-round conversation
By default, the API does not store your conversation history. The multi-round conversation feature equips the model with the ability to "remember" past interactions, catering to scenarios such as follow-up questions and information gathering. You will receive reasoning_content
and content
from QwQ. You just need to include content
in the context by using {'role': 'assistant', 'content': concatenated streaming output content}
. reasoning_content
is not required.
OpenAI
Implement multi-round conversation through OpenAI SDK or OpenAI-compatible HTTP method.
Python
Sample code
from openai import OpenAI
import os
# Initialize OpenAI client
client = OpenAI(
# If the environment variable is not set, please replace the following with the Model Studio API Key: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)
reasoning_content = "" # Define complete reasoning process
answer_content = "" # Define complete response
messages = []
conversation_idx = 1
while True:
is_answering = False # Determine whether the reasoning process has finished and response has started
print("="*20+f"Round {conversation_idx} Conversation"+"="*20)
conversation_idx += 1
user_msg = {"role": "user", "content": input("Please enter your message: ")}
messages.append(user_msg)
# Create chat completion request
completion = client.chat.completions.create(
# You can switch to other deep thinking models as needed
model="qwen-plus-2025-04-28",
messages=messages,
# The enable_thinking parameter initiates the reasoning process, which is ineffective for QwQ models
extra_body={"enable_thinking": True},
stream=True,
# stream_options={
# "include_usage": True
# }
)
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
# If chunk.choices is empty, print usage
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Print reasoning process
if hasattr(delta, 'reasoning_content') and delta.reasoning_content != None:
print(delta.reasoning_content, end='', flush=True)
reasoning_content += delta.reasoning_content
else:
# Start responding
if delta.content != "" and is_answering is False:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
# Print response process
print(delta.content, end='', flush=True)
answer_content += delta.content
# Add the model's response content to the context
messages.append({"role": "assistant", "content": answer_content})
print("\n")
Node.js
Sample code
```javascript
import OpenAI from "openai";
import process from 'process';
import readline from 'readline/promises';
// Initialize readline interface
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Retrieve from environment variables
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
let messages = [];
let conversationIdx = 1;
async function main() {
while (true) {
console.log("=".repeat(20) + `Round ${conversationIdx} Conversation` + "=".repeat(20));
conversationIdx++;
// Read user input
const userInput = await rl.question("Please enter your message: ");
messages.push({ role: 'user', content: userInput });
// Reset states
reasoningContent = '';
answerContent = '';
isAnswering = false;
try {
const stream = await openai.chat.completions.create({
// You can switch to other deep thinking models as needed
model: 'qwen-plus-2025-04-28',
messages: messages,
// The enable_thinking parameter initiates the reasoning process, which is ineffective for QwQ models
enable_thinking: true,
stream: true,
// stream_options:{
// include_usage: true
// }
});
console.log("\n" + "=".repeat(20) + "Thinking Process" + "=".repeat(20) + "\n");
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Handle reasoning process
if (delta.reasoning_content) {
process.stdout.write(delta.reasoning_content);
reasoningContent += delta.reasoning_content;
}
// Handle formal response
if (delta.content) {
if (!isAnswering) {
console.log('\n' + "=".repeat(20) + "Complete Response" + "=".repeat(20) + "\n");
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
// Add the complete response to the message history
messages.push({ role: 'assistant', content: answerContent });
console.log("\n");
} catch (error) {
console.error('Error:', error);
}
}
}
// Start the program
main().catch(console.error);
HTTP
Sample code
curl
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus-2025-04-28",
"messages": [
{
"role": "user",
"content": "Hello"
},
{
"role": "assistant",
"content": "Hello! Nice to meet you, how can I help you?"
},
{
"role": "user",
"content": "Who are you?"
}
],
"stream": true,
"stream_options": {
"include_usage": true
},
"enable_thinking": true
}'
DashScope
Implement multi-round conversation through DashScope SDK or HTTP method.
Python
Sample code
import os
import dashscope
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"
messages = []
conversation_idx = 1
while True:
print("=" * 20 + f"Round {conversation_idx} Conversation" + "=" * 20)
conversation_idx += 1
user_msg = {"role": "user", "content": input("Please enter your message: ")}
messages.append(user_msg)
response = dashscope.Generation.call(
# If the environment variable is not set, please replace the following with the Model Studio API Key: api_key="sk-xxx",
api_key=os.getenv('DASHSCOPE_API_KEY'),
# qwen-plus-2025-04-28 is used as an example here; you can switch to other deep thinking models as needed
model="qwen-plus-2025-04-28",
messages=messages,
# The enable_thinking parameter initiates the reasoning process, which is ineffective for QwQ models
enable_thinking=True,
result_format="message",
stream=True,
incremental_output=True
)
# Define complete reasoning process
reasoning_content = ""
# Define complete response
answer_content = ""
# Determine whether the reasoning process has finished and response has started
is_answering = False
print("=" * 20 + "Thinking Process" + "=" * 20)
for chunk in response:
# Ignore if both reasoning process and response are empty
if (chunk.output.choices[0].message.content == "" and
chunk.output.choices[0].message.reasoning_content == ""):
pass
else:
# If currently in reasoning process
if (chunk.output.choices[0].message.reasoning_content != "" and
chunk.output.choices[0].message.content == ""):
print(chunk.output.choices[0].message.reasoning_content, end="",flush=True)
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If currently in response
elif chunk.output.choices[0].message.content != "":
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content, end="",flush=True)
answer_content += chunk.output.choices[0].message.content
# Add the model's response content to the context
messages.append({"role": "assistant", "content": answer_content})
print("\n")
# If you need to print the complete reasoning process and complete response, please uncomment the code below and run it
# print("=" * 20 + "Complete Thinking Process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Complete Response" + "=" * 20 + "\n")
# print(f"{answer_content}")
Java
Sample code
// Version of dashscope SDK >= 2.19.4
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.List;
import com.alibaba.dashscope.protocol.Protocol;
public class Main {
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(GenerationResult message) {
String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking Process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (!content.isEmpty()) {
finalContent.append(content);
if (!isFirstPrint) {
System.out.println("\n====================Complete Response====================");
isFirstPrint = true;
}
System.out.print(content);
}
}
private static GenerationParam buildGenerationParam(List Msg) {
return GenerationParam.builder()
// If the environment variable is not set, please replace the following with the Model Studio API Key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
// qwen-plus-2025-04-28 is used as an example here; you can switch to other model names as needed
.model("qwen-plus-2025-04-28")
.enableThinking(true)
.messages(Msg)
.incrementalOutput(true)
.build();
}
public static void streamCallWithMessage(Generation gen, List Msg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(Msg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
public static void main(String[] args) {
try {
Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
Message userMsg1 = Message.builder()
.role(Role.USER.getValue())
.content("Hello")
.build();
Message AssistantMsg = Message.builder()
.role(Role.ASSISTANT.getValue())
.content("Hello! Nice to meet you, is there anything I can assist you with?")
.build();
Message userMsg2 = Message.builder()
.role(Role.USER.getValue())
.content("Who are you?")
.build();
List Msg = Arrays.asList(userMsg1, AssistantMsg, userMsg2);
streamCallWithMessage(gen, Msg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Complete Response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
HTTP
Sample code
curl
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen-plus-2025-04-28",
"input":{
"messages":[
{
"role": "user",
"content": "Hello"
},
{
"role": "assistant",
"content": "Hello! Nice to meet you, how can I help you?"
},
{
"role": "user",
"content": "Who are you?"
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"result_format": "message"
}
}'
Limit thinking length
Deep thinking models may sometimes produce lengthy reasoning processes, resulting in long wait time and high token consumption. To solve this, set thinking_budget
to limit the length of the reasoning process.
If the number of reasoning tokens exceeds thinking_budget
, the reasoning content will be truncated, and the final response will begin immediately.
Only Qwen3 supports this parameter.
OpenAI
Python
Sample code
from openai import OpenAI
import os
# Initialize OpenAI client
client = OpenAI(
# If the environment variable is not set, please replace the following with the Model Studio API Key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
messages = [{"role": "user", "content": "Who are you?"}]
completion = client.chat.completions.create(
model="qwen-plus-2025-04-28", # You can switch to other deep thinking models as needed
messages=messages,
# The enable_thinking parameter initiates the reasoning process, and the thinking_budget parameter sets the maximum number of tokens for the reasoning process. Both parameters are ineffective for QwQ models.
extra_body={
"enable_thinking": True,
"thinking_budget": 50
},
stream=True,
# stream_options={
# "include_usage": True
# },
)
reasoning_content = "" # Complete reasoning process
answer_content = "" # Complete response
is_answering = False # Whether entering the response phase
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
continue
delta = chunk.choices[0].delta
# Collect only reasoning content
if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
if not is_answering:
print(delta.reasoning_content, end="", flush=True)
reasoning_content += delta.reasoning_content
# Receive content and start responding
if hasattr(delta, "content") and delta.content:
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
print(delta.content, end="", flush=True)
answer_content += delta.content
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you," and I need to provide a clear and friendly response. First, I should clarify my identity as Qwen, developed by Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions, generating text, logical reasoning, etc., aimed at helping and providing convenience to users.
====================Complete Response====================
I am Qwen, a large-scale language model developed by Tongyi Lab under Alibaba Group. I am capable of answering questions, generating text, performing logical reasoning, programming, and more, all aimed at providing help and convenience to users. Is there anything I can assist you with?
Node.js
Sample code
import OpenAI from "openai";
import process from 'process';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Retrieve from environment variables
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
async function main() {
try {
const messages = [{ role: 'user', content: 'Who are you?' }];
const stream = await openai.chat.completions.create({
// qwen-plus-2025-04-28 is used as an example here; you can switch to other deep thinking models as needed
model: 'qwen-plus-2025-04-28',
messages,
stream: true,
// The enable_thinking parameter initiates the reasoning process, and the thinking_budget parameter sets the maximum number of tokens for the reasoning process. Both parameters are ineffective for QwQ models.
enable_thinking: true,
thinking_budget: 50
});
console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Collect only reasoning content
if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
if (!isAnswering) {
process.stdout.write(delta.reasoning_content);
}
reasoningContent += delta.reasoning_content;
}
// Receive content and start responding
if (delta.content !== undefined && delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you?" and I need to provide a clear and accurate response. First, I should introduce my identity as Qwen, developed by Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions.
====================Complete Response====================
I am Qwen, a large-scale language model independently developed by Tongyi Lab under Alibaba Group. I am capable of answering questions, generating text, performing logical reasoning, programming, and handling various tasks. If you have any questions or need assistance, feel free to let me know anytime!
HTTP
Sample code
curl
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus-2025-04-28",
"messages": [
{
"role": "user",
"content": "Who are you"
}
],
"stream": true,
"stream_options": {
"include_usage": true
},
"enable_thinking": true,
"thinking_budget": 50
}'
Sample response
data: {"choices":[{"delta":{"content":null,"role":"assistant","reasoning_content":""},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
.....
data: {"choices":[{"finish_reason":"stop","delta":{"content":"","reasoning_content":null},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":10,"completion_tokens":360,"total_tokens":370},"created":1745485391,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}
data: [DONE]
DashScope
Python
Sample code
import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"
messages = [{"role": "user", "content": "Who are you?"}]
completion = Generation.call(
# If the environment variable is not set, please replace the following with the Model Studio API Key: api_key = "sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# You can switch to other deep thinking models as needed
model="qwen-plus-2025-04-28",
messages=messages,
result_format="message",
# Enable deep thinking; this parameter is ineffective for QwQ models
enable_thinking=True,
# Set the maximum number of tokens for the reasoning process; this parameter is ineffective for QwQ models
thinking_budget=50,
stream=True,
incremental_output=True,
)
# Define complete reasoning process
reasoning_content = ""
# Define complete response
answer_content = ""
# Determine whether the reasoning process has finished and response has started
is_answering = False
print("=" * 20 + "Thinking Process" + "=" * 20)
for chunk in completion:
# Ignore if both reasoning process and response are empty
if (
chunk.output.choices[0].message.content == ""
and chunk.output.choices[0].message.reasoning_content == ""
):
pass
else:
# If currently in reasoning process
if (
chunk.output.choices[0].message.reasoning_content != ""
and chunk.output.choices[0].message.content == ""
):
print(chunk.output.choices[0].message.reasoning_content, end="", flush=True)
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If currently in response
elif chunk.output.choices[0].message.content != "":
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content, end="", flush=True)
answer_content += chunk.output.choices[0].message.content
# If you need to print the complete reasoning process and complete response, please uncomment the code below and run it
# print("=" * 20 + "Complete Thinking Process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Complete Response" + "=" * 20 + "\n")
# print(f"{answer_content}")
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you?" and I need to provide a clear and friendly response. First, I should introduce my identity, namely Qwen, developed by Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions.
====================Complete Response====================
I am Qwen, a large-scale language model independently developed by Tongyi Lab under Alibaba Group. I am capable of answering questions, generating text, performing logical reasoning, programming, and more, aiming to provide comprehensive, accurate, and useful information and assistance to users. Is there anything I can help you with?
Java
Sample code
// Version of dashscope SDK >= 2.19.4
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(GenerationResult message) {
String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking Process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (!content.isEmpty()) {
finalContent.append(content);
if (!isFirstPrint) {
System.out.println("\n====================Complete Response====================");
isFirstPrint = true;
}
System.out.print(content);
}
}
private static GenerationParam buildGenerationParam(Message userMsg) {
return GenerationParam.builder()
// If the environment variable is not set, please replace the following with the Model Studio API Key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen-plus-2025-04-28")
.enableThinking(true)
.thinkingBudget(50)
.incrementalOutput(true)
.resultFormat("message")
.messages(Arrays.asList(userMsg))
.build();
}
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
public static void main(String[] args) {
try {
Generation gen = new Generation();
Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
streamCallWithMessage(gen, userMsg);
// Print the final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Complete Response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
Sample response
====================Thinking Process====================
Alright, the user asked "Who are you?" and I need to provide a clear and friendly response. First, I should introduce my identity, namely Qwen, developed by Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions, generating text, logical reasoning, programming, etc., to offer comprehensive, accurate, and helpful information and assistance to users.
====================Complete Response====================
I am Qwen, a large-scale language model independently developed by Tongyi Lab under Alibaba Group. I am capable of answering questions, generating text, performing logical reasoning, programming, and more, aiming to provide comprehensive, accurate, and useful information and assistance to users. Is there anything I can help you with?
HTTP
Sample code
curl
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen-plus-2025-04-28",
"input":{
"messages":[
{
"role": "user",
"content": "Who are you?"
}
]
},
"parameters":{
"enable_thinking": true,
"thinking_budget": 50,
"incremental_output": true,
"result_format": "message"
}
}'
Sample response
id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"Well","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":14,"output_tokens":3,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":1}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}
id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":", ","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":15,"output_tokens":4,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":2}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}
......
id:133
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":149,"output_tokens":138,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":50}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}
id:134
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":149,"output_tokens":138,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":50}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}
Function calling
Despite their reasoning capability, the deep thinking models cannot interact with the outside world. Function calling introduces foreign tools that help the model to perform tasks like weather queries, database queries, and sending emails.
After completing the thinking process, the Qwen3 and QwQ models will output tool calling information. The tool_choice
parameter can only be set to "auto" (default value, meaning the model selects tools on itself) or "none" (forcing the model not to select any tools).
OpenAI
Python
Sample code
import os
from openai import OpenAI
# Initialize OpenAI client, configuring Alibaba Cloud Model Studio Service
client = OpenAI(
# If the environment variable is not set, please replace the following with the Model Studio API Key: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"), # Read API key from environment variable
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
# Define available tools list
tools = [
# Tool 1: Get the current time
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for knowing the current time.",
"parameters": {} # No parameters needed
}
},
# Tool 2: Get the weather of a specified city
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for querying the weather of a specified city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or district, e.g., Beijing, Hangzhou, Yuhang District, etc."
}
},
"required": ["location"] # Required parameter
}
}
}
]
messages = [{"role": "user", "content": input("Please enter your question: ")}]
completion = client.chat.completions.create(
# qwen-plus-2025-04-28 is used as an example here; you can switch to other deep thinking models
model="qwen-plus-2025-04-28",
messages=messages,
extra_body={
# Enable deep thinking; this parameter is ineffective for QwQ models
"enable_thinking": True
},
tools=tools,
parallel_tool_calls=True,
stream=True,
# Uncomment if you want to retrieve token consumption information
# stream_options={
# "include_usage": True
# }
)
reasoning_content = "" # Define complete reasoning process
answer_content = "" # Define complete response
tool_info = [] # Store tool invocation information
is_answering = False # Determine whether the reasoning process has finished and response has started
print("="*20+"Thinking Process"+"="*20)
for chunk in completion:
if not chunk.choices:
# Handle usage information
print("\n"+"="*20+"Usage"+"="*20)
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Handle AI's thought process (chain reasoning)
if hasattr(delta, 'reasoning_content') and delta.reasoning_content is not None:
reasoning_content += delta.reasoning_content
print(delta.reasoning_content,end="",flush=True) # Real-time output of the thought process
# Handle final response content
else:
if not is_answering: # Print title when entering the response phase for the first time
is_answering = True
print("\n"+"="*20+"Response Content"+"="*20)
if delta.content is not None:
answer_content += delta.content
print(delta.content,end="",flush=True) # Stream output of response content
# Handle tool invocation information (support parallel tool calls)
if delta.tool_calls is not None:
for tool_call in delta.tool_calls:
index = tool_call.index # Tool call index, used for parallel calls
# Dynamically expand tool information storage list
while len(tool_info) <= index:
tool_info.append({})
# Collect tool call ID (used for subsequent function calls)
if tool_call.id:
tool_info[index]['id'] = tool_info[index].get('id', '') + tool_call.id
# Collect function name (used for subsequent routing to specific functions)
if tool_call.function and tool_call.function.name:
tool_info[index]['name'] = tool_info[index].get('name', '') + tool_call.function.name
# Collect function parameters (in JSON string format, need subsequent parsing)
if tool_call.function and tool_call.function.arguments:
tool_info[index]['arguments'] = tool_info[index].get('arguments', '') + tool_call.function.arguments
print(f"\n"+"="*19+"Tool Invocation Information"+"="*19)
if not tool_info:
print("No tool invocation")
else:
print(tool_info)
Sample response
Enter "weather of the four municipalities".
====================Thinking Process====================
Alright, the user asked about the "weather of the four municipalities." First, I need to clarify which four municipalities these are. According to China's administrative regions, the municipalities are Beijing, Shanghai, Tianjin, and Chongqing. Therefore, the user wants to know the weather conditions of these four cities.
Next, I need to check the available tools. Among the provided tools, there is the `get_current_weather` function, with the parameter `location` being a string type. Each city must be queried individually, as the function can check only one location at a time. Thus, I need to call this function once for each municipality.
Then, I need to consider how to generate the correct tool invocation. Each call should include the city name as a parameter. For example, the first call is for Beijing, the second is for Shanghai, and so on. Ensure that the parameter name is `location` and the value is the correct city name.
Additionally, the user might want the weather information for each city, so it is important to ensure that each function call is correct and flawless. It might require calling four times consecutively, once for each city. However, according to the tool usage rules, it may need processing in multiple steps, or generating multiple calls at once. But according to the example, perhaps only one function is called at a time, so it might need to be done gradually.
Finally, confirm if there are any other factors to consider, such as the correctness of parameters, the accuracy of city names, and whether potential errors, like non-existent cities or unavailable API, need handling. But for now, the four municipalities are clear, and there shouldn't be any issues.
====================Complete Response====================
===================Tool Invocation Information===================
[{'id': 'call_767af2834c12488a8fe6e3', 'name': 'get_current_weather', 'arguments': '{"location": "Beijing"}'}, {'id': 'call_2cb05a349c89437a947ada', 'name': 'get_current_weather', 'arguments': '{"location": "Shanghai"}'}, {'id': 'call_988dd180b2ca4b0a864ea7', 'name': 'get_current_weather', 'arguments': '{"location": "Tianjin"}'}, {'id': 'call_4e98c57ea96a40dba26d12', 'name': 'get_current_weather', 'arguments': '{"location": "Chongqing"}'}]
Node.js
Sample code
import OpenAI from "openai";
import readline from 'node:readline/promises';
import { stdin as input, stdout as output } from 'node:process';
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
const tools = [
{
type: "function",
function: {
name: "get_current_time",
description: "Useful for knowing the current time.",
parameters: {}
}
},
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful for querying the weather of a specified city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City or district, e.g., Beijing, Hangzhou, Yuhang District, etc."
}
},
required: ["location"]
}
}
}
];
async function main() {
const rl = readline.createInterface({ input, output });
const question = await rl.question("Please enter your question: ");
rl.close();
const messages = [{ role: "user", content: question }];
let reasoningContent = "";
let answerContent = "";
const toolInfo = [];
let isAnswering = false;
console.log("=".repeat(20) + "Thinking Process" + "=".repeat(20));
try {
const stream = await openai.chat.completions.create({
// qwen-plus-2025-04-28 is used as an example here; you can switch to other deep thinking models
model: "qwen-plus-2025-04-28",
messages,
// Enable deep thinking; this parameter is ineffective for QwQ models
enable_thinking: true,
tools,
stream: true,
parallel_tool_calls: true
});
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log("\n" + "=".repeat(20) + "Usage" + "=".repeat(20));
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0]?.delta;
if (!delta) continue;
// Handle thought process
if (delta.reasoning_content) {
reasoningContent += delta.reasoning_content;
process.stdout.write(delta.reasoning_content);
}
// Handle response content
else {
if (!isAnswering) {
isAnswering = true;
console.log("\n" + "=".repeat(20) + "Response Content" + "=".repeat(20));
}
if (delta.content) {
answerContent += delta.content;
process.stdout.write(delta.content);
}
// Handle tool invocation
if (delta.tool_calls) {
for (const toolCall of delta.tool_calls) {
const index = toolCall.index;
// Ensure array length is sufficient
while (toolInfo.length <= index) {
toolInfo.push({});
}
// Update tool ID
if (toolCall.id) {
toolInfo[index].id = (toolInfo[index].id || "") + toolCall.id;
}
// Update function name
if (toolCall.function?.name) {
toolInfo[index].name = (toolInfo[index].name || "") + toolCall.function.name;
}
// Update parameters
if (toolCall.function?.arguments) {
toolInfo[index].arguments = (toolInfo[index].arguments || "") + toolCall.function.arguments;
}
}
}
}
}
console.log("\n" + "=".repeat(19) + "Tool Invocation Information" + "=".repeat(19));
console.log(toolInfo.length ? toolInfo : "No tool invocation");
} catch (error) {
console.error("Error occurred:", error);
}
}
main();
Sample response
Enter "weather of the four municipalities".
Please enter your question: weather of the four municipalities
====================Thinking Process====================
Alright, the user asked about the weather in the four municipalities. First, I need to clarify which these four municipalities are in China. They are Beijing, Shanghai, Tianjin, and Chongqing, right? Next, I need to call the weather query function for each city.
The user's question likely requires me to separately obtain the weather for these four cities. Each city requires calling the get_current_weather function, with the parameter being the city's name. I need to ensure the parameters are correct, such as the full names of the municipalities, like "Beijing", "Shanghai", "Tianjin", and "Chongqing."
Then, I need to call the weather API for these four cities sequentially. Each call needs a separate tool_call. The user likely wants the current weather for each city, so each call must be accurate and correct. Careful attention to the correct spelling and names of each city is necessary to avoid errors. For example, Chongqing might be abbreviated as "Chongqing" sometimes, so using the full name is recommended in the parameters.
Now, I need to generate four tool_calls, each corresponding to a municipality. Check the correctness of each parameter and arrange them in order. This way, the user will receive weather data for the four municipalities.
====================Complete Response====================
===================Tool Invocation Information===================
json
[
{
"id": "call_21dc802e717f491298d1b2",
"name": "get_current_weather",
"arguments": "{\"location\": \"Beijing\"}"
},
{
"id": "call_2cd3be1d2f694c4eafd4e5",
"name": "get_current_weather",
"arguments": "{\"location\": \"Shanghai\"}"
},
{
"id": "call_48cf3f78e02940bd9085e4",
"name": "get_current_weather",
"arguments": "{\"location\": \"Tianjin\"}"
},
{
"id": "call_e230a2b4c64f4e658d223e",
"name": "get_current_weather",
"arguments": "{\"location\": \"Chongqing\"}"
}
]
HTTP
Sample code
curl
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus-2025-04-28",
"messages": [
{
"role": "user",
"content": "How is the weather in Hangzhou?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for knowing the current time.",
"parameters": {}
}
},
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for querying the weather of a specified city.",
"parameters": {
"type": "object",
"properties": {
"location":{
"type": "string",
"description": "City or district, e.g., Beijing, Hangzhou, Yuhang District, etc."
}
},
"required": ["location"]
}
}
}
],
"enable_thinking": true,
"stream": true
}'
DashScope
Python
Sample code
import dashscope
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"
tools = [
# Tool 1: Get the current time
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for knowing the current time.",
"parameters": {} # No parameters needed since obtaining current time doesn't require input
}
},
# Tool 2: Get the weather for a specified city
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for querying the weather of a specified city.",
"parameters": {
"type": "object",
"properties": {
# Location must be provided when querying weather, hence parameter is set as location
"location": {
"type": "string",
"description": "City or district, e.g., Beijing, Hangzhou, Yuhang District, etc."
}
},
"required": ["location"]
}
}
}
]
# Define question
messages = [{"role": "user", "content": input("Please enter your question: ")}]
completion = dashscope.Generation.call(
# qwen-plus-2025-04-28 is used as an example here; you can switch to other deep thinking models
model="qwen-plus-2025-04-28",
messages=messages,
enable_thinking=True,
tools=tools,
parallel_tool_calls=True,
stream=True,
incremental_output=True,
result_format="message"
)
reasoning_content = ""
answer_content = ""
tool_info = []
is_answering = False
print("="*20+"Thinking Process"+"="*20)
for chunk in completion:
if chunk.status_code == 200:
msg = chunk.output.choices[0].message
# Handle thought process
if 'reasoning_content' in msg and msg.reasoning_content:
reasoning_content += msg.reasoning_content
print(msg.reasoning_content, end="", flush=True)
# Handle response content
if 'content' in msg and msg.content:
if not is_answering:
is_answering = True
print("\n"+"="*20+"Response Content"+"="*20)
answer_content += msg.content
print(msg.content, end="", flush=True)
# Handle tool invocation
if 'tool_calls' in msg and msg.tool_calls:
for tool_call in msg.tool_calls:
index = tool_call['index']
while len(tool_info) <= index:
tool_info.append({'id': '', 'name': '', 'arguments': ''}) # Initialize all fields
# Incrementally update tool ID
if 'id' in tool_call:
tool_info[index]['id'] += tool_call.get('id', '')
# Incrementally update function information
if 'function' in tool_call:
func = tool_call['function']
# Incrementally update function name
if 'name' in func:
tool_info[index]['name'] += func.get('name', '')
# Incrementally update parameters
if 'arguments' in func:
tool_info[index]['arguments'] += func.get('arguments', '')
print(f"\n"+"="*19+"Tool Invocation Information"+"="*19)
if not tool_info:
print("No tool invocation")
else:
print(tool_info)
Sample response
Enter "weather of the four municipalities".
Please enter your question: weather of the four municipalities
====================Thinking Process====================
Alright, the user asked about the weather in the four municipalities. First, I need to confirm which these municipalities are in China: Beijing, Shanghai, Tianjin, and Chongqing, right? Next, the user needs the weather information for each city, so I need to call the weather query function.
However, the issue is that the user didn't specify the exact city names, just mentioned the four municipalities. I might need to specify each municipality's name and query them separately. For instance, Beijing, Shanghai, Tianjin, and Chongqing are the four municipalities I need to confirm are correctly identified.
Next, I need to check the available tools; the user's provided function is `get_current_weather`, with `location` as the parameter. Therefore, I need to call this function for each municipality, providing the corresponding city name as the parameter. For example, the first call will have `location` set to Beijing, the second to Shanghai, the third to Tianjin, and the fourth to Chongqing.
It might be necessary to note that sometimes municipalities like Chongqing might require more specific districts, but the user might only need city-level weather information. Using the municipality's name directly should be fine for this task. Following that, I need to generate four separate function calls, each corresponding to one municipality, so the user will receive weather information for the four cities.
Finally, ensure each call's parameter is correct and nothing is missed. This way, the user's query will receive a complete response.
===================Tool Invocation Information===================
[{'id': 'call_2f774ed97b0e4b24ab10ec', 'name': 'get_current_weather', 'arguments': '{"location": "Beijing"}'}, {'id': 'call_dc3b05b88baa48c58bc33a', 'name': 'get_current_weather', 'arguments': '{"location": "Shanghai"}'}, {'id': 'call_249b2de2f73340cdb46cbc', 'name': 'get_current_weather', 'arguments': '{"location": "Tianjin"}'}, {'id': 'call_833333634fda49d1b39e87', 'name': 'get_current_weather', 'arguments': '{"location": "Chongqing"}'}]
Java
Sample code
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.tools.FunctionDefinition;
import io.reactivex.Flowable;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.lang.System;
import com.github.victools.jsonschema.generator.Option;
import com.github.victools.jsonschema.generator.OptionPreset;
import com.github.victools.jsonschema.generator.SchemaGenerator;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfig;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfigBuilder;
import com.github.victools.jsonschema.generator.SchemaVersion;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import com.alibaba.dashscope.utils.Constants;
public class Main {
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static ObjectNode jsonSchemaWeather;
private static ObjectNode jsonSchemaTime;
static {
Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
}
static class TimeTool {
public String call() {
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
return "Current time: " + now.format(formatter) + ".";
}
}
static class WeatherTool {
private String location;
public WeatherTool(String location) {
this.location = location;
}
public String call() {
return location + " is sunny today.";
}
}
static {
SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(
SchemaVersion.DRAFT_2020_12, OptionPreset.PLAIN_JSON);
SchemaGeneratorConfig config = configBuilder
.with(Option.EXTRA_OPEN_API_FORMAT_VALUES)
.without(Option.FLATTENED_ENUMS_FROM_TOSTRING)
.build();
SchemaGenerator generator = new SchemaGenerator(config);
jsonSchemaWeather = generator.generateSchema(WeatherTool.class);
jsonSchemaTime = generator.generateSchema(TimeTool.class);
}
private static void handleGenerationResult(GenerationResult message) {
System.out.println(JsonUtils.toJson(message));
}
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
private static GenerationParam buildGenerationParam(Message userMsg) {
FunctionDefinition fdWeather = buildFunctionDefinition(
"get_current_weather", "Get the weather of a specified location", jsonSchemaWeather);
FunctionDefinition fdTime = buildFunctionDefinition(
"get_current_time", "Get the current time", jsonSchemaTime);
return GenerationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen-plus-2025-04-28")
.enableThinking(true)
.messages(Arrays.asList(userMsg))
.resultFormat(GenerationParam.ResultFormat.MESSAGE)
.incrementalOutput(true)
.tools(Arrays.asList(
ToolFunction.builder().function(fdWeather).build(),
ToolFunction.builder().function(fdTime).build()))
.build();
}
private static FunctionDefinition buildFunctionDefinition(
String name, String description, ObjectNode schema) {
return FunctionDefinition.builder()
.name(name)
.description(description)
.parameters(JsonUtils.parseString(schema.toString()).getAsJsonObject())
.build();
}
public static void main(String[] args) {
try {
Generation gen = new Generation();
Message userMsg = Message.builder()
.role(Role.USER.getValue())
.content("Please tell me the weather in Hangzhou")
.build();
streamCallWithMessage(gen, userMsg);
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
Sample response
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":6,"total_tokens":244},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"Well, the user want to"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":12,"total_tokens":250},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"know the weather in Hangzhou. I"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":16,"total_tokens":254},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"should first check whether I have"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":22,"total_tokens":260},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"related tools. Check the provided"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":28,"total_tokens":266},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"tools, I find get_current"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":34,"total_tokens":272},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"_weather. Its parameter is location"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":38,"total_tokens":276},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":". So I should call"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":43,"total_tokens":281},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"this function, and the parameter"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":48,"total_tokens":286},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"is Hangzhou. No other"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":52,"total_tokens":290},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"tool is needed. Because"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":56,"total_tokens":294},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"the user only asks about"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":60,"total_tokens":298},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"weather. Then, construct"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":64,"total_tokens":302},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"tool_call and fill in"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":68,"total_tokens":306},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"the name and parameter"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":73,"total_tokens":311},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":". Make sure the parameter is"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":78,"total_tokens":316},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"a JSON object and location is"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":82,"total_tokens":320},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"a string. Return"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":88,"total_tokens":326},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"after checking."}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":106,"total_tokens":344},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"","tool_calls":[{"type":"function","id":"call_ecc41296dccc47baa01567","function":{"name":"get_current_weather","arguments":"{\"location\": \"Hangzhou"}}]}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":108,"total_tokens":346},"output":{"choices":[{"finish_reason":"tool_calls","message":{"role":"assistant","content":"","reasoning_content":"","tool_calls":[{"type":"function","id":"","function":{"arguments":"\"}"}}]}}]}}
HTTP
Sample code
curl
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen-plus-2025-04-28",
"input": {
"messages": [
{
"role": "user",
"content": "Weather in Hangzhou"
}
]
},
"parameters": {
"result_format": "message",
"enable_thinking": true,
"incremental_output": true,
"tools": [{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for knowing the current time.",
"parameters": {}
}
},{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for querying the weather of a specified city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or district, e.g., Beijing, Hangzhou, Yuhang District, etc."
}
},
"required": ["location"]
}
}
}]
}
}'
After getting the function calling information, you can refer to Run tool functions and LLM summarizing tool function output (optional).
Enable/disable thinking mode
Apart from the enable_thinking
parameter, Qwen3 also provides a convenient method to dynamically control the thinking mode through prompts. When enable_thinking
is true
, add /no_think
in the prompt to turn off thinking mode in subsequent responses. To turn it on again in a multi-round conversation add /think
to the latest prompt.
In multi-round conversations, the model will follow the most recent/think
or/no_think
command.
If Qwen3 does not output its thinking process, the output token will be charged at the non-thinking price.
OpenAI
Python
Sample code
from openai import OpenAI
import os
# Initialize OpenAI client
client = OpenAI(
# If the environment variable is not configured, please replace with Model Studio API key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
# Add /no_think to the prompt, which will turn off the thinking mode even if enable_thinking is set to true.
messages = [{"role": "user", "content": "Who are you/no_think"}]
completion = client.chat.completions.create(
model="qwen-plus-2025-04-28", # You can replace with other Qwen3 models as needed
messages=messages,
# The enable_thinking parameter initiates the thinking process, but it is ineffective for the QwQ model
extra_body={"enable_thinking": True},
stream=True,
# stream_options={
# "include_usage": True
# },
)
reasoning_content = "" # Complete reasoning process
answer_content = "" # Complete response
is_answering = False # Indicates whether the response phase has started
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
if not chunk.choices:
print("\nUsage:")
print(chunk.usage)
continue
delta = chunk.choices[0].delta
# Only collect reasoning content
if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
if not is_answering:
print(delta.reasoning_content, end="", flush=True)
reasoning_content += delta.reasoning_content
# Receive content and begin to respond
if hasattr(delta, "content") and delta.content:
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
print(delta.content, end="", flush=True)
answer_content += delta.content
Sample response
====================Thinking Process====================
====================Complete Response====================
I am Qwen, an ultra-large-scale language model independently developed by the Tongyi Lab under Alibaba Group. I can assist you in answering questions, creating text, performing logical reasoning, coding, and other tasks. If you have any questions or need help, feel free to ask me anytime!
Node.js
Sample code
import OpenAI from "openai";
import process from 'process';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = '';
let answerContent = '';
let isAnswering = false;
async function main() {
try {
// Add /no_think to the prompt, which will turn off the thinking mode even if enable_thinking is set to true.
const messages = [{ role: 'user', content: 'Who are you/no_think' }];
const stream = await openai.chat.completions.create({
// You can replace with other Qwen3 models as needed
model: 'qwen-plus-2025-04-28',
messages,
stream: true,
// The enable_thinking parameter initiates the Thinking Process, but it is ineffective for the QwQ model
enable_thinking: true
});
console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\nUsage:');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Only collect reasoning content
if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
if (!isAnswering) {
process.stdout.write(delta.reasoning_content);
}
reasoningContent += delta.reasoning_content;
}
// Receive content and begin to respond
if (delta.content !== undefined && delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();
Sample response
====================Thinking Process====================
====================Complete Response====================
I am Qwen, an ultra-large-scale language model independently developed by Tongyi Lab under Alibaba Group. I can assist with answering questions, creating text (such as stories, official documents, emails, scripts), logical reasoning, programming, and more. Additionally, I can express opinions and play games. If you have any questions or need help, feel free to ask me anytime!
HTTP
Sample code
curl
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus-2025-04-28",
"messages": [
{
"role": "user",
"content": "Who are you /no_think"
}
],
"stream": true,
"stream_options": {
"include_usage": true
},
"enable_thinking": true
}'
Sample response
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"finish_reason":null,"delta":{"content":"I"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" am Qwen,","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" a large-scale language","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" model independently developed by","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" the Tongyi Lab","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" under Alibaba Group.","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" I am capable of","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" answering questions, creating","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" text such as stories","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":", official documents,","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" emails, scripts,","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" performing logical reasoning,","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" coding, and more","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":". I can also","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" express opinions and play","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" games. If you","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" have any questions or","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" need assistance, feel","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" free to let me","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"delta":{"content":" know anytime!","reasoning_content":null},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[{"finish_reason":"stop","delta":{"content":"","reasoning_content":null},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":15,"completion_tokens":80,"total_tokens":95,"completion_tokens_details":{"reasoning_tokens":0}},"created":1746689786,"system_fingerprint":null,"model":"qwen-plus-2025-04-28","id":"chatcmpl-284e4638-e77b-9663-84f5-c46778baa018"}
data: [DONE]
DashScope
Python
Sample code
import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"
# Add the /no_think suffix to the prompt, which will turn off the thinking mode even if enable_thinking is set to true.
messages = [{"role": "user", "content": "Who are you? /no_think"}]
completion = Generation.call(
# If the environment variable is not configured, please replace the following line with your Model Studio API key: api_key = "sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# You can replace with other Qwen3 models as needed
model="qwen-plus-2025-04-28",
messages=messages,
result_format="message",
enable_thinking=True,
stream=True,
incremental_output=True,
)
# Define complete Thinking Process
reasoning_content = ""
# Define complete response
answer_content = ""
# Determine whether the Thinking Process has ended and the response has begun
is_answering = False
print("=" * 20 + "Thinking Process" + "=" * 20)
for chunk in completion:
# Ignore if both Thinking Process and response are empty
if (
chunk.output.choices[0].message.content == ""
and chunk.output.choices[0].message.reasoning_content == ""
):
pass
else:
# If currently in Thinking Process
if (
chunk.output.choices[0].message.reasoning_content != ""
and chunk.output.choices[0].message.content == ""
):
print(chunk.output.choices[0].message.reasoning_content, end="", flush=True)
reasoning_content += chunk.output.choices[0].message.reasoning_content
# If currently in response
elif chunk.output.choices[0].message.content != "":
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20)
is_answering = True
print(chunk.output.choices[0].message.content, end="", flush=True)
answer_content += chunk.output.choices[0].message.content
# If you need to print the complete thinking process and complete response, uncomment the following lines and run
# print("=" * 20 + "Complete Thinking Process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Complete Response" + "=" * 20 + "\n")
# print(f"{answer_content}")
Sample response
====================Thinking Process====================
====================Complete Response====================
Hello! I'm Qwen, and I'm really excited to tell you about myself! Think of me as your friendly AI companion, always ready to learn and help out. Whether you need help with coding, want to dive into some creative writing, or just have questions about any topic under the sun, I'm here to explore it all with you.
I love tackling challenges - from solving complex math problems to having deep conversations about philosophy. And don't get me started on my creative side! I can help you craft stories, poems, or any written content you can imagine. What makes me special is how I can switch between different modes to best suit our conversation - kind of like a Swiss Army knife for your curiosity!
Want to have a casual chat or dive deep into some serious learning? I'm equally comfortable with both! Let's embark on this journey of discovery together - what would you like to explore first?
Java
Sample code
// Version of dashscope SDK >= 2.19.4
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class Main {
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(GenerationResult message) {
String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (!reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking Process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (!content.isEmpty()) {
finalContent.append(content);
if (!isFirstPrint) {
System.out.println("\n====================Complete Response====================");
isFirstPrint = true;
}
System.out.print(content);
}
}
private static GenerationParam buildGenerationParam(Message userMsg) {
return GenerationParam.builder()
// If the environment variable is not configured, please replace the following line with your Bailian API Key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
// This uses the qwen-plus-2025-04-28 model; you can replace it with other Qwen3 models as needed
.model("qwen-plus-2025-04-28")
.enableThinking(true)
.incrementalOutput(true)
.resultFormat("message")
.messages(Arrays.asList(userMsg))
.build();
}
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
public static void main(String[] args) {
try {
Generation gen = new Generation("http", "https://dashscope-intl.aliyuncs.com/api/v1");
// Add /no_think to the prompt, which will turn off the thinking mode even if enable_thinking is set to true.
Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?/no_think").build();
streamCallWithMessage(gen, userMsg);
// Print final result
// if (reasoningContent.length() > 0) {
// System.out.println("\n====================Complete Response====================");
// System.out.println(finalContent.toString());
// }
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
Sample response
I am Qwen, an ultra-large-scale language model independently developed by Tongyi Lab under Alibaba Group. I can help you answer questions, create texts (such as stories, official documents, emails, scripts), perform logical reasoning, programming, and more. Additionally, I can express opinions and play games. If you have any questions or need assistance, feel free to ask me anytime!
HTTP
Sample code
curl
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen-plus-2025-04-28",
"input":{
"messages":[
{
"role": "user",
"content": "Who are you /no_think"
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"result_format": "message"
}
}'
Sample response
id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"I","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":20,"output_tokens":5,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" am a large-scale","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":24,"output_tokens":9,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:3
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" language model independently developed","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":28,"output_tokens":13,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:4
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" by the Tongyi","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":32,"output_tokens":17,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:5
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" Lab under Alibaba Group","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":36,"output_tokens":21,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:6
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":". My name is","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":40,"output_tokens":25,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:7
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" Qwen. I","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":44,"output_tokens":29,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:8
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" am capable of answering","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":48,"output_tokens":33,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:9
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" questions, creating text","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":52,"output_tokens":37,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:10
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" such as stories,","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":56,"output_tokens":41,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:11
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" official documents, emails","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":60,"output_tokens":45,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:12
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":", scripts, performing","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":64,"output_tokens":49,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:13
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" logical reasoning, coding","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":68,"output_tokens":53,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:14
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":", and more.","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":72,"output_tokens":57,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:15
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" I can also express","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":76,"output_tokens":61,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:16
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" opinions and play games","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":80,"output_tokens":65,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:17
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":". If you have","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":84,"output_tokens":69,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:18
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" any questions or need","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":88,"output_tokens":73,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:19
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" assistance, feel free","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":92,"output_tokens":77,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:20
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" to ask me anytime","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":96,"output_tokens":81,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:21
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":97,"output_tokens":82,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
id:22
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":97,"output_tokens":82,"input_tokens":15,"output_tokens_details":{"reasoning_tokens":0}},"request_id":"a3e7bd75-db44-9356-96fc-c69b5aa97b80"}
Usage notes
To achieve best reasoning performance, do not set System Message. You can specify the purpose, output format, and other requirements in User Message.
FAQ
Q: How to disable the thinking process?
Q: How to purchase tokens when my free quota runs out?
Q: Can I upload images or documents in questions?
Q: How to view token usage and the number of API calls?
API references
For the input and output parameters, see Qwen.
Error codes
If the call failed and an error message is returned, see Error messages.