Large language models may struggle with tasks that require real-time data or complex math. Function calling solves this by allowing models to invoke external tools to answer questions they cannot handle on their own.
How it works
Function calling uses a multi-step interaction between your application and the model to call external tools and generate a response.
-
Make the first model call
Your application sends the user's question and a list of available tools to the model.
-
Receive the tool call instruction (tool name and input parameters)
If the model needs an external tool, it returns a JSON instruction that specifies which function to run and what parameters to pass.
If no tool is needed, the model returns a response in natural language.
-
Run the tool in your application
Your application runs the specified tool and captures its output.
-
Make the second model call
You can add the tool output to the model's context (the messages array) and call the model again.
-
Receive the final response from the model
The model combines the tool output and the original user question to generate a reply in natural language.
The following diagram illustrates the workflow:
Supported models
Qwen
-
Text generation models
-
Qwen-Max: Qwen3.7-Max series, Qwen3.6-Max series, Qwen3-Max series, Qwen-Max series.
-
Qwen-Plus: Qwen3.7-Plus series, Qwen3.6-Plus series, Qwen3.5-Plus series, Qwen-Plus series
-
Qwen-Flash: Qwen3.6-Flash series, Qwen3.5-Flash series, Qwen-Flash series
-
Qwen-Coder: Qwen3-Coder series, Qwen2.5-Coder series, Qwen-Coder series
-
Qwen-Turbo: Qwen-Turbo series
-
Qwen3.6 open source series
-
Qwen3.5 open source series
-
Qwen3 open source series
-
Qwen2.5 open source series
-
-
Multimodal models
-
Qwen-VL: Qwen3-VL-Plus series, Qwen3-VL-Flash series
-
Qwen-Omni: Qwen3.5-Omni-Plus series, Qwen3.5-Omni-Flash series, Qwen3-Omni-Flash series
-
Qwen-Omni-Realtime: Qwen3.5-Omni-Plus-Realtime series, Qwen3.5-Omni-Flash-Realtime series
-
Qwen3-VL open source series
-
DeepSeek
-
deepseek-v4-pro
-
deepseek-v4-flash
-
deepseek-v3.2
-
deepseek-v3.2-exp (non-thinking mode)
-
deepseek-v3.1 (non-thinking mode)
-
deepseek-r1
-
deepseek-r1-0528
-
deepseek-v3
GLM
-
glm-5.1
-
glm-5
-
glm-4.7
-
glm-4.6
Kimi
-
kimi-k2.6
-
kimi-k2.5
-
kimi-k2-thinking
-
Moonshot-Kimi-K2-Instruct
MiniMax
MiniMax-M2.5
Getting started
Get an API key and set it as an environment variable. This method is scheduled for deprecation and will be merged into the API key configuration. If you use the OpenAI SDK or DashScope SDK, you must install the SDK.
This example uses function calling for weather lookups.
OpenAI compatible
from openai import OpenAI
from datetime import datetime
import json
import os
import random
client = OpenAI(
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
# If you have not set the environment variable, replace the line below with: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
# Simulate user question
USER_QUESTION = "What is the weather in Singapore?"
# Define tool list
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Singapore or New York.",
}
},
"required": ["location"],
},
},
},
]
# Simulate weather lookup tool
def get_current_weather(arguments):
weather_conditions = ["sunny", "cloudy", "rainy"]
random_weather = random.choice(weather_conditions)
location = arguments["location"]
return f"{location} is {random_weather} today."
# Wrap model response function
def get_response(messages):
completion = client.chat.completions.create(
model="qwen3.6-plus",
extra_body={"enable_thinking": False},
messages=messages,
tools=tools,
)
return completion
messages = [{"role": "user", "content": USER_QUESTION}]
response = get_response(messages)
assistant_output = response.choices[0].message
if assistant_output.content is None:
assistant_output.content = ""
messages.append(assistant_output)
# If no tool is needed, output content directly
if assistant_output.tool_calls is None:
print(f"No weather tool needed. Direct reply: {assistant_output.content}")
else:
# Enter tool calling loop
while assistant_output.tool_calls is not None:
tool_call = assistant_output.tool_calls[0]
tool_call_id = tool_call.id
func_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Calling tool [{func_name}] with arguments: {arguments}")
# Execute tool
tool_result = get_current_weather(arguments)
# Build tool response
tool_message = {
"role": "tool",
"tool_call_id": tool_call_id,
"content": tool_result, # Keep original tool output
}
print(f"Tool returned: {tool_message['content']}")
messages.append(tool_message)
# Call model again to summarize tool result into natural language
response = get_response(messages)
assistant_output = response.choices[0].message
if assistant_output.content is None:
assistant_output.content = ""
messages.append(assistant_output)
print(f"Final assistant reply: {assistant_output.content}")import OpenAI from 'openai';
// Initialize client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
// For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
baseURL: "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
});
// Define tool list
const tools = [
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful when you want to check the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City or county, such as Singapore or New York.",
},
},
required: ["location"],
},
},
},
];
// Simulate weather lookup tool
const getCurrentWeather = (args) => {
const weatherConditions = ["sunny", "cloudy", "rainy"];
const randomWeather = weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
const location = args.location;
return `${location} is ${randomWeather} today.`;
};
// Wrap model response function
const getResponse = async (messages) => {
const response = await openai.chat.completions.create({
model: "qwen3.6-plus",
enable_thinking: false,
messages: messages,
tools: tools,
});
return response;
};
const main = async () => {
const input = "What is the weather in Singapore?";
let messages = [
{
role: "user",
content: input,
}
];
let response = await getResponse(messages);
let assistantOutput = response.choices[0].message;
// Ensure content is not null
if (!assistantOutput.content) assistantOutput.content = "";
messages.push(assistantOutput);
// Check if tool call is needed
if (!assistantOutput.tool_calls) {
console.log(`No weather tool needed. Direct reply: ${assistantOutput.content}`);
} else {
// Enter tool calling loop
while (assistantOutput.tool_calls) {
const toolCall = assistantOutput.tool_calls[0];
const toolCallId = toolCall.id;
const funcName = toolCall.function.name;
const funcArgs = JSON.parse(toolCall.function.arguments);
console.log(`Calling tool [${funcName}] with arguments:`, funcArgs);
// Execute tool
const toolResult = getCurrentWeather(funcArgs);
// Build tool response
const toolMessage = {
role: "tool",
tool_call_id: toolCallId,
content: toolResult,
};
console.log(`Tool returned: ${toolMessage.content}`);
messages.push(toolMessage);
// Call model again to summarize into natural language
response = await getResponse(messages);
assistantOutput = response.choices[0].message;
if (!assistantOutput.content) assistantOutput.content = "";
messages.push(assistantOutput);
}
console.log(`Final assistant reply: ${assistantOutput.content}`);
}
};
// Start program
main().catch(console.error);DashScope
import os
from dashscope import MultiModalConversation
import dashscope
import json
import random
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1'
# 1. Define tool list
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Singapore or New York.",
}
},
"required": ["location"],
},
},
}
]
# 2. Simulate weather lookup tool
def get_current_weather(arguments):
weather_conditions = ["sunny", "cloudy", "rainy"]
random_weather = random.choice(weather_conditions)
location = arguments["location"]
return f"{location} is {random_weather} today."
# 3. Wrap model response function
def get_response(messages):
response = MultiModalConversation.call(
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
# If you have not set the environment variable, replace the line below with: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
# This example uses multimodal model qwen3.6-plus. For text-only models like qwen3.6-max-preview or qwen-plus, use the text-generation interface. See https://www.alibabacloud.com/help/model-studio/qwen-api-via-dashscope
model="qwen3.6-plus",
enable_thinking=False,
messages=messages,
tools=tools,
result_format="message",
)
return response
# 4. Initialize conversation history
messages = [
{
"role": "user",
"content": [{"text": "What is the weather in Singapore?"}]
}
]
# 5. First model call
response = get_response(messages)
assistant_output = response.output.choices[0].message
messages.append(assistant_output)
# 6. Check if tool call is needed
if "tool_calls" not in assistant_output or not assistant_output["tool_calls"]:
print(f"No tool needed. Direct reply: {assistant_output['content']}")
else:
# 7. Enter tool calling loop
# Loop condition: keep going while the latest model reply includes a tool call request
while "tool_calls" in assistant_output and assistant_output["tool_calls"]:
tool_call = assistant_output["tool_calls"][0]
# Parse tool call info
func_name = tool_call["function"]["name"]
arguments = json.loads(tool_call["function"]["arguments"])
tool_call_id = tool_call.get("id") # Get tool_call_id
print(f"Calling tool [{func_name}] with arguments: {arguments}")
# Execute corresponding tool function
tool_result = get_current_weather(arguments)
# Build tool response
tool_message = {
"role": "tool",
"content": tool_result,
"tool_call_id": tool_call_id
}
print(f"Tool returned: {tool_message['content']}")
messages.append(tool_message)
# Call model again to reply based on tool result
response = get_response(messages)
assistant_output = response.output.choices[0].message
messages.append(assistant_output)
# 8. Output final natural-language reply
content = assistant_output["content"]
if isinstance(content, list) and content:
content = content[0].get("text", "") if isinstance(content[0], dict) else str(content[0])
print(f"Final assistant reply: {content}")import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.protocol.Protocol;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.tools.FunctionDefinition;
import com.alibaba.dashscope.tools.ToolCallBase;
import com.alibaba.dashscope.tools.ToolCallFunction;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.utils.JsonUtils;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Random;
public class Main {
/**
* Extract plain text from MultiModalMessage's content.
* Content format is List<Map<String, String>>, e.g.: [{text=weather is sunny}]
*/
@SuppressWarnings("unchecked")
public static String getTextContent(Object content) {
if (content instanceof List) {
for (Object item : (List<?>) content) {
if (item instanceof Map) {
Object text = ((Map<String, Object>) item).get("text");
if (text != null) return text.toString();
}
}
}
return content != null ? content.toString() : "";
}
/**
* Local implementation of the tool.
* @param arguments JSON string containing tool parameters passed by the model.
* @return Result string after tool execution.
*/
public static String getCurrentWeather(String arguments) {
try {
// The model provides parameters in JSON format, so we parse them manually.
ObjectMapper objectMapper = new ObjectMapper();
JsonNode argsNode = objectMapper.readTree(arguments);
String location = argsNode.get("location").asText();
// Use random results to simulate real API calls or business logic.
List<String> weatherConditions = Arrays.asList("sunny", "cloudy", "rainy");
String randomWeather = weatherConditions.get(new Random().nextInt(weatherConditions.size()));
return location + " is " + randomWeather + " today.";
} catch (Exception e) {
// Exception handling to ensure robustness.
return "Unable to parse location parameter.";
}
}
public static void main(String[] args) {
try {
// Register our tool with the model.
String weatherParamsSchema =
"{\"type\":\"object\",\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"City or county, such as Singapore or New York. \"}},\"required\":[\"location\"]}";
FunctionDefinition weatherFunction = FunctionDefinition.builder()
.name("get_current_weather") // Unique identifier for the tool, must match local implementation.
.description("Useful when you want to check the weather in a specific city.") // Clear description helps the model decide when to use this tool.
.parameters(JsonUtils.parseString(weatherParamsSchema).getAsJsonObject())
.build();
// For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1
MultiModalConversation conv = new MultiModalConversation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1");
String userInput = "What is the weather in Singapore?";
List<MultiModalMessage> messages = new ArrayList<>();
messages.add(MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("text", userInput))).build());
// First model call. Send user request and tool list to the model.
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model("qwen3.6-plus") // This example uses multimodal model qwen3.6-plus. For text-only models like qwen3.6-max-preview or qwen-plus, use the text-generation interface. See https://www.alibabacloud.com/help/model-studio/qwen-api-via-dashscope
.enableThinking(false)
.apiKey(System.getenv("DASHSCOPE_API_KEY")) // Get API key from environment variable. API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
.messages(messages) // Pass current conversation history.
.tools(Arrays.asList(ToolFunction.builder().function(weatherFunction).build())) // Pass available tools.
.build();
MultiModalConversationResult result = conv.call(param);
MultiModalMessage assistantOutput = result.getOutput().getChoices().get(0).getMessage();
messages.add(assistantOutput); // Add model's first reply to conversation history.
// Check model reply to see if it requests a tool call.
if (assistantOutput.getToolCalls() == null || assistantOutput.getToolCalls().isEmpty()) {
// Case A: Model replies directly without calling a tool.
System.out.println("No weather tool needed. Direct reply: " + getTextContent(assistantOutput.getContent()));
} else {
// Case B: Model decides to call a tool.
// Use a while loop to handle multiple tool calls in sequence.
while (assistantOutput.getToolCalls() != null && !assistantOutput.getToolCalls().isEmpty()) {
ToolCallBase toolCall = assistantOutput.getToolCalls().get(0);
// Parse tool call details (function name and arguments) from model reply.
ToolCallFunction functionCall = (ToolCallFunction) toolCall;
String funcName = functionCall.getFunction().getName();
String arguments = functionCall.getFunction().getArguments();
System.out.println("Calling tool [" + funcName + "] with arguments: " + arguments);
// Execute corresponding Java method by tool name.
String toolResult = getCurrentWeather(arguments);
// Build a message with role "tool" containing the tool result.
MultiModalMessage toolMessage = MultiModalMessage.builder()
.role("tool")
.toolCallId(toolCall.getId())
.content(Arrays.asList(Collections.singletonMap("text", toolResult)))
.build();
System.out.println("Tool returned: " + toolResult);
messages.add(toolMessage); // Add tool result to conversation history.
// Call model again.
param.setMessages((List) messages);
result = conv.call(param);
assistantOutput = result.getOutput().getChoices().get(0).getMessage();
messages.add(assistantOutput);
}
// Print final reply generated by the model after summarizing.
System.out.println("Final assistant reply: " + getTextContent(assistantOutput.getContent()));
}
} catch (NoApiKeyException | UploadFileException e) {
System.err.println("Error: " + e.getMessage());
} catch (Exception e) {
e.printStackTrace();
}
}
}Output:
Calling tool [get_current_weather] with arguments: {'location': 'Singapore'}
Tool returned: Singapore is cloudy today.
Final assistant reply: Today's weather in Singapore is cloudy.
Usage
Function calling supports two methods to provide tool information:
-
Method 1: Pass tools using the tools parameter (recommended)
See Usage. The steps are: Define tools, Create the messages array, Invoke function calling, Run tool functions, and Summarize tool output with the model.
-
Method 2: Pass tools via the system message.
We recommend using the tools parameter because the server automatically selects the best prompt template for it. To pass tools through the system message with Qwen models, see Pass tool information via system message.
The following sections demonstrate function calling with the tools parameter using the OpenAI-compatible method.
Assume that your application handles two types of questions: weather lookups and time lookups.
1. Define tools
Tools connect the model to external services.
1.1. Create tool functions
You can create two tool functions: one for weather lookup and one for time lookup.
-
Weather lookup tool
This tool accepts an
argumentsparameter. The value forargumentsmust be in the format{"location": "location to query"}. The tool returns a string in the format"{location} today is {weather}".To simplify this demo, the weather tool does not make real API calls. It randomly selects from sunny, cloudy, or rainy. In a production environment, you can replace it with a real service, such as the Amap Weather API.
-
Time lookup tool
No input parameters are required. The output format is
"Current time: {retrieved time}.".If you are using Node.js, you can run
npm install date-fnsto install the date-fns package.
## Step 1: Define tool functions
# Import random module
import random
from datetime import datetime
# Simulate weather lookup tool. Example output: "Beijing is rainy today."
def get_current_weather(arguments):
# List of possible weather conditions
weather_conditions = ["sunny", "cloudy", "rainy"]
# Pick a random weather condition
random_weather = random.choice(weather_conditions)
# Extract location from JSON
location = arguments["location"]
# Return formatted weather info
return f"{location} is {random_weather} today."
# Lookup current time. Example output: "Current time: 2024-04-15 17:15:18."
def get_current_time():
# Get current date and time
current_datetime = datetime.now()
# Format date and time
formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
# Return formatted current time
return f"Current time: {formatted_time}."
# Test tool functions and print results. Remove these four lines before running later steps.
print("Test tool outputs:")
print(get_current_weather({"location": "Shanghai"}))
print(get_current_time())
print("\n")// Step 1: Define tool functions
// Import time formatting package
import { format } from 'date-fns';
function getCurrentWeather(args) {
// List of possible weather conditions
const weatherConditions = ["sunny", "cloudy", "rainy"];
// Pick a random weather condition
const randomWeather = weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
// Extract location from JSON
const location = args.location;
// Return formatted weather info
return `${location} is ${randomWeather} today.`;
}
function getCurrentTime() {
// Get current date and time
const currentDatetime = new Date();
// Format date and time
const formattedTime = format(currentDatetime, 'yyyy-MM-dd HH:mm:ss');
// Return formatted current time
return `Current time: ${formattedTime}.`;
}
// Test tool functions and print results. Remove these four lines before running later steps.
console.log("Test tool outputs:")
console.log(getCurrentWeather({location:"Shanghai"}));
console.log(getCurrentTime());
console.log("\n")Output:
Test tool outputs:
Shanghai is cloudy today.
Current time: 2025-01-08 20:21:45.
1.2. Create the tools array
The model needs the same details as a person to select the right tool: what it does, when to use it, and what inputs it needs. You can provide tool information in this JSON format.
|
For the weather lookup tool, the tool information is as follows:
|
You can define the tools array in your code before you invoke function calling. The array holds each tool's name, description, and parameter definition, and is passed as a parameter in your request.
# Paste this code after Step 1
## Step 2: Create tools array
tools = [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful when you want to know the current time.",
"parameters": {}
}
},
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
"required": ["location"]
}
}
}
]
tool_name = [tool["function"]["name"] for tool in tools]
print(f"Created {len(tools)} tools: {tool_name}\n")// Paste this code after Step 1
// Step 2: Create tools array
const tools = [
{
type: "function",
function: {
name: "get_current_time",
description: "Useful when you want to know the current time.",
parameters: {}
}
},
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful when you want to check the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
required: ["location"]
}
}
}
];
const toolNames = tools.map(tool => tool.function.name);
console.log(`Created ${tools.length} tools: ${toolNames.join(', ')}\n`);2. Create the messages array
Function calling uses the messages array to send instructions and context to the model. The messages array must include a system message and a user message.
System message
The tools and their usage are already described in the Create the tools array step. However, adding a clear rule in the system message often improves tool call accuracy. For this scenario, you can set the system prompt to:
You are a helpful assistant. If the user asks about weather, call the 'get_current_weather' function.
If the user asks about time, call the 'get_current_time' function.
Reply in a friendly tone.
User message
The user message holds the user's question. If the user asks, "What is the weather in Shanghai?", the messages array is as follows:
# Step 3: Create messages array
# Paste this code after Step 2
# User message example for text-generation models
messages = [
{
"role": "system",
"content": """You are a helpful assistant. If the user asks about weather, call the 'get_current_weather' function. If the user asks about time, call the 'get_current_time' function. Reply in a friendly tone.""",
},
{
"role": "user",
"content": "What is the weather in Shanghai?"
}
]
# User message example for multimodal models
# messages=[
# {
# "role": "system",
# "content": """You are a helpful assistant. If the user asks about weather, call the 'get_current_weather' function. If the user asks about time, call the 'get_current_time' function. Reply in a friendly tone.""",
# },
# {"role": "user",
# "content": [{"type": "image_url","image_url": {"url": "https://img.alicdn.com/imgextra/i2/O1CN01FbTJon1ErXVGMRdsN_!!6000000000405-0-tps-1024-683.jpg"}},
# {"type": "text", "text": "Based on the location in the image, what is the current weather?"}]},
# ]
print("messages array created\n") // Step 3: Create messages array
// Paste this code after Step 2
const messages = [
{
role: "system",
content: "You are a helpful assistant. If the user asks about weather, call the 'get_current_weather' function. If the user asks about time, call the 'get_current_time' function. Reply in a friendly tone.",
},
{
role: "user",
content: "What is the weather in Shanghai?"
}
];
// User message example for multimodal models,
// const messages: [{
// role: "user",
// content: [{type: "image_url", image_url: {"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"}},
// {type: "text", text: "What scene is shown in the image?"}]
// }];
console.log("messages array created\n");Because the available tools cover weather and time, you can also ask about the current time.
3. Invoke function calling
Pass the tools and messages arrays to the model to invoke function calling. The model decides whether to call a tool. If it does, it returns the tool's name and parameters.
See Supported models.
# Step 4: Invoke function calling
# Paste this code after Step 3
from openai import OpenAI
import os
client = OpenAI(
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
# If you have not set the environment variable, replace the line below with: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
def function_calling():
completion = client.chat.completions.create(
# This example uses qwen3.6-plus. Replace with another model as needed. Model list: https://www.alibabacloud.com/help/model-studio/getting-started/models
model="qwen3.6-plus",
extra_body={"enable_thinking": False},
messages=messages,
tools=tools
)
print("Response object:")
print(completion.choices[0].message.model_dump_json())
print("\n")
return completion
print("Invoking function calling...")
completion = function_calling()// Step 4: Invoke function calling
// Paste this code after Step 3
import OpenAI from "openai";
const openai = new OpenAI(
{
// API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
// If you have not set the environment variable, replace the line below with: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
// For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
baseURL: "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1"
}
);
async function functionCalling() {
const completion = await openai.chat.completions.create({
model: "qwen3.6-plus", // This example uses qwen3.6-plus. Replace with another model as needed. Model list: https://www.alibabacloud.com/help/model-studio/getting-started/models
enable_thinking: false,
messages: messages,
tools: tools
});
console.log("Response object:");
console.log(JSON.stringify(completion.choices[0].message));
console.log("\n");
return completion;
}
const completion = await functionCalling();Because the user asked about Shanghai's weather, the model returns the tool function name "get_current_weather" and the input parameters "{\"location\": \"Shanghai\"}".
{
"content": "",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": [
{
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"location\": \"Shanghai\"}",
"name": "get_current_weather"
},
"type": "function",
"index": 0
}
]
}
If the model decides that no tool is needed, it replies directly through the content parameter. For example, when you enter "Hello", the tool_calls parameter is empty:
{
"content": "Hello! How can I help you? I'm great at answering questions about weather and time.",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
}
If thetool_callsparameter is empty, you can return thecontentdirectly and skip the remaining steps.
To force the model to call a specific tool every time, see Force tool calling.
4. Run tool functions
Running tool functions turns the model's decision into action.
Your application, not the model, runs the tool functions.
The model only outputs strings, so you must parse the tool name and input parameters before you run a tool function.
-
Tool function
You can create a mapping table
function_mapperthat maps tool names to actual functions. -
Input parameters
Function calling returns parameters as a JSON string. You must parse the string into a JSON object to extract the values.
After parsing, you can pass the parameters to the tool function, run it, and capture the output.
# Step 5: Run tool functions
# Paste this code after Step 4
import json
print("Running tool function...")
# Get function name and input parameters from response
function_name = completion.choices[0].message.tool_calls[0].function.name
arguments_string = completion.choices[0].message.tool_calls[0].function.arguments
# Parse parameters string using json module
arguments = json.loads(arguments_string)
# Create function mapping table
function_mapper = {
"get_current_weather": get_current_weather,
"get_current_time": get_current_time
}
# Get function object
function = function_mapper[function_name]
# If parameters are empty, call function directly
if arguments == {}:
function_output = function()
# Otherwise, call function with parameters
else:
function_output = function(arguments)
# Print tool output
print(f"Tool function output: {function_output}\n")// Step 5: Run tool functions
// Paste this code after Step 4
console.log("Running tool function...");
const function_name = completion.choices[0].message.tool_calls[0].function.name;
const arguments_string = completion.choices[0].message.tool_calls[0].function.arguments;
// Parse parameters string using JSON module
const args = JSON.parse(arguments_string);
// Create function mapping table
const functionMapper = {
"get_current_weather": getCurrentWeather,
"get_current_time": getCurrentTime
};
// Get function object
const func = functionMapper[function_name];
// If parameters are empty, call function directly
let functionOutput;
if (Object.keys(args).length === 0) {
functionOutput = func();
} else {
// Otherwise, call function with parameters
functionOutput = func(args);
}
// Print tool output
console.log(`Tool function output: ${functionOutput}\n`);Output:
Shanghai is cloudy today.
In real applications, many tools perform actions, such as sending emails or uploading files, not just data lookups. These tools often do not return a string. To help the model understand their status, you can add status messages, such as "Email sent" or "Operation failed".
5. Summarize tool output with the model
Tool outputs follow fixed formats that may sound stiff. To receive replies in natural language, you can submit the tool output to the model along with the user's original question.
-
Add the assistant message
After you invoke function calling, you can retrieve the assistant message from
completion.choices[0].messageand add it to the messages array. -
Add the tool message
You can add the tool output to the messages array in the
{"role": "tool", "content": "the output of the tool","tool_call_id": completion.choices[0].message.tool_calls[0].id}format.Note-
Make sure that the tool output is a string.
-
The
tool_call_idis a unique identifier that the system generates for each tool calling request. A model may request to call multiple tools at once. When multiple tool results are returned to the model, thetool_call_idensures that the output of a tool is matched with its corresponding call intent.
-
# Step 6: Submit tool output to LLM
# Paste this code after Step 5
messages.append(completion.choices[0].message)
print("Added assistant message")
messages.append({"role": "tool", "content": function_output, "tool_call_id": completion.choices[0].message.tool_calls[0].id})
print("Added tool message\n")// Step 6: Submit tool output to LLM
// Paste this code after Step 5
messages.push(completion.choices[0].message);
console.log("Added assistant message")
messages.push({
"role": "tool",
"content": functionOutput,
"tool_call_id": completion.choices[0].message.tool_calls[0].id
});
console.log("Added tool message\n");The messages array is now as follows:
[
System Message -- Rules for calling tools
User Message -- User's question
Assistant Message -- Tool call info from model
Tool Message -- Tool output (If using parallel tool calls, there may be multiple tool messages)
]
After you update the messages array, you can run the following code:
# Step 7: LLM summarizes tool output
# Paste this code after Step 6
print("Summarizing tool output...")
completion = function_calling()// Step 7: LLM summarizes tool output
// Paste this code after Step 6
console.log("Summarizing tool output...");
const completion_1 = await functionCalling();The reply is in the content field: "Today's weather in Shanghai is cloudy. Let me know if you have more questions."
{
"content": "Today's weather in Shanghai is cloudy. Let me know if you have more questions.",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": null
}
You have now completed a full function calling flow.
Advanced usage
Specify tool calling behavior
Parallel tool calling
A single-city weather lookup requires one tool call. However, some questions require multiple calls. For example, "What is the weather in Beijing and Shanghai?" or "What is the weather in Hangzhou and what time is it?" After you invoke function calling, the response contains only one tool call. For "What is the weather in Beijing and Shanghai?", the response is as follows:
{
"content": "",
"refusal": null,
"role": "assistant",
"audio": null,
"function_call": null,
"tool_calls": [
{
"id": "call_61a2bbd82a8042289f1ff2",
"function": {
"arguments": "{\"location\": \"Beijing\"}",
"name": "get_current_weather"
},
"type": "function",
"index": 0
}
]
}
The response contains only the input parameters for Beijing. To include all cities, you can set the parallel_tool_calls request parameter to true when you invoke function calling. The response then contains all tool functions and their input parameters.
You can use parallel tool calling when tasks have no dependencies. If tasks depend on each other (for example, the input of tool A depends on the output of tool B), you must call tools one at a time (serial tool calling). See Getting started.
def function_calling():
completion = client.chat.completions.create(
model="qwen3.6-plus", # This example uses qwen3.6-plus. Replace with another model as needed
extra_body={"enable_thinking": False},
messages=messages,
tools=tools,
# New parameter
parallel_tool_calls=True
)
print("Response object:")
print(completion.choices[0].message.model_dump_json())
print("\n")
return completion
print("Invoking function calling...")
completion = function_calling()async function functionCalling() {
const completion = await openai.chat.completions.create({
model: "qwen3.6-plus", // This example uses qwen3.6-plus. Replace with another model as needed
enable_thinking: false,
messages: messages,
tools: tools,
parallel_tool_calls: true
});
console.log("Response object:");
console.log(JSON.stringify(completion.choices[0].message));
console.log("\n");
return completion;
}
const completion = await functionCalling();The tool_calls array in the response now includes parameters for both Beijing and Shanghai:
{
"content": "",
"role": "assistant",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Beijing\"}"
},
"index": 0,
"id": "call_c2d8a3a24c4d4929b26ae2",
"type": "function"
},
{
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Shanghai\"}"
},
"index": 1,
"id": "call_dc7f2f678f1944da9194cd",
"type": "function"
}
]
}
Force tool calling
Model output is non-deterministic and may sometimes invoke the wrong tool. To manually specify a strategy for certain questions, such as forcing a specific tool or preventing tool use, you can modify the tool_choice parameter. The default value of the tool_choice parameter is "auto", which means the model autonomously decides how to perform tool calling.
When you summarize tool output with the model, you must remove the tool_choice parameter. Otherwise, the API still returns tool call information.
-
Force a specific tool
To force a specific tool for certain questions, you can set
tool_choiceto{"type": "function", "function": {"name": "the_function_to_call"}}. The model skips tool selection and returns only parameters.For example, if your application handles only weather questions, you can update the function_calling code as follows:
def function_calling(): completion = client.chat.completions.create( model="qwen3.6-plus", extra_body={"enable_thinking": False}, messages=messages, tools=tools, tool_choice={"type": "function", "function": {"name": "get_current_weather"}} ) print(completion.model_dump_json()) function_calling()async function functionCalling() { const response = await openai.chat.completions.create({ model: "qwen3.6-plus", enable_thinking: false, messages: messages, tools: tools, tool_choice: {"type": "function", "function": {"name": "get_current_weather"}} }); console.log("Response object:"); console.log(JSON.stringify(response.choices[0].message)); console.log("\n"); return response; } const response = await functionCalling();No matter what question you ask, the tool function in the response is always
get_current_weather.Before you use this strategy, you must confirm that the question matches the chosen tool. Otherwise, the results may be unexpected.
-
Block all tools
To block tool calls for all questions so that the response has a
contentfield but an emptytool_callsparameter, you can settool_choiceto"none"or omit thetoolsparameter. Thetool_callsparameter will always be empty.For example, if none of your questions require tools, you can update the function_calling code as follows:
def function_calling(): completion = client.chat.completions.create( model="qwen3.6-plus", extra_body={"enable_thinking": False}, messages=messages, tools=tools, tool_choice="none" ) print(completion.model_dump_json()) function_calling()async function functionCalling() { const completion = await openai.chat.completions.create({ model: "qwen3.6-plus", enable_thinking: false, messages: messages, tools: tools, tool_choice: "none" }); console.log("Response object:"); console.log(JSON.stringify(completion.choices[0].message)); console.log("\n"); return completion; } const completion = await functionCalling();
Multi-turn conversations
A user might ask, "What is the weather in Beijing?" in the first round and "What about Shanghai?" in the second round. Without the context from the first round, the model cannot determine which tool to call. In multi-turn conversations, you can keep the messages array after each round, add the new user message, and invoke function calling again. The messages structure is as follows:
[
System Message -- Rules for calling tools
User Message -- User's question
Assistant Message -- Tool call info from model
Tool Message -- Tool output
Assistant Message -- Model summary of tool call
User Message -- User's second question
]
Streaming output
Streaming output retrieves tool names and parameters in real time:
-
Tool call parameters: Returned in chunks as a data stream.
-
Tool function name: Returned in the first chunk of the streaming response.
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
"required": ["location"],
},
},
},
]
stream = client.chat.completions.create(
model="qwen3.6-plus",
extra_body={"enable_thinking": False},
messages=[{"role": "user", "content": "What is the weather in Hangzhou?"}],
tools=tools,
stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta
print(delta.tool_calls)import { OpenAI } from "openai";
const openai = new OpenAI(
{
// API keys vary by region. To obtain an API key, visit https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
// If you use a model in the Beijing region, you must replace the baseURL with: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
}
);
const tools = [
{
"type": "function",
"function": {
"name": "getCurrentWeather",
"description": "Useful for querying the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city or district, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"]
}
}
}
];
const stream = await openai.chat.completions.create({
model: "qwen3.6-plus",
enable_thinking: false,
messages: [{ role: "user", content: "Weather in Beijing" }],
tools: tools,
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0].delta;
console.log(delta.tool_calls);
}Output:
[ChoiceDeltaToolCall(index=0, id='call_8f08d2b0fc0c4d8fab7123', function=ChoiceDeltaToolCallFunction(arguments='{"location":', name='get_current_weather'), type='function')]
[ChoiceDeltaToolCall(index=0, id='', function=ChoiceDeltaToolCallFunction(arguments=' "Hangzhou"}', name=None), type='function')]
None
Join the parameter chunks (arguments):
tool_calls = {}
for response_chunk in stream:
delta_tool_calls = response_chunk.choices[0].delta.tool_calls
if delta_tool_calls:
for tool_call_chunk in delta_tool_calls:
call_index = tool_call_chunk.index
tool_call_chunk.function.arguments = tool_call_chunk.function.arguments or ""
if call_index not in tool_calls:
tool_calls[call_index] = tool_call_chunk
else:
tool_calls[call_index].function.arguments += tool_call_chunk.function.arguments
print(tool_calls[0].model_dump_json())const toolCalls = {};
for await (const responseChunk of stream) {
const deltaToolCalls = responseChunk.choices[0]?.delta?.tool_calls;
if (deltaToolCalls) {
for (const toolCallChunk of deltaToolCalls) {
const index = toolCallChunk.index;
toolCallChunk.function.arguments = toolCallChunk.function.arguments || "";
if (!toolCalls[index]) {
toolCalls[index] = { ...toolCallChunk };
if (!toolCalls[index].function) {
toolCalls[index].function = { name: '', arguments: '' };
}
}
else if (toolCallChunk.function?.arguments) {
toolCalls[index].function.arguments += toolCallChunk.function.arguments;
}
}
}
}
console.log(JSON.stringify(toolCalls[0]));Output:
{"index":0,"id":"call_16c72bef988a4c6c8cc662","function":{"arguments":"{\"location\": \"Hangzhou\"}","name":"get_current_weather"},"type":"function"}
When you summarize tool output with the model, the assistant message must match this format. You can replace the tool_calls element with the preceding content.
{
"content": "",
"refusal": None,
"role": "assistant",
"audio": None,
"function_call": None,
"tool_calls": [
{
"id": "call_xxx",
"function": {
"arguments": '{"location": "xx"}',
"name": "get_current_weather",
},
"type": "function",
"index": 0,
}
],
}
Tool calling with Responses API
The preceding examples use the OpenAI Chat Completions and DashScope APIs. If you use the OpenAI Responses API, the overall tool calling process is the same, but the interface format is different:
|
Dimension |
Chat Completions |
Responses API |
|
Tool definition format |
|
|
|
Tool call output |
response.choices[0].message.tool_calls |
Items in response.output where type is function_call |
|
Return tool result |
|
|
|
Final reply |
response.choices[0].message.content |
response.output_text |
from openai import OpenAI
import json
import os
import random
# Initialize client
client = OpenAI(
# If you have not set the environment variable, replace the line below with: api_key="sk-xxx",
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
# Simulate user question
USER_QUESTION = "What is the weather in Singapore?"
# Define tool list
tools = [
{
"type": "function",
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Singapore or London.",
}
},
"required": ["location"],
},
}
]
# Simulate weather lookup tool
def get_current_weather(arguments):
weather_conditions = ["sunny", "cloudy", "rainy"]
random_weather = random.choice(weather_conditions)
location = arguments["location"]
return f"{location} is {random_weather} today."
# Wrap model response function
def get_response(input_data):
response = client.responses.create(
model="qwen3.6-plus",
extra_body={"enable_thinking": False},
input=input_data,
tools=tools,
)
return response
# Maintain conversation context
conversation = [{"role": "user", "content": USER_QUESTION}]
response = get_response(conversation)
function_calls = [item for item in response.output if item.type == "function_call"]
# If no tool is needed, output content directly
if not function_calls:
print(f"Final assistant reply: {response.output_text}")
else:
# Enter tool calling loop
while function_calls:
for fc in function_calls:
func_name = fc.name
arguments = json.loads(fc.arguments)
print(f"Calling tool [{func_name}] with arguments: {arguments}")
# Execute tool
tool_result = get_current_weather(arguments)
print(f"Tool returned: {tool_result}")
# Append tool call and result to context
conversation.append(
{
"type": "function_call",
"name": fc.name,
"arguments": fc.arguments,
"call_id": fc.call_id,
}
)
conversation.append(
{
"type": "function_call_output",
"call_id": fc.call_id,
"output": tool_result,
}
)
# Call model again with full context
response = get_response(conversation)
function_calls = [
item for item in response.output if item.type == "function_call"
]
print(f"Final assistant reply: {response.output_text}")
import OpenAI from "openai";
// Initialize client
const openai = new OpenAI({
// API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
// If you have not set the environment variable, replace the line below with: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL:
"https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
});
// Define tool list
const tools = [
{
type: "function",
name: "get_current_weather",
description: "Useful when you want to check the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City or county, such as Singapore or London.",
},
},
required: ["location"],
},
},
];
// Simulate weather lookup tool
const getCurrentWeather = (args) => {
const weatherConditions = ["sunny", "cloudy", "rainy"];
const randomWeather =
weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
const location = args.location;
return `${location} is ${randomWeather} today.`;
};
// Wrap model response function
const getResponse = async (inputData) => {
const response = await openai.responses.create({
model: "qwen3.6-plus",
enable_thinking: false,
input: inputData,
tools: tools,
});
return response;
};
const main = async () => {
const userQuestion = "What is the weather in Singapore?";
// Maintain conversation context
const conversation = [{ role: "user", content: userQuestion }];
let response = await getResponse(conversation);
let functionCalls = response.output.filter(
(item) => item.type === "function_call"
);
// If no tool is needed, output content directly
if (functionCalls.length === 0) {
console.log(`Final assistant reply: ${response.output_text}`);
} else {
// Enter tool calling loop
while (functionCalls.length > 0) {
for (const fc of functionCalls) {
const funcName = fc.name;
const args = JSON.parse(fc.arguments);
console.log(`Calling tool [${funcName}] with arguments:`, args);
// Execute tool
const toolResult = getCurrentWeather(args);
console.log(`Tool returned: ${toolResult}`);
// Append tool call and result to context
conversation.push({
type: "function_call",
name: fc.name,
arguments: fc.arguments,
call_id: fc.call_id,
});
conversation.push({
type: "function_call_output",
call_id: fc.call_id,
output: toolResult,
});
}
// Call model again with full context
response = await getResponse(conversation);
functionCalls = response.output.filter(
(item) => item.type === "function_call"
);
}
console.log(`Final assistant reply: ${response.output_text}`);
}
};
// Start program
main().catch(console.error);
Tool calling with omni-modal models
Omni-modal models support tool calling, including the Qwen-Omni series and the Qwen-Omni-Realtime series. The calling methods for these two series are different.
Qwen-Omni series
The Qwen3.5-Omni-Plus, Qwen3.5-Omni-Flash, and Qwen3-Omni-Flash series support tool calling through an OpenAI compatible interface. During the tool information retrieval phase, they differ from other models in the following ways:
-
Streaming output is required: Qwen-Omni supports only streaming output. Therefore, you must set
stream=Truewhen you retrieve tool information. -
Text-only output is recommended: When you retrieve tool information, such as the function name and parameters, only text matters. To avoid unnecessary audio, you can set
modalities=["text"]. If the output includes both text and audio, you can skip the audio chunks when you retrieve tool information.
See Non-real-time (Qwen-Omni).
from openai import OpenAI
import os
client = OpenAI(
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
"required": ["location"],
},
},
},
]
completion = client.chat.completions.create(
model="qwen3.5-omni-plus",
messages=[{"role": "user", "content": "What is the weather in Hangzhou?"}],
# Set output modalities. Options: ["text"] or ["text","audio"]. Recommended: ["text"]
modalities=["text"],
# stream must be True, or it fails
stream=True,
tools=tools
)
for chunk in completion:
# If output includes audio, change the condition to: if chunk.choices and not hasattr(chunk.choices[0].delta, "audio"):
if chunk.choices:
delta = chunk.choices[0].delta
print(delta.tool_calls)import { OpenAI } from "openai";
const openai = new OpenAI(
{
// API keys vary by region. To obtain an API key, visit: https://www.alibabacloud.com/help/en/model-studio/get-api-key
// If the environment variable is not configured, replace the following line with your Model Studio API key, for example: apiKey: "sk-xxx",
apiKey: process.env.DASHSCOPE_API_KEY,
// If you use a model in the China (Beijing) region, you must replace the baseURL with: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1
baseURL: "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1"
}
);
const tools = [
{
"type": "function",
"function": {
"name": "getCurrentWeather",
"description": "This is useful for querying the weather in a specified city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city or district, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"]
}
}
}
];
const stream = await openai.chat.completions.create({
model: "qwen3-omni-flash",
messages: [
{
"role": "user",
"content": "Weather in Hangzhou"
}],
stream: true,
// Set the modality for the output. Valid values are ["text"] and ["text", "audio"]. We recommend that you set this to ["text"].
modalities: ["text"],
tools:tools
});
for await (const chunk of stream) {
// If the output contains audio, replace the conditional statement with: if (chunk.choices?.length && chunk.choices[0].delta && !('audio' in chunk.choices[0].delta))
if (chunk.choices?.length){
const delta = chunk.choices[0].delta;
console.log(delta.tool_calls);
}}Output:
[ChoiceDeltaToolCall(index=0, id='call_391c8e5787bc4972a388aa', function=ChoiceDeltaToolCallFunction(arguments=None, name='get_current_weather'), type='function')]
[ChoiceDeltaToolCall(index=0, id='call_391c8e5787bc4972a388aa', function=ChoiceDeltaToolCallFunction(arguments=' {"location": "Hangzhou"}', name=None), type='function')]
None
To join the parameter parts (arguments), see the Streaming output section.
Qwen-Omni-Realtime series
The Qwen3.5-Omni-Plus-Realtime and Qwen3.5-Omni-Flash-Realtime series support tool calling for voice conversations. You can call them through the DashScope SDK or the native WebSocket protocol.
Workflow:
After you establish a WebSocket connection, you can pass the tool definitions through session.update to start the following interaction flow:
Phase 1: Speech input and tool calling
-
The user asks a question with their voice. The client captures the audio and sends it to the server (corresponding to the
append_audio()method). After the server's Voice Activity Detection (VAD) detects the end of speech, it performs model inference and determines that a tool needs to be called. -
The server returns the tool call information to the client (corresponding to the
response.function_call_arguments.doneevent), including the function name (name), function parameters (arguments), and call ID (call_id). Example:{ "type": "response.function_call_arguments.done", "response_id": "resp_JnTOsWXlFhKcFohZbtfz6", "item_id": "item_Rhcms7CauTNsQprV5S4Hr", "output_index": 0, "name": "get_current_weather", "call_id": "call_2be200f4cafe419b9530dd", "arguments": "{\"location\": \"Hangzhou\"}" } -
The client executes the corresponding tool function locally based on the function name and parameters, and gets the result.
Phase 2: Client sends back tool results and triggers the final response
-
The client sends the tool execution result back to the server (corresponding to the
conversation.item.createevent), including the call ID (call_id) and execution result (output). Example:{ "type": "conversation.item.create", "item": { "type": "function_call_output", "call_id": "call_2be200f4cafe419b9530dd", "output": "The weather in Hangzhou today is sunny, with a temperature of 25°C and a light breeze." } } -
The client continues to send a
response.createevent to trigger the server to generate the final voice response based on the tool execution result. -
The client receives the voice and text returned by the server (corresponding to the
response.audio.deltaandresponse.audio_transcript.deltaevents) and plays the voice response to the user.
The Qwen-Omni-Realtime series does not support thetool_choiceandparallel_tool_callsparameters.
See Qwen-Omni-Realtime, Client events, and Server events.
DashScope Python SDK
import os
import uuid
import threading
import traceback
import json
import base64
import signal
import sys
import time
from typing import Dict, Any, Optional, List
import pyaudio
import queue
import contextlib
import dashscope
from dashscope.audio.qwen_omni import *
# ==================== Constant definitions ====================
VOICE = 'Tina'
MODEL = "qwen3.5-omni-plus-realtime"
# For the Beijing region, replace WS_URL with: wss://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api-ws/v1/realtime
WS_URL = "wss://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api-ws/v1/realtime"
# Configure the API key. If the environment variable is not set, replace the line below with dashscope.api_key = "sk-xxx"
dashscope.api_key = os.getenv('DASHSCOPE_API_KEY')
AUDIO_SAMPLE_RATE = 16000
AUDIO_CHUNK_SIZE = 3200
OUTPUT_AUDIO_SAMPLE_RATE = 24000
# ==================== Tool definitions ====================
def get_train_price(src: str, dst: str) -> str:
"""Query train ticket prices"""
return f"The train ticket price from {src} to {dst} is 100-200 CNY."
def get_flight_price(src: str, dst: str) -> str:
"""Query flight ticket prices"""
return f"The flight ticket price from {src} to {dst} is 200-300 USD."
def get_current_weather(location: str) -> str:
"""Query weather for a specified city"""
return f"The weather in {location} today is hazy turning to sunny, with a temperature of 4/-4°C and a light breeze."
# Unified OpenAI-format tool definitions
TOOLS = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
"required": ["location"],
},
},
},
{
"type": "function",
"function": {
"name": "get_flight_price",
"description": "Useful when you want to query flight ticket prices.",
"parameters": {
"type": "object",
"properties": {
"src": {
"type": "string",
"description": "The departure city of the flight, such as Beijing or Hangzhou.",
},
"dst": {
"type": "string",
"description": "The arrival city of the flight, such as Beijing or Hangzhou.",
},
},
"required": ["src", "dst"],
},
},
},
{
"type": "function",
"function": {
"name": "get_train_price",
"description": "Useful when you want to query train ticket prices.",
"parameters": {
"type": "object",
"properties": {
"src": {
"type": "string",
"description": "The departure city of the train, such as Beijing or Hangzhou.",
},
"dst": {
"type": "string",
"description": "The arrival city of the train, such as Beijing or Hangzhou.",
},
},
"required": ["src", "dst"],
},
},
},
]
# Mapping from tool names to functions
TOOL_FUNCTIONS = {
"get_current_weather": get_current_weather,
"get_flight_price": get_flight_price,
"get_train_price": get_train_price,
}
# ==================== Tool call handling ====================
def handle_tool_call(tool_call_response: Dict[str, Any]) -> Dict[str, Any]:
"""
Handle tool call requests
Args:
tool_call_response: Tool call information including name, arguments, and call_id
Returns:
Updated tool call response including the output field
"""
try:
function_name = tool_call_response['name']
tool_call_arguments = json.loads(tool_call_response['arguments'])
print(f'[Tool Call] Start processing: name={function_name}, args={tool_call_arguments}')
# Find the corresponding function
if function_name not in TOOL_FUNCTIONS:
tool_call_response['output'] = f"Client did not find the tool: {function_name}"
print(f'[Tool Call] Error: Tool not found {function_name}')
return tool_call_response
# Call the function
func = TOOL_FUNCTIONS[function_name]
result = func(**tool_call_arguments)
tool_call_response['output'] = result
print(f'[Tool Call] Complete: {result}')
return tool_call_response
except Exception as e:
error_msg = f"Tool call failed: {str(e)}"
tool_call_response['output'] = error_msg
print(f'[Tool Call] Exception: {error_msg}')
traceback.print_exc()
return tool_call_response
def send_tool_call_response(conversation: OmniRealtimeConversation, response: Dict[str, Any]) -> None:
"""Send tool call response to the server"""
conversation.create_item({
"id": 'item_' + uuid.uuid4().hex,
"type": "function_call_output",
"call_id": response['call_id'],
"output": response["output"],
})
# ==================== PCM audio player ====================
class PCMPlayer:
"""
PCM audio player
Uses a dual-thread architecture for real-time audio playback:
- Decoder thread: Decodes base64-encoded audio data into raw PCM data
- Player thread: Writes PCM data to the audio output device
Supports dynamic addition of audio data, cancellation of playback, saving audio files, etc.
"""
def __init__(self, pya: pyaudio.PyAudio, sample_rate=24000, chunk_size_ms=100, save_file=False):
"""
Initialize the PCM player
Args:
pya: pyaudio.PyAudio instance
sample_rate: Audio sampling rate (Hz), default 24000
chunk_size_ms: Audio chunk size (milliseconds), affects playback cancellation latency, default 100ms
save_file: Whether to save the played audio to a file (result.pcm), default False
"""
self.pya = pya
self.sample_rate = sample_rate
self.chunk_size_bytes = chunk_size_ms * sample_rate * 2 // 1000
self.player_stream = pya.open(format=pyaudio.paInt16,
channels=1,
rate=sample_rate,
output=True)
self.raw_audio_buffer: queue.Queue = queue.Queue()
self.b64_audio_buffer: queue.Queue = queue.Queue()
self.status_lock = threading.Lock()
self.status = 'playing'
self.decoder_thread = threading.Thread(target=self.decoder_loop)
self.player_thread = threading.Thread(target=self.player_loop)
self.decoder_thread.start()
self.player_thread.start()
self.complete_event: threading.Event = None
self.save_file = save_file
if self.save_file:
self.out_file = open('result.pcm', 'wb')
def decoder_loop(self):
"""Decoder thread: Decodes base64 audio data into raw PCM data"""
while self.status != 'stop':
recv_audio_b64 = None
with contextlib.suppress(queue.Empty):
recv_audio_b64 = self.b64_audio_buffer.get(timeout=0.1)
if recv_audio_b64 is None:
continue
recv_audio_raw = base64.b64decode(recv_audio_b64)
# push raw audio data into queue by chunk
for i in range(0, len(recv_audio_raw), self.chunk_size_bytes):
chunk = recv_audio_raw[i:i + self.chunk_size_bytes]
self.raw_audio_buffer.put(chunk)
if self.save_file:
self.out_file.write(chunk)
def player_loop(self):
"""Player thread: Writes PCM data to the audio output device"""
while self.status != 'stop':
recv_audio_raw = None
with contextlib.suppress(queue.Empty):
recv_audio_raw = self.raw_audio_buffer.get(timeout=0.1)
if recv_audio_raw is None:
if self.complete_event:
self.complete_event.set()
continue
# write chunk to pyaudio audio player, wait until finish playing this chunk.
self.player_stream.write(recv_audio_raw)
def cancel_playing(self):
"""Cancel playback: Clear all buffer queues"""
self.b64_audio_buffer.queue.clear()
self.raw_audio_buffer.queue.clear()
def add_data(self, data):
"""Add base64-encoded audio data to the playback queue"""
self.b64_audio_buffer.put(data)
def wait_for_complete(self):
"""Wait for playback to complete"""
self.complete_event = threading.Event()
self.complete_event.wait()
self.complete_event = None
def shutdown(self):
"""Shut down the player and release resources"""
self.status = 'stop'
self.decoder_thread.join()
self.player_thread.join()
self.player_stream.close()
if self.save_file:
self.out_file.close()
# ==================== Audio manager ====================
class AudioManager:
"""Manages audio input and output resources"""
def __init__(self):
self.pya: Optional[pyaudio.PyAudio] = None
self.mic_stream: Optional[pyaudio.Stream] = None
self.player: Optional[PCMPlayer] = None
def initialize(self) -> None:
"""Initialize audio devices"""
print('Initializing audio devices...')
self.pya = pyaudio.PyAudio()
self.mic_stream = self.pya.open(
format=pyaudio.paInt16,
channels=1,
rate=AUDIO_SAMPLE_RATE,
input=True
)
self.player = PCMPlayer(self.pya, sample_rate=OUTPUT_AUDIO_SAMPLE_RATE)
print('Audio devices initialized')
def read_audio_chunk(self) -> Optional[bytes]:
"""Read an audio data chunk"""
if not self.mic_stream:
return None
try:
return self.mic_stream.read(AUDIO_CHUNK_SIZE, exception_on_overflow=False)
except Exception as e:
print(f'[Error] Failed to read audio data: {e}')
return None
def cleanup(self) -> None:
"""Clean up audio resources"""
print('Cleaning up audio resources...')
if self.player:
self.player.shutdown()
if self.mic_stream:
self.mic_stream.close()
if self.pya:
self.pya.terminate()
print('Audio resources cleaned up')
# ==================== Callback handler ====================
class OmniCallback(OmniRealtimeCallback):
"""Omni real-time conversation callback handler"""
def __init__(self, audio_manager: AudioManager):
self.audio_manager = audio_manager
self.tool_calls: Dict[str, Dict[str, Any]] = {}
self.all_response_text: str = ''
self.last_package_time: float = 0
self.is_first_text: bool = True
self.is_first_audio: bool = True
self.conversation: Optional[OmniRealtimeConversation] = None
def set_conversation(self, conversation: OmniRealtimeConversation) -> None:
"""Set the conversation instance reference"""
self.conversation = conversation
def on_open(self) -> None:
"""Callback on connection establishment"""
print('Connection established')
self.audio_manager.initialize()
self.last_package_time = time.time() * 1000
self.is_first_text = True
self.is_first_audio = True
self.tool_calls = {}
self.all_response_text = ''
def on_close(self, close_status_code: int, close_msg: str) -> None:
"""Callback on connection closure"""
print(f'Connection closed: code={close_status_code}, msg={close_msg}')
self.audio_manager.cleanup()
sys.exit(0)
def on_event(self, response: Dict[str, Any]) -> None:
"""Handle event callbacks"""
try:
event_type = response.get('type', '')
# Session created
if event_type == 'session.created':
print(f'Session started: {response["session"]["id"]}')
# Speech-to-text completed
elif event_type == 'conversation.item.input_audio_transcription.completed':
print(f'User question: {response.get("transcript", "")}')
# Incremental text response
elif event_type in ('response.audio_transcript.delta', 'response.text.delta'):
if self.is_first_text:
self.is_first_text = False
latency = time.time() * 1000 - self.last_package_time
print(f'Time to first token (VAD end): {latency:.0f} ms')
text = response.get('delta', '')
self.all_response_text += text
# Incremental audio response
elif event_type == 'response.audio.delta':
if self.is_first_audio:
self.is_first_audio = False
latency = time.time() * 1000 - self.last_package_time
print(f'Time to first audio chunk (VAD end): {latency:.0f} ms')
audio_interval = time.time() * 1000 - self.last_package_time
print(f'Audio interval: {audio_interval:.0f} ms')
self.last_package_time = time.time() * 1000
recv_audio_b64 = response.get('delta', '')
if self.audio_manager.player:
self.audio_manager.player.add_data(recv_audio_b64)
# VAD detected speech start
elif event_type == 'input_audio_buffer.speech_started':
print('====== VAD detected speech start ======')
if self.audio_manager.player:
self.audio_manager.player.cancel_playing()
# VAD detected speech end
elif event_type == 'input_audio_buffer.speech_stopped':
print('====== VAD detected speech end ======')
self.last_package_time = time.time() * 1000
self.is_first_text = True
self.is_first_audio = True
self.tool_calls = {}
# Function call arguments done
elif event_type == 'response.function_call_arguments.done':
print('====== Received tool call request ======')
call_id = response.get('call_id', '')
self.tool_calls[call_id] = response.copy()
self.tool_calls[call_id]['processed'] = False
# Response done
elif event_type == 'response.done':
print('====== Response done ======')
print(f'Full response: {self.all_response_text}')
if self.conversation:
response_id = self.conversation.get_last_response_id()
text_delay = self.conversation.get_last_first_text_delay()
audio_delay = self.conversation.get_last_first_audio_delay()
# Print detailed metrics only when all are available
if response_id is not None and text_delay is not None and audio_delay is not None:
print(f'[Metric] Response ID: {response_id}, '
f'Time to first token: {text_delay:.0f}ms, '
f'Time to first audio chunk: {audio_delay:.0f}ms')
else:
print('[Metric] Metric info not available (might be a response after a tool call)')
self.all_response_text = ''
except Exception as e:
print(f'[Error] Exception in event handling: {e}')
traceback.print_exc()
def process_pending_tool_calls(self) -> bool:
"""
Process pending tool calls
Returns:
Whether there are new tool calls to respond to
"""
has_pending = False
for call_id, tool_call in self.tool_calls.items():
if not tool_call.get('processed', False):
has_pending = True
tool_call['processed'] = True
# Handle the tool call
result = handle_tool_call(tool_call)
# Send the result to the server
if self.conversation:
send_tool_call_response(self.conversation, result)
return has_pending
# ==================== Main program ====================
def main():
"""Main function"""
print('Initializing Omni real-time conversation...')
# Create audio manager
audio_manager = AudioManager()
# Create callback handler
callback = OmniCallback(audio_manager)
# Create conversation instance
conversation = OmniRealtimeConversation(
api_key=dashscope.api_key,
url=WS_URL,
model=MODEL,
callback=callback,
)
# Set conversation reference in callback
callback.set_conversation(conversation)
# Establish connection
conversation.connect()
# Configure session parameters
omni_output_modalities = [MultiModality.AUDIO, MultiModality.TEXT]
conversation.update_session(
output_modalities=omni_output_modalities,
voice=VOICE,
input_audio_format=AudioFormat.PCM_16000HZ_MONO_16BIT,
output_audio_format=AudioFormat.PCM_24000HZ_MONO_16BIT,
enable_input_audio_transcription=True,
enable_turn_detection=True,
turn_detection_type='server_vad',
tools=TOOLS,
)
# Set signal handler
def signal_handler(sig, frame):
print('\nCtrl+C received, stopping...')
conversation.close()
audio_manager.cleanup()
print('Omni real-time conversation stopped')
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
print("Press Ctrl+C to stop the conversation...\n")
# Main loop: continuously send audio and check for tool calls
try:
while True:
# Process pending tool calls
has_tool_calls = callback.process_pending_tool_calls()
if has_tool_calls:
print("*** Tool call complete, creating new response ***")
conversation.create_response(
instructions=None,
output_modalities=omni_output_modalities
)
print('====== Tool call processing complete ======\n')
# Read and send audio data
audio_data = audio_manager.read_audio_chunk()
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode('ascii')
conversation.append_audio(audio_b64)
else:
break
except KeyboardInterrupt:
signal_handler(signal.SIGINT, None)
except Exception as e:
print(f'[Error] Main loop exception: {e}')
traceback.print_exc()
finally:
conversation.close()
audio_manager.cleanup()
if __name__ == '__main__':
main()
DashScope Java SDK
import com.alibaba.dashscope.audio.omni.*;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import javax.sound.sampled.*;
import java.nio.ByteBuffer;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Function;
public class Main {
public static void main(String[] args) {
try {
// Initialize components
AudioPlayer audioPlayer = new AudioPlayer();
ToolRegistry toolRegistry = new ToolRegistry();
ConversationHandler handler = new ConversationHandler(audioPlayer, toolRegistry);
// Create and configure the session
OmniRealtimeParam param = OmniRealtimeParam.builder()
.model("qwen3.5-omni-plus-realtime")
.apikey(System.getenv("DASHSCOPE_API_KEY"))
// For the Beijing region, replace the URL with: wss://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api-ws/v1/realtime
.url("wss://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api-ws/v1/realtime")
.build();
OmniRealtimeConversation conversation = new OmniRealtimeConversation(param, handler);
conversation.connect();
// Configure session parameters
configureSession(conversation, toolRegistry);
// Start audio capture
startAudioCapture(conversation, handler);
// Clean up resources
cleanup(conversation, audioPlayer);
} catch (NoApiKeyException e) {
System.err.println("API KEY not found: Please set the environment variable DASHSCOPE_API_KEY");
} catch (Exception e) {
e.printStackTrace();
}
}
private static void configureSession(OmniRealtimeConversation conversation, ToolRegistry toolRegistry) {
HashMap<String, Object> additionalConfig = new HashMap<>();
additionalConfig.put("tools", toolRegistry.buildToolsDefinition());
conversation.updateSession(OmniRealtimeConfig.builder()
.modalities(Arrays.asList(OmniRealtimeModality.AUDIO, OmniRealtimeModality.TEXT))
.voice("Tina")
.enableTurnDetection(true)
.enableInputAudioTranscription(true)
.parameters(additionalConfig)
.build());
System.out.println("Tool calling is enabled. Start speaking (Press Ctrl+C to exit)...");
}
private static void startAudioCapture(OmniRealtimeConversation conversation, ConversationHandler handler)
throws LineUnavailableException {
AudioFormat format = new AudioFormat(16000, 16, 1, true, false);
TargetDataLine mic = AudioSystem.getTargetDataLine(format);
mic.open(format);
mic.start();
ByteBuffer buffer = ByteBuffer.allocate(3200);
while (!handler.getShouldStop().get()) {
int bytesRead = mic.read(buffer.array(), 0, buffer.capacity());
if (bytesRead > 0) {
conversation.appendAudio(Base64.getEncoder().encodeToString(buffer.array()));
// Check and process pending tool calls
if (handler.hasPendingToolCalls()) {
System.out.println("*** create response after call tools");
handler.processPendingToolCalls(conversation);
conversation.createResponse(null, Arrays.asList(OmniRealtimeModality.AUDIO, OmniRealtimeModality.TEXT));
System.out.println("======TOOL CALL END======");
}
}
try {
Thread.sleep(20);
} catch (InterruptedException ignored) {}
}
mic.close();
}
private static void cleanup(OmniRealtimeConversation conversation, AudioPlayer audioPlayer) {
try {
conversation.close(1000, "Normal exit");
audioPlayer.close();
} catch (Exception e) {
e.printStackTrace();
}
}
/**
* AudioPlayer - Responsible for sequential playback of audio data
*/
static class AudioPlayer {
private final SourceDataLine line;
private final Queue<byte[]> audioQueue = new ConcurrentLinkedQueue<>();
private final Thread playerThread;
private final AtomicBoolean shouldStop = new AtomicBoolean(false);
public AudioPlayer() throws LineUnavailableException {
AudioFormat format = new AudioFormat(24000, 16, 1, true, false);
line = AudioSystem.getSourceDataLine(format);
line.open(format);
line.start();
playerThread = new Thread(this::playLoop, "AudioPlayer");
playerThread.start();
}
private void playLoop() {
while (!shouldStop.get()) {
byte[] audio = audioQueue.poll();
if (audio != null) {
line.write(audio, 0, audio.length);
} else {
try {
Thread.sleep(10);
} catch (InterruptedException ignored) {}
}
}
}
public void play(String base64Audio) {
audioQueue.add(Base64.getDecoder().decode(base64Audio));
}
public void close() {
shouldStop.set(true);
try {
playerThread.join(1000);
} catch (InterruptedException ignored) {}
line.drain();
line.close();
}
}
/**
* ToolRegistry - Manages available tools and their implementations
*/
static class ToolRegistry {
private final Map<String, Function<JsonObject, String>> tools = new ConcurrentHashMap<>();
private final Map<String, JsonObject> pendingToolCalls = new ConcurrentHashMap<>();
public ToolRegistry() {
registerDefaultTools();
}
private void registerDefaultTools() {
registerTool("get_current_weather", this::getCurrentWeather);
registerTool("get_flight_price", this::getFlightPrice);
registerTool("get_train_price", this::getTrainPrice);
}
public void registerTool(String name, Function<JsonObject, String> handler) {
tools.put(name, handler);
}
/**
* Build tool definitions (OpenAI format)
*/
public List<Map<String, Object>> buildToolsDefinition() {
List<Map<String, Object>> definitions = new ArrayList<>();
definitions.add(createFunctionDefinition(
"get_current_weather",
"Useful when you want to check the weather in a specific city.",
createParamsSchema(
Collections.singletonMap("location",
createProperty("string", "City or county, such as Beijing, Hangzhou, or Yuhang District.")),
Collections.singletonList("location")
)
));
Map<String, Object> flightProps = new HashMap<>();
flightProps.put("src", createProperty("string", "The departure city of the flight, such as Beijing or Hangzhou."));
flightProps.put("dst", createProperty("string", "The arrival city of the flight, such as Beijing or Hangzhou."));
definitions.add(createFunctionDefinition(
"get_flight_price",
"Useful when you want to query flight ticket prices.",
createParamsSchema(flightProps, Arrays.asList("src", "dst"))
));
Map<String, Object> trainProps = new HashMap<>();
trainProps.put("src", createProperty("string", "The departure city of the train, such as Beijing or Hangzhou."));
trainProps.put("dst", createProperty("string", "The arrival city of the train, such as Beijing or Hangzhou."));
definitions.add(createFunctionDefinition(
"get_train_price",
"Useful when you want to query train ticket prices.",
createParamsSchema(trainProps, Arrays.asList("src", "dst"))
));
return definitions;
}
private Map<String, Object> createFunctionDefinition(String name, String description, Map<String, Object> parameters) {
Map<String, Object> function = new HashMap<>();
function.put("name", name);
function.put("description", description);
function.put("parameters", parameters);
Map<String, Object> tool = new HashMap<>();
tool.put("type", "function");
tool.put("function", function);
return tool;
}
private Map<String, Object> createParamsSchema(Map<String, Object> properties, List<String> required) {
Map<String, Object> schema = new HashMap<>();
schema.put("type", "object");
schema.put("properties", properties);
schema.put("required", required);
return schema;
}
private Map<String, Object> createProperty(String type, String description) {
Map<String, Object> prop = new HashMap<>();
prop.put("type", type);
prop.put("description", description);
return prop;
}
/**
* Add a tool call to the pending queue
*/
public void addPendingToolCall(String callId, JsonObject toolCall) {
pendingToolCalls.put(callId, toolCall);
}
/**
* Check if there are pending tool calls
*/
public boolean hasPendingToolCalls() {
return !pendingToolCalls.isEmpty();
}
/**
* Process all pending tool calls
*/
public void processPendingToolCalls(OmniRealtimeConversation conversation) {
if (pendingToolCalls.isEmpty()) {
return;
}
for (Map.Entry<String, JsonObject> entry : pendingToolCalls.entrySet()) {
String callId = entry.getKey();
JsonObject toolCall = entry.getValue();
String result = executeTool(toolCall);
sendToolResult(conversation, callId, result);
}
pendingToolCalls.clear();
}
private String executeTool(JsonObject toolCall) {
String functionName = toolCall.get("name").getAsString();
JsonObject arguments = new Gson().fromJson(
toolCall.get("arguments").getAsString(),
JsonObject.class
);
System.out.println("[Tool Call] start handling: " + functionName + ", args: " + arguments);
Function<JsonObject, String> handler = tools.get(functionName);
if (handler == null) {
return "Client did not find this tool, call failed.";
}
String result = handler.apply(arguments);
System.out.println("[Tool Call] response: " + result);
return result;
}
private void sendToolResult(OmniRealtimeConversation conversation, String callId, String output) {
JsonObject item = new JsonObject();
item.addProperty("id", "item_" + UUID.randomUUID().toString().replace("-", ""));
item.addProperty("type", "function_call_output");
item.addProperty("call_id", callId);
item.addProperty("output", output);
conversation.createItem(item);
}
// ===== Tool implementations =====
private String getCurrentWeather(JsonObject args) {
String location = args.get("location").getAsString();
return "The weather in " + location + " today is hazy turning to sunny, with a temperature of 4/-4°C and a light breeze.";
}
private String getFlightPrice(JsonObject args) {
String src = args.get("src").getAsString();
String dst = args.get("dst").getAsString();
return "The flight ticket price from " + src + " to " + dst + " is 200-300 USD.";
}
private String getTrainPrice(JsonObject args) {
String src = args.get("src").getAsString();
String dst = args.get("dst").getAsString();
return "invalid apikey error";
}
}
/**
* ConversationHandler - Handles WebSocket events
*/
static class ConversationHandler extends OmniRealtimeCallback {
private final AudioPlayer audioPlayer;
private final ToolRegistry toolRegistry;
private final AtomicBoolean shouldStop = new AtomicBoolean(false);
private final AtomicReference<StringBuilder> responseTextRef = new AtomicReference<>(new StringBuilder());
private long lastPackageTime = 0;
private boolean isFirstText = true;
private boolean isFirstAudio = true;
public ConversationHandler(AudioPlayer audioPlayer, ToolRegistry toolRegistry) {
this.audioPlayer = audioPlayer;
this.toolRegistry = toolRegistry;
}
public AtomicBoolean getShouldStop() {
return shouldStop;
}
@Override
public void onOpen() {
System.out.println("Connection established");
}
@Override
public void onClose(int code, String reason) {
System.out.println("Connection closed");
shouldStop.set(true);
}
@Override
public void onEvent(JsonObject message) {
String type = message.get("type").getAsString();
switch (type) {
case "session.created":
handleSessionCreated(message);
break;
case "conversation.item.input_audio_transcription.completed":
handleTranscriptionCompleted(message);
break;
case "response.audio_transcript.delta":
case "response.text.delta":
handleTextDelta(message);
break;
case "response.audio.delta":
handleAudioDelta(message);
break;
case "input_audio_buffer.speech_started":
handleSpeechStarted();
break;
case "input_audio_buffer.speech_stopped":
handleSpeechStopped();
break;
case "response.function_call_arguments.done":
handleFunctionCall(message);
break;
case "response.done":
handleResponseDone();
break;
default:
break;
}
}
private void handleSessionCreated(JsonObject message) {
String sessionId = message.get("session").getAsJsonObject().get("id").getAsString();
System.out.println("start session: " + sessionId);
}
private void handleTranscriptionCompleted(JsonObject message) {
System.out.println("question: " + message.get("transcript").getAsString());
}
private void handleTextDelta(JsonObject message) {
if (isFirstText) {
isFirstText = false;
System.out.println("first text latency from vad end: " +
(System.currentTimeMillis() - lastPackageTime) + " ms");
}
String text = message.get("delta").getAsString();
responseTextRef.get().append(text);
}
private void handleAudioDelta(JsonObject message) {
if (isFirstAudio) {
isFirstAudio = false;
System.out.println("first audio latency from vad end: " +
(System.currentTimeMillis() - lastPackageTime) + " ms");
}
System.out.println("audio interval: " + (System.currentTimeMillis() - lastPackageTime) + " ms");
lastPackageTime = System.currentTimeMillis();
audioPlayer.play(message.get("delta").getAsString());
}
private void handleSpeechStarted() {
System.out.println("======VAD Speech Start======");
}
private void handleSpeechStopped() {
System.out.println("======VAD Speech End======");
lastPackageTime = System.currentTimeMillis();
isFirstText = true;
isFirstAudio = true;
}
private void handleFunctionCall(JsonObject message) {
System.out.println("======TOOL CALL======");
String callId = message.get("call_id").getAsString();
toolRegistry.addPendingToolCall(callId, message);
}
private void handleResponseDone() {
System.out.println("======RESPONSE DONE======");
System.out.println("all response text: " + responseTextRef.get());
responseTextRef.set(new StringBuilder());
}
/**
* Check if there are pending tool calls
*/
public boolean hasPendingToolCalls() {
return toolRegistry.hasPendingToolCalls();
}
/**
* Process all pending tool calls
*/
public void processPendingToolCalls(OmniRealtimeConversation conversation) {
toolRegistry.processPendingToolCalls(conversation);
}
}
}
WebSocket (Python)
import asyncio
import json
import base64
import os
import pyaudio
import websockets
# ==================== Constant definitions ====================
API_KEY = os.getenv("DASHSCOPE_API_KEY")
# For the Beijing region, replace with:
# wss://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api-ws/v1/realtime
URL = "wss://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api-ws/v1/realtime"
MODEL = "qwen3.5-omni-plus-realtime"
VOICE = "Ethan"
# ==================== Tool definitions ====================
def get_current_weather(location):
"""Query weather for a specified city"""
return f"The weather in {location} today is hazy turning to sunny, with a temperature of 4/-4°C and a light breeze."
def get_flight_price(src, dst):
"""Query flight ticket prices"""
return f"The flight ticket price from {src} to {dst} is 200-300 USD."
def get_train_price(src, dst):
"""Query train ticket prices"""
return f"The train ticket price from {src} to {dst} is 100-200 CNY."
# Mapping from tool names to functions
TOOL_FUNCTIONS = {
"get_current_weather": get_current_weather,
"get_flight_price": get_flight_price,
"get_train_price": get_train_price,
}
TOOLS = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District.",
}
},
"required": ["location"],
},
},
},
{
"type": "function",
"function": {
"name": "get_flight_price",
"description": "Useful when you want to query flight ticket prices.",
"parameters": {
"type": "object",
"properties": {
"src": {
"type": "string",
"description": "The departure city of the flight, such as Beijing or Hangzhou.",
},
"dst": {
"type": "string",
"description": "The arrival city of the flight, such as Beijing or Hangzhou.",
},
},
"required": ["src", "dst"],
},
},
},
{
"type": "function",
"function": {
"name": "get_train_price",
"description": "Useful when you want to query train ticket prices.",
"parameters": {
"type": "object",
"properties": {
"src": {
"type": "string",
"description": "The departure city of the train, such as Beijing or Hangzhou.",
},
"dst": {
"type": "string",
"description": "The arrival city of the train, such as Beijing or Hangzhou.",
},
},
"required": ["src", "dst"],
},
},
},
]
# ==================== Tool call handling ====================
def handle_tool_call(name, arguments_str):
"""
Handle tool call requests
Args:
name: Tool function name
arguments_str: JSON-formatted parameter string
Returns:
Tool execution result string
"""
try:
arguments = json.loads(arguments_str)
print(f'[Tool Call] Start processing: name={name}, args={arguments}')
func = TOOL_FUNCTIONS.get(name)
if func is None:
result = f"Client did not find the tool: {name}"
print(f'[Tool Call] Error: {result}')
return result
result = func(**arguments)
print(f'[Tool Call] Complete: {result}')
return result
except Exception as e:
error_msg = f"Tool call failed: {str(e)}"
print(f'[Tool Call] Exception: {error_msg}')
return error_msg
# ==================== Main program ====================
async def main():
"""Main function: Establish WebSocket connection and conduct voice conversation"""
pya = pyaudio.PyAudio()
speaker = pya.open(format=pyaudio.paInt16, channels=1, rate=24000, output=True)
# Establish WebSocket connection
headers = {
"Authorization": f"bearer {API_KEY}",
"X-DashScope-OmniRealtime": "true",
}
async with websockets.connect(
f"{URL}?model={MODEL}", additional_headers=headers,
) as ws:
await ws.recv()
# Configure session parameters
await ws.send(json.dumps({
"type": "session.update",
"session": {
"modalities": ["text", "audio"],
"voice": VOICE,
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"instructions": "You are a personal assistant named Xiaoyun",
"turn_detection": {"type": "server_vad"},
"input_audio_transcription": {"model": "qwen3-asr-flash-realtime"},
"tools": TOOLS,
},
}))
await ws.recv()
# Audio capture coroutine
async def send_audio():
mic = pya.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True)
try:
while True:
data = mic.read(3200, exception_on_overflow=False)
await ws.send(json.dumps({
"type": "input_audio_buffer.append",
"audio": base64.b64encode(data).decode(),
}))
await asyncio.sleep(0.01)
except asyncio.CancelledError:
mic.close()
pending = {}
all_response_text = ""
send_task = asyncio.create_task(send_audio())
print("Tool calling enabled. Speak into the microphone (Ctrl+C to exit)...")
# Event handling loop
async for raw in ws:
msg = json.loads(raw)
t = msg["type"]
# Session created
if t == "session.created":
print(f"Session started: {msg['session']['id']}")
# Play audio
elif t == "response.audio.delta":
speaker.write(base64.b64decode(msg["delta"]))
# Incremental text response
elif t in ("response.audio_transcript.delta", "response.text.delta"):
all_response_text += msg.get("delta", "")
# User speech-to-text
elif t == "conversation.item.input_audio_transcription.completed":
print(f"[User] {msg['transcript']}")
# VAD detected speech start
elif t == "input_audio_buffer.speech_started":
print("====== VAD detected speech start ======")
# VAD detected speech end
elif t == "input_audio_buffer.speech_stopped":
print("====== VAD detected speech end ======")
# Received tool call request
elif t == "response.function_call_arguments.done":
print("====== Received tool call request ======")
pending[msg["call_id"]] = {
"name": msg["name"],
"arguments": msg["arguments"],
}
# Response done
elif t == "response.done":
if pending:
# Execute pending tool calls
for cid, info in pending.items():
result = handle_tool_call(info["name"], info["arguments"])
# Send tool execution result
await ws.send(json.dumps({
"type": "conversation.item.create",
"item": {
"type": "function_call_output",
"call_id": cid,
"output": result,
},
}))
pending.clear()
# Trigger the server to continue generating the response
await ws.send(json.dumps({
"type": "response.create",
"response": {"modalities": ["text", "audio"]},
}))
print("====== Tool call processing complete ======")
else:
# Normal response complete, print full response
if all_response_text:
print(f"[Model] {all_response_text}")
all_response_text = ""
send_task.cancel()
speaker.close()
pya.terminate()
asyncio.run(main())
Tool calling with deep thinking models
Deep thinking models reason before they output tool calls, which improves the explainability and reliability of decisions.
-
Reasoning process
The model analyzes the user intent, identifies the required tools, validates the legality of parameters, and plans the call strategy.
-
Tool calling
The model outputs one or more function calls in a structured format.
Parallel tool calling is supported.
Streaming example for deep thinking models:
For text-generation deep thinking models, see Deep thinking. For multimodal deep thinking models, see Image and video understanding and Non-real-time (Qwen-Omni).
tool_choiceparameter supports only"auto"(default, the model chooses freely) or"none"(blocks all tools).
OpenAI compatible
Python
Example code
import os
from openai import OpenAI
# Initialize OpenAI client with Alibaba Cloud DashScope service
client = OpenAI(
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
# If you have not set the environment variable, replace the line below with: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"), # Read API key from environment variable
base_url="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1",
)
# Define available tools
tools = [
# Tool 1: Get current time
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful when you want to know the current time.",
"parameters": {} # No parameters needed
}
},
# Tool 2: Get weather for a specific city
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"] # Required parameter
}
}
}
]
messages = [{"role": "user", "content": input("Enter your question: ")}]
# Example message for multimodal models
# messages = [{
# "role": "user",
# "content": [
# {"type": "image_url","image_url": {"url": "https://img.alicdn.com/imgextra/i4/O1CN014CJhzi20NOzo7atOC_!!6000000006837-2-tps-2048-1365.png"}},
# {"type": "text", "text": "Based on the location in the image, what is the current weather?"}]
# }]
completion = client.chat.completions.create(
# This example uses qwen3.6-plus. Replace with another deep thinking model.
model="qwen3.6-plus",
messages=messages,
extra_body={
# Enable deep thinking. This parameter has no effect on qwen3-30b-a3b-thinking-2507, qwen3-235b-a22b-thinking-2507, or QwQ models.
"enable_thinking": True
},
tools=tools,
parallel_tool_calls=True,
stream=True,
# Uncomment to get token usage info
# stream_options={
# "include_usage": True
# }
)
reasoning_content = "" # Full reasoning process
answer_content = "" # Full reply
tool_info = [] # Store tool call info
is_answering = False # Flag to track if reasoning ended and reply started
print("="*20+"Reasoning process"+"="*20)
for chunk in completion:
if not chunk.choices:
# Handle usage stats
print("\n"+"="*20+"Usage"+"="*20)
print(chunk.usage)
else:
delta = chunk.choices[0].delta
# Process AI's reasoning (chain-of-thought)
if hasattr(delta, 'reasoning_content') and delta.reasoning_content is not None:
reasoning_content += delta.reasoning_content
print(delta.reasoning_content,end="",flush=True) # Print reasoning in real time
# Process final reply content
else:
if not is_answering: # Print title on first reply chunk
is_answering = True
print("\n"+"="*20+"Reply content"+"="*20)
if delta.content is not None:
answer_content += delta.content
print(delta.content,end="",flush=True) # Stream reply content
# Process tool call info (supports parallel tool calls)
if delta.tool_calls is not None:
for tool_call in delta.tool_calls:
index = tool_call.index # Tool call index for parallel calls
# Dynamically expand tool info list
while len(tool_info) <= index:
tool_info.append({})
# Collect tool call ID (for later function routing)
if tool_call.id:
tool_info[index]['id'] = tool_info[index].get('id', '') + tool_call.id
# Collect function name (for later routing)
if tool_call.function and tool_call.function.name:
tool_info[index]['name'] = tool_info[index].get('name', '') + tool_call.function.name
# Collect function arguments (JSON string, needs parsing later)
if tool_call.function and tool_call.function.arguments:
tool_info[index]['arguments'] = tool_info[index].get('arguments', '') + tool_call.function.arguments
print(f"\n"+"="*19+"Tool call info"+"="*19)
if not tool_info:
print("No tool calls")
else:
print(tool_info)
Response
If you input "Weather in the four municipalities", the following result is returned:
====================Reasoning process====================
The user asked about the weather in the four municipalities. First, I need to identify which cities those are. According to China's administrative divisions, the municipalities are Beijing, Shanghai, Tianjin, and Chongqing. So the user wants the weather for these four cities.
Next, I need to check the available tools. The provided tool is get_current_weather, with a location parameter of type string. Each city requires a separate query because the function can only handle one location at a time. Therefore, I need to call this function once for each municipality.
Then, I need to consider how to generate correct tool calls. Each call should include the city name as the parameter. For example, the first call is for Beijing, the second for Shanghai, and so on. Ensure the parameter name is location and the value is the correct city name.
Also, the user likely wants the weather for each city, so I need to ensure each function call is correct. Maybe I need to make four calls, one for each city. But according to the tool usage rules, I may need to process them step-by-step or generate multiple calls at once. Based on the example, I might need to make multiple calls.
Finally, confirm other factors, like whether parameters are correct, city names are accurate, and error handling is needed—for example, if a city doesn't exist or the API is unavailable. But the four municipalities are clear, so it should be fine.
====================Reply content====================
===================Tool call info===================
[{'id': 'call_767af2834c12488a8fe6e3', 'name': 'get_current_weather', 'arguments': '{"location": "Beijing"}'}, {'id': 'call_2cb05a349c89437a947ada', 'name': 'get_current_weather', 'arguments': '{"location": "Shanghai"}'}, {'id': 'call_988dd180b2ca4b0a864ea7', 'name': 'get_current_weather', 'arguments': '{"location": "Tianjin"}'}, {'id': 'call_4e98c57ea96a40dba26d12', 'name': 'get_current_weather', 'arguments': '{"location": "Chongqing"}'}]
Node.js
Example code
import OpenAI from "openai";
import readline from 'node:readline/promises';
import { stdin as input, stdout as output } from 'node:process';
const openai = new OpenAI({
// API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1"
});
const tools = [
{
type: "function",
function: {
name: "get_current_time",
description: "Useful when you want to know the current time.",
parameters: {}
}
},
{
type: "function",
function: {
name: "get_current_weather",
description: "Useful when you want to check the weather in a specific city.",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "City or county, such as Beijing, Hangzhou, or Yuhang District."
}
},
required: ["location"]
}
}
}
];
async function main() {
const rl = readline.createInterface({ input, output });
const question = await rl.question("Enter your question: ");
rl.close();
const messages = [{ role: "user", content: question }];
// Example message for multimodal models
// const messages= [{
// role: "user",
// content: [{type: "image_url", image_url: {url: "https://img.alicdn.com/imgextra/i2/O1CN01FbTJon1ErXVGMRdsN_!!6000000000405-0-tps-1024-683.jpg"}},
// {type: "text", text: "What is the weather at the location in the image?"}]
// }];
let reasoningContent = "";
let answerContent = "";
const toolInfo = [];
let isAnswering = false;
console.log("=".repeat(20) + "Reasoning process" + "=".repeat(20));
try {
const stream = await openai.chat.completions.create({
// This example uses qwen3.6-plus. Replace with another deep thinking model.
model: "qwen3.6-plus",
messages,
// Enable deep thinking. This parameter has no effect on qwen3-30b-a3b-thinking-2507, qwen3-235b-a22b-thinking-2507, or QwQ models.
enable_thinking: true,
tools,
stream: true,
parallel_tool_calls: true
});
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log("\n" + "=".repeat(20) + "Usage" + "=".repeat(20));
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0]?.delta;
if (!delta) continue;
// Process reasoning
if (delta.reasoning_content) {
reasoningContent += delta.reasoning_content;
process.stdout.write(delta.reasoning_content);
}
// Process reply content
else {
if (!isAnswering) {
isAnswering = true;
console.log("\n" + "=".repeat(20) + "Reply content" + "=".repeat(20));
}
if (delta.content) {
answerContent += delta.content;
process.stdout.write(delta.content);
}
// Process tool calls
if (delta.tool_calls) {
for (const toolCall of delta.tool_calls) {
const index = toolCall.index;
// Ensure array length is sufficient
while (toolInfo.length <= index) {
toolInfo.push({});
}
// Update tool ID
if (toolCall.id) {
toolInfo[index].id = (toolInfo[index].id || "") + toolCall.id;
}
// Update function name
if (toolCall.function?.name) {
toolInfo[index].name = (toolInfo[index].name || "") + toolCall.function.name;
}
// Update parameters
if (toolCall.function?.arguments) {
toolInfo[index].arguments = (toolInfo[index].arguments || "") + toolCall.function.arguments;
}
}
}
}
}
console.log("\n" + "=".repeat(19) + "Tool call info" + "=".repeat(19));
console.log(toolInfo.length ? toolInfo : "No tool calls");
} catch (error) {
console.error("Error:", error);
}
}
main();
Response
If you input "Weather in the four municipalities", the following result is returned:
Enter your question: Weather in the four municipalities
====================Reasoning process====================
The user asked about the weather in the four municipalities. First, I need to identify which cities those are. Beijing, Shanghai, Tianjin, and Chongqing, right? Next, I need to call the weather lookup function for each city.
But the user's question may require me to get the weather for each city separately. Each city needs its own get_current_weather call, with the city name as the parameter. I need to ensure the parameters are correct, like using full names such as "Beijing Municipality", "Shanghai Municipality", "Tianjin Municipality", and "Chongqing Municipality".
Then, I need to generate four tool calls, one for each municipality. Check that each parameter is correct and that there are no errors. For example, make sure the city names are spelled correctly. Since the four municipalities are clear, it should be fine.
====================Reply content====================
===================Tool call info===================
[
{
id: 'call_21dc802e717f491298d1b2',
name: 'get_current_weather',
arguments: '{"location": "Beijing"}'
},
{
id: 'call_2cd3be1d2f694c4eafd4e5',
name: 'get_current_weather',
arguments: '{"location": "Shanghai"}'
},
{
id: 'call_48cf3f78e02940bd9085e4',
name: 'get_current_weather',
arguments: '{"location": "Tianjin"}'
},
{
id: 'call_e230a2b4c64f4e658d223e',
name: 'get_current_weather',
arguments: '{"location": "Chongqing"}'
}
]
HTTP
Example code
curl
curl -X POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3.6-plus",
"messages": [
{
"role": "user",
"content": "How is the weather in Hangzhou?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful when you want to know the current time.",
"parameters": {}
}
},
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location":{
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"]
}
}
}
],
"enable_thinking": true,
"stream": true
}'
DashScope
Python
Example code
import dashscope
from dashscope import MultiModalConversation
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1
dashscope.base_http_api_url = "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/"
tools = [
# Tool 1: Get current time
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful when you want to know the current time.",
"parameters": {} # No input needed, so parameters is empty
}
},
# Tool 2: Get weather for a specific city
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
# Weather lookup needs a location, so set parameter to location
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"]
}
}
}
]
# Define question
messages = [{"role": "user", "content": [{"text": input("Enter your question: ")}]}]
# Example message for multimodal models
# messages = [
# {
# "role": "user",
# "content": [
# {"image": "https://img.alicdn.com/imgextra/i2/O1CN01FbTJon1ErXVGMRdsN_!!6000000000405-0-tps-1024-683.jpg"},
# {"text": "What is the weather at the location in the image?"}]
# }]
completion = MultiModalConversation.call(
# This example uses qwen3.6-plus. Replace with another deep thinking model.
model="qwen3.6-plus",
messages=messages,
enable_thinking=True,
tools=tools,
parallel_tool_calls=True,
stream=True,
incremental_output=True,
result_format="message"
)
reasoning_content = ""
answer_content = ""
tool_info = []
is_answering = False
print("="*20+"Reasoning process"+"="*20)
for chunk in completion:
if chunk.status_code == 200:
msg = chunk.output.choices[0].message
# Process reasoning
if 'reasoning_content' in msg and msg.reasoning_content:
reasoning_content += msg.reasoning_content
print(msg.reasoning_content, end="", flush=True)
# Process reply content
if 'content' in msg and msg.content:
if not is_answering:
is_answering = True
print("\n"+"="*20+"Reply content"+"="*20)
answer_content += msg.content
print(msg.content, end="", flush=True)
# Process tool calls
if 'tool_calls' in msg and msg.tool_calls:
for tool_call in msg.tool_calls:
index = tool_call['index']
while len(tool_info) <= index:
tool_info.append({'id': '', 'name': '', 'arguments': ''}) # Initialize all fields
# Incrementally update tool ID
if 'id' in tool_call:
tool_info[index]['id'] += tool_call.get('id', '')
# Incrementally update function info
if 'function' in tool_call:
func = tool_call['function']
# Incrementally update function name
if 'name' in func:
tool_info[index]['name'] += func.get('name', '')
# Incrementally update parameters
if 'arguments' in func:
tool_info[index]['arguments'] += func.get('arguments', '')
print(f"\n"+"="*19+"Tool call info"+"="*19)
if not tool_info:
print("No tool calls")
else:
print(tool_info)
Response
If you enter "Weather in the four municipalities", the following response is returned:
Enter your question: Weather in the four municipalities
====================Reasoning process====================
The user asked about the weather in the four municipalities. First, I need to confirm which cities those are. Beijing, Shanghai, Tianjin, and Chongqing, right? Next, the user needs the weather for each city, so I need to call the weather lookup function.
But the problem is that the user did not specify exact city names. I need to clarify the four municipalities: Beijing, Shanghai, Tianjin, and Chongqing. Then, I need to call the get_current_weather function for each city, passing the city name as the parameter. For example, the first call is for Beijing, the second for Shanghai, the third for Tianjin, and the fourth for Chongqing.
I need to ensure each call's parameters are correct and that no city is missed. Finally, I need to generate four independent function calls, one for each municipality. That way, the user gets the weather for all four cities.
===================Tool call info===================
[{'id': 'call_2f774ed97b0e4b24ab10ec', 'name': 'get_current_weather', 'arguments': '{"location": "Beijing"}'}, {'id': 'call_dc3b05b88baa48c58bc33a', 'name': 'get_current_weather', 'arguments': '{"location": "Shanghai"}'}, {'id': 'call_249b2de2f73340cdb46cbc', 'name': 'get_current_weather', 'arguments': '{"location": "Tianjin"}'}, {'id': 'call_833333634fda49d1b39e87', 'name': 'get_current_weather', 'arguments': '{"location": "Chongqing"}'}]
Java
Example code
// dashscope SDK version >= 2.19.4
import java.util.Arrays;
import com.alibaba.dashscope.exception.UploadFileException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import com.alibaba.dashscope.utils.JsonUtils;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.tools.FunctionDefinition;
import io.reactivex.Flowable;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.lang.System;
import com.github.victools.jsonschema.generator.Option;
import com.github.victools.jsonschema.generator.OptionPreset;
import com.github.victools.jsonschema.generator.SchemaGenerator;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfig;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfigBuilder;
import com.github.victools.jsonschema.generator.SchemaVersion;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Collections;
public class Main {
private static final Logger logger = LoggerFactory.getLogger(Main.class);
private static ObjectNode jsonSchemaWeather;
private static ObjectNode jsonSchemaTime;
static {Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1";}
static class TimeTool {
public String call() {
LocalDateTime now = LocalDateTime.now();
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
return "Current time: " + now.format(formatter) + ".";
}
}
static class WeatherTool {
private String location;
public WeatherTool(String location) {
this.location = location;
}
public String call() {
return location + " is sunny today.";
}
}
static {
SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(
SchemaVersion.DRAFT_2020_12, OptionPreset.PLAIN_JSON);
SchemaGeneratorConfig config = configBuilder
.with(Option.EXTRA_OPEN_API_FORMAT_VALUES)
.without(Option.FLATTENED_ENUMS_FROM_TOSTRING)
.build();
SchemaGenerator generator = new SchemaGenerator(config);
jsonSchemaWeather = generator.generateSchema(WeatherTool.class);
jsonSchemaTime = generator.generateSchema(TimeTool.class);
}
private static void handleGenerationResult(GenerationResult message) {
System.out.println(JsonUtils.toJson(message));
}
// Create tool calling method for text-generation models
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
// Build parameters for text-generation models with tool calling support
private static GenerationParam buildGenerationParam(Message userMsg) {
FunctionDefinition fdWeather = buildFunctionDefinition(
"get_current_weather", "Get weather for a specified location", jsonSchemaWeather);
FunctionDefinition fdTime = buildFunctionDefinition(
"get_current_time", "Get current time", jsonSchemaTime);
return GenerationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3.6-plus")
.enableThinking(true)
.messages(Arrays.asList(userMsg))
.resultFormat(GenerationParam.ResultFormat.MESSAGE)
.incrementalOutput(true)
.tools(Arrays.asList(
ToolFunction.builder().function(fdWeather).build(),
ToolFunction.builder().function(fdTime).build()))
.build();
}
// Create tool calling method for multimodal models
public static void streamCallWithMultiModalMessage(MultiModalConversation conv, MultiModalMessage userMsg)
throws NoApiKeyException, ApiException, UploadFileException {
MultiModalConversationParam param = buildMultiModalConversationParam(userMsg);
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(message -> System.out.println(JsonUtils.toJson(message)));
}
// Build parameters for multimodal models with tool calling support
private static MultiModalConversationParam buildMultiModalConversationParam(MultiModalMessage userMsg) {
FunctionDefinition fdWeather = buildFunctionDefinition(
"get_current_weather", "Get weather for a specified location", jsonSchemaWeather);
FunctionDefinition fdTime = buildFunctionDefinition(
"get_current_time", "Get current time", jsonSchemaTime);
return MultiModalConversationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("qwen3-vl-plus") // Use multimodal model Qwen3-VL
.enableThinking(true)
.messages(Arrays.asList(userMsg))
.tools(Arrays.asList( // Configure tool list
ToolFunction.builder().function(fdWeather).build(),
ToolFunction.builder().function(fdTime).build()))
.build();
}
private static FunctionDefinition buildFunctionDefinition(
String name, String description, ObjectNode schema) {
return FunctionDefinition.builder()
.name(name)
.description(description)
.parameters(JsonUtils.parseString(schema.toString()).getAsJsonObject())
.build();
}
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("text", "Tell me the weather in Hangzhou"))).build();
try {
streamCallWithMultiModalMessage(conv, userMsg);
} catch (UploadFileException e) {
throw new RuntimeException(e);
}
// To use text-generation models for tool calling, uncomment the lines below
// Generation gen = new Generation();
// Message userMessage = Message.builder()
// .role(Role.USER.getValue())
// .content("Tell me the weather in Hangzhou")
// .build();
// try {
// streamCallWithMessage(gen, userMessage);
// } catch (InputRequiredException e) {
// throw new RuntimeException(e);
// }
} catch (ApiException | NoApiKeyException e) {
logger.error("An exception occurred: {}", e.getMessage());
}
System.exit(0);
}
}
Response
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":6,"total_tokens":244},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"OK, the user wants"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":12,"total_tokens":250},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"to know the weather in Hangzhou. I"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":16,"total_tokens":254},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"need to determine if there are any"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":22,"total_tokens":260},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"relevant tools available. Looking at the"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":28,"total_tokens":266},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"provided tools, I see a get_current"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":34,"total_tokens":272},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"_weather function, with a location"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":38,"total_tokens":276},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"parameter. So I should call"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":43,"total_tokens":281},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"this function, setting the parameter"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":48,"total_tokens":286},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"to Hangzhou. No other tools are needed,"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":52,"total_tokens":290},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"since the user only asked about"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":56,"total_tokens":294},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"weather. Next, construct the"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":60,"total_tokens":298},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"tool_call, filling in the name and"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":64,"total_tokens":302},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"parameters. Make sure the parameter is"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":68,"total_tokens":306},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"a JSON object, with location as a"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":73,"total_tokens":311},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"string. After checking for errors,"}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":78,"total_tokens":316},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"return."}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":106,"total_tokens":344},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":"","reasoning_content":"","tool_calls":[{"type":"function","id":"call_ecc41296dccc47baa01567","function":{"name":"get_current_weather","arguments":"{\"location\": \"Hangzhou\"}}"}]}}]}}
{"requestId":"4edb81cd-4647-9d5d-88f9-a4f30bc6d8dd","usage":{"input_tokens":238,"output_tokens":108,"total_tokens":346},"output":{"choices":[{"finish_reason":"tool_calls","message":{"role":"assistant","content":"","reasoning_content":"","tool_calls":[{"type":"function","id":"","function":{"arguments":"\"}"}}]}}]}}
HTTP
Example code
curl
# ======= Important note =======
# For text-generation models, use: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation
# API keys differ by region. Get your API key: https://www.alibabacloud.com/help/model-studio/get-api-key
# For Beijing region, use: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before running ===
curl -X POST "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "qwen3.6-plus",
"input":{
"messages":[
{
"role": "user",
"content": [{"text": "What is the weather in Hangzhou?"}]
}
]
},
"parameters": {
"enable_thinking": true,
"incremental_output": true,
"result_format": "message",
"tools": [{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful when you want to know the current time.",
"parameters": {}
}
},{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful when you want to check the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City or county, such as Beijing, Hangzhou, or Yuhang District."
}
},
"required": ["location"]
}
}
}]
}
}'
Going live
Test tool calling accuracy
-
Build an evaluation system:
Create a test dataset that mirrors real-world business scenarios. Define clear metrics, such as tool selection accuracy, parameter extraction accuracy, and end-to-end success rate.
-
Optimize prompts:
Address issues identified during testing, such as incorrect tool selection or invalid parameters, by refining system prompts, tool descriptions, and parameter descriptions. This is the core optimization method.
-
Upgrade models:
If prompt engineering no longer improves performance, upgrade to a more powerful model, such as
qwen3.6-plus.
Control tool count dynamically
When your application integrates dozens or hundreds of tools, providing the full tool library to the model can cause the following problems:
-
Performance degradation: It becomes much harder for the model to select the correct tool from a large list.
-
Increased cost and latency: Long tool descriptions consume many input tokens. This increases costs and slows down responses.
The solution is to add a tool routing or retrieval layer before calling the model. This layer filters the full library of tools based on the user's query to create a small, relevant subset. Only this subset is then provided to the model.
The main methods for implementing tool routing are as follows:
-
Semantic search
Tool descriptions (
description) are pre-processed into vectors using an Embedding model and then stored in a vector database. When a user sends a query, the query is converted into a vector. A similarity search is then run to retrieve the top-K most relevant tools. -
Mixed search
Combine semantic search ("fuzzy match") with classic keyword matching or metadata tagging ("exact match"). Add
tagsorkeywordsto tools and search using both vector search and keyword filters. This improves recall precision for high-frequency or specific scenarios. -
Lightweight model router
For more complex routing logic, use a smaller, faster, and less expensive model, such as Qwen-Flash, as a router model. This model reads the user's query and outputs a list of relevant tool names.
Best practices
-
Keep candidate sets small: Regardless of the method used, limit the number of tools passed to the main model to 20 or fewer. This balances cognitive load, cost, latency, and accuracy.
-
Use layered filtering: Build a funnel-style routing strategy. For example, first use inexpensive keyword or rule-based matching to filter out clearly irrelevant tools, and then perform a semantic search on the remaining tools.
Tool security principles
When you give a model the ability to execute tools, security is the top priority. The core principles are least privilege and human confirmation.
-
Principle of least privilege: The tool set must adhere to the principle of least privilege. By default, tools should have read-only permissions, such as for looking up weather information or searching documents. Avoid granting write permissions that can change state or manipulate resources.
-
Isolate dangerous tools: Never provide the model with direct access to dangerous tools, such as arbitrary code execution (
code interpreter), file system operations (fs.delete), destructive database operations (db.drop_table), or money transfers (payment.transfer). -
Require human confirmation: For all high-privilege or irreversible actions, require human review and approval. The model can draft a request, but a human must click the final Execute button. For example, the model can prepare an email, but the user must click Send.
User experience optimization
Function calling involves multiple steps. A failure at any point can negatively impact the user experience.
Processing tool run failure
Tool execution can often fail. Use the following strategies to handle these failures:
-
Set maximum retries: Set a reasonable retry limit, such as three, to avoid long waits or wasted resources from repeated failures.
-
Provide fallback messages: If retries fail or errors are unrecoverable, provide users with clear and helpful messages. For example: "Sorry, I cannot retrieve that information right now. The service might be busy. Please try again later."
Handle delays
High latency can lower user satisfaction. Mitigate this with frontend interaction and backend optimization.
-
Set timeouts: Set a reasonable timeout for each step of the function calling process. If a timeout occurs, stop the action and provide feedback to the user.
-
Provide instant feedback: When a function call is initiated, display a message, such as "Looking up the weather..." or "Searching for information...", to inform the user that the system is working.
Billing
In addition to the tokens in the messages array, tool descriptions are also counted and billed as input tokens.
Pass tool information via system message
Error codes
If the model call fails and returns an error message, see Error codes for resolution.