All Products
Search
Document Center

Alibaba Cloud Model Studio:OpenAI Responses API reference

Last Updated:Jan 27, 2026

This topic describes how to call Qwen using the OpenAI-compatible Responses API, the input and output parameters, and code samples.

Advantages over the OpenAI Chat Completions API:

  • Built-in tools: Includes web search, code interpreter, and web extractor. Use these tools together (tools=[{"type": "web_search"}, {"type": "code_interpreter"}, {"type": "web_extractor"}]) for best performance against complex tasks. See Call built-in tools.

  • More flexible input: Supports passing a string directly as model input. Compatible with message arrays in the Chat format.

  • Simplified context management: Pass the previous_response_id from the previous response instead of manually building the complete message history array.

Compatibility notes and limitations

This API is designed to be compatible with OpenAI to reduce migration costs. However, there are differences in parameters, features, and specific behaviors.

Core principle: Requests process only the parameters explicitly listed in this document. Any OpenAI parameters that are not mentioned are ignored.

Key differences:

  • Multimodal input not supported: Currently, only qwen3-max-2026-01-23 is supported. The input parameter supports only text, not images or other file types.

  • Unsupported parameters: Some OpenAI Responses API parameters are not supported, such as instructions and background. Currently, only synchronous calls are supported.

  • Additional parameters: This API also supports extra parameters, such as enable_thinking. For specific usage, see the description of the corresponding parameter.

Currently, supported only in the Singapore region.

Singapore

base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

Request body

Basic call

Python

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the following line with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3-max-2026-01-23",
    input="Hello, please introduce yourself in one sentence."
)

# Get model response
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    // If the environment variable is not set, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3-max-2026-01-23",
        input: "Hello, please introduce yourself in one sentence."
    });

    // Get model response
    console.log(response.output_text);
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3-max-2026-01-23",
    "input": "Hello, please introduce yourself in one sentence."
}'

Streaming output

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3-max-2026-01-23",
    input="Please briefly introduce artificial intelligence.",
    stream=True
)

print("Receiving stream output:")
for event in stream:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\nStream completed")
        print(f"Total tokens: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3-max-2026-01-23",
        input: "Please briefly introduce artificial intelligence.",
        stream: true
    });

    console.log("Receiving stream output:");
    for await (const event of stream) {
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\nStream completed");
            console.log(`Total tokens: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
    "model": "qwen3-max-2026-01-23",
    "input": "Please briefly introduce artificial intelligence.",
    "stream": true
}'

Multi-turn conversation

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

# First round
response1 = client.responses.create(
    model="qwen3-max-2026-01-23",
    input="My name is John, please remember it."
)
print(f"First response: {response1.output_text}")

# Second round - use previous_response_id to link context
# The response id expires in 7 days
response2 = client.responses.create(
    model="qwen3-max-2026-01-23",
    input="Do you remember my name?",
    previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    // First round
    const response1 = await openai.responses.create({
        model: "qwen3-max-2026-01-23",
        input: "My name is John, please remember it."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second round - use previous_response_id to link context
    // The response id expires in 7 days
    const response2 = await openai.responses.create({
        model: "qwen3-max-2026-01-23",
        input: "Do you remember my name?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);
}

main();

Call built-in tools

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3-max-2026-01-23",
    input="Find the Alibaba Cloud website and extract key information",
    # For best results, enable the built-in tools
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
    extra_body={"enable_thinking": True}
)

# Uncomment the line below to see the intermediate output
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3-max-2026-01-23",
        input: "Find the Alibaba Cloud website and extract key information",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ],
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("Model is thinking...");
        } else if (item.type === "web_search_call") {
            console.log(`Search query: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("Extracting web content...");
        } else if (item.type === "message") {
            console.log(`Response: ${item.content[0].text}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3-max-2026-01-23",
    "input": "Find the Alibaba Cloud website and extract key information",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ],
    "enable_thinking": true
}'

model string (Required)

The model name. Currently, only qwen3-max-2026-01-23 is supported.

input string or array (Required)

The model input. The following formats are supported:

  • string: Plain text input, such as "Hello".

  • array: A message array that is arranged in conversational order.

array message types

System Message object (Optional)

A system message. It sets the model's role, tone, task objectives, or constraints. If a system message is used, it must be the first element in the message array.

Properties

role string (Required)

The role of the message. The value must be system.

content string (Required)

The system instruction content. It defines the model's role, behavior, response style, and task constraints.

User Message object (Required)

A user message. It passes questions, instructions, or context to the model.

Properties

role string (Required)

The role of the message. The value must be user.

content string (Required)

The text content that is input by the user.

Assistant Message object (Optional)

An assistant message. It contains previous responses from the model and is used to provide context in a multi-turn conversation.

Properties

role string (Required)

The role of the message. The value must be assistant.

content string (Required)

The text content of the assistant's reply.

previous_response_id string (Optional)

The unique ID of the previous response. The current response id is valid for 7 days. You can use this parameter to create a multi-turn conversation. The server automatically retrieves and combines the input and output of that turn as context. If both an input message array and previous_response_id are provided, the new messages in input are appended to the historical context.

stream boolean (Optional) Defaults to: false

Specifies whether to enable streaming output. If this parameter is set to true, the model response data is streamed back to the client in real time.

tools array (Optional)

An array of tools that the model can call when it generates a response. It supports a mix of built-in tools and user-defined function tools.

For the best response, we recommend that you enable the code_interpreter, web_search, and web_extractor tools together.

Properties

web_search

A web search tool that allows the model to search for the latest information on the Internet. For more information, see Web search.

Properties

type string (Required)

The value must be web_search.

Example: [{"type": "web_search"}]

web_extractor

A web page extraction tool that allows the model to access and extract web page content. It must be used with the web_search tool. For more information, see Web scraping.

This tool applies only to the thinking mode of qwen3-max-2026-01-23.

Properties

type string (Required)

The value must be web_extractor.

Example: [{"type": "web_search"}, {"type": "web_extractor"}]

code_interpreter

A code interpreter tool that allows the model to execute code and return results for data analytics. For more information, see Code interpreter.

This tool applies only to the thinking mode of qwen3-max-2026-01-23.

Properties

type string (Required)

The value must be code_interpreter.

Example: [{"type": "code_interpreter"}]

Custom tool function

A user-defined function tool that allows the model to call functions that you define. When the model decides to call a tool, the response returns a function_call type output. For more information, see Function Calling.

Properties

type string (Required)

The value must be function.

name string (Required)

The tool name. The name can contain only letters, numbers, underscores (_), and hyphens (-). The maximum length is 64 tokens.

description string (Required)

A description of the tool that helps the model decide when and how to call the tool.

parameters object (Optional)

A description of the tool's parameters. The description must be a valid JSON Schema. If the parameters parameter is empty, the tool has no input parameters, such as a time query tool.

To improve the accuracy of tool calls, we recommend that you pass the parameters.

Example:

[{
  "type": "function",
  "name": "get_weather",
  "description": "Get weather information for a specified city",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The name of the city"
      }
    },
    "required": ["city"]
  }
}]

tool_choice string or object (Optional) Defaults to: auto

Controls how the model selects and calls tools. This parameter supports two formats: a string and an object.

String format

  • auto: The model automatically decides whether to call a tool.

  • none: Prevents the model from calling any tool.

  • required: Forces the model to call a tool. This is available only when there is a single tool in the tools list.

Object Pattern

Sets the range of available tools for the model. The model can select and call tools only from the predefined list of tools.

Properties

mode string (Required)

  • auto: The model automatically decides whether to call a tool.

  • required: Forces the model to call a tool. This is available only when there is a single tool in the tools list.

tools array(Required)

A list of tool definitions that the model is allowed to call.

[
  { "type": "function", "name": "get_weather" }
]

type string (Required)

The type of allowed tool configuration. The value must be allowed_tools.

temperature float (Optional)

The sampling temperature. This parameter controls the diversity of the generated text.

A higher temperature results in more diverse text. A lower temperature results in more deterministic text.

Value range: [0, 2)

Both temperature and top_p control the diversity of the generated text. We recommend that you set only one of them. For more information, see Text generation model overview.

top_p float (Optional)

The probability threshold for nucleus sampling. This parameter controls the diversity of the generated text.

A higher top_p value results in more diverse text. A lower top_p value results in more deterministic text.

Value range: (0, 1.0]

Both temperature and top_p control the diversity of the generated text. We recommend that you set only one of them. For more information, see Text generation model overview.

enable_thinking boolean (Optional) Defaults to: false

Specifies whether to enable the thinking mode. If this parameter is enabled, the model thinks before it replies. The thinking process is returned through an output item of the reasoning type.

If the thinking mode is enabled, we recommend that you enable the built-in tools to achieve the best model performance on complex tasks.

Valid values:

  • true: Enables the thinking mode. This must be enabled when you use the code_interpreter or web_extractor tools.

  • false: Disables the thinking mode.

This parameter is not a standard OpenAI parameter. The Python SDK passes it using extra_body={"enable_thinking": True}. The Node.js SDK and curl use enable_thinking: true directly as a top-level parameter.

Response object (non-streaming output)

{
    "created_at": 1769408284,
    "id": "351e34cc-5f75-483b-b948-35be954dbxxx",
    "model": "qwen3-max-2026-01-23",
    "object": "response",
    "output": [
        {
            "content": [
                {
                    "annotations": [],
                    "text": "Hello! I am Qwen, a large-scale language model developed by Tongyi Lab. I can answer questions, create text, write code, and express opinions. I am committed to providing you with accurate, useful, and friendly help.",
                    "type": "output_text"
                }
            ],
            "id": "msg_59a7339e-77d0-4451-8f51-75fb8dbefxxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 39,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 46,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 85
    }
}

id string

The unique identifier for this response. It is a UUID string and is valid for 7 days. You can use this identifier in the previous_response_id parameter to create a multi-turn conversation.

created_at integer

The Unix timestamp in seconds for this request.

object string

The object type. The value is response.

status string

The status of the response generation. Valid values:

  • completed

  • failed

  • in_progress

  • cancelled

  • queued

  • incomplete

model string

The ID of the model that is used to generate the response.

output array

An array of output items that are generated by the model. The type and order of elements in the array depend on the model's response.

Array element properties

type string

The type of the output item. Valid values:

  • message: Message type. Contains the final reply content generated by the model.

  • reasoning: Reasoning type. This type is returned when the thinking mode (enable_thinking: true) is enabled. Reasoning tokens are counted in output_tokens_details.reasoning_tokens and are billed as reasoning tokens.

  • function_call: Function call type. This type is returned when a user-defined function tool is used. You need to handle the function call and return the result.

  • web_search_call: Search call type. This type is returned when the web_search tool is used.

  • code_interpreter_call: Code execution type. This type is returned when the code_interpreter tool is used. It must be used with enable_thinking: true.

  • web_extractor_call: Web page extraction type. This type is returned when the web_extractor tool is used. It must be used with the web_search tool.

id string

The unique identifier for the output item. This field is included in all types of output items.

role string

The role of the message. The value is assistant. This field exists only when the type is message.

status string

The status of the output item. Valid values are completed and in_progress. This field exists when the type is message, function_call, web_search_call, code_interpreter_call, or web_extractor_call.

name string

The function name. This field exists only when the type is function_call.

arguments string

The arguments for the function call, in a JSON string format. This field exists only when the type is function_call. You need to parse it using JSON.parse() before use.

call_id string

The unique identifier for the function call. This field exists only when the type is function_call. When you return the function call result, you must use this ID to associate the request with the response.

content array

An array of message content. This field exists only when the type is message.

Array element properties

type string

The content type. The value is output_text.

text string

The text content that is generated by the model.

annotations array

An array of text annotations. This is usually an empty array.

summary array

An array of reasoning summaries. This field exists only when the type is reasoning. Each element contains a type field with a value of summary_text and a text field that contains the summary text.

action object

The search action information. This field exists only when the type is web_search_call.

Properties

query string

The search query keyword.

type string

The search type. The value is search.

sources array

A list of search sources. Each element contains a type field with a value of url and a url field that contains the web page URL.

code string

The code that is generated and executed by the model. This field exists only when the type is code_interpreter_call.

outputs array

An array of code execution outputs. This field exists only when the type is code_interpreter_call. Each element contains a type field with a value of logs and a logs field that contains the code execution log.

container_id string

The identifier for the code interpreter container. This field exists only when the type is code_interpreter_call. It is used to associate multiple code executions within the same session.

goal string

A description of the extraction goal that explains what information needs to be extracted from the web page. This field exists only when the type is web_extractor_call.

output string

A summary of the content that is extracted from the web page. This field exists only when the type is web_extractor_call.

urls array

A list of URLs of the extracted web pages. This field exists only when the type is web_extractor_call.

usage object

The token consumption information for this request.

Properties

input_tokens integer

The number of input tokens.

output_tokens integer

The number of tokens that are output by the model.

total_tokens integer

The total number of tokens consumed. This is the sum of input_tokens and output_tokens.

input_tokens_details object

The fine-grained categorization of input tokens.

Properties

cached_tokens integer

The number of tokens that hit the cache. For more information, see Context cache.

output_tokens_details object

The fine-grained categorization of output tokens.

Properties

reasoning_tokens integer

The number of tokens in the thinking process.

x_tools object

Statistical information about tool usage. If built-in tools are used, this field includes the number of calls for each tool.

Examples:

  • web_search tool: {"web_search": {"count": 1}}

  • code_interpreter tool: {"code_interpreter": {"count": 1}}

  • web_extractor tool: {"web_extractor": {"count": 1}}

error object

The error object that is returned when the model fails to generate a response. This field is null on success.

tools array

The complete content of the tools parameter from the echo request. The structure is the same as the tools parameter in the request body.

tool_choice string

The value of the tool_choice parameter from the echo request. Valid values are auto, none, and required.

Response chunk object (streaming output)

Basic call

// response.created - Response created
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"object":"response","status":"queued",...},"sequence_number":0,"type":"response.created"}

// response.in_progress - Response in progress
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","status":"in_progress",...},"sequence_number":1,"type":"response.in_progress"}

// response.output_item.added - New output item added
{"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}

// response.content_part.added - New content block added
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}

// response.output_text.delta - Incremental text (triggered multiple times)
{"content_index":0,"delta":"Artificial intelligence","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":"(AI) refers to the technology and science","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}

// response.output_text.done - Text completed
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":53,"text":"Artificial intelligence (AI) refers to the technology and science that simulates human intelligent behavior by computer systems...","type":"response.output_text.done"}

// response.content_part.done - Content block completed
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null},"sequence_number":54,"type":"response.content_part.done"}

// response.output_item.done - Output item completed
{"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":55,"type":"response.output_item.done"}

// response.completed - Response completed (includes full response and usage)
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"model":"qwen3-max-2026-01-23","object":"response","output":[...],"status":"completed","usage":{"input_tokens":37,"output_tokens":243,"total_tokens":280,...}},"sequence_number":56,"type":"response.completed"}

Call built-in tools

id:1
event:response.created
:HTTP_STATUS/200
data:{"sequence_number":0,"type":"response.created","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"queued"}}

id:2
event:response.in_progress
:HTTP_STATUS/200
data:{"sequence_number":1,"type":"response.in_progress","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"in_progress"}}

id:3
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":2,"item":{"summary":[],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":0,"type":"response.output_item.added"}

id:4
event:response.reasoning_summary_text.delta
:HTTP_STATUS/200
data:{"delta":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the home page.\n\nI need to first search for the URL of the Alibaba Cloud website, then use the web_extractor tool to access the site and extract key information.","sequence_number":3,"output_index":0,"type":"response.reasoning_summary_text.delta","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0}

id:14
event:response.reasoning_summary_text.done
:HTTP_STATUS/200
data:{"sequence_number":13,"text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the home page.\n\nI need to first search for the URL of the Alibaba Cloud website, then use the web_extractor tool to access the site and extract key information.","output_index":0,"type":"response.reasoning_summary_text.done","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0}

id:15
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":14,"item":{"summary":[{"type":"summary_text","text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the home page.\n\nI need to first search for the URL of the Alibaba Cloud website, then use the web_extractor tool to access the site and extract key information."}],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":1,"type":"response.output_item.done"}

id:16
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":15,"item":{"action":{"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

id:17
event:response.web_search_call.in_progress
:HTTP_STATUS/200
data:{"sequence_number":16,"output_index":1,"type":"response.web_search_call.in_progress","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"}

id:19
event:response.web_search_call.completed
:HTTP_STATUS/200
data:{"sequence_number":18,"output_index":1,"type":"response.web_search_call.completed","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"}

id:20
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":19,"item":{"action":{"sources":[{"type":"url","url":"https://cn.aliyun.com/"},{"type":"url","url":"https://www.aliyun.com/"}],"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"completed"},"output_index":1,"type":"response.output_item.done"}

id:33
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":32,"item":{"urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud home page, including the following: company positioning/profile, core products and services, main business sections, special features/solutions, latest news/events, free trial/promotional information, navigation menu structure, etc.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"in_progress"},"output_index":3,"type":"response.output_item.added"}

id:34
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":33,"item":{"output":"The useful information in https://cn.aliyun.com/ for user goal Extract key information from the Alibaba Cloud home page, including the following: company positioning/profile, core products and services, main business sections, special features/solutions, latest news/events, free trial/promotional information, navigation menu structure, etc. as follows: \n\nEvidence in page: \n## Tongyi large model, the first choice for enterprises to embrace the AI era\n\n## A complete product system to create a cloud of technological innovation for enterprises\n\nAll cloud products## Relying on the coordinated development of large models and cloud computing to make AI within reach\n\nAll AI solutions\n\nSummary: \nAlibaba Cloud positions itself as a leading enterprise AI solution provider centered around the Tongyi large model...","urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud home page, including the following: company positioning/profile, core products and services, main business sections, special features/solutions, latest news/events, free trial/promotional information, navigation menu structure, etc.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"completed"},"output_index":3,"type":"response.output_item.done"}

id:50
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":50,"item":{"content":[{"type":"text","text":""}],"type":"message","id":"msg_final","role":"assistant"},"output_index":5,"type":"response.output_item.added"}

id:51
event:response.output_text.delta
:HTTP_STATUS/200
data:{"delta":"I have found the Alibaba Cloud official website and extracted the key information from the home page:\n\n","sequence_number":51,"output_index":5,"type":"response.output_text.delta"}

id:60
event:response.completed
:HTTP_STATUS/200
data:{"type":"response.completed","response":{"id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","status":"completed","usage":{"input_tokens":45,"output_tokens":320,"total_tokens":365}}}

Streaming output returns a series of JSON objects. Each object includes a type field to identify the event type and a sequence_number field to indicate the event order. The response.completed event marks the end of the stream.

type string

The event type identifier. Valid values:

  • response.created: Triggered when the response is created. The status is queued.

  • response.in_progress: Triggered when response processing begins. The status changes to in_progress.

  • response.output_item.added: Triggered when a new output item, such as a message or a web_extractor_call, is added to the output array. If the item.type is web_extractor_call, this indicates that the web extractor tool call has started.

  • response.content_part.added: Triggered when a new content block is added to the content array of an output item.

  • response.output_text.delta: Triggered for incremental text generation. This event is triggered multiple times, and the delta field contains the new text fragment.

  • response.output_text.done: Triggered when text generation is complete. The text field contains the complete text.

  • response.content_part.done: Triggered when a content block is complete. The part object contains the complete content block.

  • response.output_item.done: Triggered when an output item is completely generated. The item object contains the complete output item. If the item.type is web_extractor_call, this indicates that the web extractor tool call is complete.

  • response.reasoning_summary_text.delta: (If the thinking mode is enabled) Incremental text for the reasoning summary. The delta field contains the new summary fragment.

  • response.reasoning_summary_text.done: (If the thinking mode is enabled) The reasoning summary is complete. The text field contains the complete summary.

  • response.web_search_call.in_progress / searching / completed: (If you use the web_search tool) Search status change events.

  • response.code_interpreter_call.in_progress / interpreting / completed: (If you use the code_interpreter tool) Code execution status change events.

  • Note: If you use the web_extractor tool, there is no dedicated event type identifier. The web extractor tool call is communicated through the general response.output_item.added and response.output_item.done events. It is identified by the item.type field, which has a value of web_extractor_call.

  • response.completed: Triggered when response generation is complete. The response object contains the complete response, including usage information. This event marks the end of the stream.

sequence_number integer

The event serial number. It starts from 0 and increments. You can use it to ensure that the client processes events in the correct order.

response object

The response object. It appears in the response.created, response.in_progress, and response.completed events. In the response.completed event, it contains the complete response data, including output and usage. Its structure is consistent with the response object of a non-streaming response.

item object

The output item object. It appears in the response.output_item.added and response.output_item.done events. In an added event, it is an initial skeleton where the content is an empty array. In a done event, it is a complete object.

Properties

id string

The unique identifier for the output item, such as msg_xxx.

type string

The type of the output item. Valid values: message, reasoning, etc.

role string

The role of the message. The value is assistant. This field exists only when the type is message.

status string

The generation status. In an added event, the status is in_progress. In a done event, the status is completed.

content array

An array of message content. In an added event, it is an empty array []. In a done event, it contains complete content block objects that have the same structure as the part object.

part object

The content block object. It appears in the response.content_part.added and response.content_part.done events.

Properties

type string

The content block type. The value is output_text.

text string

The text content. In an added event, it is an empty string. In a done event, it is the complete text.

annotations array

An array of text annotations. This is usually an empty array.

logprobs object | null

The log probability information for tokens. Currently, this is null.

delta string

The incremental text content. It appears in the response.output_text.delta event and contains the new text fragment. The client should concatenate all delta fragments to obtain the complete text.

text string

The complete text content. It appears in the response.output_text.done event and contains the complete text of the content block. You can use it to verify the concatenated delta result.

item_id string

The unique identifier for the output item. It is used to associate related events for the same output item.

output_index integer

The index position of the output item in the output array.

content_index integer

The index position of the content block in the content array.

summary_index integer

The index of the summary array. It appears in the response.reasoning_summary_text.delta and response.reasoning_summary_text.done events.

FAQ

Q: How to pass the context for a multi-turn conversation?

A: To start a new turn in the conversation, just pass the id from the previous response as previous_response_id.

Q: Why are some fields in the response examples not described in this document?

A: If you use the official OpenAI SDK, it may output additional fields, usually null, based on its own model structure. These fields are defined by the OpenAI protocol but are not currently supported by our service. You need to focus only on the fields that are described in this document.