使用 OpenAI Responses API 構建 Qwen 代理 - 大模型服務平台百鍊 - 阿里雲 - Alibaba Cloud Model Studio

阿里雲百鍊的通義千問模型支援 OpenAI 相容 Responses 介面。作為Chat Completions API的演化版本，Responses API能夠以更簡潔的方式提供智能體原生功能。

相較於OpenAI Chat Completions API 的優勢：

內建工具：內建連網搜尋、網頁抓取、代碼解譯器、文搜圖、圖搜圖等工具，可在處理複雜任務時獲得更佳效果，詳情參考調用內建工具。
更靈活的輸入：支援直接傳入字串作為模型輸入，也相容 Chat 格式的訊息數組。
簡化上下文管理：通過傳遞上一輪響應的 previous_response_id，無需手動構建完整的訊息歷史數組。

前提條件

您需要先擷取API Key並配置API Key到環境變數（準備下線，併入配置 API Key）。若通過 OpenAI SDK 進行調用，需要安裝SDK。

支援的模型

qwen3-max、qwen3-max-2026-01-23、qwen3.5-plus、qwen3.5-plus-2026-02-15、qwen3.5-flash、qwen3.5-flash-2026-02-23、qwen3.5-397b-a17b、qwen3.5-122b-a10b、qwen3.5-27b、qwen3.5-35b-a3b、qwen-plus、qwen-flash、qwen3-coder-plus、qwen3-coder-flash。

服務地址

新加坡

SDK 調用配置的base_url：https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP 要求地址：POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

华北2（北京）

SDK 調用配置的base_url：https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP 要求地址：POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

程式碼範例

基礎調用

最簡單的調用方式，發送一條訊息並擷取模型回複。

Python

import os
from openai import OpenAI

client = OpenAI(
    # If environment variable is not set, replace with: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="What can you do?"
)

# Get model response
# print(response.model_dump_json())
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    // If environment variable is not set, replace with: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "What can you do?"
    });

    // Get model response
    console.log(response.output_text);
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "What can you do?"
}'

響應樣本

以下為API返回的完整響應。

{
    "created_at": 1771226624,
    "id": "bf0d5c2e-f14b-9ad7-bc0d-ee0c8c9ee2d8",
    "model": "qwen3-max-2026-01-23",
    "object": "response",
    "output": [
        {
            "content": [
                {
                    "annotations": [],
                    "text": "Hi there!  I'm actually quite ......",
                    "type": "output_text"
                }
            ],
            "id": "msg_1e17fdb2-5fc3-4c78-a9e9-cbd78eb043f0",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 37,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 220,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 257,
        "x_details": [
            {
                "input_tokens": 37,
                "output_tokens": 220,
                "total_tokens": 257,
                "x_billing_type": "response_api"
            }
        ]
    }
}

多輪對話

通過 previous_response_id 參數自動關聯上下文，無需手動構建訊息歷史，當前響應id有效期間為7天。

previous_response_id 應傳入上一輪響應中的頂層 id （f0dbb153-117f-9bbf-8176-5284b47f3xxx，UUID格式），而不是 output 數組內訊息的 id （msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx）。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

# First round
response1 = client.responses.create(
    model="qwen3.5-plus",
    input="My name is John, please remember it."
)
print(f"First response: {response1.output_text}")

# Second round - use previous_response_id to link context
# The response id expires in 7 days
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="Do you remember my name?",
    previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    // First round
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "My name is John, please remember it."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second round - use previous_response_id to link context
    // The response id expires in 7 days
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Do you remember my name?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);
}

main();

curl

# First round
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "My name is John, please remember it."
}'

# Second round - use the id from first response as previous_response_id
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Do you remember my name?",
    "previous_response_id": "response_id_from_first_round"
}'

第二輪對話響應樣本

{
  "id": "f0dbb153-117f-9bbf-8176-5284b47f3xxx",
  "created_at": 1769173209.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Yes, John! I remember your name. How can I assist you today?",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 78,
    "output_tokens": 16,
    "total_tokens": 94,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

說明：第二輪對話的 input_tokens 為 78，包含了第一輪的上下文，模型成功記住了名字"John"。

流式輸出

通過流式輸出即時接收模型產生的內容，適合長文本產生情境。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3.5-plus",
    input="Please briefly introduce artificial intelligence.",
    stream=True
)

print("Receiving stream output:")
for event in stream:
    # print(event.model_dump_json())  # Uncomment to see raw event response
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\nStream completed")
        print(f"Total tokens: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Please briefly introduce artificial intelligence.",
        stream: true
    });

    console.log("Receiving stream output:");
    for await (const event of stream) {
        // console.log(JSON.stringify(event));  // Uncomment to see raw event response
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\nStream completed");
            console.log(`Total tokens: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Please briefly introduce artificial intelligence.",
    "stream": true
}'

響應樣本

{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"queued","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":0,"type":"response.created"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"in_progress","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":1,"type":"response.in_progress"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}
{"content_index":0,"delta":"人工智慧","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":"（Art","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":5,"type":"response.output_text.delta"}
{"content_index":0,"delta":"ificial Intelligence，","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}
{"content_index":0,"delta":"簡稱 AI）","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":7,"type":"response.output_text.delta"}
... (省略中間事件) ...
{"content_index":0,"delta":"領域，正在深刻改變我們的","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":38,"type":"response.output_text.delta"}
{"content_index":0,"delta":"生活和工作方式","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":39,"type":"response.output_text.delta"}
{"content_index":0,"delta":"。","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":40,"type":"response.output_text.delta"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":41,"text":"人工智慧（Artificial Intelligence，簡稱 AI）是指由電腦系統類比人類智能行為的技術和科學。xxxx","type":"response.output_text.done"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"人工智慧（Artificial Intelligence，簡稱 AI）是指由電腦系統類比人類智能行為的技術和科學。xxx","type":"output_text","logprobs":null},"sequence_number":42,"type":"response.content_part.done"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"人工智慧（Artificial Intelligence，簡稱 AI）是指由電腦系統類比人類智能行為的技術和科學。它旨在讓機器能夠執行通常需要人類智能才能完成的任務，例如：\n\n- **學習**（如通過資料訓練模型）  \n- **推理**（如邏輯判斷和問題求解）  \n- **感知**（如識別映像、語音或文字）  \n- **理解語言**（如自然語言處理）  \n- **決策**（如在複雜環境中做出最優選擇）\n\n人工智慧可分為**弱人工智慧**（專註於特定任務，如語音助手、推薦系統）和**強人工智慧**（具備類似人類的通用智能，目前尚未實現）。\n\n當前，AI 已廣泛應用於醫學、金融、交通、教育、娛樂等多個領域，正在深刻改變我們的生活和工作方式。","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":43,"type":"response.output_item.done"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"qwen3.5-plus","object":"response","output":[{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"人工智慧（Artificial Intelligence，簡稱 AI）是xxxxxx","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"completed","text":null,"top_logprobs":null,"truncation":null,"usage":{"input_tokens":37,"input_tokens_details":{"cached_tokens":0},"output_tokens":166,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":203},"user":null},"sequence_number":44,"type":"response.completed"}

深度思考

開啟深度思考模式後，模型會在回複前進行思考，思考內容通過 reasoning 類型的輸出項返回。適用於需要複雜推理的問題。

不支援 thinking_budget 參數控制最大思維長度。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="9.9和9.11誰大？",
    extra_body={"enable_thinking": True}
)

# 處理輸出
for item in response.output:
    if item.type == "reasoning":
        print("=== 思考過程 ===")
        for summary in item.summary:
            print(summary.text)
    elif item.type == "message":
        print("\n=== 最終答案 ===")
        print(item.content[0].text)

# 查看思考 Token 數
print(f"\n思考 Token 數: {response.usage.output_tokens_details.reasoning_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "9.9和9.11誰大？",
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("=== 思考過程 ===");
            for (const summary of item.summary) {
                console.log(summary.text);
            }
        } else if (item.type === "message") {
            console.log("\n=== 最終答案 ===");
            console.log(item.content[0].text);
        }
    }

    // 查看思考 Token 數
    console.log(`\n思考 Token 數: ${response.usage.output_tokens_details.reasoning_tokens}`);
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "9.9和9.11誰大？",
    "enable_thinking": true
}'

響應樣本

{
    "created_at": 1774498317,
    "id": "resp_xxx",
    "model": "qwen3.5-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_xxx",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request:**\n    *   Question: \"9.9 和 9.11 誰大？\" (Which is larger, 9.9 or 9.11?)\n    *   Context: The user is asking a simple mathematical comparison question.\n    *   Current Date: Thursday, March 26, 2026 (provided in the system prompt).\n    *   Knowledge Cutoff: 2026 (provided in the system prompt).\n\n2.  **Evaluate the Numbers:**\n    *   Number A: 9.9\n    *   Number B: 9.11\n    *   These are decimal numbers.\n    *   Compare the integer part: Both are 9.\n    *   Compare the tenths place (first decimal digit):\n        *   9.9 has 9 in the tenths place.\n        *   9.11 has 1 in the tenths place.\n    *   Since 9 > 1, 9.9 is greater than 9.11.\n\n3.  **Consider Potential Ambiguities:**\n    *   Could this be version numbering? (e.g., software versions). In versioning, 9.11 is often \"newer\" or \"higher\" than 9.9. However, mathematically, 9.9 > 9.11.\n    *   Could this be dates? (September 9th vs September 11th). 11th is later.\n    *   Standard interpretation: Without context, decimal numbers are assumed to be mathematical values.\n    *   Common pitfall: Some people mistakenly treat decimals like whole numbers (where 11 > 9), leading them to think 9.11 > 9.9. This is a known cognitive bias or misconception in elementary math.\n    *   Decision: Provide the mathematical answer clearly, but perhaps acknowledge the versioning context if relevant (though usually, for this specific question, it's a math test). Given the simplicity, stick to the mathematical truth first.\n\n4.  **Formulate the Answer:**\n    *   Direct answer: 9.9 is larger.\n    *   Explanation: Compare place values. 9.9 = 9.90, 9.11 = 9.11. 90 hundredths > 11 hundredths.\n    *   Tone: Helpful, clear.\n\n5.  **Drafting the Response (Internal Monologue/Draft):**\n    *   Okay, mathematically 9.9 is bigger. 9.9 is 9 and 9 tenths. 9.11 is 9 and 11 hundredths (or 1 tenth and 1 hundredth). 9 tenths is more than 1 tenth.\n    *   So, 9.9 > 9.11.\n    *   I should answer in Chinese since the question is in Chinese.\n    *   \"9.9 更大。\" (9.9 is larger.)\n    *   Add a brief explanation to avoid confusion. \"因為 9.9 等於 9.90，而 9.90 大於 9.11。\" (Because 9.9 equals 9.90, and 9.90 is greater than 9.11.)\n\n6.  **Refining based on \"Version Number\" possibility:**\n    *   Sometimes this is a trick question about software versions. In semver, 9.11 > 9.9.\n    *   However, usually, when asked plainly \"which is larger\", it refers to numerical value.\n    *   I will provide the numerical answer as primary, maybe add a note about versions if it feels necessary. But keeping it simple is better. Most likely, this is testing the decimal comparison logic.\n    *   Let's stick to the mathematical value as the default.\n\n7.  **Final Polish:**\n    *   Answer: 9.9 大。\n    *   Reasoning: 比較小數時，先看整數部分，再看十分位。9.9 的十分位是 9，9.11 的十分位是 1。9 > 1，所以 9.9 > 9.11。\n\n8.  **Output Generation:** (Matching the thought process)\n    *   State clearly which is larger.\n    *   Explain why.\n\n    *Self-Correction on Date/Time:* The system prompt mentions the current date is 2026. This doesn't affect the math question, but I should ensure I don't reference anything post-2026 if I were adding context (not needed here).\n\n    *Final Decision:* Just answer the math question directly.\n\n    \"從數值上看，9.9 更大。\" (Numerically, 9.9 is larger.)\n    Explain: 9.9 = 9.90, 9.11 = 9.11. 90 > 11.\n\n    Wait, is there any other interpretation?\n    - Dates? 9/9 vs 9/11. 11th is later.\n    - Versions? 9.11 is newer.\n    - But \"誰大\" (who is big/larger) usually implies magnitude.\n    - I will answer based on numerical magnitude but briefly mention the version context if it adds value without confusing. Actually, better to be decisive. Numerical value is the standard interpretation for \"who is larger\" with numbers.\n\n    Let's go with the numerical answer.\n\n    **Plan:**\n    1. State 9.9 is larger numerically.\n    2. Explain the decimal comparison.\n    3. (Optional but helpful) Mention that if it's version numbers, 9.11 might be considered \"higher\". But prioritize the math.\n    Actually, for a simple query like this, over-explaining versions might be confusing. I'll stick to the math but acknowledge the common confusion.\n\n    *Revised Plan:*\n    1. Direct answer: 9.9 is larger (mathematically).\n    2. Explanation: Place value.\n    3. Note: Unless it's a version number.\n\n    Let's keep it concise.\n\n    \"在數學數值上，9.9 更大。\"\n    \"因為 9.9 等於 9.90，比 9.11 大。\"\n    \"如果是版本號碼，則 9.11 版本通常高於 9.9 版本。\"\n    This covers both bases.\n\n    Wait, looking at common AI training, this is often a \"trap\" question to see if the AI treats decimals like integers. The correct mathematical answer is crucial.\n\n    Okay, final output structure:\n    1. Mathematical answer.\n    2. Brief explanation.\n    3. Contextual note (versions).\n\n    Let's write it.\n    \"從數學數值上看，9.9 更大。\"\n    \"比較小數時，先看整數部分（都是 9），再看十分位：9.9 的十分位是 9，而 9.11 的十分位是 1。因為 9 > 1，所以 9.9 > 9.11。\"\n    \"不過，如果是軟體版本號碼，通常 9.11 版會比 9.9 版更新。\"\n\n    This is balanced.\n\n    Wait, I need to check the instruction again. \"Please remember the current actual time...\". This is just system context. I don't need to mention the date in the answer unless relevant. It's not relevant here.\n\n    Okay, ready to generate.\n\n    Wait, one more check. Is there any cultural context? In some contexts, people might say 9.11 is bigger because 11 > 9. This is the misconception I need to correct.\n    So, emphasizing the mathematical rule is key.\n\n    Final check:\n    9.9 = 9 + 9/10 = 9.90\n    9.11 = 9 + 1/10 + 1/100 = 9.11\n    9.90 > 9.11. Correct.\n\n    Response will be in Chinese.\n    \"從數值大小來看，9.9 更大。\"\n    Explain why.\n    Mention version exception.\n    Done.\n",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning"
        },
        {
            "content": [
                {
                    "annotations": [],
                    "text": "從**數學數值**上看，**9.9 更大**。\n\n原因如下：\n比較小數時，先看整數部分，都是 9；再看十分位（小數點後第一位）：\n*   9.9 的十分位是 **9**\n*   9.11 的十分位是 **1**\n\n因為 9 大於 1，所以 **9.9 > 9.11**（可以把 9.9 看作 9.90 來比較）。\n\n**注意**：如果是**軟體版本號碼**，通常 9.11 版會比 9.9 版更新（更高），但在純數字大小比較中，9.9 更大。",
                    "type": "output_text"
                }
            ],
            "id": "msg_xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 57,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 2018,
        "output_tokens_details": {
            "reasoning_tokens": 1861
        },
        "total_tokens": 2075,
        "x_details": [
            {
                "input_tokens": 57,
                "output_tokens": 2018,
                "output_tokens_details": {
                    "reasoning_tokens": 1861
                },
                "total_tokens": 2075,
                "x_billing_type": "response_api"
            }
        ]
    }
}

調用內建工具

開啟內建工具可在處理複雜任務時獲得更佳效果，當前網頁抓取與代碼解譯器工具限時免費，支援的工具請參見工具調用。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="Find the Alibaba Cloud website and extract key information",
    # For best results, enable all the built-in tools
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
    extra_body={"enable_thinking": True}
)

# Uncomment the line below to see the intermediate output
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Find the Alibaba Cloud website and extract key information",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ],
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("Model is thinking...");
        } else if (item.type === "web_search_call") {
            console.log(`Search query: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("Extracting web content...");
        } else if (item.type === "message") {
            console.log(`Response: ${item.content[0].text}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Find the Alibaba Cloud website and extract key information",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ],
    "enable_thinking": true
}'

響應樣本

{
    "id": "69258b21-5099-9d09-92e8-8492b1955xxx",
    "object": "response",
    "status": "completed",
    "output": [
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "使用者要求找阿里雲官網並提取資訊..."
                }
            ]
        },
        {
            "type": "web_search_call",
            "status": "completed",
            "action": {
                "query": "阿里雲官網",
                "type": "search",
                "sources": [
                    {
                        "type": "url",
                        "url": "https://cn.aliyun.com/"
                    },
                    {
                        "type": "url",
                        "url": "https://www.alibabacloud.com/zh"
                    }
                ]
            }
        },
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "搜尋結果顯示阿里雲官網URL..."
                }
            ]
        },
        {
            "type": "web_extractor_call",
            "status": "completed",
            "goal": "提取阿里雲官網首頁的關鍵資訊",
            "output": "通義大模型、完整產品體系、AI解決方案...",
            "urls": [
                "https://cn.aliyun.com/"
            ]
        },
        {
            "type": "message",
            "role": "assistant",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "text": "阿里雲官網關鍵資訊：通義大模型，雲端運算服務..."
                }
            ]
        }
    ],
    "usage": {
        "input_tokens": 40836,
        "output_tokens": 2106,
        "total_tokens": 42942,
        "output_tokens_details": {
            "reasoning_tokens": 677
        },
        "x_tools": {
            "web_extractor": {
                "count": 1
            },
            "web_search": {
                "count": 1
            }
        }
    }
}

Session 緩衝

概述

Session 緩衝是面向 Responses API 多輪對話情境的緩衝模式。與顯式緩衝需要手動添加 cache_control 標記不同，Session 緩衝由服務端自動處理緩衝邏輯，只需通過 HTTP header 控制開關，按正常多輪對話方式調用即可。

在使用 previous_response_id 進行多輪對話時，開啟 Session 緩衝後，服務端會自動緩衝對話上下文，降低推理延遲與使用成本。

使用方式

在請求 header 中添加以下欄位即可控制 Session 緩衝的開關：

x-dashscope-session-cache: enable：開啟 Session 緩衝。
x-dashscope-session-cache: disable：關閉 Session 緩衝，若模型支援將啟用隱式緩衝。

使用 SDK 時，可通過 default_headers（Python）或 defaultHeaders（Node.js）參數傳入該 header；使用 curl 時，通過 -H 參數傳入。

支援的模型

qwen3-max、qwen3.5-plus、qwen3.5-flash、qwen-plus、qwen-flash、qwen3-coder-plus、qwen3-coder-flash

Session 緩衝僅適用於 Responses API（OpenAI相容-Responses），不適用於 Chat Completions API。

程式碼範例

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    # 通過 default_headers 開啟 Session 緩衝
    default_headers={"x-dashscope-session-cache": "enable"}
)

# 構造超過 1024 Token 的長文本，確保能觸發緩衝建立（若未達到1024 Token，後續累積對話上下文超過1024 Token時將觸發緩衝建立）
long_context = "人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。" * 50

# 第一輪對話
response1 = client.responses.create(
    model="qwen3.5-plus",
    input=long_context + "\n\n基於以上背景知識，請簡短介紹機器學習中的隨機森林演算法。",
)
print(f"第一輪迴複: {response1.output_text}")

# 第二輪對話：通過 previous_response_id 關聯上下文，緩衝由服務端自動處理
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="它和 GBDT 有什麼主要區別？",
    previous_response_id=response1.id,
)
print(f"第二輪迴複: {response2.output_text}")

# 查看快取命中情況
usage = response2.usage
print(f"輸入 Token: {usage.input_tokens}")
print(f"快取命中 Token: {usage.input_tokens_details.cached_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    // 通過 defaultHeaders 開啟 Session 緩衝
    defaultHeaders: {"x-dashscope-session-cache": "enable"}
});

// 構造超過 1024 Token 的長文本，確保能觸發緩衝建立（若未達到1024 Token，後續累積對話上下文超過1024 Token時將觸發緩衝建立）
const longContext = "人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。".repeat(50);

async function main() {
    // 第一輪對話
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: longContext + "\n\n基於以上背景知識，請簡短介紹機器學習中的隨機森林演算法，包括基本原理和應用情境。"
    });
    console.log(`第一輪迴複: ${response1.output_text}`);

    // 第二輪對話：通過 previous_response_id 關聯上下文，緩衝由服務端自動處理
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "它和 GBDT 有什麼主要區別？",
        previous_response_id: response1.id
    });
    console.log(`第二輪迴複: ${response2.output_text}`);

    // 查看快取命中情況
    console.log(`輸入 Token: ${response2.usage.input_tokens}`);
    console.log(`快取命中 Token: ${response2.usage.input_tokens_details.cached_tokens}`);
}

main();

curl

# 第一輪對話
# 請將 input 替換為超過 1024 Token 的長文本，以確保觸發緩衝建立
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。人工智慧是電腦科學的一個重要分支，致力於研究和開發能夠類比、延伸和擴充人類智能的理論、方法、技術及應用系統。\n\n基於以上背景知識，請簡短介紹機器學習中的隨機森林演算法，包括基本原理和應用情境。"
}'

# 第二輪對話 - 使用上一輪返回的 id 作為 previous_response_id
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "它和 GBDT 有什麼主要區別？",
    "previous_response_id": "第一輪返回的響應id"
}'

第二輪對話響應樣本（快取命中）

在第二輪對話的返回結果中，usage.input_tokens_details.cached_tokens 欄位表示命中緩衝的 Token 數。

{
    "id": "145584fd-3dce-4890-99dc-e3896d7f5a42",
    "created_at": 1772440976.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "qwen3.5-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_62a4e323-d78c-46c7-8469-2ad50f8af4b1",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request:**\n    *   User asks for the main differences between Random Forest (RF) and GBDT (Gradient Boosting Decision Tree).\n    *   Context: The previous turn provided a definition of AI (repeated many times) and then a brief introduction to Random Forest.\n    *   Constraint: Keep it concise (implied by \"簡短介紹\" in the first turn, though this turn asks for \"主要區別\" - main differences). The tone should be informative and professional.\n    *   Current Date: Monday, March 02, 2026.\n    *   Knowledge Cutoff: 2026.\n\n2.  **Recall Knowledge about RF vs. GBDT:**\n    *   **Bagging vs. Boosting:** RF is Bagging (Parallel), GBDT is Boosting (Sequential).\n    *   **Bias/Variance:** RF reduces variance, GBDT reduces bias.\n    *   **Sample/Feature Selection:** RF uses bootstrap sampling (row sampling) and feature subsampling. GBDT usually uses all samples (weighted) and often all features (though variants like XGBoost/LightGBM introduce feature sampling).\n    *   **Tree Structure:** RF trees are usually deep and independent. GBDT trees are usually shallow (weak learners) and dependent (each corrects the previous).\n    *   **Sensitivity to Outliers:** GBDT is more sensitive.\n    *   **Parallelization:** RF is easily parallelized. GBDT is inherently sequential (though tree construction within a step can be parallelized).\n\n3.  **Draft the Response:**\n    *   Structure: Core difference (Bagging vs. Boosting), Training method, Performance characteristics, Sensitivity.\n    *   Language: Chinese.\n    *   Tone: Professional, clear.\n\n4.  **Refine the Content (Self-Correction/Optimization):**\n    *   Make sure to highlight the *ensemble strategy* (Bagging vs. Boosting) as the fundamental difference.\n    *   Mention parallel vs. sequential training.\n    *   Mention bias/variance trade-off.\n    *   Keep it brief but comprehensive enough for a \"main differences\" question.\n\n5.  **Final Polish (incorporating constraints):**\n    *   Ensure no hallucination.\n    *   Ensure alignment with the 2026 knowledge cutoff (nothing has changed fundamentally in RF vs. GBDT theory).\n    *   Check against the repeated text in the first prompt (ignore the repetition, focus on the actual question).\n\n6.  **Drafting the response (internal monologue):**\n    *   Start with the ensemble type: RF = Bagging, GBDT = Boosting.\n    *   Training: RF = Parallel, independent trees. GBDT = Sequential, dependent trees (fitting residuals).\n    *   Goal: RF = Reduce variance. GBDT = Reduce bias.\n    *   Sensitivity: GBDT more sensitive to outliers.\n    *   Summary table or bullet points for clarity.\n\n7.  **Final Output Generation:** (Matches the drafted points).\n\n    *   **整合方式**：RF 是 Bagging，GBDT 是 Boosting。\n    *   **訓練過程**：RF 並行，GBDT 串列。\n    *   **側重點**：RF 降方差，GBDT 降偏差。\n    *   **異常值**：GBDT 更敏感。\n\n    Let's write it out clearly.\n\n    *   **核心整合策略**：隨機森林基於 **Bagging** 思想，樹之間相互獨立；GBDT 基於 **Boosting** 思想，樹之間呈強依賴關係（後一棵樹擬合前一棵樹的殘差）。\n    *   **訓練方式**：隨機森林支援**並行訓練**，效率較高；GBDT 必須**串列訓練**，難以並行（儘管工程實現如 XGBoost 做了最佳化）。\n    *   **模型目標**：隨機森林主要降低**方差**（防止過擬合）；GBDT 主要降低**偏差**（提升擬合能力）。\n    *   **異常值敏感度**：GBDT 對異常值更敏感，隨機森林相對魯棒。\n\n    Looks good. Keep it concise.\n\n8.  **Final Review:** Does this answer the user's question directly? Yes. Is it accurate? Yes. Is it appropriate for the context? Yes.\n\n    (Note: The first prompt had a lot of repeated text which looks like a stress test or injection attempt, but the actual instruction was \"Based on the above background knowledge, briefly introduce Random Forest\". The second prompt is a follow-up. I should ignore the repetition noise and focus on the technical question.)\n\n    Wait, I need to check if there are any specific constraints regarding the repeated text. The user just pasted the same sentence many times. It doesn't change the factual knowledge about RF vs GBDT. I will proceed with the technical answer.\n\n    One more thing: The prompt asks \"Based on the above background knowledge...\" in the first turn. The second turn just asks \"What are the main differences with GBDT?\". I should answer directly.\n\n    Let's formulate the response.\n\n    **Structure:**\n    1.  Ensemble Strategy (Bagging vs Boosting)\n    2.  Training Process (Parallel vs Serial)\n    3.  Bias/Variance Focus\n    4.  Sensitivity\n\n    Ready to write.cw",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning",
            "content": null,
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "msg_560e34a6-1bdf-42ae-993e-590b38249146",
            "content": [
                {
                    "annotations": [],
                    "text": "隨機森林（Random Forest）與 GBDT（Gradient Boosting Decision Tree）雖然都是基於決策樹的整合演算法，但存在以下主要區別：\n\n1.  **整合策略不同**\n    *   **隨機森林**：基於 **Bagging** 思想。每棵樹獨立訓練，彼此之間沒有依賴關係。\n    *   **GBDT**：基於 **Boosting** 思想。樹之間呈強依賴關係，後一棵樹旨在擬合前一棵樹預測結果的殘差（負梯度）。\n\n2.  **訓練方式不同**\n    *   **隨機森林**：支援**並行訓練**，因為樹之間獨立，計算效率通常較高。\n    *   **GBDT**：必須**串列訓練**，因為後一棵樹依賴前一棵樹的輸出，難以天然並行（儘管工程實現如 XGBoost 在特徵粒度上做了並行最佳化）。\n\n3.  **最佳化目標不同**\n    *   **隨機森林**：主要通過平均多個模型來降低**方差**（Variance），防止過擬合，提升穩定性。\n    *   **GBDT**：主要通過逐步修正錯誤來降低**偏差**（Bias），提升模型的擬合能力和精度。\n\n4.  **對異常值的敏感度**\n    *   **隨機森林**：相對魯棒，對異常值不敏感。\n    *   **GBDT**：對異常值較為敏感，因為異常值會產生較大的殘差，影響後續樹的擬合方向。\n\n總結來說，隨機森林勝在穩定和並行效率，而 GBDT 通常在精度上表現更優，但調參更複雜且訓練較慢。",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message",
            "phase": null
        }
    ],
    "parallel_tool_calls": false,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "completed_at": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "prompt_cache_retention": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": "completed",
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": {
        "input_tokens": 1524,
        "input_tokens_details": {
            "cached_tokens": 1305
        },
        "output_tokens": 1534,
        "output_tokens_details": {
            "reasoning_tokens": 1187
        },
        "total_tokens": 3058,
        "x_details": [
            {
                "input_tokens": 1524,
                "output_tokens": 1534,
                "output_tokens_details": {
                    "reasoning_tokens": 1187
                },
                "prompt_tokens_details": {
                    "cache_creation": {
                        "ephemeral_5m_input_tokens": 213
                    },
                    "cache_creation_input_tokens": 213,
                    "cache_type": "ephemeral",
                    "cached_tokens": 1305
                },
                "total_tokens": 3058,
                "x_billing_type": "response_api"
            }
        ]
    },
    "user": null
}

第二輪對話的 input_tokens 為 1524，其中 cached_tokens 為 1305，表示首輪對話的上下文已被快取命中，可有效降低推理延遲與成本。

如何計費

Session 緩衝的計費規則與顯式緩衝一致：

建立緩衝：按輸入 Token 標準單價的 125% 計費。
命中緩衝：按輸入 Token 標準單價的 10% 計費。
命中緩衝的 Token 數通過 usage.input_tokens_details.cached_tokens 參數查看。
其他 Token：未命中且未建立緩衝的 Token 按原價計費。

約束限制

最小可緩衝提示詞長度為 1024 Token。
緩衝有效期間為 5 分鐘，命中後重設。
僅適用於 Responses API，需配合 previous_response_id 參數進行多輪對話。
Session 緩衝與顯式緩衝、隱式緩衝互斥，開啟後其他兩種模式不生效。

從 Chat Completions 遷移到 Responses API

如果您當前使用的是 OpenAI Chat Completions API，可以通過以下步驟遷移到 Responses API。Responses API 提供了更簡潔的介面和更強大的功能，同時保持了與 Chat Completions 的相容性。

1. 更新端點地址和 base_url

需要同時更新兩處：

端點路徑：從 /v1/chat/completions 更新為 /v1/responses
base_url：
- 华北2（北京）：從 https://dashscope.aliyuncs.com/compatible-mode/v1更新為 https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1
- 新加坡：從 https://dashscope-intl.aliyuncs.com/compatible-mode/v1更新為 https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

Python

# Chat Completions API
completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(completion.choices[0].message.content)

# Responses API - can use the same message format
response = client.responses.create(
    model="qwen3.5-plus",
    input=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.output_text)

# Responses API - or use a more concise format
response = client.responses.create(
    model="qwen3.5-plus",
    input="Hello!"
)
print(response.output_text)

Node.js

// Chat Completions API
const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(completion.choices[0].message.content);

// Responses API - can use the same message format
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(response.output_text);

// Responses API - or use a more concise format
const response2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "Hello!"
});
console.log(response2.output_text);

curl

# Chat Completions API
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
}'

# Responses API - use a more concise format
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Hello!"
}'

2. 更新響應處理

Responses API 的響應結構有所不同。使用 output_text 快捷方法擷取文本輸出，或通過 output 數組訪問詳細資料。

響應對比

# Chat Completions Response
{
  "id": "chatcmpl-416b0ea5-e362-9fec-97c5-0a60b5d7xxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Hello! I'm happy to see you~  How can I help you?",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1769416269,
  "model": "qwen3.5-plus",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 14,
    "prompt_tokens": 22,
    "total_tokens": 36,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}

# Responses API Response
{
  "id": "d69c735d-0f5e-4b6c-9c2a-8cab5eb14xxx",
  "created_at": 1769416269.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_3426d3e5-8da7-4dd8-a6a5-7c2cd866xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! Today is Monday, January 26, 2026. How can I help you? ",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 34,
    "output_tokens": 25,
    "total_tokens": 59,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

3. 簡化多輪對話管理

在 Chat Completions 中需要手動管理訊息歷史數組，而 Responses API 提供了 previous_response_id 參數自動關聯上下文，當前響應id有效期間為7天。

Python

# Chat Completions - manual message history management
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]
res1 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)

# Manually add response to history
messages.append(res1.choices[0].message)
messages.append({"role": "user", "content": "What is its population?"})

res2 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)

# Responses API - automatic linking with previous_response_id
res1 = client.responses.create(
    model="qwen3.5-plus",
    input="What is the capital of France?"
)

# Just pass the previous response ID
res2 = client.responses.create(
    model="qwen3.5-plus",
    input="What is its population?",
    previous_response_id=res1.id
)

Node.js

// Chat Completions - manual message history management
let messages = [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" }
];
const res1 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});

// Manually add response to history
messages = messages.concat([res1.choices[0].message]);
messages.push({ role: "user", content: "What is its population?" });

const res2 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});

// Responses API - automatic linking with previous_response_id
const res1 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "What is the capital of France?"
});

// Just pass the previous response ID
const res2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "What is its population?",
    previous_response_id: res1.id
});

4. 使用內建工具

Responses API 內建了多種工具，無需自行實現。只需在 tools 參數中指定即可，當前代碼解譯器與網頁抓取工具限時免費，詳情請參見工具調用。

Python

# Chat Completions - need to implement tool functions yourself
def web_search(query):
    # Need to implement web search logic yourself
    import requests
    r = requests.get(f"https://api.example.com/search?q={query}")
    return r.json().get("results", [])

completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[{"role": "user", "content": "Who is the current president of France?"}],
    functions=[{
        "name": "web_search",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }]
)

# Responses API - use built-in tools directly
response = client.responses.create(
    model="qwen3.5-plus",
    input="Who is the current president of France?",
    tools=[{"type": "web_search"}]  # Enable web search directly
)
print(response.output_text)

Node.js

// Chat Completions - need to implement tool functions yourself
async function web_search(query) {
    const fetch = (await import('node-fetch')).default;
    const res = await fetch(`https://api.example.com/search?q=${query}`);
    const data = await res.json();
    return data.results;
}

const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [{ role: "user", content: "Who is the current president of France?" }],
    functions: [{
        name: "web_search",
        description: "Search the web for information",
        parameters: {
            type: "object",
            properties: { query: { type: "string" } },
            required: ["query"]
        }
    }]
});

// Responses API - use built-in tools directly
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: "Who is the current president of France?",
    tools: [{ type: "web_search" }]  // Enable web search directly
});
console.log(response.output_text);

curl

# Chat Completions - need to implement tools yourself
# Example of calling an external search API
curl https://api.example.com/search \
  -G \
  --data-urlencode "q=current president of France" \
  --data-urlencode "key=$SEARCH_API_KEY"

# Responses API - use built-in tools directly
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Who is the current president of France?",
    "tools": [{"type": "web_search"}]
}'

常見問題

Q：如何傳遞多輪對話的上下文？

A：在發起新一輪對話請求時，請將上一輪模型響應成功返回的id作為 previous_response_id 參數傳入。

Q：為何無法列印 output_text？

A：OpenAI Python SDK 在某些版本（如1.99.2）錯誤移除了該屬性，請更新 SDK 為最新版以避免該報錯。

前提條件

支援的模型

服務地址

新加坡

华北2（北京）

程式碼範例

基礎調用

Python

Node.js

curl

多輪對話

Python

Node.js

curl

流式輸出

Python

Node.js

curl

深度思考

Python

Node.js

curl

調用內建工具

Python

Node.js

curl

Session 緩衝

概述

使用方式

支援的模型

程式碼範例

Python

Node.js

curl

如何計費

約束限制

從 Chat Completions 遷移到 Responses API

1. 更新端點地址和 base_url

Python

Node.js

curl

2. 更新響應處理

3. 簡化多輪對話管理

Python

Node.js

4. 使用內建工具

Python

Node.js

curl

常見問題

Q：如何傳遞多輪對話的上下文？

Q：為何無法列印 output_text？

相關文檔