ディープシンキングモデルの使用方法 - Alibaba Cloud Model Studio

ディープシンキングモデルは、応答を生成する前に推論を行います。これにより、論理的推論や数値計算などの複雑なタスクにおける精度が向上します。このトピックでは、Qwen や DeepSeek などのディープシンキングモデルの呼び出し方法について説明します。

実装ガイド

Alibaba Cloud Model Studio は、さまざまなディープシンキングモデル向けの API を提供しています。これらのモデルは、ハイブリッドシンキングモードとシンキング専用モードの 2 種類のモードをサポートしています。

ハイブリッドシンキングモード： enable_thinking パラメーターを使用して、シンキングモードを有効にするかどうかを制御します。

true に設定すると、モデルは応答前に推論を行います。
false に設定すると、モデルは直接応答します。

OpenAI 互換

# 依存関係をインポートし、クライアントを作成...
completion = client.chat.completions.create(
    model="qwen-plus", # モデルを選択
    messages=[{"role": "user", "content": "Who are you"}],    
    # enable_thinking は標準の OpenAI パラメーターではないため、extra_body を使用して渡します
    extra_body={"enable_thinking":True},
    # ストリーミング出力モードで呼び出し
    stream=True,
    # ストリーム応答の最後のデータパケットにトークン消費情報を含める
    stream_options={
        "include_usage": True
    }
)

DashScope

# 依存関係をインポート...

response = Generation.call(
    # 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： api_key = "sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # 必要に応じて、他のディープシンキングモデルに置き換えることができます
    model="qwen-plus",
    messages=messages,
    result_format="message",
    enable_thinking=True,
    stream=True,
    incremental_output=True
)

シンキング専用モード：モデルは常に応答前に推論を行い、この動作を無効にすることはできません。リクエストフォーマットはハイブリッドシンキングモードと同じですが、enable_thinking パラメーターは不要です。

思考プロセスは reasoning_content フィールドで返され、応答は content フィールドで返されます。ディープシンキングモデルは応答前に推論を行うため、応答時間が長くなります。これらのモデルのほとんどはストリーミング出力のみをサポートしています。そのため、このトピックの例ではストリーミング呼び出しを使用しています。

サポートされるモデル

Qwen3

商用版
- Qwen-Max シリーズ (ハイブリッドシンキングモード、デフォルトで無効)： qwen3-max、qwen3-max-2026-01-23、qwen3-max-preview
- Qwen-Plus シリーズ (ハイブリッドシンキングモード、デフォルトで無効)： qwen-plus、qwen-plus-latest、qwen-plus-2025-04-28、およびそれ以降のスナップショットモデル
- Qwen-Flash シリーズ (ハイブリッドシンキングモード、デフォルトで無効)： qwen-flash、qwen-flash-2025-07-28、およびそれ以降のスナップショットモデル
- Qwen-Turbo シリーズ (ハイブリッドシンキングモード、デフォルトで無効)： qwen-turbo、qwen-turbo-latest、qwen-turbo-2025-04-28、およびそれ以降のスナップショットモデル
オープンソース版
- ハイブリッドシンキングモード、デフォルトで有効： qwen3-235b-a22b、qwen3-32b、qwen3-30b-a3b、qwen3-14b、qwen3-8b、qwen3-4b、qwen3-1.7b、qwen3-0.6b
- シンキング専用モード： qwen3-next-80b-a3b-thinking、qwen3-235b-a22b-thinking-2507、qwen3-30b-a3b-thinking-2507

QwQ (Qwen2.5 ベース)

シンキング専用モード： qwq-plus、qwq-plus-latest、qwq-plus-2025-03-05、qwq-32b

DeepSeek (北京リージョン)

ハイブリッドシンキングモード、デフォルトで無効： deepseek-v3.2、deepseek-v3.2-exp、deepseek-v3.1
シンキング専用モード： deepseek-r1、deepseek-r1-0528、deepseek-r1 distilled model

Kimi (北京リージョン)

シンキング専用モード： kimi-k2-thinking

モデル名、コンテキスト、価格、スナップショットバージョンの詳細については、「モデル一覧」をご参照ください。レート制限の詳細については、「レート制限」をご参照ください。

クイックスタート

前提条件：「API キーの作成」および「API キーを環境変数として設定」。SDK を使用する場合は、「OpenAI または DashScope SDK のインストール」が必要です (Java 向け DashScope SDK はバージョン 2.19.4 以降である必要があります)。

以下のコードを実行して、ストリーミング出力でシンキングモードの qwen-plus モデルを呼び出します。

OpenAI 互換

Python

サンプルコード

from openai import OpenAI
import os

# OpenAI クライアントを初期化
client = OpenAI(
    # 環境変数を設定していない場合は、次の内容を Model Studio API キーに置き換えます： api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えてください
    # 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]

completion = client.chat.completions.create(
    model="qwen-plus",  # 必要に応じて、他のディープシンキングモデルに置き換えることができます
    messages=messages,
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""  # 完全な応答
is_answering = False  # 応答フェーズが開始されたかどうかを示す
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\nUsage:")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # 思考コンテンツのみを収集
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # コンテンツを受信した場合、応答を開始
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

応答

====================Thinking process====================

Okay, the user asked "Who are you", so I need to provide an accurate and friendly response. First, I need to confirm my identity, which is Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions, creating text, and logical reasoning. At the same time, I should maintain a friendly tone and avoid being too technical to make the user feel at ease. I also need to avoid using complex terminology to ensure the response is concise and clear. In addition, I might need to add some interactive elements, inviting the user to ask questions to encourage further communication. Finally, I will check if I have missed any important information, such as my Chinese name "Tongyi Qianwen" and English name "Qwen", along with my parent company and lab. I need to ensure the response is comprehensive and meets the user's expectations.
====================Full response====================

Hello! I am Qwen, an ultra-large language model independently developed by the Tongyi Lab under Alibaba Group. I can answer questions, create text, perform logical reasoning, and write code, with the goal of providing users with high-quality information and services. You can call me Qwen, or just Tongyi Qianwen. How can I help you?

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

// OpenAI クライアントを初期化
const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY, // 環境変数から読み込み
    // 以下はシンガポールリージョンの baseURL です。バージニアリージョンのモデルを使用する場合は、baseURL を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えてください 
    // 北京リージョンのモデルを使用する場合は、baseURL を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = '';
let answerContent = '';
let isAnswering = false;

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        const stream = await openai.chat.completions.create({
            model: 'qwen-plus',
            messages,
            stream: true,
            enable_thinking: true
        });
        console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\nUsage:');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // 思考コンテンツのみを収集
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // コンテンツを受信した場合、応答を開始
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

戻り値

====================Thinking process====================

Okay, the user asked "Who are you", so I need to answer about my identity. First, I should clearly state that I am Qwen, an ultra-large language model developed by Alibaba Cloud. Next, I can mention my main functions, such as answering questions, creating text, and logical reasoning. I should also emphasize my multilingual support, including Chinese and English, so the user knows I can handle requests in different languages. In addition, I might need to explain my application scenarios, such as helping with study, work, and daily life. However, the user's question is quite direct, so I probably do not need to provide too much detailed information. I should keep it concise and clear. At the same time, I need to ensure a friendly tone and invite the user to ask further questions. I will check if I have missed any important information, such as my version or latest updates, but the user probably does not need that much detail. Finally, I will confirm that the response is accurate and contains no errors.
====================Full response====================

I am Qwen, an ultra-large language model independently developed by the Tongyi Lab under Alibaba Group. I can perform various tasks such as answering questions, creating text, logical reasoning, and coding. I support multiple languages, including Chinese and English. If you have any questions or need help, feel free to let me know!

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールの base_url です。北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

応答

data: {"choices":[{"delta":{"content":null,"role":"assistant","reasoning_content":""},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

.....

data: {"choices":[{"finish_reason":"stop","delta":{"content":"","reasoning_content":null},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":10,"completion_tokens":360,"total_tokens":370},"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

data: [DONE]

DashScope

Python

サンプルコード

import os
from dashscope import Generation
import dashscope

# 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
# 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"

messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： api_key = "sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen-plus",
    messages=messages,
    result_format="message",
    enable_thinking=True,
    stream=True,
    incremental_output=True,
)

# 完全な思考プロセスを定義。
reasoning_content = ""
# 完全な応答を定義。
answer_content = ""
# 思考プロセスが終了し、応答が開始されたかどうかを判定。
is_answering = False

print("=" * 20 + "Thinking process" + "=" * 20)

for chunk in completion:
    # 思考プロセスと応答の両方が空の場合、無視。
    if (
        chunk.output.choices[0].message.content == ""
        and chunk.output.choices[0].message.reasoning_content == ""
    ):
        pass
    else:
        # 現在が思考プロセスの場合。
        if (
            chunk.output.choices[0].message.reasoning_content != ""
            and chunk.output.choices[0].message.content == ""
        ):
            print(chunk.output.choices[0].message.reasoning_content, end="", flush=True)
            reasoning_content += chunk.output.choices[0].message.reasoning_content
        # 現在が応答の場合。
        elif chunk.output.choices[0].message.content != "":
            if not is_answering:
                print("\n" + "=" * 20 + "Full response" + "=" * 20)
                is_answering = True
            print(chunk.output.choices[0].message.content, end="", flush=True)
            answer_content += chunk.output.choices[0].message.content

# 完全な思考プロセスと応答を出力するには、次のコードのコメントを外して実行してください。
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")

応答

====================Thinking process====================
Okay, the user is asking, "Who are you?" I need to answer this question. First, I must clarify my identity: I am Qwen, a large-scale language model developed by Alibaba Cloud. Next, I need to explain my functions and uses, such as answering questions, creating text, and logical reasoning. I should also emphasize that my goal is to be a helpful assistant to the user, providing help and support.

When responding, I should maintain a conversational tone and avoid technical jargon or complex sentences. I can add friendly phrases, like "Hello there!~", to make the conversation more natural. Also, I must ensure the information is accurate and does not omit key points, such as my developer, main functions, and use cases.

I also need to consider potential follow-up questions from the user, such as specific application examples or technical details. So, I can subtly plant seeds in my response to encourage further questions. For example, mentioning "Whether it's a question about daily life or a professional topic, I can do my best to help" is both comprehensive and open-ended.

Finally, I will check if the response is fluent and free of repetitive or redundant information, ensuring it is concise and clear. I will also maintain a balance between being friendly and professional, so the user finds me both approachable and reliable.
====================Full response====================
Hello there!~ I am Qwen, a large-scale language model developed by Alibaba Cloud. I can answer questions, create text, perform logical reasoning, write code, and more, with the goal of providing help and support to users. Whether you have questions about daily life or professional topics, I will do my best to assist. How can I help you?

Java

サンプルコード

// DashScope SDK バージョン >= 2.19.4
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;

public class Main {
    static {
        // 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
        // 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
        Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
    }
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Full response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .enableThinking(true)
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
//             最終結果を出力。
//            if (reasoningContent.length() > 0) {
//                System.out.println("\n====================Full response====================");
//                System.out.println(finalContent.toString());
//            }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

戻り値

====================Thinking process====================
Okay, the user is asking "Who are you?", and I need to answer based on my predefined settings. First, my role is Qwen, a large-scale language model from Alibaba Group. I should keep the tone conversational, simple, and easy to understand.

The user might be new to me or wants to confirm my identity. I should first state who I am directly, then briefly explain my functions and uses, such as answering questions, creating text, and coding. I should also mention my multilingual support so the user knows I can handle requests in different languages.

Also, according to the guidelines, I should maintain a human-like persona, so the tone should be friendly. I might use emojis to add a touch of warmth. At the same time, I might need to guide the user to ask more questions or use my features, for example, by asking what they need help with.

I need to be careful not to use complex terminology and avoid being verbose. I will check for any missed key points, such as multilingual support and specific capabilities. I must ensure the response meets all requirements, including being conversational and concise.
====================Full response====================
Hello! I am Qwen, a large-scale language model from Alibaba Group. I can answer questions and create text, such as stories, official documents, emails, and playbooks. I can also perform logical reasoning, write code, express opinions, and play games. I am proficient in multiple languages, including but not limited to Chinese, English, German, French, and Spanish. Is there anything I can help you with?

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールリージョンの URL です。北京リージョンのモデルを使用する場合は、URL を https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

応答

id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"Hmm","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":14,"input_tokens":11,"output_tokens":3},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":",","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":15,"input_tokens":11,"output_tokens":4},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:3
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"the user","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":16,"input_tokens":11,"output_tokens":5},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:4
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":" asks","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":17,"input_tokens":11,"output_tokens":6},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:5
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":" '","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":18,"input_tokens":11,"output_tokens":7},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
......

id:358
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"help","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":373,"input_tokens":11,"output_tokens":362},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:359
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":",","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":374,"input_tokens":11,"output_tokens":363},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:360
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" feel free","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":375,"input_tokens":11,"output_tokens":364},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:361
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" to","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":376,"input_tokens":11,"output_tokens":365},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:362
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" let me know","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":377,"input_tokens":11,"output_tokens":366},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:363
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:364
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

コア機能

シンキングモードと非シンキングモードの切り替え

シンキングモードを有効にすると、通常は応答品質が向上しますが、応答レイテンシとコストが増加します。ハイブリッドシンキングモードをサポートするモデルを使用する場合、モデルを変更せずに質問の複雑さに基づいて動的にシンキングモードと非シンキングモードを切り替えることができます。

日常的なチャットや単純な Q&A ペアなど、複雑な推論を必要としないタスクでは、enable_thinking を false に設定してシンキングモードを無効にしてください。
論理的推論、コード生成、数学的問題の解決など、複雑な推論を必要とするタスクでは、enable_thinking を true に設定してシンキングモードを有効にしてください。

OpenAI 互換

重要

enable_thinking は標準の OpenAI パラメーターではありません。OpenAI Python SDK を使用する場合は、このパラメーターを extra_body を使用して渡します。Node.js SDK の場合は、トップレベルパラメーターとして渡します。

Python

サンプルコード

from openai import OpenAI
import os

# OpenAI クライアントを初期化
client = OpenAI(
    # 環境変数を設定していない場合は、次の内容を Model Studio API キーに置き換えます： api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えてください # 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    # 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
    model="qwen-plus",
    messages=messages,
    # extra_body を使用して enable_thinking を設定し、思考プロセスを有効化
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""  # 完全な応答
is_answering = False  # 応答フェーズが開始されたかどうかを示す
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\n" + "=" * 20 + "Token consumption" + "=" * 20 + "\n")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # 思考コンテンツのみを収集
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # コンテンツを受信した場合、応答を開始
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

応答

====================Thinking process====================

Okay, the user is asking 'Who are you'. I need to figure out what they want to know. They might be interacting with me for the first time or want to confirm my identity. I should start by introducing myself as Qwen, developed by Tongyi Lab. Then, I should explain my capabilities, such as answering questions, creating text, and programming, so the user understands how I can help. I should also mention that I support multiple languages, so international users know they can communicate in different languages. Finally, I should be friendly and invite them to ask more questions to encourage further interaction. I need to be concise and clear, avoiding too much technical jargon to make it easy for the user to understand. The user probably wants a quick overview of my abilities, so I'll focus on my functions and uses. I should also check if I've missed any information, like mentioning Alibaba Group or more technical details. However, the user probably just needs basic information, not an in-depth explanation. I'll make sure my response is friendly and professional, and encourages the user to keep asking questions.
====================Full response====================

I am Qwen, a large-scale language model developed by Tongyi Lab. I can help you answer questions, create text, write code, and express ideas. I support conversations in multiple languages. How can I help you?
====================Token consumption====================

CompletionUsage(completion_tokens=221, prompt_tokens=10, total_tokens=231, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=172, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=None, cached_tokens=0))

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

// OpenAI クライアントを初期化します
const openai = new OpenAI({
    // 環境変数を設定していない場合は、以下を Model Studio API キーに置き換えてください: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY, 
    // 以下はシンガポールリージョンの baseURL です。バージニアリージョンのモデルを使用する場合は、baseURL を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えてください
    // 北京リージョンのモデルを使用する場合は、baseURL を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // 完全な思考プロセス
let answerContent = ''; // 完全な応答
let isAnswering = false; // 応答フェーズが開始されたかどうかを示します

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        
        const stream = await openai.chat.completions.create({
            model: 'qwen-plus',
            messages,
            // Node.js SDK では、enable_thinking のような非標準パラメーターはトップレベルプロパティとして渡され、extra_body に配置する必要はありません。
            enable_thinking: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        console.log('\n' + '='.repeat(20) + '思考プロセス' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\n' + '='.repeat(20) + 'トークン消費量' + '='.repeat(20) + '\n');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // 思考コンテンツのみを収集します
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // コンテンツを受信した場合、応答を開始します
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + '完全な応答' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('エラー:', error);
    }
}

main();

応答

====================Thinking process====================

Okay, the user is asking 'Who are you'. I need to figure out what they want to know. They might be interacting with me for the first time or want to confirm my identity. I should start by introducing my name and identity, such as Qwen, with the English name Qwen. Then I should state that I am a large-scale language model independently developed by Tongyi Lab under Alibaba Group. Next, I should mention my capabilities, such as answering questions, creating text, programming, and expressing opinions, so the user understands my purpose. I should also mention that I support multiple languages, which international users will find useful. Finally, I should invite them to ask questions and maintain a friendly and open attitude. I need to use simple and easy-to-understand language, avoiding too much technical jargon. The user might need help or just be curious, so the response should be cordial and encourage further interaction. Additionally, I might need to consider if the user has deeper needs, such as testing my abilities or seeking specific help, but the initial response should focus on basic information and guidance. I will keep the tone conversational and avoid complex sentences to make the information more effective.
====================Full response====================

Hello! I am Qwen, a large-scale language model independently developed by Tongyi Lab under Alibaba Group. I can help you answer questions, create text (such as stories, official documents, emails, and playbooks), perform logical reasoning, write code, and even express opinions and play games. I support multiple languages, including but not limited to Chinese, English, German, French, and Spanish.

If you have any questions or need help, feel free to ask me anytime!
====================Token consumption====================

{
  prompt_tokens: 10,
  completion_tokens: 288,
  total_tokens: 298,
  completion_tokens_details: { reasoning_tokens: 188 },
  prompt_tokens_details: { cached_tokens: 0 }
}

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールの base_url です。北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

DashScope

Python

サンプルコード

import os
from dashscope import Generation
import dashscope
# 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
# 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"

# リクエストパラメーターを初期化
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # 環境変数を設定していない場合は、次の内容を Model Studio API キーに置き換えます： api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen-plus",
    messages=messages,
    result_format="message",  # 結果フォーマットを message に設定
    enable_thinking=True,     # 思考プロセスを有効化
    stream=True,              # ストリーミング出力を有効化
    incremental_output=True,  # 増分出力を有効化
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""     # 完全な応答
is_answering = False    # 応答フェーズが開始されたかどうかを示す

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    
    # 思考コンテンツのみを収集
    if message.reasoning_content:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # コンテンツを受信した場合、応答を開始
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

print("\n" + "=" * 20 + "Token consumption" + "=" * 20 + "\n")
print(chunk.usage)
# ループ終了後、reasoning_content および answer_content 変数には完全なコンテンツが含まれています。
# 必要に応じて、ここで後続処理を実行できます。
# print(f"\n\nFull thinking process:\n{reasoning_content}")
# print(f"\nFull response:\n{answer_content}")

応答

====================Thinking process====================

Okay, the user is asking 'Who are you?'. I need to figure out what they want to know. They might be interacting with me for the first time or want to confirm my identity. First, I should introduce myself as Qwen and state that I am a large-scale language model developed by Tongyi Lab. Next, I might need to explain my capabilities, such as answering questions, creating text, and programming, so the user understands my purpose. I should also mention that I support multiple languages, so international users know they can communicate in different languages. Finally, I should be friendly and invite them to ask questions to encourage further interaction. I need to use simple and easy-to-understand language, avoiding too much technical jargon. The user might have deeper needs, such as testing my abilities or seeking help, so providing specific examples like writing stories, official documents, or emails would be better. I should also ensure the response is well-structured, perhaps by listing my functions, but a natural transition might be better than using bullets. Additionally, I should emphasize that I am an AI assistant without personal consciousness and all my answers are based on training data to avoid misunderstandings. I might need to check if I've missed any important information, such as multimodal capabilities or recent updates, but based on previous responses, I probably don't need to go too deep. In short, the response should be comprehensive yet concise, friendly, and helpful, making the user feel understood and supported.
====================Full response====================

I am Qwen, a large-scale language model independently developed by Tongyi Lab under Alibaba Group. I can help you with:

1. **Answer questions**: Whether it's academic questions, general knowledge, or domain-specific issues, I can try to provide an answer.
2. **Create text**: I can help you write stories, official documents, emails, playbooks, and more.
3. **Logical reasoning**: I can help you with logical reasoning and problem-solving.
4. **Programming**: I can understand and generate code in various programming languages.
5. **Multilingual support**: I support multiple languages, including but not limited to Chinese, English, German, French, and Spanish.

If you have any questions or need help, feel free to ask me anytime!
====================Token consumption====================

{"input_tokens": 11, "output_tokens": 405, "total_tokens": 416, "output_tokens_details": {"reasoning_tokens": 256}, "prompt_tokens_details": {"cached_tokens": 0}}

Java

サンプルコード

// DashScope SDK バージョン >= 2.19.4
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Full response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .enableThinking(true)
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            // 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
            // 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
            Generation gen = new Generation("http", "https://dashscope-intl.aliyuncs.com/api/v1");
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
//             最終結果を出力。
//            if (reasoningContent.length() > 0) {
//                System.out.println("\n====================Full response====================");
//                System.out.println(finalContent.toString());
//            }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

戻り値

====================Thinking process====================
Okay, the user is asking 'Who are you?'. I need to figure out what they want to know. They might want to know my identity or are testing my response. First, I should clearly state that I am Qwen, a large-scale language model from Alibaba Group. Then, I might need to briefly introduce my capabilities, such as answering questions, creating text, and programming, so the user understands my purpose. I should also mention that I support multiple languages, so international users know they can communicate in different languages. Finally, I should be friendly and invite them to ask questions, which will make them feel welcome and willing to continue the interaction. I need to make sure the answer is not too long but is comprehensive. The user might have follow-up questions, such as my technical details or use cases, but the initial response should be concise and clear. I will ensure I don't use technical jargon so that all users can understand. I will check if I have missed any important information, such as multilingual support and specific examples of my functions. Okay, this should cover the user's needs.
====================Full response====================
I am Qwen, a large-scale language model from Alibaba Group. I can answer questions, create text (such as stories, official documents, emails, and playbooks), perform logical reasoning, write code, express opinions, play games, and more. I support conversations in multiple languages, including but not limited to Chinese, English, German, French, and Spanish. If you have any questions or need help, feel free to ask me anytime!

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールリージョンの URL です。北京リージョンのモデルを使用する場合は、URL を https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

応答

id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"Hmm","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":14,"input_tokens":11,"output_tokens":3},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":",","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":15,"input_tokens":11,"output_tokens":4},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:3
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"the user","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":16,"input_tokens":11,"output_tokens":5},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:4
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":" asks","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":17,"input_tokens":11,"output_tokens":6},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:5
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":" '","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":18,"input_tokens":11,"output_tokens":7},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}
......

id:358
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"help","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":373,"input_tokens":11,"output_tokens":362},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:359
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":",","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":374,"input_tokens":11,"output_tokens":363},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:360
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" feel free","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":375,"input_tokens":11,"output_tokens":364},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:361
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" to","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":376,"input_tokens":11,"output_tokens":365},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:362
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":" let me know","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":377,"input_tokens":11,"output_tokens":366},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:363
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

id:364
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":378,"input_tokens":11,"output_tokens":367},"request_id":"25d58c29-c47b-9e8d-a0f1-d6c309ec58b1"}

さらに、オープンソース Qwen3 版のハイブリッドシンキングモデル、qwen-plus-2025-04-28、および qwen-turbo-2025-04-28 は、プロンプトを使用してシンキングモードを動的に制御する方法を提供しています。enable_thinking が true の場合、プロンプトに /no_think を追加してシンキングモードを無効にできます。マルチターン対話でシンキングモードを再度有効にするには、最新の入力プロンプトに /think を追加します。モデルは、最も最近の /think または /no_think 命令に従います。

思考長の制限

ディープシンキングモデルは、時に lengthy な推論プロセスを生成することがあります。これにより待ち時間が長くなり、より多くのトークンを消費します。thinking_budget パラメーターを使用して、推論プロセスの最大トークン数を制限できます。制限を超えた場合、モデルは直ちに応答を生成します。

thinking_budget パラメーターは、モデルのチェーンオブソートの最大長を指定します。詳細については、「モデル一覧」をご参照ください。

重要

thinking_budget パラメーターは、Qwen3 (シンキングモード)および Kimi モデルでサポートされています。

OpenAI 互換

Python

サンプルコード

from openai import OpenAI
import os

# OpenAI クライアントを初期化
client = OpenAI(
    # 環境変数を設定していない場合は、次の内容を Model Studio API キーに置き換えます： api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えてください # 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    # 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えてください
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]

completion = client.chat.completions.create(
    model="qwen-plus",
    messages=messages,
    # enable_thinking パラメーターは思考プロセスを有効化し、thinking_budget パラメーターは推論プロセスの最大トークン数を設定します。
    extra_body={
        "enable_thinking": True,
        "thinking_budget": 50
        },
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""  # 完全な応答
is_answering = False  # 応答フェーズが開始されたかどうかを示す
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\nUsage:")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # 思考コンテンツのみを収集
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # コンテンツを受信した場合、応答を開始
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

応答

====================Thinking process====================

Okay, the user asked "Who are you", so I need to give a clear and friendly response. First, I should state my identity, which is Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering
====================Full response====================

I am Qwen, a large-scale language model developed by the Tongyi Lab at Alibaba Group. I can answer questions, create text, perform logical reasoning, and write code, with the goal of providing help and convenience to users. How can I help you?

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

// OpenAI クライアントを初期化します
const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY, // 環境変数から読み取ります
    // 以下は、シンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1 に置き換えます。
    // 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1 に置き換えます。
    baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = '';
let answerContent = '';
let isAnswering = false;


async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        const stream = await openai.chat.completions.create({
            model: 'qwen-plus',
            messages,
            stream: true,
            // enable_thinking パラメーターは推論プロセスを有効化し、thinking_budget パラメーターは推論プロセスの最大トークン数を設定します。
            enable_thinking: true,
            thinking_budget: 50
        });
        console.log('\n' + '='.repeat(20) + '推論プロセス' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\nUsage:');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // 推論コンテンツのみを収集します
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // コンテンツを受信すると、応答を開始します
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + '完全な応答' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

応答

====================Thinking process====================

Okay, the user asked "Who are you", so I need to give a clear and accurate response. First, I should introduce my identity, which is Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as answering questions
====================Full response====================

I am Qwen, an ultra-large language model independently developed by the Tongyi Lab under Alibaba Group. I can perform various tasks such as answering questions, creating text, logical reasoning, and coding. If you have any questions or need help, feel free to let me know!

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールの base_url です。北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true,
    "thinking_budget": 50
}'

応答

data: {"choices":[{"delta":{"content":null,"role":"assistant","reasoning_content":""},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

.....

data: {"choices":[{"finish_reason":"stop","delta":{"content":"","reasoning_content":null},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":10,"completion_tokens":360,"total_tokens":370},"created":1745485391,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-e2edaf2c-8aaf-9e54-90e2-b21dd5045503"}

data: [DONE]

DashScope

Python

サンプルコード

import os
from dashscope import Generation
import dashscope
# 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
# 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1/"

messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： api_key = "sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen-plus",
    messages=messages,
    result_format="message",
    enable_thinking=True,
    # 推論プロセスの最大トークン数を設定。
    thinking_budget=50,
    stream=True,
    incremental_output=True,
)

# 完全な思考プロセスを定義。
reasoning_content = ""
# 完全な応答を定義。
answer_content = ""
# 思考プロセスが終了し、応答が開始されたかどうかを判定。
is_answering = False

print("=" * 20 + "Thinking process" + "=" * 20)

for chunk in completion:
    # 思考プロセスと応答の両方が空の場合、無視。
    if (
        chunk.output.choices[0].message.content == ""
        and chunk.output.choices[0].message.reasoning_content == ""
    ):
        pass
    else:
        # 現在が思考プロセスの場合。
        if (
            chunk.output.choices[0].message.reasoning_content != ""
            and chunk.output.choices[0].message.content == ""
        ):
            print(chunk.output.choices[0].message.reasoning_content, end="", flush=True)
            reasoning_content += chunk.output.choices[0].message.reasoning_content
        # 現在が応答フェーズの場合。
        elif chunk.output.choices[0].message.content != "":
            if not is_answering:
                print("\n" + "=" * 20 + "Full response" + "=" * 20)
                is_answering = True
            print(chunk.output.choices[0].message.content, end="", flush=True)
            answer_content += chunk.output.choices[0].message.content

# 完全な思考プロセスと完全な応答を出力するには、次のコードのコメントを外して実行してください。
# print("=" * 20 + "Full thinking process" + "=" * 20 + "\n")
# print(f"{reasoning_content}")
# print("=" * 20 + "Full response" + "=" * 20 + "\n")
# print(f"{answer_content}")

戻り値

====================Thinking process====================
Okay, the user is asking "Who are you?", so I need to give a clear and friendly response. First, I should introduce my identity, which is Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as
====================Full response====================
I am Qwen, a large-scale language model independently developed by the Tongyi Lab at Alibaba Group. I can answer questions, create text, perform logical reasoning, and write code, with the goal of providing users with comprehensive, accurate, and useful information and assistance. How can I help you?

Java

サンプルコード

// DashScope SDK バージョン >= 2.19.4
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import com.alibaba.dashscope.utils.Constants;

public class Main {
    static {
        // 以下はシンガポールリージョンの base_url です。バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1 に置き換えてください
        // 北京リージョンのモデルを使用する場合は、base_url を https://dashscope.aliyuncs.com/api/v1 に置き換えてください
        Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
    }
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Full response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 環境変数を設定していない場合は、次の行を Model Studio API キーに置き換えます： .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .enableThinking(true)
                .thinkingBudget(50)
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
//             最終結果を出力。
//            if (reasoningContent.length() > 0) {
//                System.out.println("\n====================Full response====================");
//                System.out.println(finalContent.toString());
//            }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

応答

====================Thinking process====================
Okay, the user is asking "Who are you?", so I need to give a clear and friendly response. First, I should introduce my identity, which is Qwen, developed by the Tongyi Lab under Alibaba Group. Next, I should explain my main functions, such as
====================Full response====================
I am Qwen, a large-scale language model independently developed by the Tongyi Lab at Alibaba Group. I can answer questions, create text, perform logical reasoning, and write code, with the goal of providing users with comprehensive, accurate, and useful information and assistance. How can I help you?

HTTP

サンプルコード

curl

# ======= 重要 =======
# 以下はシンガポールリージョンの URL です。北京リージョンのモデルを使用する場合は、URL を https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# バージニアリージョンのモデルを使用する場合は、base_url を https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation に置き換えてください
# === 実行前にこのコメントを削除してください ===
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "thinking_budget": 50,
        "incremental_output": true,
        "result_format": "message"
    }
}'

戻り値

id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"Okay","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":14,"output_tokens":3,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":1}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}

id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":",","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":15,"output_tokens":4,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":2}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}

......

id:133
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"!","reasoning_content":"","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":149,"output_tokens":138,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":50}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}

id:134
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"","reasoning_content":"","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":149,"output_tokens":138,"input_tokens":11,"output_tokens_details":{"reasoning_tokens":50}},"request_id":"2ce91085-3602-9c32-9c8b-fe3d583a2c38"}

その他の機能

課金詳細

思考プロセスは、出力トークンに基づいて課金されます。
一部のハイブリッドシンキングモデルでは、シンキングモードと非シンキングモードで価格が異なります。
シンキングモードのモデルが思考プロセスを出力しない場合、非シンキングモードの価格で課金されます。

よくある質問

Q: シンキングモードを無効にするにはどうすればよいですか？

シンキングモードを無効にできるかどうかは、モデルのタイプによって異なります。

qwen-plus や deepseek-v3.2-exp などのハイブリッドシンキングモードモデルの場合、'enable_thinking' パラメーターを 'false' に設定してモードを無効にします。
qwen3-235b-a22b-thinking-2507 や deepseek-r1 などのシンキング専用モードモデルの場合、モードを無効にすることはできません。

Q: 非ストリーミング出力をサポートしているモデルはどれですか？

ディープシンキングモデルは応答前により多くの処理時間を必要とするため、応答時間が長くなり、非ストリーミング出力ではタイムアウトのリスクがあります。ストリーミング呼び出しの使用を推奨します。非ストリーミング出力が必要な場合は、以下のサポートされているモデルのいずれかを使用してください。

Qwen3

商用版
- Qwen-Max シリーズ： qwen3-max-preview
- Qwen-Plus シリーズ： qwen-plus
- Qwen-Flash シリーズ： qwen-flash、qwen-flash-2025-07-28
- Qwen-Turbo シリーズ： qwen-turbo
オープンソース版
- qwen3-next-80b-a3b-thinking、qwen3-235b-a22b-thinking-2507、qwen3-30b-a3b-thinking-2507

DeepSeek (北京リージョン)

deepseek-v3.2、deepseek-v3.2-exp、deepseek-r1、deepseek-r1-0528、deepseek-r1 distilled model

Kimi (北京リージョン)

kimi-k2-thinking

Q: 無料クォータを使い切った後、トークンを購入するにはどうすればよいですか？

費用とコストセンターにアクセスして、アカウントにチャージできます。モデルを呼び出すには、アカウントに支払い遅延がないことを確認してください。

無料クォータを使い切った後、モデル呼び出しは自動的に課金されます。課金サイクルは 1 時間です。課金詳細を確認するには、「課金詳細」にアクセスしてください。

Q: 画像やドキュメントをアップロードして質問できますか？

このトピックで説明するモデルは、テキスト入力のみをサポートしています。Qwen3-VL モデルおよび QVQ モデルは、画像に対するディープシンキングをサポートしています。

Q: トークン消費量および呼び出し回数を確認するにはどうすればよいですか？

モデルを呼び出してから 1 時間後に、「モデル観測 (シンガポールまたは北京)」ページにアクセスします。時間範囲やワークスペースなどのクエリ条件を設定します。次に、モデルエリアで対象のモデルを見つけ、操作列の監視をクリックして、モデルの呼び出し統計を確認します。詳細については、「モデル観測」ドキュメントをご参照ください。

データは 1 時間ごとに更新されます。ピーク時には、1 時間程度の遅延が発生する場合があります。

API リファレンス

ディープシンキングモデルの入力および出力パラメーターの詳細については、「Qwen」をご参照ください。

エラーコード

エラーが発生した場合は、「エラーメッセージ」を参照してトラブルシューティングを行ってください。