DeepSeek R1、DeepSeek V3、DeepSeek V3.1 API - Alibaba Cloud Model Studio

このトピックでは、OpenAI 互換 API または DashScope SDK を使用して、Alibaba Cloud Model Studio 上で DeepSeek モデルを呼び出す方法について説明します。

重要

このドキュメントは中国 (北京) リージョンでのみ適用されます。これらのモデルを使用するには、中国 (北京) リージョンの API キーを使用する必要があります。

モデルの可用性

deepseek-v3.2、deepseek-v3.2-exp、および deepseek-v3.1（パラメーターにより、応答前に思考を行うかどうかを制御）
ハイブリッド思考モデルであり、デフォルトでは思考モードが無効になっています。deepseek-v3.2 は、DeepSeek が提供する中で初めて思考とツール使用を統合したモデルです。思考モードおよび非思考モードの両方でツール呼び出しをサポートしています。
enable_thinking パラメーターを使用して、思考モードを制御します。
deepseek-r1（常に応答前に思考）
- 2025 年 5 月にリリースされた deepseek-r1-0528 は、2025 年 1 月にリリースされた deepseek-r1 のスペックアップ版です。新バージョンでは、複雑な推論タスクにおける性能が大幅に向上しています。推論時の思考の深さが増しているため、応答時間が長くなっています。
  Model Studio 上の deepseek-r1 は、バージョン 0528 にアップグレードされています。
- deepseek-r1-distill モデルは、deepseek-r1 によって生成されたトレーニングサンプルを用いて、Qwen や Llama などのオープンソース大規模言語モデルを知識蒸留によりファインチューニングしたものです。
deepseek-v3（応答前に思考しない）
14.8 T トークンで事前学習された deepseek-v3 モデルは、長文処理、コード、数学、百科事典的知識、および中国語において優れた性能を発揮します。
これは 2024 年 12 月 26 日にリリースされたバージョンであり、2025 年 3 月 24 日にリリースされたバージョンではありません。

思考モードでは、モデルが応答前に思考を行います。思考ステップは reasoning_content フィールドに表示されます。非思考モードと比較して、応答時間は長くなりますが、応答品質は向上します。

最新の DeepSeek モデルである deepseek-v3.2 の使用を推奨します。このモデルは、任意に有効化可能な思考モードを備え、レート制限がより緩やかで、deepseek-v3.1 よりも低価格で利用できます。

モデル	コンテキストウィンドウ	最大入力	最大 CoT	最大応答
	（トークン数）
deepseek-v3.2 685B フルバージョン	131,072	98,304	32,768	65,536
deepseek-v3.2-exp 685B フルバージョン
deepseek-v3.1 685B フルバージョン
deepseek-r1 685B フルバージョン				16,384
deepseek-r1-0528 685B フルバージョン
deepseek-v3 671B フルバージョン		131,072	-

蒸留モデル

モデル	コンテキストウィンドウ	最大入力	最大 CoT	最大応答
	（トークン数）
deepseek-r1-distill-qwen-1.5b Qwen2.5-Math-1.5B ベース	32,768	32,768	16,384	16,384
deepseek-r1-distill-qwen-7b Qwen2.5-Math-7B ベース
deepseek-r1-distill-qwen-14b Qwen2.5-14B ベース
deepseek-r1-distill-qwen-32b Qwen2.5-32B ベース
deepseek-r1-distill-llama-8b Llama-3.1-8B ベース
deepseek-r1-distill-llama-70b Llama-3.3-70B ベース

最大 CoT とは、思考モードにおける思考プロセスの最大トークン数です。

上記に記載されているモデルは、統合されたサードパーティサービスではありません。すべて Model Studio サーバー上にデプロイされています。

同時リクエスト制限に関する情報については、「DeepSeek レート制限」をご参照ください。

クイックスタート

deepseek-v3.2 は、DeepSeek シリーズの最新モデルです。enable_thinking パラメーターを使用して、思考モードと非思考モードを切り替えます。以下のコードは、思考モードで deepseek-v3.2 モデルを呼び出す方法を示しています。

開始する前に、API キーを作成し、API キーを環境変数としてエクスポートしてください。SDK を使用してモデルを呼び出す場合は、OpenAI または DashScope SDK をインストールしてください。

OpenAI 互換

説明

enable_thinking パラメーターは、標準的な OpenAI パラメーターではありません。OpenAI Python SDK では、このパラメーターを extra_body 内で渡す必要があります。Node.js SDK では、トップレベルのパラメーターとして渡す必要があります。

Python

サンプルコード

from openai import OpenAI
import os

# OpenAI クライアントを初期化
client = OpenAI(
    # 環境変数が設定されていない場合、以下を Model Studio API キーに置き換えます: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=messages,
    # extra_body に enable_thinking を設定して思考モードを有効化
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""  # 完全な応答
is_answering = False  # 応答フェーズが開始されたかどうかを示すフラグ
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # 思考コンテンツのみを収集
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # コンテンツを受信した時点で応答を開始
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

応答

====================Thinking process====================

Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and functions simply and clearly. I can start with my company background and core capabilities to help the user quickly understand.
I should highlight my free-to-use nature and text-based strengths, but avoid going into too much detail. Finally, I'll guide the conversation with an open-ended question, which is in line with the nature of an assistant.
I'll position myself as an enterprise-level AI assistant, which is both professional and friendly. The emoji in parentheses can add a touch of friendliness.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process various files such as images, txt, pdf, ppt, word, and excel, and read text information from them to assist you. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it in the Web/App).

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.

Is there anything I can help you with? Whether it's a question about your studies, work, or daily life, I'm happy to assist you! ✨
====================Token usage====================

CompletionUsage(completion_tokens=238, prompt_tokens=5, total_tokens=243, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=93, rejected_prediction_tokens=None), prompt_tokens_details=None)

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

// OpenAI クライアントを初期化
const openai = new OpenAI({
    // 環境変数が設定されていない場合、以下を Model Studio API キーに置き換えます: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY, 
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // 完全な思考プロセス
let answerContent = ''; // 完全な応答
let isAnswering = false; // 応答フェーズが開始されたかどうかを示すフラグ

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        
        const stream = await openai.chat.completions.create({
            model: 'deepseek-v3.2',
            messages,
            // 注: Node.js SDK では、enable_thinking のような非標準パラメーターはトップレベルのプロパティとして渡され、extra_body に配置する必要はありません。
            enable_thinking: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\n' + '='.repeat(20) + 'Token usage' + '='.repeat(20) + '\n');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // 思考コンテンツのみを収集
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // コンテンツを受信した時点で応答を開始
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

応答

====================Thinking process====================

Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and core functions simply and clearly, without going into too much detail.

I can start with my company background and basic positioning, then list a few key capabilities to let the user quickly understand what I can do. I'll end with an open-ended question to make it easy for the user to continue.

I should highlight practical features like being free, having a long context, and file processing. I'll maintain a friendly but restrained tone, without using emojis.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model with a 128K context window, and I can help you answer questions, engage in conversations, and assist with text-based tasks. Although I do not support multimodal recognition, I can process files you upload, such as images, txt, pdf, ppt, word, and excel, and read text information from them to help you.

I am completely free to use and have no voice function, but you can download my app from the official app store. To use web search, remember to manually enable it in the Web or App.

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. If you have any questions or need assistance, just let me know! I'm happy to help. ✨
====================Token usage====================

{
  prompt_tokens: 5,
  completion_tokens: 243,
  total_tokens: 248,
  completion_tokens_details: { reasoning_tokens: 83 }
}

HTTP

サンプルコード

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "deepseek-v3.2",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

DashScope

Python

サンプルコード

import os
from dashscope import Generation

# リクエストパラメーターを初期化
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # 環境変数が設定されていない場合、以下を Model Studio API キーに置き換えます: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="deepseek-v3.2",
    messages=messages,
    result_format="message",  # 結果フォーマットを message に設定
    enable_thinking=True,
    stream=True,              # ストリーミング出力を有効化
    incremental_output=True,  # 増分出力を有効化
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""     # 完全な応答
is_answering = False    # 応答フェーズが開始されたかどうかを示すフラグ

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    # 思考コンテンツのみを収集
    if "reasoning_content" in message:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # コンテンツを受信した時点で応答を開始
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
print(chunk.usage)

応答

====================Thinking process====================

Oh, the user is asking who I am. This is a very basic self-introduction question. I need to state my identity and functions concisely and clearly, avoiding complexity. I can start with my company background and core capabilities to help the user quickly understand.
Considering the user might be new, I can add some typical use cases and features, such as being free, having a long context, and file processing. I'll end with an open-ended invitation for help, maintaining a friendly attitude.
No need for too many technical details, the focus should be on ease of use and practicality.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process files like images, txt, pdf, ppt, word, and excel by reading the text information for analysis. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it).

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.

If you have any questions or need help, just ask! I'm happy to answer your questions and assist with various tasks. ✨
====================Token usage====================

{"input_tokens": 6, "output_tokens": 240, "total_tokens": 246, "output_tokens_details": {"reasoning_tokens": 92}}

Java

サンプルコード

重要

DashScope Java SDK は、バージョン 2.19.4 以降である必要があります。

// DashScope SDK のバージョンは 2.19.4 以降である必要があります。
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;

public class Main {
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;
    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();
        if (reasoning != null && !reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }
        if (content != null && !content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Full response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 環境変数が設定されていない場合、以下の行を .apiKey("sk-xxx") に置き換えます
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("deepseek-v3.2")
                .enableThinking(true)
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("An exception occurred: " + e.getMessage());
        }
    }
}

応答

====================Thinking process====================

Hmm, the user is asking a simple self-introduction question. This is a common query, so I need to state my identity and function clearly and quickly. I'll use a relaxed and friendly tone to introduce myself as DeepSeek-V3, created by DeepSeek. I can also mention the types of help I can provide, such as answering questions, chatting, and tutoring. Finally, I'll add an emoji to be more approachable. I should keep it concise and clear.
====================Full response====================

I am DeepSeek-V3, an intelligent assistant created by DeepSeek! I can help you answer various questions, provide suggestions, look up information, and even chat with you! Feel free to ask me anything about your studies, work, or daily life. How can I help you?

HTTP

サンプルコード

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "deepseek-v3.2",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

その他の機能

モデル	マルチターン会話	関数呼び出し	コンテキストキャッシュ	構造化出力	部分モード
deepseek-v3.2	対応	対応	対応	未対応	未対応
deepseek-v3.2-exp	対応	対応非思考モードでのみ対応。	未対応	未対応	未対応
deepseek-v3.1	対応	対応非思考モードでのみ対応。	未対応	未対応	未対応
deepseek-r1	対応	対応	未対応	未対応	未対応
deepseek-r1-0528	対応	対応	未対応	未対応	未対応
deepseek-v3	対応	対応	未対応	未対応	未対応
蒸留モデル	対応	未対応	未対応	未対応	未対応

デフォルトパラメーター値

モデル	temperature	top_p	repetition_penalty	presence_penalty	max_tokens	thinking_budget
deepseek-v3.2	1.0	0.95	-	-	65,536	32,768
deepseek-v3.2-exp	0.6	0.95	1.0	-	65,536	32,768
deepseek-v3.1	0.6	0.95	1.0	-	65,536	32,768
deepseek-r1	0.6	0.95	-	1	16,384	32,768
deepseek-r1-0528	0.6	0.95	-	1	16,384	32,768
蒸留版	0.6	0.95	-	1	16,384	16,384
deepseek-v3	0.7	0.6	-	-	16,384	-

ハイフン (-) は、パラメーターにデフォルト値がなく、設定できないことを示します。
deepseek-r1、deepseek-r1-0528、および蒸留モデルは、これらのパラメーターの設定をサポートしていません。
パラメーターの詳細については、「OpenAI Chat」をご参照ください。

課金

課金は、入力トークン数および出力トークン数に基づいて行われます。料金の詳細については、「モデル一覧と料金」をご参照ください。

思考モードでは、CoT は出力トークンとして課金されます。

よくある質問

画像やドキュメントをアップロードして質問できますか？

DeepSeek モデルはテキスト入力のみをサポートしており、画像やドキュメントの入力はサポートしていません。Qwen-VL は画像入力をサポートしており、Qwen-Long はドキュメント入力をサポートしています。

トークン使用量および呼び出し回数を確認する方法を教えてください。

モデルを呼び出してから 1 時間後に、「モデル観測」ページにアクセスしてください。時間範囲やワークスペースなどのクエリ条件を設定します。「モデル」エリアで対象のモデルを見つけ、「モニター」列の「操作」をクリックすると、呼び出し統計を確認できます。詳細については、「使用状況とパフォーマンスのモニタリング」をご参照ください。

データは 1 時間ごとに更新されます。ピーク時などは、最大 1 時間程度遅延することがあります。

エラーコード

エラーが発生した場合は、「エラーメッセージ」を参照して解決策をご確認ください。