All Products
Search
Document Center

Alibaba Cloud Model Studio:DeepSeek

Last Updated:Mar 24, 2026

Call DeepSeek models on Alibaba Cloud Model Studio using an OpenAI-compatible API or the DashScope SDK.

Model availability

  • Hybrid-thinking models (controlled by the enable_thinking parameter): deepseek-v3.2, deepseek-v3.2-exp, and deepseek-v3.1

  • Thinking-only models (always think before responding): deepseek-r1 and deepseek-r1-0528

  • Non-thinking models: deepseek-v3

deepseek-v3.2 is the latest DeepSeek model. It excels at coding and math tasks, offers the lowest pricing, and has more relaxed rate limits. We recommend it as your default choice.

Currently, only the deepseek-v3.2 model is available in the international (Singapore) region. See Model list for details.

Model

Context window

Max input

Max CoT

Max response

(tokens)

deepseek-v3.2

685B size

131,072

98,304

32,768

65,536

deepseek-v3.2-exp

685B size

deepseek-v3.1

685B size

deepseek-r1

685B size

16,384

deepseek-r1-0528

685B size

deepseek-v3

671B size

131,072

-

Distilled models

Model

Context window

Max input

Max CoT

Max response

(tokens)

deepseek-r1-distill-qwen-1.5b

Based on Qwen2.5-Math-1.5B

32,768

32,768

16,384

16,384

deepseek-r1-distill-qwen-7b

Based on Qwen2.5-Math-7B

deepseek-r1-distill-qwen-14b

Based on Qwen2.5-14B

deepseek-r1-distill-qwen-32b

Based on Qwen2.5-32B

deepseek-r1-distill-llama-8b

Based on Llama-3.1-8B

deepseek-r1-distill-llama-70b

Based on Llama-3.3-70B
Max CoT is the maximum number of tokens for the thinking process in thinking mode.
The models listed above are not integrated third-party services. They are all deployed on Model Studio servers.
For concurrent request limits, see DeepSeek rate limits.

Getting started

deepseek-v3.2 is the latest model in the DeepSeek series. Use the enable_thinking parameter to switch between thinking and non-thinking modes. The following example shows how to call deepseek-v3.2 in thinking mode.

Before you begin, get an API key and export it as an environment variable. If you use an SDK to call the model, install the OpenAI or DashScope SDK.

OpenAI compatible

Note

The enable_thinking parameter is not a standard OpenAI parameter. In the OpenAI Python SDK, pass this parameter in extra_body. In the Node.js SDK, pass it as a top-level parameter.

Python

Sample code

from openai import OpenAI
import os

# Initialize the OpenAI client
client = OpenAI(
    # If the environment variable is not set, replace it with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=messages,
    # Set enable_thinking in extra_body to enable thinking mode
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # Full thinking process
answer_content = ""  # Full response
is_answering = False  # Indicates whether the response phase has started
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # Collect only the thinking content
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # Start replying when content is received
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

Response

====================Thinking process====================

Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and functions simply and clearly. I can start with my company background and core capabilities to help the user quickly understand.  
I should highlight my free-to-use nature and text-based strengths, but avoid going into too much detail. Finally, I'll guide the conversation with an open-ended question, which is in line with the nature of an assistant.  
I'll position myself as an enterprise-level AI assistant, which is both professional and friendly. The emoji in parentheses can add a touch of friendliness.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process various files such as images, txt, pdf, ppt, word, and excel, and read text information from them to assist you. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it in the Web/App).

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.

Is there anything I can help you with? Whether it's a question about your studies, work, or daily life, I'm happy to assist you! ✨
====================Token usage====================

CompletionUsage(completion_tokens=238, prompt_tokens=5, total_tokens=243, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=93, rejected_prediction_tokens=None), prompt_tokens_details=None)

Node.js

Sample code

import OpenAI from "openai";
import process from 'process';

// Initialize the OpenAI client
const openai = new OpenAI({
    // If the environment variable is not set, replace it with your Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY, 
    baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // Full thinking process
let answerContent = ''; // Full response
let isAnswering = false; // Indicates whether the response phase has started

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        
        const stream = await openai.chat.completions.create({
            model: 'deepseek-v3.2',
            messages,
            // Note: In the Node.js SDK, non-standard parameters such as enable_thinking are passed as top-level properties and do not need to be placed in extra_body.
            enable_thinking: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\n' + '='.repeat(20) + 'Token usage' + '='.repeat(20) + '\n');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // Collect only the thinking content
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // Start replying when content is received
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

Response

====================Thinking process====================

Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and core functions simply and clearly, without going into too much detail.

I can start with my company background and basic positioning, then list a few key capabilities to let the user quickly understand what I can do. I'll end with an open-ended question to make it easy for the user to continue.

I should highlight practical features like being free, having a long context, and file processing. I'll maintain a friendly but restrained tone, without using emojis.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model with a 128K context window, and I can help you answer questions, engage in conversations, and assist with text-based tasks. Although I do not support multimodal recognition, I can process files you upload, such as images, txt, pdf, ppt, word, and excel, and read text information from them to help you.

I am completely free to use and have no voice function, but you can download my app from the official app store. To use web search, remember to manually enable it in the Web or App.

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. If you have any questions or need assistance, just let me know! I'm happy to help. ✨
====================Token usage====================

{
  prompt_tokens: 5,
  completion_tokens: 243,
  total_tokens: 248,
  completion_tokens_details: { reasoning_tokens: 83 }
}

HTTP

Sample code

curl

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "deepseek-v3.2",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

DashScope

Python

Sample code

import os
import dashscope
from dashscope import Generation

dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"

# Initialize the request parameters
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # If the environment variable is not set, replace it with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="deepseek-v3.2",
    messages=messages,
    result_format="message",  # Set the result format to message
    enable_thinking=True,
    stream=True,              # Enable streaming output
    incremental_output=True,  # Enable incremental output
)

reasoning_content = ""  # Full thinking process
answer_content = ""     # Full response
is_answering = False    # Indicates whether the response phase has started

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    # Collect only the thinking content
    if "reasoning_content" in message:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # Start replying when content is received
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
print(chunk.usage)

Response

====================Thinking process====================

Oh, the user is asking who I am. This is a very basic self-introduction question. I need to state my identity and functions concisely and clearly, avoiding complexity. I can start with my company background and core capabilities to help the user quickly understand.  
Considering the user might be new, I can add some typical use cases and features, such as being free, having a long context, and file processing. I'll end with an open-ended invitation for help, maintaining a friendly attitude.  
No need for too many technical details, the focus should be on ease of use and practicality.
====================Full response====================

Hello! I am DeepSeek, an AI assistant created by DeepSeek.

I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process files like images, txt, pdf, ppt, word, and excel by reading the text information for analysis. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it).

My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.

If you have any questions or need help, just ask! I'm happy to answer your questions and assist with various tasks. ✨
====================Token usage====================

{"input_tokens": 6, "output_tokens": 240, "total_tokens": 246, "output_tokens_details": {"reasoning_tokens": 92}}

Java

Sample code

Important

Use DashScope Java SDK version 2.19.4 or later.

// The DashScope SDK version must be 2.19.4 or later.
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;

public class Main {
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;
    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();
        if (reasoning != null && !reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }
        if (content != null && !content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Full response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If the environment variable is not set, replace it with your Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("deepseek-v3.2")
                .enableThinking(true)
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation("http", "https://dashscope-intl.aliyuncs.com/api/v1");
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("An exception occurred: " + e.getMessage());
        }
    }
}

Response

====================Thinking process====================

Hmm, the user is asking a simple self-introduction question. This is a common query, so I need to state my identity and function clearly and quickly. I'll use a relaxed and friendly tone to introduce myself as DeepSeek-V3, created by DeepSeek. I can also mention the types of help I can provide, such as answering questions, chatting, and tutoring. Finally, I'll add an emoji to be more approachable. I should keep it concise and clear.
====================Full response====================

I am DeepSeek-V3, an intelligent assistant created by DeepSeek! I can help you answer various questions, provide suggestions, look up information, and even chat with you! Feel free to ask me anything about your studies, work, or daily life. How can I help you?

HTTP

Sample code

curl

curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "deepseek-v3.2",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

Other features

Model

Multi-turn conversation

Function calling

Context cache

Structured output

Prefix Completion

deepseek-v3.2

Supported

Supported

Supported

Not supported

Not supported

deepseek-v3.2-exp

Supported

Supported

Supported only in non-thinking mode.

Not supported

Not supported

Not supported

deepseek-v3.1

Supported

Supported

Supported only in non-thinking mode.

Not supported

Not supported

Not supported

deepseek-r1

Supported

Supported

Not supported

Not supported

Not supported

deepseek-r1-0528

Supported

Supported

Not supported

Not supported

Not supported

deepseek-v3

Supported

Supported

Not supported

Not supported

Not supported

Distilled models

Supported

Not supported

Not supported

Not supported

Not supported

Default parameter values

Model

temperature

top_p

repetition_penalty

presence_penalty

max_tokens

thinking_budget

deepseek-v3.2

1.0

0.95

-

-

65,536

32,768

deepseek-v3.2-exp

0.6

0.95

1.0

-

65,536

32,768

deepseek-v3.1

0.6

0.95

1.0

-

65,536

32,768

deepseek-r1

0.6

0.95

-

1

16,384

32,768

deepseek-r1-0528

0.6

0.95

-

1

16,384

32,768

Distilled models

0.6

0.95

-

1

16,384

16,384

deepseek-v3

0.7

0.6

-

-

16,384

-

  • A hyphen (-) indicates the parameter has no default value and cannot be set.

  • The deepseek-r1, deepseek-r1-0528, and distilled models do not support these parameters.

  • For more information about parameter definitions, see OpenAI Chat.

Billing

Billing is based on the number of input and output tokens. For pricing details, see Model list and pricing.

In thinking mode, CoT tokens are billed as output tokens.

FAQ

Can I upload images or documents to ask questions?

DeepSeek models accept text input only. They do not support image or document input. Qwen-VL supports image input, and Qwen-Long supports document input.

How do I view token usage and the number of calls?

One hour after calling a model, go to Monitoring, set filters (time range, workspace), locate your model in Models, and click Monitor in Actions to view statistics. For more information, see Usage and performance monitoring.

Data is updated hourly. During peak hours, updates may be delayed by up to one hour.

image

Error codes

If an error occurs, see Error messages for troubleshooting.