Call DeepSeek models on Alibaba Cloud Model Studio using an OpenAI-compatible API or the DashScope SDK.
Model availability
Hybrid-thinking models (controlled by the
enable_thinkingparameter): deepseek-v3.2, deepseek-v3.2-exp, and deepseek-v3.1Thinking-only models (always think before responding): deepseek-r1 and deepseek-r1-0528
Non-thinking models: deepseek-v3
deepseek-v3.2 is the latest DeepSeek model. It excels at coding and math tasks, offers the lowest pricing, and has more relaxed rate limits. We recommend it as your default choice.
Currently, only the deepseek-v3.2 model is available in the international (Singapore) region. See Model list for details.
Model | Context window | Max input | Max CoT | Max response |
(tokens) | ||||
deepseek-v3.2 685B size | 131,072 | 98,304 | 32,768 | 65,536 |
deepseek-v3.2-exp 685B size | ||||
deepseek-v3.1 685B size | ||||
deepseek-r1 685B size | 16,384 | |||
deepseek-r1-0528 685B size | ||||
deepseek-v3 671B size | 131,072 | - | ||
Max CoT is the maximum number of tokens for the thinking process in thinking mode.
The models listed above are not integrated third-party services. They are all deployed on Model Studio servers.
For concurrent request limits, see DeepSeek rate limits.
Getting started
deepseek-v3.2 is the latest model in the DeepSeek series. Use the enable_thinking parameter to switch between thinking and non-thinking modes. The following example shows how to call deepseek-v3.2 in thinking mode.
Before you begin, get an API key and export it as an environment variable. If you use an SDK to call the model, install the OpenAI or DashScope SDK.
OpenAI compatible
The enable_thinking parameter is not a standard OpenAI parameter. In the OpenAI Python SDK, pass this parameter in extra_body. In the Node.js SDK, pass it as a top-level parameter.
Python
Sample code
from openai import OpenAI
import os
# Initialize the OpenAI client
client = OpenAI(
# If the environment variable is not set, replace it with your Model Studio API key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
model="deepseek-v3.2",
messages=messages,
# Set enable_thinking in extra_body to enable thinking mode
extra_body={"enable_thinking": True},
stream=True,
stream_options={
"include_usage": True
},
)
reasoning_content = "" # Full thinking process
answer_content = "" # Full response
is_answering = False # Indicates whether the response phase has started
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
if not chunk.choices:
print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
print(chunk.usage)
continue
delta = chunk.choices[0].delta
# Collect only the thinking content
if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
if not is_answering:
print(delta.reasoning_content, end="", flush=True)
reasoning_content += delta.reasoning_content
# Start replying when content is received
if hasattr(delta, "content") and delta.content:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
print(delta.content, end="", flush=True)
answer_content += delta.contentResponse
====================Thinking process====================
Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and functions simply and clearly. I can start with my company background and core capabilities to help the user quickly understand.
I should highlight my free-to-use nature and text-based strengths, but avoid going into too much detail. Finally, I'll guide the conversation with an open-ended question, which is in line with the nature of an assistant.
I'll position myself as an enterprise-level AI assistant, which is both professional and friendly. The emoji in parentheses can add a touch of friendliness.
====================Full response====================
Hello! I am DeepSeek, an AI assistant created by DeepSeek.
I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process various files such as images, txt, pdf, ppt, word, and excel, and read text information from them to assist you. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it in the Web/App).
My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.
Is there anything I can help you with? Whether it's a question about your studies, work, or daily life, I'm happy to assist you! ✨
====================Token usage====================
CompletionUsage(completion_tokens=238, prompt_tokens=5, total_tokens=243, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=93, rejected_prediction_tokens=None), prompt_tokens_details=None)Node.js
Sample code
import OpenAI from "openai";
import process from 'process';
// Initialize the OpenAI client
const openai = new OpenAI({
// If the environment variable is not set, replace it with your Model Studio API key: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = ''; // Full thinking process
let answerContent = ''; // Full response
let isAnswering = false; // Indicates whether the response phase has started
async function main() {
try {
const messages = [{ role: 'user', content: 'Who are you' }];
const stream = await openai.chat.completions.create({
model: 'deepseek-v3.2',
messages,
// Note: In the Node.js SDK, non-standard parameters such as enable_thinking are passed as top-level properties and do not need to be placed in extra_body.
enable_thinking: true,
stream: true,
stream_options: {
include_usage: true
},
});
console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (!chunk.choices?.length) {
console.log('\n' + '='.repeat(20) + 'Token usage' + '='.repeat(20) + '\n');
console.log(chunk.usage);
continue;
}
const delta = chunk.choices[0].delta;
// Collect only the thinking content
if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
if (!isAnswering) {
process.stdout.write(delta.reasoning_content);
}
reasoningContent += delta.reasoning_content;
}
// Start replying when content is received
if (delta.content !== undefined && delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Full response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
} catch (error) {
console.error('Error:', error);
}
}
main();Response
====================Thinking process====================
Ah, the user is asking who I am. This is a very common opening question. I need to introduce my identity and core functions simply and clearly, without going into too much detail.
I can start with my company background and basic positioning, then list a few key capabilities to let the user quickly understand what I can do. I'll end with an open-ended question to make it easy for the user to continue.
I should highlight practical features like being free, having a long context, and file processing. I'll maintain a friendly but restrained tone, without using emojis.
====================Full response====================
Hello! I am DeepSeek, an AI assistant created by DeepSeek.
I am a text-only model with a 128K context window, and I can help you answer questions, engage in conversations, and assist with text-based tasks. Although I do not support multimodal recognition, I can process files you upload, such as images, txt, pdf, ppt, word, and excel, and read text information from them to help you.
I am completely free to use and have no voice function, but you can download my app from the official app store. To use web search, remember to manually enable it in the Web or App.
My knowledge is current up to July 2024, and I will help you with enthusiasm and care. If you have any questions or need assistance, just let me know! I'm happy to help. ✨
====================Token usage====================
{
prompt_tokens: 5,
completion_tokens: 243,
total_tokens: 248,
completion_tokens_details: { reasoning_tokens: 83 }
}HTTP
Sample code
curl
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v3.2",
"messages": [
{
"role": "user",
"content": "Who are you"
}
],
"stream": true,
"stream_options": {
"include_usage": true
},
"enable_thinking": true
}'DashScope
Python
Sample code
import os
import dashscope
from dashscope import Generation
dashscope.base_http_api_url = "https://dashscope-intl.aliyuncs.com/api/v1"
# Initialize the request parameters
messages = [{"role": "user", "content": "Who are you?"}]
completion = Generation.call(
# If the environment variable is not set, replace it with your Model Studio API key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
model="deepseek-v3.2",
messages=messages,
result_format="message", # Set the result format to message
enable_thinking=True,
stream=True, # Enable streaming output
incremental_output=True, # Enable incremental output
)
reasoning_content = "" # Full thinking process
answer_content = "" # Full response
is_answering = False # Indicates whether the response phase has started
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")
for chunk in completion:
message = chunk.output.choices[0].message
# Collect only the thinking content
if "reasoning_content" in message:
if not is_answering:
print(message.reasoning_content, end="", flush=True)
reasoning_content += message.reasoning_content
# Start replying when content is received
if message.content:
if not is_answering:
print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
is_answering = True
print(message.content, end="", flush=True)
answer_content += message.content
print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
print(chunk.usage)Response
====================Thinking process====================
Oh, the user is asking who I am. This is a very basic self-introduction question. I need to state my identity and functions concisely and clearly, avoiding complexity. I can start with my company background and core capabilities to help the user quickly understand.
Considering the user might be new, I can add some typical use cases and features, such as being free, having a long context, and file processing. I'll end with an open-ended invitation for help, maintaining a friendly attitude.
No need for too many technical details, the focus should be on ease of use and practicality.
====================Full response====================
Hello! I am DeepSeek, an AI assistant created by DeepSeek.
I am a text-only model. Although I do not support multimodal recognition, I have a file upload feature that can help you process files like images, txt, pdf, ppt, word, and excel by reading the text information for analysis. I am completely free to use, have a 128K context window, and support web search (you need to manually enable it).
My knowledge is current up to July 2024, and I will help you with enthusiasm and care. You can download my app from the official app store.
If you have any questions or need help, just ask! I'm happy to answer your questions and assist with various tasks. ✨
====================Token usage====================
{"input_tokens": 6, "output_tokens": 240, "total_tokens": 246, "output_tokens_details": {"reasoning_tokens": 92}}Java
Sample code
Use DashScope Java SDK version 2.19.4 or later.
// The DashScope SDK version must be 2.19.4 or later.
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
public class Main {
private static StringBuilder reasoningContent = new StringBuilder();
private static StringBuilder finalContent = new StringBuilder();
private static boolean isFirstPrint = true;
private static void handleGenerationResult(GenerationResult message) {
String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
String content = message.getOutput().getChoices().get(0).getMessage().getContent();
if (reasoning != null && !reasoning.isEmpty()) {
reasoningContent.append(reasoning);
if (isFirstPrint) {
System.out.println("====================Thinking process====================");
isFirstPrint = false;
}
System.out.print(reasoning);
}
if (content != null && !content.isEmpty()) {
finalContent.append(content);
if (!isFirstPrint) {
System.out.println("\n====================Full response====================");
isFirstPrint = true;
}
System.out.print(content);
}
}
private static GenerationParam buildGenerationParam(Message userMsg) {
return GenerationParam.builder()
// If the environment variable is not set, replace it with your Model Studio API key: .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("deepseek-v3.2")
.enableThinking(true)
.incrementalOutput(true)
.resultFormat("message")
.messages(Arrays.asList(userMsg))
.build();
}
public static void streamCallWithMessage(Generation gen, Message userMsg)
throws NoApiKeyException, ApiException, InputRequiredException {
GenerationParam param = buildGenerationParam(userMsg);
Flowable<GenerationResult> result = gen.streamCall(param);
result.blockingForEach(message -> handleGenerationResult(message));
}
public static void main(String[] args) {
try {
Generation gen = new Generation("http", "https://dashscope-intl.aliyuncs.com/api/v1");
Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
streamCallWithMessage(gen, userMsg);
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
System.err.println("An exception occurred: " + e.getMessage());
}
}
}Response
====================Thinking process====================
Hmm, the user is asking a simple self-introduction question. This is a common query, so I need to state my identity and function clearly and quickly. I'll use a relaxed and friendly tone to introduce myself as DeepSeek-V3, created by DeepSeek. I can also mention the types of help I can provide, such as answering questions, chatting, and tutoring. Finally, I'll add an emoji to be more approachable. I should keep it concise and clear.
====================Full response====================
I am DeepSeek-V3, an intelligent assistant created by DeepSeek! I can help you answer various questions, provide suggestions, look up information, and even chat with you! Feel free to ask me anything about your studies, work, or daily life. How can I help you?HTTP
Sample code
curl
curl -X POST "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
"model": "deepseek-v3.2",
"input":{
"messages":[
{
"role": "user",
"content": "Who are you?"
}
]
},
"parameters":{
"enable_thinking": true,
"incremental_output": true,
"result_format": "message"
}
}'Other features
Model | |||||
deepseek-v3.2 | |||||
deepseek-v3.2-exp | Supported only in non-thinking mode. | ||||
deepseek-v3.1 | Supported only in non-thinking mode. | ||||
deepseek-r1 | |||||
deepseek-r1-0528 | |||||
deepseek-v3 | |||||
Distilled models |
Default parameter values
Model | temperature | top_p | repetition_penalty | presence_penalty | max_tokens | thinking_budget |
deepseek-v3.2 | 1.0 | 0.95 | - | - | 65,536 | 32,768 |
deepseek-v3.2-exp | 0.6 | 0.95 | 1.0 | - | 65,536 | 32,768 |
deepseek-v3.1 | 0.6 | 0.95 | 1.0 | - | 65,536 | 32,768 |
deepseek-r1 | 0.6 | 0.95 | - | 1 | 16,384 | 32,768 |
deepseek-r1-0528 | 0.6 | 0.95 | - | 1 | 16,384 | 32,768 |
Distilled models | 0.6 | 0.95 | - | 1 | 16,384 | 16,384 |
deepseek-v3 | 0.7 | 0.6 | - | - | 16,384 | - |
A hyphen (-) indicates the parameter has no default value and cannot be set.
The deepseek-r1, deepseek-r1-0528, and distilled models do not support these parameters.
For more information about parameter definitions, see OpenAI Chat.
Billing
Billing is based on the number of input and output tokens. For pricing details, see Model list and pricing.
In thinking mode, CoT tokens are billed as output tokens.
FAQ
Can I upload images or documents to ask questions?
DeepSeek models accept text input only. They do not support image or document input. Qwen-VL supports image input, and Qwen-Long supports document input.
How do I view token usage and the number of calls?
One hour after calling a model, go to Monitoring, set filters (time range, workspace), locate your model in Models, and click Monitor in Actions to view statistics. For more information, see Usage and performance monitoring.
Data is updated hourly. During peak hours, updates may be delayed by up to one hour.

Error codes
If an error occurs, see Error messages for troubleshooting.