This document describes how to call the Kimi model inference service deployed on Alibaba Cloud Model Studio.
Moonshot-Kimi-K2-Instruct and kimi-k2-thinking will be retired on July 9, 2026. We recommend migrating to qwen3.7-plus, qwen3.7-max, or qwen3.6-flash.
Supported regions: China (Beijing), China (Hong Kong), Germany (Frankfurt), and US (Virginia).
Model experience: You can try the Kimi model in the model trial center.
Service endpoints are region-specific. Configure the correct base URL for your region.
OpenAI compatible
US (Virginia)
The base_url for SDK calls is: https://dashscope-us.aliyuncs.com/compatible-mode/v1
HTTP request URL: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions
Germany (Frankfurt)
The base_url for SDK calls is: https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1
HTTP request URL: POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1/chat/completions
When you make a call, replace WorkspaceId with your actual Workspace ID.
China (Beijing)
The base_url for SDK calls is: https://dashscope.aliyuncs.com/compatible-mode/v1
HTTP request URL: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
China (Hong Kong)
The base_url for SDK calls is: https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/compatible-mode/v1
HTTP request URL: POST https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/compatible-mode/v1/chat/completions
When you make a call, replace WorkspaceId with your actual Workspace ID.
DashScope
US (Virginia)
The HTTP request URL for text models, such as kimi-k2-thinking, is POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation
The HTTP request URL for multimodal models, such as kimi-k2.6 and kimi-k2.5, is POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
The base_url for SDK calls is:
Python code
dashscope.base_http_api_url = 'https://dashscope-us.aliyuncs.com/api/v1'Java code
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), “https://dashscope-us.aliyuncs.com/api/v1");Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://dashscope-us.aliyuncs.com/api/v1";
Germany (Frankfurt)
The HTTP request URL for text models, such as kimi-k2-thinking, is POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation
The HTTP request URL for multimodal models, such as kimi-k2.7-code, kimi-k2.6, and kimi-k2.5, is POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
When you make a call, replace WorkspaceId with your actual Workspace ID.
The base_url for SDK calls is:
Python code
When you make a call, replace WorkspaceId with your actual Workspace ID.
dashscope.base_http_api_url = 'https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1'Java code
When you make a call, replace WorkspaceId with your actual Workspace ID.
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), “https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1");Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1";
China (Hong Kong)
The HTTP request URL for text models, such as kimi-k2-thinking, is POST https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation
The HTTP request URL for multimodal models, such as kimi-k2.7-code, kimi-k2.6, and kimi-k2.5, is POST https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
When you make a call, replace WorkspaceId with your actual Workspace ID.
The base_url for SDK calls is:
Python code
When you make a call, replace WorkspaceId with your actual Workspace ID.
dashscope.base_http_api_url = 'https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/api/v1'Java code
When you make a call, replace WorkspaceId with your actual Workspace ID.
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), “https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/api/v1");Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://{WorkspaceId}.cn-hongkong.maas.aliyuncs.com/api/v1";
China (Beijing)
The HTTP request URL for text models, such as kimi-k2-thinking, is POST https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation
The HTTP request URL for multimodal models, such as kimi-k2.7-code, kimi-k2.6, and kimi-k2.5, is POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
You do not need to configure the base_url for SDK calls.
Prerequisites: You must get an API key and set it as an environment variable. If you use the SDK, you must install the SDK.
Get started
The following examples use text-only input. For multimodal examples, see multimodal call.
OpenAI compatible
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="kimi-k2.6",
messages=[{"role": "user", "content": "Who are you?"}],
stream=True,
)
reasoning_content = "" # Complete thinking process
answer_content = "" # Complete response
is_answering = False # Tracks if the main response has started.
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
if chunk.choices:
delta = chunk.choices[0].delta
# Store content from the thinking process.
if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
if not is_answering:
print(delta.reasoning_content, end="", flush=True)
reasoning_content += delta.reasoning_content
# Start printing the main response once its content arrives.
if hasattr(delta, "content") and delta.content:
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
print(delta.content, end="", flush=True)
answer_content += delta.contentResponse
====================Thinking Process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.
I am Kimi, an AI assistant developed by Moonshot AI. I should introduce myself clearly and concisely, including:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.
I should maintain a friendly and professional tone, avoiding overly technical terms for clarity. I should also emphasize that I am an AI without personal consciousness, emotions, or experiences to prevent misunderstandings.
Response structure:
- Directly state my identity
- Mention my developer
- Briefly introduce core capabilities
- Keep it clear and concise
====================Complete Response====================
I am Kimi, an AI assistant developed by Moonshot AI. I am based on a Mixture-of-Experts (MoE) architecture and have capabilities such as ultra-long context understanding, intelligent conversation, file processing, code generation, and complex task reasoning. How can I help you?Node.js
import OpenAI from "openai";
import process from 'process';
// Initialize the OpenAI client
const openai = new OpenAI({
// If not using an environment variable, replace `process.env.DASHSCOPE_API_KEY` with your API key string (e.g., "sk-xxx").
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});
let reasoningContent = ''; // Complete thinking process
let answerContent = ''; // Complete response
let isAnswering = false; // Tracks if the main response has started.
async function main() {
const messages = [{ role: 'user', content: 'Who are you?' }];
const stream = await openai.chat.completions.create({
model: 'kimi-k2.6',
messages,
stream: true,
});
console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
for await (const chunk of stream) {
if (chunk.choices?.length) {
const delta = chunk.choices[0].delta;
// Store content from the thinking process.
if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
if (!isAnswering) {
process.stdout.write(delta.reasoning_content);
}
reasoningContent += delta.reasoning_content;
}
// Start printing the main response once its content arrives.
if (delta.content !== undefined && delta.content) {
if (!isAnswering) {
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
isAnswering = true;
}
process.stdout.write(delta.content);
answerContent += delta.content;
}
}
}
}
main();Response
====================Thinking Process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.
I am Kimi, an AI assistant developed by Moonshot AI. I should introduce myself clearly and concisely, including:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.
I should maintain a friendly and professional tone and avoid overly technical terms for clarity. I should also emphasize that I am an AI without personal consciousness, emotions, or experiences to prevent misunderstandings.
Response structure:
- Directly state my identity
- Mention my developer
- Briefly introduce core capabilities
- Keep it clear and concise
====================Complete Response====================
I am Kimi, an AI assistant developed by Moonshot AI.
I am skilled in:
- Long-text understanding and generation
- Intelligent conversation and question answering
- File processing and analysis
- Information retrieval and integration
As an AI assistant, I do not have personal consciousness, emotions, or experiences, but I am designed to provide accurate and helpful assistance. How can I help you?HTTP
curl
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.6",
"messages": [
{
"role": "user",
"content": "Who are you?"
}
]
}'Response
{
"choices": [
{
"message": {
"content": "I am Kimi, an AI assistant developed by Moonshot AI. I am skilled in long-text processing, intelligent conversation, file analysis, programming assistance, and complex task reasoning. I can help you answer questions, create content, and analyze documents. How can I assist you?",
"reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I must answer truthfully based on my actual identity.\n\nI am Kimi, an AI assistant developed by Moonshot AI. I should introduce myself clearly and concisely, including:\n1. My identity: AI assistant\n2. My developer: Moonshot AI\n3. My name: Kimi\n4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.\n\nI should maintain a friendly and professional tone while providing useful information. No need to overcomplicate; a direct answer is sufficient.",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 8,
"completion_tokens": 183,
"total_tokens": 191
},
"created": 1762753998,
"system_fingerprint": null,
"model": "kimi-k2.6",
"id": "chatcmpl-485ab490-90ec-48c3-85fa-1c732b683db2"
}DashScope
The following DashScope examples use the multimodal-generation endpoint to call kimi-k2.6, which supports both text and multimodal input. For more multimodal examples, see multimodal call.Python
import os
from dashscope import MultiModalConversation
# Define the request messages.
messages = [{"role": "user", "content": "Who are you?"}]
completion = MultiModalConversation.call(
api_key=os.getenv("DASHSCOPE_API_KEY"), # If not using an environment variable, provide your key directly, e.g., api_key="sk-xxx"
model="kimi-k2.6",
messages=messages,
result_format="message", # Set the result format to message
stream=True, # Enable streaming.
incremental_output=True, # Enable incremental output
)
reasoning_content = "" # Complete thinking process
answer_content = "" # Complete response
is_answering = False # Tracks if the main response has started.
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
for chunk in completion:
message = chunk.output.choices[0].message
# Store content from the thinking process.
if message.reasoning_content:
if not is_answering:
print(message.reasoning_content, end="", flush=True)
reasoning_content += message.reasoning_content
# Start printing the main response once its content arrives.
if message.content:
if not is_answering:
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
is_answering = True
print(message.content, end="", flush=True)
answer_content += message.contentResponse
====================Thinking Process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.
I am Kimi, an AI assistant developed by Moonshot AI. I should state this clearly and concisely.
Key information to include:
1. My name: Kimi
2. My developer: Moonshot AI
3. My nature: AI assistant
4. What I can do: answer questions, assist with content creation, etc.
I should maintain a friendly and helpful tone while accurately stating my identity. I should not pretend to be human or have a personal identity.
A suitable response would be:
"I am Kimi, an AI assistant developed by Moonshot AI. I can help you with a variety of tasks such as answering questions, creating content, and analyzing documents. How can I help you?"
This response is direct, accurate, and encourages further interaction.
====================Complete Response====================
I am Kimi, an AI assistant developed by Moonshot AI. I can help you with a variety of tasks such as answering questions, creating content, and analyzing documents. How can I help you?Java
// DashScope SDK version >= 2.19.4
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import java.util.Arrays;
import java.util.Collections;
public class Main {
public static void main(String[] args) {
try {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMsg = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("text", "Who are you?")))
.build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
// If not using an environment variable, replace the following line with your API key, e.g., .apiKey("sk-xxx")
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("kimi-k2.6")
.messages(Arrays.asList(userMsg))
.build();
MultiModalConversationResult result = conv.call(param);
String content = result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text");
System.out.println("Response: " + content);
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
System.err.println("An exception occurred: " + e.getMessage());
}
System.exit(0);
}
}Response
====================Thinking Process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.
I am Kimi, an AI assistant developed by Moonshot AI. I should state this clearly and concisely.
The response should include:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, etc.
I should not pretend to be human or provide excessive technical details. A clear and friendly answer is sufficient.
====================Complete Response====================
I am Kimi, an AI assistant developed by Moonshot AI. My skills include long-text processing, intelligent conversation, question answering, content creation, and file analysis and processing. How can I assist you?HTTP
curl
curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.6",
"input":{
"messages":[
{
"role": "user",
"content": "Who are you?"
}
]
},
"parameters": {
"result_format": "message"
}
}'Response
{
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"content": "I am Kimi, an AI assistant developed by Moonshot AI. I can help you answer questions, create content, analyze documents, and write code. How can I help you?",
"reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am Kimi, an AI assistant developed by Moonshot AI. I should state this clearly and concisely.\n\nKey information to include:\n1. My name: Kimi\n2. My developer: Moonshot AI\n3. My nature: AI assistant\n4. What I can do: answer questions, assist with content creation, etc.\n\nThe response should be friendly, direct, and easy to understand.",
"role": "assistant"
}
}
]
},
"usage": {
"input_tokens": 9,
"output_tokens": 156,
"total_tokens": 165
},
"request_id": "709a0697-ed1f-4298-82c9-a4b878da1849"
}Multimodal calls
The kimi-k2.7-code, kimi-k2.6, and kimi-k2.5 models can simultaneously process text, images, or video. Use the enable_thinking parameter to enable thinking mode. The following examples show how to use this capability.
Enable or disable thinking mode
kimi-k2.6 and kimi-k2.5 are hybrid thinking models. These models can reply after thinking or reply directly. You can use the enable_thinking parameter to control whether to enable the thinking mode:
true: Enable thinking modefalse(default): Disables the thinking mode
kimi-k2.7-code is a thinking-only model: thinking mode is always enabled (enable_thinking defaults to true and cannot be disabled), and preserve_thinking defaults to true.
kimi-k2.6 supports passing the thinking process in multi-turn conversations by using the preserve_thinking parameter. For more information, see Pass the thinking process.
The following examples show how to use an image URL and enable thinking mode. The main example demonstrates single-image input, while the commented-out code is an example of multi-image input.
OpenAI compatible
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
# Single-image input example (thinking mode enabled)
completion = client.chat.completions.create(
model="kimi-k2.6",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What scene is depicted in the image?"},
{
"type": "image_url",
"image_url": {
"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
}
}
]
}
],
extra_body={"enable_thinking":True} # Enable thinking mode
)
# Print the thinking process
if hasattr(completion.choices[0].message, 'reasoning_content') and completion.choices[0].message.reasoning_content:
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
print(completion.choices[0].message.reasoning_content)
# Print the complete response
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
print(completion.choices[0].message.content)
# Multi-image input example (thinking mode enabled, uncomment to use)
# completion = client.chat.completions.create(
# model="kimi-k2.6",
# messages=[
# {
# "role": "user",
# "content": [
# {"type": "text", "text": "What do these images depict?"},
# {
# "type": "image_url",
# "image_url": {"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"}
# },
# {
# "type": "image_url",
# "image_url": {"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"}
# }
# ]
# }
# ],
# extra_body={"enable_thinking":True}
# )
#
# # Print the thinking process and complete response
# if hasattr(completion.choices[0].message, 'reasoning_content') and completion.choices[0].message.reasoning_content:
# print("\nThinking Process:\n" + completion.choices[0].message.reasoning_content)
# print("\nComplete Response:\n" + completion.choices[0].message.content)Node.js
import OpenAI from "openai";
import process from 'process';
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});
// Single-image input example (thinking mode enabled)
const completion = await openai.chat.completions.create({
model: 'kimi-k2.6',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What scene is depicted in the image?' },
{
type: 'image_url',
image_url: {
url: 'https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg'
}
}
]
}
],
enable_thinking: true // Enable thinking mode
});
// Print the thinking process
if (completion.choices[0].message.reasoning_content) {
console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
console.log(completion.choices[0].message.reasoning_content);
}
// Print the complete response
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
console.log(completion.choices[0].message.content);
// Multi-image input example (thinking mode enabled, uncomment to use)
// const multiCompletion = await openai.chat.completions.create({
// model: 'kimi-k2.6',
// messages: [
// {
// role: 'user',
// content: [
// { type: 'text', text: 'What do these images depict?' },
// {
// type: 'image_url',
// image_url: { url: 'https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg' }
// },
// {
// type: 'image_url',
// image_url: { url: 'https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png' }
// }
// ]
// }
// ],
// enable_thinking: true
// });
//
// // Print the thinking process and complete response
// if (multiCompletion.choices[0].message.reasoning_content) {
// console.log('\nThinking Process:\n' + multiCompletion.choices[0].message.reasoning_content);
// }
// console.log('\nComplete Response:\n' + multiCompletion.choices[0].message.content);Curl
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.6",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What scene is depicted in the image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
}
}
]
}
],
"enable_thinking": true
}'
# Multi-image input example (uncomment to use)
# curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
# -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
# -H "Content-Type: application/json" \
# -d '{
# "model": "kimi-k2.6",
# "messages": [
# {
# "role": "user",
# "content": [
# {
# "type": "text",
# "text": "What do these images depict?"
# },
# {
# "type": "image_url",
# "image_url": {
# "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
# }
# },
# {
# "type": "image_url",
# "image_url": {
# "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
# }
# }
# ]
# }
# ],
# "enable_thinking": true,
# "stream": false
# }'DashScope
Python
import os
from dashscope import MultiModalConversation
# Single-image input example (thinking mode enabled)
response = MultiModalConversation.call(
api_key=os.getenv("DASHSCOPE_API_KEY"),
model="kimi-k2.6",
messages=[
{
"role": "user",
"content": [
{"text": "What scene is depicted in the image?"},
{"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"}
]
}
],
enable_thinking=True # Enable thinking mode
)
# Print the thinking process
if hasattr(response.output.choices[0].message, 'reasoning_content') and response.output.choices[0].message.reasoning_content:
print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
print(response.output.choices[0].message.reasoning_content)
# Print the complete response
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
print(response.output.choices[0].message.content[0]["text"])
# Multi-image input example (thinking mode enabled, uncomment to use)
# response = MultiModalConversation.call(
# api_key=os.getenv("DASHSCOPE_API_KEY"),
# model="kimi-k2.6",
# messages=[
# {
# "role": "user",
# "content": [
# {"text": "What do these images depict?"},
# {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"},
# {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"}
# ]
# }
# ],
# enable_thinking=True
# )
#
# # Print the thinking process and complete response
# if hasattr(response.output.choices[0].message, 'reasoning_content') and response.output.choices[0].message.reasoning_content:
# print("\nThinking Process:\n" + response.output.choices[0].message.reasoning_content)
# print("\nComplete Response:\n" + response.output.choices[0].message.content[0]["text"])Java
// Requires DashScope SDK v2.19.4 or later.
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
public class KimiK26MultiModalExample {
public static void main(String[] args) {
try {
// Single-image input example (thinking mode enabled)
MultiModalConversation conv = new MultiModalConversation();
// Build the message content
Map<String, Object> textContent = new HashMap<>();
textContent.put("text", "What scene is depicted in the image?");
Map<String, Object> imageContent = new HashMap<>();
imageContent.put("image", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg");
MultiModalMessage userMessage = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(textContent, imageContent))
.build();
// Build the request parameters
MultiModalConversationParam param = MultiModalConversationParam.builder()
// If the environment variable is not set, replace this with your API key from Model Studio.
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("kimi-k2.6")
.messages(Arrays.asList(userMessage))
.enableThinking(true) // Enable thinking mode
.build();
// Call the model
MultiModalConversationResult result = conv.call(param);
// Print the response
String content = result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text");
System.out.println("Response: " + content);
// If thinking mode is enabled, print the thinking process
if (result.getOutput().getChoices().get(0).getMessage().getReasoningContent() != null) {
System.out.println("\nThinking Process: " +
result.getOutput().getChoices().get(0).getMessage().getReasoningContent());
}
// Multi-image input example (uncomment to use)
// Map<String, Object> imageContent1 = new HashMap<>();
// imageContent1.put("image", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg");
// Map<String, Object> imageContent2 = new HashMap<>();
// imageContent2.put("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png");
//
// Map<String, Object> textContent2 = new HashMap<>();
// textContent2.put("text", "What do these images depict?");
//
// MultiModalMessage multiImageMessage = MultiModalMessage.builder()
// .role(Role.USER.getValue())
// .content(Arrays.asList(textContent2, imageContent1, imageContent2))
// .build();
//
// MultiModalConversationParam multiParam = MultiModalConversationParam.builder()
// .apiKey(System.getenv("DASHSCOPE_API_KEY"))
// .model("kimi-k2.6")
// .messages(Arrays.asList(multiImageMessage))
// .enableThinking(true)
// .build();
//
// MultiModalConversationResult multiResult = conv.call(multiParam);
// System.out.println(multiResult.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));
} catch (ApiException | NoApiKeyException | InputRequiredException e) {
System.err.println("Call failed: " + e.getMessage());
}
}
}Curl
curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.6",
"input": {
"messages": [
{
"role": "user",
"content": [
{
"text": "What scene is depicted in the image?"
},
{
"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
}
]
}
]
},
"parameters": {
"enable_thinking": true
}
}'
# Multi-image input example (uncomment to use)
# curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
# -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
# -H "Content-Type: application/json" \
# -d '{
# "model": "kimi-k2.6",
# "input": {
# "messages": [
# {
# "role": "user",
# "content": [
# {
# "text": "What do these images depict?"
# },
# {
# "image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
# },
# {
# "image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
# }
# ]
# }
# ]
# },
# "parameters": {
# "enable_thinking": true
# }
# }'Video understanding
Video file
The kimi-k2.7-code, kimi-k2.6, and kimi-k2.5 models analyze videos by extracting a sequence of frames. You can control the frame extraction strategy with the following parameters:
fps: Controls the frame extraction frequency. The interval between extracted frames is
seconds. The value must be in the range of [0.1, 10]. The default value is 2.0. For high-motion scenes: Set a higher fps value to capture more detail.
For static or long videos: Set a lower fps value to improve processing efficiency.
max_frames: Specifies the maximum number of frames to extract from a video. The default and maximum value is 2000.
If the number of frames calculated from the fps value exceeds this limit, the system automatically extracts frames uniformly to stay within the max_frames limit. This parameter is available only when you use the DashScope SDK.
OpenAI compatible
When passing a video file to the model using the OpenAI SDK or an HTTP request, set the"type"parameter in the user message to"video_url".
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="kimi-k2.6",
messages=[
{
"role": "user",
"content": [
# When passing a video file directly, set the "type" parameter to "video_url".
{
"type": "video_url",
"video_url": {
"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4"
},
"fps": 2
},
{
"type": "text",
"text": "What is the content of this video?"
}
]
}
]
)
print(completion.choices[0].message.content)Node.js
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});
async function main() {
const response = await openai.chat.completions.create({
model: "kimi-k2.6",
messages: [
{
role: "user",
content: [
// When passing a video file directly, set the "type" parameter to "video_url".
{
type: "video_url",
video_url: {
"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4"
},
"fps": 2
},
{
type: "text",
text: "What is the content of this video?"
}
]
}
]
});
console.log(response.choices[0].message.content);
}
main();curl
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "kimi-k2.6",
"messages": [
{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4"
},
"fps":2
},
{
"type": "text",
"text": "What is the content of this video?"
}
]
}
]
}'DashScope
Python
import dashscope
import os
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
messages = [
{"role": "user",
"content": [
# The fps parameter sets the frame extraction frequency; the interval between frames is 1/fps seconds.
{"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4","fps":2},
{"text": "What is the content of this video?"}
]
}
]
response = dashscope.MultiModalConversation.call(
# If the DASHSCOPE_API_KEY environment variable is not set, replace this line with your Model Studio API key: api_key="sk-xxx"
api_key=os.getenv('DASHSCOPE_API_KEY'),
model='kimi-k2.6',
messages=messages
)
print(response.output.choices[0].message.content[0]["text"])Java
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}
public static void simpleMultiModalConversationCall()
throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
// The fps parameter sets the frame extraction frequency; the interval between frames is 1/fps seconds.
Map<String, Object> params = new HashMap<>();
params.put("video", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4");
params.put("fps", 2);
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(
params,
Collections.singletonMap("text", "What is the content of this video?"))).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("kimi-k2.6")
.messages(Arrays.asList(userMessage))
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));
}
public static void main(String[] args) {
try {
simpleMultiModalConversationCall();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}curl
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "kimi-k2.6",
"input":{
"messages":[
{"role": "user","content": [{"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241115/cqqkru/1.mp4","fps":2},
{"text": "What is the content of this video?"}]}]}
}'Image list
When you provide a video as an image list (pre-extracted frames), use the fps parameter to specify the original video's frame extraction rate. This value indicates that the frames were extracted every
OpenAI compatible
When passing a video as an image list using the OpenAI SDK or an HTTP request, set the"type"parameter in the user message to"video".
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="kimi-k2.6",
messages=[{"role": "user","content": [
# When passing an image list, set the "type" parameter in the user message to "video".
{"type": "video","video": [
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"],
"fps":2},
{"type": "text","text": "Describe the action in this video."},
]}]
)
print(completion.choices[0].message.content)Node.js
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});
async function main() {
const response = await openai.chat.completions.create({
model: "kimi-k2.6",
messages: [{
role: "user",
content: [
{
// When passing an image list, set the "type" parameter in the user message to "video".
type: "video",
video: [
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"],
"fps":2
},
{
type: "text",
text: "Describe the action in this video."
}
]
}]
});
console.log(response.choices[0].message.content);
}
main();curl
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "kimi-k2.6",
"messages": [{"role": "user","content": [{"type": "video","video": [
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"],
"fps":2},
{"type": "text","text": "Describe the action in this video."}]}]
}'DashScope
Python
import os
import dashscope
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
messages = [{"role": "user",
"content": [
{"video":["https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"],
"fps":2},
{"text": "Describe the action in this video."}]}]
response = dashscope.MultiModalConversation.call(
# If the DASHSCOPE_API_KEY environment variable is not set, replace this line with your Model Studio API key: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
model='kimi-k2.6',
messages=messages
)
print(response.output.choices[0].message.content[0]["text"])Java
// Requires DashScope SDK v2.21.10 or later.
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}
private static final String MODEL_NAME = "kimi-k2.6";
public static void videoImageListSample() throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
Map<String, Object> params = new HashMap<>();
params.put("video", Arrays.asList("https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"));
params.put("fps", 2);
MultiModalMessage userMessage = MultiModalMessage.builder()
.role(Role.USER.getValue())
.content(Arrays.asList(
params,
Collections.singletonMap("text", "Describe the action in this video.")))
.build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model(MODEL_NAME)
.messages(Arrays.asList(userMessage)).build();
MultiModalConversationResult result = conv.call(param);
System.out.print(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));
}
public static void main(String[] args) {
try {
videoImageListSample();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}curl
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "kimi-k2.6",
"input": {
"messages": [
{
"role": "user",
"content": [
{
"video": [
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/xzsgiz/football1.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/tdescd/football2.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/zefdja/football3.jpg",
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/aedbqh/football4.jpg"
],
"fps":2
},
{
"text": "Describe the action in this video."
}
]
}
]
}
}'Pass a local file
The following examples show how to pass a local file. The OpenAI-compatible API supports only Base64 encoding, while DashScope supports both Base64 encoding and file paths.
OpenAI compatible
To pass a local file using Base64 encoding, construct a Data URL. For instructions, see Construct a Data URL.
Python
from openai import OpenAI
import os
import base64
# Encoding function: Converts a local file to a Base64-encoded string.
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Replace "xxx/eagle.png" with the absolute path to your local image.
base64_image = encode_image("xxx/eagle.png")
client = OpenAI(
api_key=os.getenv('DASHSCOPE_API_KEY'),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="kimi-k2.6",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{base64_image}"},
},
{"type": "text", "text": "What scene is depicted in the image?"},
],
}
],
)
print(completion.choices[0].message.content)
# The following examples show how to pass a local video file and a local image list.
# [Local video file] Encode the local video as a Data URL and pass it to the video_url parameter:
# def encode_video_to_data_url(video_path):
# with open(video_path, "rb") as f:
# return "data:video/mp4;base64," + base64.b64encode(f.read()).decode("utf-8")
# video_data_url = encode_video_to_data_url("xxx/local.mp4")
# content = [{"type": "video_url", "video_url": {"url": video_data_url}, "fps": 2}, {"type": "text", "text": "What is the content of this video?"}]
# [Local image list] Encode multiple local images with Base64 and pass them as a list to the video parameter:
# image_data_urls = [f"data:image/jpeg;base64,{encode_image(p)}" for p in ["xxx/f1.jpg", "xxx/f2.jpg", "xxx/f3.jpg", "xxx/f4.jpg"]]
# content = [{"type": "video", "video": image_data_urls, "fps": 2}, {"type": "text", "text": "Describe the sequence of events in this video."}]
Node.js
import OpenAI from "openai";
import { readFileSync } from 'fs';
const openai = new OpenAI(
{
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
);
const encodeImage = (imagePath) => {
const imageFile = readFileSync(imagePath);
return imageFile.toString('base64');
};
// Replace "xxx/eagle.png" with the absolute path to your local image.
const base64Image = encodeImage("xxx/eagle.png")
async function main() {
const completion = await openai.chat.completions.create({
model: "kimi-k2.6",
messages: [
{"role": "user",
"content": [{"type": "image_url",
"image_url": {"url": `data:image/png;base64,${base64Image}`},},
{"type": "text", "text": "What scene is depicted in the image?"}]}]
});
console.log(completion.choices[0].message.content);
}
main();
// The following examples show how to pass a local video file and a local image list.
// [Local video file] Encode the local video as a Data URL and pass it to the video_url parameter:
// const encodeVideoToDataUrl = (videoPath) => "data:video/mp4;base64," + readFileSync(videoPath).toString("base64");
// const videoDataUrl = encodeVideoToDataUrl("xxx/local.mp4");
// content: [{ type: "video_url", video_url: { url: videoDataUrl }, fps: 2 }, { type: "text", text: "What is the content of this video?" }]
// [Local image list] Encode multiple local images with Base64 and pass them as a list to the video parameter:
// const imageDataUrls = ["xxx/f1.jpg","xxx/f2.jpg","xxx/f3.jpg","xxx/f4.jpg"].map(p => `data:image/jpeg;base64,${encodeImage(p)}`);
// content: [{ type: "video", video: imageDataUrls, fps: 2 }, { type: "text", text: "Describe the sequence of events in this video." }]
// messages: [{"role": "user", "content": content}]
// Then call openai.chat.completions.create({model: "kimi-k2.6", messages: messages})DashScope
Base64 encoding
To pass a local file using Base64 encoding, construct a Data URL. For instructions, see Construct a Data URL.
Python
import base64
import os
import dashscope
from dashscope import MultiModalConversation
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
# Encoding function: Converts a local file to a Base64-encoded string.
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Replace "xxx/eagle.png" with the absolute path to your local image.
base64_image = encode_image("xxx/eagle.png")
messages = [
{
"role": "user",
"content": [
{"image": f"data:image/png;base64,{base64_image}"},
{"text": "What scene is depicted in the image?"},
],
},
]
response = MultiModalConversation.call(
# If the DASHSCOPE_API_KEY environment variable is not set, pass your Model Studio API key directly, for example: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
model="kimi-k2.6",
messages=messages,
)
print(response.output.choices[0].message.content[0]["text"])
# The following examples show how to pass a local video file and a local image list.
# [Local video file]
# video_data_url = "data:video/mp4;base64," + base64.b64encode(open("xxx/local.mp4","rb").read()).decode("utf-8")
# content: [{"video": video_data_url, "fps": 2}, {"text": "What is the content of this video?"}]
# [Local image list]
# image_data_urls = [f"data:image/jpeg;base64,{encode_image(p)}" for p in ["xxx/f1.jpg","xxx/f2.jpg","xxx/f3.jpg","xxx/f4.jpg"]]
# content: [{"video": image_data_urls, "fps": 2}, {"text": "Describe the sequence of events in this video."}]Java
import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Base64;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import com.alibaba.dashscope.aigc.multimodalconversation.*;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}
private static String encodeToBase64(String imagePath) throws IOException {
Path path = Paths.get(imagePath);
byte[] imageBytes = Files.readAllBytes(path);
return Base64.getEncoder().encodeToString(imageBytes);
}
public static void callWithLocalFile(String localPath) throws ApiException, NoApiKeyException, UploadFileException, IOException {
String base64Image = encodeToBase64(localPath);
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(
new HashMap<String, Object>() {{ put("image", "data:image/png;base64," + base64Image); }},
new HashMap<String, Object>() {{ put("text", "What scene is depicted in the image?"); }}
)).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("kimi-k2.6")
.messages(Arrays.asList(userMessage))
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));
}
public static void main(String[] args) {
try {
// Replace "xxx/eagle.png" with the absolute path to your local image.
callWithLocalFile("xxx/eagle.png");
} catch (ApiException | NoApiKeyException | UploadFileException | IOException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
// The following examples show how to pass a local video file and a local image list.
// [Local video file]
// String base64Image = encodeToBase64(localPath);
// MultiModalConversation conv = new MultiModalConversation();
// MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
// .content(Arrays.asList(
// new HashMap<String, Object>() {{ put("video", "data:video/mp4;base64," + base64Video;}},
// new HashMap<String, Object>() {{ put("text", "What scene is depicted in this video?"); }}
// )).build();
// [Local image list]
// List<String> urls = Arrays.asList(
// "data:image/jpeg;base64,"+encodeToBase64(path/f1.jpg),
// "data:image/jpeg;base64,"+encodeToBase64(path/f2.jpg),
// "data:image/jpeg;base64,"+encodeToBase64(path/f3.jpg),
// "data:image/jpeg;base64,"+encodeToBase64(path/f4.jpg));
// MultiModalConversation conv = new MultiModalConversation();
// MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
// .content(Arrays.asList(
// new HashMap<String, Object>() {{ put("video", urls;}},
// new HashMap<String, Object>() {{ put("text", "What scene is depicted in this video?"); }}
// )).build();
}File path
You can pass a local file path directly to the model. This method is supported only by the DashScope Python and Java SDKs; it is not available for DashScope HTTP or the OpenAI-compatible API. The table below shows the required file path format for each programming language and operating system.
Python
import os
from dashscope import MultiModalConversation
import dashscope
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
# Replace "xxx/eagle.png" with the absolute path to your local image.
local_path = "xxx/eagle.png"
image_path = f"file://{local_path}"
messages = [
{'role':'user',
'content': [{'image': image_path},
{'text': 'What scene is depicted in the image?'}]}]
response = MultiModalConversation.call(
api_key=os.getenv('DASHSCOPE_API_KEY'),
model='kimi-k2.6',
messages=messages)
print(response.output.choices[0].message.content[0]["text"])
# The following examples show how to pass a local video and a list of local images using file paths.
# [Local video file]
# video_path = "file:///path/to/local.mp4"
# content: [{"video": video_path, "fps": 2}, {"text": "What is the content of this video?"}]
# [Local image list]
# image_paths = ["file:///path/f1.jpg", "file:///path/f2.jpg", "file:///path/f3.jpg", "file:///path/f4.jpg"]
# content: [{"video": image_paths, "fps": 2}, {"text": "Describe the sequence of events in this video."}]
Java
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;
public class Main {
static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}
public static void callWithLocalFile(String localPath)
throws ApiException, NoApiKeyException, UploadFileException {
String filePath = "file://"+localPath;
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(new HashMap<String, Object>(){{put("image", filePath);}},
new HashMap<String, Object>(){{put("text", "What scene is depicted in the image?");}})).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
.model("kimi-k2.6")
.messages(Arrays.asList(userMessage))
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));}
public static void main(String[] args) {
try {
// Replace "xxx/eagle.png" with the absolute path to your local image.
callWithLocalFile("xxx/eagle.png");
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
// The following examples show how to pass a local video and a list of local images using file paths.
// [Local video file]
// String filePath = "file://"+localPath;
// MultiModalConversation conv = new MultiModalConversation();
// MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
// .content(Arrays.asList(new HashMap<String, Object>(){{put("video", filePath);}},
// new HashMap<String, Object>(){{put("text", "What scene is depicted in the video?");}})).build();
// [Local image list]
// MultiModalConversation conv = new MultiModalConversation();
// List<String> filePath = Arrays.asList("file:///path/f1.jpg", "file:///path/f2.jpg", "file:///path/f3.jpg", "file:///path/f4.jpg")
// MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
// .content(Arrays.asList(new HashMap<String, Object>(){{put("video", filePath);}},
// new HashMap<String, Object>(){{put("text", "What scene is depicted in the video?");}})).build();
}File limitations
Image limitations
Image resolution:
Minimum size: Width and height must each exceed
10pixels.Aspect ratio: The ratio of the longest side to the shortest side must not exceed
200:1.Maximum resolution: The recommended maximum is
8K(7680x4320). Higher resolutions may cause API call timeouts due to large file sizes or slow network transfers.
Supported image formats
The following formats are supported for resolutions below 4K
(3840x2160):Image format
File extension
MIME type
BMP
.bmp
image/bmp
JPEG
.jpe, .jpeg, .jpg
image/jpeg
PNG
.png
image/png
TIFF
.tif, .tiff
image/tiff
WEBP
.webp
image/webp
HEIC
.heic
image/heic
For resolutions between
4K(3840x2160)and8K(7680x4320), only JPEG, JPG, and PNG are supported.
Image size:
When providing an image via a public URL or local path, its size must not exceed
10 MB.When using Base64 encoding, the encoded string must not exceed
10 MB.
To compress a file, see How to compress an image or video to meet the size limit.
Number of supported images: When providing multiple images, the total number of tokens for all images and text must not exceed the model's maximum input limit.
Video limitations
As an image list: 4 to 2,000 images.
As a video file:
Video size:
Via public URL: Up to 2 GB.
Via Base64 encoding: the encoded string must be less than 10 MB.
Via local file path: Up to 100 MB.
Video duration: 2 seconds to 1 hour.
Video format: Supported formats include MP4, AVI, MKV, MOV, FLV, and WMV.
Video resolution: While there is no strict resolution limit, use 2K or lower for best results. Higher resolutions increase processing time without improving model understanding.
Audio understanding: The model does not process the audio track in video files.
Other features
Model | |||||||
kimi-k2.7-code | |||||||
kimi-k2.6 | |||||||
kimi-k2.5 | |||||||
kimi-k2-thinking | |||||||
Moonshot-Kimi-K2-Instruct |
Default parameters
Model | enable_thinking | temperature | top_p | presence_penalty | fps | max_frames |
kimi-k2.7-code | true (thinking mode only) | 1.0 | 0.95 | 0.0 | 2 | 2000 |
kimi-k2.6 | false | thinking mode: 1.0 non-thinking mode: 0.6 | Both modes: 0.95 | Both modes: 0.0 | 2 | 2000 |
kimi-k2.5 | false | thinking mode: 1.0 non-thinking mode: 0.6 | Both modes: 0.95 | Both modes: 0.0 | 2 | 2000 |
kimi-k2-thinking | - | 1.0 | - | - | - | - |
Moonshot-Kimi-K2-Instruct | - | 0.6 | 1.0 | 0 | - | - |
A hyphen (-) indicates that the parameter is not applicable.
Models and billing
The Kimi series are large language models from Moonshot AI.
kimi-k2.7-code: The most capable Kimi model for coding. It follows long-context instructions more reliably and achieves higher success rates on programming tasks. Supports text, image, and video input, thinking mode, conversation, and agent tasks.
kimi-k2.6: The newest and most capable model in the Kimi series. It offers improved performance in long-horizon coding, instruction following, and self-correction. Supports text, image, and video input, thinking and non-thinking modes, conversation, and agent tasks.
kimi-k2.5: It achieves state-of-the-art (SOTA) performance on open-source benchmarks for agent tasks, code generation, visual understanding, and other general intelligence tasks. Supports image, video, and text input, thinking and non-thinking modes, conversation, and agent tasks.
kimi-k2-thinking: Supports deep thinking mode only. It exposes the reasoning process through the
reasoning_contentfield. It excels at coding and tool calling, and is suitable for use cases that require logical analysis, planning, or deep understanding.Moonshot-Kimi-K2-Instruct: Does not support deep thinking. It generates responses with lower latency, and is suitable for use cases that need fast, direct answers.
For kimi-k2.7-code pricing, see model invocation billing.
For pricing and context window details, see the Model Studio console.
Billing is based on input and output token counts.
In thinking mode, the chain of thought counts as output tokens.
Error codes
If a model call fails and returns an error message, see Error codes.