All Products
Search
Document Center

Alibaba Cloud Model Studio:Kimi

Last Updated:Feb 02, 2026

This topic describes how to call Kimi series models on the Alibaba Cloud Model Studio platform using the OpenAI compatible API or the DashScope software development kit (SDK).

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key from the China (Beijing) region.

Model introduction

The Kimi series models are large language models (LLMs) developed by Moonshot AI.

  • Kimi-K2.5: The most versatile model in the Kimi series to date. It achieves state-of-the-art (SOTA) performance among open source models on tasks such as agents, code generation, and visual understanding. It supports image and text inputs, thinking and non-thinking modes, and both conversational and agent tasks.

    Note

    The Kimi-K2.5 model provided on Model Studio currently supports image understanding but not video input. The video understanding feature will be available soon.

  • kimi-k2-thinking: Supports only deep thinking mode. It shows the thinking process in the reasoning_content field. This model excels at encoding and tool calling. It is suitable for scenarios that require logical analysis, planning, or deep understanding.

  • Moonshot-Kimi-K2-Instruct: Does not support deep thinking. It generates responses directly for faster performance. This model is suitable for scenarios that require quick and direct answers.

Model name

Mode

Context length

Max input

Max chain-of-thought length

Max response length

Input cost

Output cost

(Tokens)

(per million tokens)

Kimi-K2.5

Thinking mode

262,144

258,048

32,768

32,768

$0.574

$3.011

Kimi-K2.5

Non-thinking mode

262,144

260,096

-

32,768

$0.574

$3.011

kimi-k2-thinking

Thinking mode

262,144

229,376

32,768

16,384

$0.574

$2.294

Moonshot-Kimi-K2-Instruct

Non-thinking mode

131,072

131,072

-

8,192

$0.574

$2.294

Text generation call examples

Before you use the API, you must obtain an API key and an API host and configure the API key as an environment variable (to be deprecated and merged into Configure an API key). If you make calls using an SDK, you must also install the SDK.

OpenAI compatible

Python

Sample code

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="kimi-k2-thinking",
    messages=[{"role": "user", "content": "Who are you?"}],
    stream=True,
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Indicates whether the model has started generating the response

print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")

for chunk in completion:
    if chunk.choices:
        delta = chunk.choices[0].delta
        # Collect only the thinking content
        if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
            if not is_answering:
                print(delta.reasoning_content, end="", flush=True)
            reasoning_content += delta.reasoning_content
        # When content is received, start generating the response
        if hasattr(delta, "content") and delta.content:
            if not is_answering:
                print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
                is_answering = True
            print(delta.content, end="", flush=True)
            answer_content += delta.content

Response

====================Thinking Process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an AI assistant named Kimi, developed by Moonshot AI. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical jargon so that regular users can understand. I should also emphasize that I am an AI without personal consciousness, emotions, or experiences.

Response structure:
- Directly state my identity
- Mention my developer
- Briefly introduce my core capabilities
- Keep it concise and clear
====================Complete Response====================

I am an AI assistant named Kimi, developed by Moonshot AI. I am based on a Mixture-of-Experts (MoE) architecture and have capabilities such as long-context understanding, intelligent conversation, file processing, code generation, and complex task reasoning. How can I help you?

Node.js

Sample code

import OpenAI from "openai";
import process from 'process';

// Initialize the OpenAI client
const openai = new OpenAI({
    // If the environment variable is not set, replace this with your Alibaba Cloud Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // Complete thinking process
let answerContent = ''; // Complete response
let isAnswering = false; // Indicates whether the model has started generating the response

async function main() {
    const messages = [{ role: 'user', content: 'Who are you?' }];

    const stream = await openai.chat.completions.create({
        model: 'kimi-k2-thinking',
        messages,
        stream: true,
    });

    console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');

    for await (const chunk of stream) {
        if (chunk.choices?.length) {
            const delta = chunk.choices[0].delta;
            // Collect only the thinking content
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // When content is received, start generating the response
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    }
}

main();

Response

====================Thinking Process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an AI assistant named Kimi, developed by Moonshot AI. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical jargon so that regular users can easily understand. I should also emphasize that I am an AI without personal consciousness, emotions, or experiences to avoid misunderstandings.

Response structure:
- Directly state my identity
- Mention my developer
- Briefly introduce my core capabilities
- Keep it concise and clear
====================Complete Response====================

I am an AI assistant named Kimi, developed by Moonshot AI.

I am good at:
- Long-text understanding and generation
- Intelligent conversation and Q&A
- File processing and analysis
- Information retrieval and integration

As an AI assistant, I do not have personal consciousness, emotions, or experiences, but I will do my best to provide you with accurate and helpful assistance. How can I help you?

HTTP

Sample code

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you?"
        }
    ]
}'

Response

{
    "choices": [
        {
            "message": {
                "content": "I am an AI assistant named Kimi, developed by Moonshot AI. I am skilled at long-text processing, intelligent conversation, file analysis, programming assistance, and complex task reasoning. I can help you answer questions, create content, and analyze documents. How can I help you?",
                "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an AI assistant named Kimi, developed by Moonshot AI. I should introduce myself clearly and concisely, including the following:\n1. My identity: AI assistant\n2. My developer: Moonshot AI\n3. My name: Kimi\n4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.\n\nI should maintain a friendly and professional tone while providing useful information. No need to overcomplicate it; a direct answer is sufficient.",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 183,
        "total_tokens": 191
    },
    "created": 1762753998,
    "system_fingerprint": null,
    "model": "kimi-k2-thinking",
    "id": "chatcmpl-485ab490-90ec-48c3-85fa-1c732b683db2"
}

DashScope

Python

Sample code

import os
from dashscope import Generation

# Initialize request parameters
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # If the environment variable is not set, replace this with your Alibaba Cloud Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="kimi-k2-thinking",
    messages=messages,
    result_format="message",  # Set the result format to message
    stream=True,              # Enable streaming output
    incremental_output=True,  # Enable incremental output
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Indicates whether the model has started generating the response

print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    
    # Collect only the thinking content
    if message.reasoning_content:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # When content is received, start generating the response
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

# After the loop, the reasoning_content and answer_content variables contain the complete content.
# You can perform further processing here as needed.
# print(f"\n\nComplete thinking process:\n{reasoning_content}")
# print(f"\nComplete response:\n{answer_content}")

Response

====================Thinking Process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an AI assistant named Kimi, developed by Moonshot AI. I should state this clearly and concisely.

Key information to include the following:
1. My name: Kimi
2. My developer: Moonshot AI
3. My nature: Artificial intelligence assistant
4. What I can do: Answer questions, assist with creation, etc.

I should maintain a friendly and helpful tone while accurately stating my identity. I should not pretend to be human or have a personal identity.

A suitable response could be:
"I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?"

This response is direct, accurate, and invites further interaction.
====================Complete Response====================

I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?

Java

Sample code

// DashScope SDK version >= 2.19.4
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (reasoning!= null&&!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking Process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (content!= null&&!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Complete Response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If the environment variable is not set, replace the following line with your Alibaba Cloud Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("kimi-k2-thinking")
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
            // Print the final result
            // if (reasoningContent.length() > 0) {
            //     System.out.println("\n====================Complete Response====================");
            //     System.out.println(finalContent.toString());
            // }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

Response

====================Thinking Process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an AI assistant named Kimi, developed by Moonshot AI. I should state this clearly and concisely.

The response should include the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, etc.

I should not pretend to be human or provide excessive technical details. A clear and friendly answer is sufficient.
====================Complete Response====================
I am an AI assistant named Kimi, developed by Moonshot AI. I am skilled at long-text processing, intelligent conversation, answering questions, assisting with creation, and helping you analyze and process files. How can I help you?

HTTP

Sample code

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

Response

{
    "output": {
        "choices": [
            {
                "finish_reason": "stop",
                "message": {
                    "content": "I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you answer questions, create content, analyze documents, and write code. How can I help you?",
                    "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an AI assistant named Kimi, developed by Moonshot AI. I should state this clearly and concisely.\n\nKey information to include the following:\n1. My name: Kimi\n2. My developer: Moonshot AI\n3. My nature: Artificial intelligence assistant\n4. What I can do: Answer questions, assist with creation, etc.\n\nI should respond in a friendly and direct manner that is easy for the user to understand.",
                    "role": "assistant"
                }
            }
        ]
    },
    "usage": {
        "input_tokens": 9,
        "output_tokens": 156,
        "total_tokens": 165
    },
    "request_id": "709a0697-ed1f-4298-82c9-a4b878da1849"
}

Kimi-K2.5 multimodal call examples

Kimi-K2.5 can process text and image inputs simultaneously. You can enable thinking mode using the enable_thinking parameter. The following examples demonstrate how to use the multimodal capabilities.

Enable or disable thinking mode

Kimi-K2.5 is a hybrid thinking model that can either respond after a thinking process or respond directly. You can use the enable_thinking parameter to control this behavior:

  • true: Enables thinking mode.

  • false (default): Disables thinking mode.

The following examples show how to use an image URL and enable thinking mode. The main example uses a single image, while the commented-out code demonstrates multi-image input.

OpenAI compatible

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

# Example of passing a single image (thinking mode enabled)
completion = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What scene is depicted in the image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
                    }
                }
            ]
        }
    ],
    extra_body={"enable_thinking":True}  # Enable thinking mode
)

# Print the thinking process
if hasattr(completion.choices[0].message, 'reasoning_content') and completion.choices[0].message.reasoning_content:
    print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
    print(completion.choices[0].message.reasoning_content)

# Print the response content
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
print(completion.choices[0].message.content)

# Example of passing multiple images (thinking mode enabled, uncomment to use)
# completion = client.chat.completions.create(
#     model="kimi-k2.5",
#     messages=[
#         {
#             "role": "user",
#             "content": [
#                 {"type": "text", "text": "What do these images depict?"},
#                 {
#                     "type": "image_url",
#                     "image_url": {"url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"}
#                 },
#                 {
#                     "type": "image_url",
#                     "image_url": {"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"}
#                 }
#             ]
#         }
#     ],
#     extra_body={"enable_thinking":True}
# )
#
# # Print the thinking process and response
# if hasattr(completion.choices[0].message, 'reasoning_content') and completion.choices[0].message.reasoning_content:
#     print("\nThinking Process:\n" + completion.choices[0].message.reasoning_content)
# print("\nComplete Response:\n" + completion.choices[0].message.content)

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

// Example of passing a single image (thinking mode enabled)
const completion = await openai.chat.completions.create({
    model: 'kimi-k2.5',
    messages: [
        {
            role: 'user',
            content: [
                { type: 'text', text: 'What scene is depicted in the image?' },
                {
                    type: 'image_url',
                    image_url: {
                        url: 'https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg'
                    }
                }
            ]
        }
    ],
    enable_thinking: true  // Enable thinking mode
});

// Print the thinking process
if (completion.choices[0].message.reasoning_content) {
    console.log('\n' + '='.repeat(20) + 'Thinking Process' + '='.repeat(20) + '\n');
    console.log(completion.choices[0].message.reasoning_content);
}

// Print the response content
console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
console.log(completion.choices[0].message.content);

// Example of passing multiple images (thinking mode enabled, uncomment to use)
// const multiCompletion = await openai.chat.completions.create({
//     model: 'kimi-k2.5',
//     messages: [
//         {
//             role: 'user',
//             content: [
//                 { type: 'text', text: 'What do these images depict?' },
//                 {
//                     type: 'image_url',
//                     image_url: { url: 'https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg' }
//                 },
//                 {
//                     type: 'image_url',
//                     image_url: { url: 'https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png' }
//                 }
//             ]
//         }
//     ],
//     enable_thinking: true
// });
//
// // Print the thinking process and response
// if (multiCompletion.choices[0].message.reasoning_content) {
//     console.log('\nThinking Process:\n' + multiCompletion.choices[0].message.reasoning_content);
// }
// console.log('\nComplete Response:\n' + multiCompletion.choices[0].message.content);

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2.5",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What scene is depicted in the image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
                    }
                }
            ]
        }
    ],
    "enable_thinking": true
}'

# Multi-image input example (uncomment to use)
# curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
# -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
# -H "Content-Type: application/json" \
# -d '{
#     "model": "kimi-k2.5",
#     "messages": [
#         {
#             "role": "user",
#             "content": [
#                 {
#                     "type": "text",
#                     "text": "What do these images depict?"
#                 },
#                 {
#                     "type": "image_url",
#                     "image_url": {
#                         "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
#                     }
#                 },
#                 {
#                     "type": "image_url",
#                     "image_url": {
#                         "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
#                     }
#                 }
#             ]
#         }
#     ],
#     "enable_thinking": true,
#     "stream": false
# }'

DashScope

Python

import os
from dashscope import MultiModalConversation

# Example of passing a single image (thinking mode enabled)
response = MultiModalConversation.call(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "What scene is depicted in the image?"},
                {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"}
            ]
        }
    ],
    enable_thinking=True  # Enable thinking mode
)

# Print the thinking process
if hasattr(response.output.choices[0].message, 'reasoning_content') and response.output.choices[0].message.reasoning_content:
    print("\n" + "=" * 20 + "Thinking Process" + "=" * 20 + "\n")
    print(response.output.choices[0].message.reasoning_content)

# Print the response content
print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
print(response.output.choices[0].message.content[0]["text"])

# Example of passing multiple images (thinking mode enabled, uncomment to use)
# response = MultiModalConversation.call(
#     api_key=os.getenv("DASHSCOPE_API_KEY"),
#     model="kimi-k2.5",
#     messages=[
#         {
#             "role": "user",
#             "content": [
#                 {"text": "What do these images depict?"},
#                 {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"},
#                 {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"}
#             ]
#         }
#     ],
#     enable_thinking=True
# )
#
# # Print the thinking process and response
# if hasattr(response.output.choices[0].message, 'reasoning_content') and response.output.choices[0].message.reasoning_content:
#     print("\nThinking Process:\n" + response.output.choices[0].message.reasoning_content)
# print("\nComplete Response:\n" + response.output.choices[0].message.content[0]["text"])

Java

// DashScope SDK version >= 2.19.4
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;

public class KimiK25MultiModalExample {
    public static void main(String[] args) {
        try {
            // Single-image input example (thinking mode enabled)
            MultiModalConversation conv = new MultiModalConversation();

            // Build the message content
            Map<String, Object> textContent = new HashMap<>();
            textContent.put("text", "What scene is depicted in the image?");

            Map<String, Object> imageContent = new HashMap<>();
            imageContent.put("image", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg");

            MultiModalMessage userMessage = MultiModalMessage.builder()
                    .role(Role.USER.getValue())
                    .content(Arrays.asList(textContent, imageContent))
                    .build();

            // Build the request parameters
            MultiModalConversationParam param = MultiModalConversationParam.builder()
                    // If the environment variable is not set, replace this with your Alibaba Cloud Model Studio API key
                    .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                    .model("kimi-k2.5")
                    .messages(Arrays.asList(userMessage))
                    .enableThinking(true)  // Enable thinking mode
                    .build();

            // Call the model
            MultiModalConversationResult result = conv.call(param);

            // Print the result
            String content = result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text");
            System.out.println("Response content: " + content);

            // If thinking mode is enabled, print the thinking process
            if (result.getOutput().getChoices().get(0).getMessage().getReasoningContent() != null) {
                System.out.println("\nThinking process: " +
                    result.getOutput().getChoices().get(0).getMessage().getReasoningContent());
            }

            // Multi-image input example (uncomment to use)
            // Map<String, Object> imageContent1 = new HashMap<>();
            // imageContent1.put("image", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg");
            // Map<String, Object> imageContent2 = new HashMap<>();
            // imageContent2.put("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png");
            //
            // Map<String, Object> textContent2 = new HashMap<>();
            // textContent2.put("text", "What do these images depict?");
            //
            // MultiModalMessage multiImageMessage = MultiModalMessage.builder()
            //         .role(Role.USER.getValue())
            //         .content(Arrays.asList(textContent2, imageContent1, imageContent2))
            //         .build();
            //
            // MultiModalConversationParam multiParam = MultiModalConversationParam.builder()
            //         .apiKey(System.getenv("DASHSCOPE_API_KEY"))
            //         .model("kimi-k2.5")
            //         .messages(Arrays.asList(multiImageMessage))
            //         .enableThinking(true)
            //         .build();
            //
            // MultiModalConversationResult multiResult = conv.call(multiParam);
            // System.out.println(multiResult.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));

        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("Call failed: " + e.getMessage());
        }
    }
}

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2.5",
    "input": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "What scene is depicted in the image?"
                    },
                    {
                        "image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
                    }
                ]
            }
        ]
    },
    "parameters": {
        "enable_thinking": true
    }
}'

# Multi-image input example (uncomment to use)
# curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation" \
# -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
# -H "Content-Type: application/json" \
# -d '{
#     "model": "kimi-k2.5",
#     "input": {
#         "messages": [
#             {
#                 "role": "user",
#                 "content": [
#                     {
#                         "text": "What do these images depict?"
#                     },
#                     {
#                         "image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"
#                     },
#                     {
#                         "image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
#                     }
#                 ]
#             }
#         ]
#     },
#     "parameters": {
#         "enable_thinking": true
#     }
# }'

Pass a local file

The following examples show how to pass a local image file. The OpenAI compatible API supports only Base64 encoding. The DashScope SDK supports both Base64 encoding and file paths.

OpenAI compatible

To pass Base64-encoded data, you must construct a Data URL. For more information, see Construct a Data URL.

Python

from openai import OpenAI
import os
import base64


# Encoding function: Converts a local file to a Base64 encoded string
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Replace xxx/eagle.png with the absolute path of your local image
base64_image = encode_image("xxx/eagle.png")

client = OpenAI(
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{base64_image}"}, 
                },
                {"type": "text", "text": "What scene is depicted in the image?"},
            ],
        }
    ],
)
print(completion.choices[0].message.content)

Node.js

import OpenAI from "openai";
import { readFileSync } from 'fs';

const openai = new OpenAI(
    {
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }
);

const encodeImage = (imagePath) => {
    const imageFile = readFileSync(imagePath);
    return imageFile.toString('base64');
  };
// Replace xxx/eagle.png with the absolute path of your local image
const base64Image = encodeImage("xxx/eagle.png")
async function main() {
    const completion = await openai.chat.completions.create({
        model: "kimi-k2.5", 
        messages: [
            {"role": "user",
             "content": [{"type": "image_url",
                        "image_url": {"url": `data:image/png;base64,${base64Image}`},},
                        {"type": "text", "text": "What scene is depicted in the image?"}]}]
    });
    console.log(completion.choices[0].message.content);
}

main();

DashScope

Base64 encoding method

To use Base64 encoding, you must construct a Data URL. For instructions, see Constructing a Data URL.

Python

import base64
import os
import dashscope 
from dashscope import MultiModalConversation

dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"

# Encoding function: Converts a local file to a Base64 encoded string
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


# Replace xxx/eagle.png with the absolute path of your local image
base64_image = encode_image("xxx/eagle.png")

messages = [
    {
        "role": "user",
        "content": [
            {"image": f"data:image/png;base64,{base64_image}"},
            {"text": "What scene is depicted in the image?"},
        ],
    },
]
response = MultiModalConversation.call(
    # If the environment variable is not set, replace the following line with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="kimi-k2.5",  
    messages=messages,
)
print(response.output.choices[0].message.content[0]["text"])

Java

import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Base64;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.alibaba.dashscope.aigc.multimodalconversation.*;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;

public class Main {

   static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}

    private static String encodeImageToBase64(String imagePath) throws IOException {
        Path path = Paths.get(imagePath);
        byte[] imageBytes = Files.readAllBytes(path);
        return Base64.getEncoder().encodeToString(imageBytes);
    }

    public static void callWithLocalFile(String localPath) throws ApiException, NoApiKeyException, UploadFileException, IOException {

        String base64Image = encodeImageToBase64(localPath); // Base64 encoding

        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
                .content(Arrays.asList(
                        new HashMap<String, Object>() {{ put("image", "data:image/png;base64," + base64Image); }},
                        new HashMap<String, Object>() {{ put("text", "What scene is depicted in the image?"); }}
                )).build();

        MultiModalConversationParam param = MultiModalConversationParam.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("kimi-k2.5")
                .messages(Arrays.asList(userMessage))
                .build();

        MultiModalConversationResult result = conv.call(param);
        System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));
    }

    public static void main(String[] args) {
        try {
            // Replace xxx/eagle.png with the absolute path of your local image
            callWithLocalFile("xxx/eagle.png");
        } catch (ApiException | NoApiKeyException | UploadFileException | IOException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

Local file path method

You can pass the local file path directly to the model. This method is supported only by the DashScope Python and Java SDKs. It is not supported by DashScope HTTP or the OpenAI compatible method. Refer to the following table to specify the file path based on your programming language and operating system.

Specify a file path (image example)

System

SDK

File path to pass

Example

Linux or macOS system

Python SDK

file://{absolute_path_of_the_file}

file:///home/images/test.png

Java SDK

Windows system

Python SDK

file://{absolute_path_of_the_file}

file://D:/images/test.png

Java SDK

file:///{absolute_path_of_the_file}

file:///D:/images/test.pn

Python

import os
from dashscope import MultiModalConversation
import dashscope 

dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"

# Replace xxx/eagle.png with the absolute path of your local image
local_path = "xxx/eagle.png"
image_path = f"file://{local_path}"
messages = [
                {'role':'user',
                'content': [{'image': image_path},
                            {'text': 'What scene is depicted in the image?'}]}]
response = MultiModalConversation.call(
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model='kimi-k2.5',  
    messages=messages)
print(response.output.choices[0].message.content[0]["text"])

Java

import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;

public class Main {

    static {Constants.baseHttpApiUrl="https://dashscope.aliyuncs.com/api/v1";}
    
    public static void callWithLocalFile(String localPath)
            throws ApiException, NoApiKeyException, UploadFileException {
        String filePath = "file://"+localPath;
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
                .content(Arrays.asList(new HashMap<String, Object>(){{put("image", filePath);}},
                        new HashMap<String, Object>(){{put("text", "What scene is depicted in the image?");}})).build();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("kimi-k2.5")  
                .messages(Arrays.asList(userMessage))
                .build();
        MultiModalConversationResult result = conv.call(param);
        System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent().get(0).get("text"));}

    public static void main(String[] args) {
        try {
            // Replace xxx/eagle.png with the absolute path of your local image
            callWithLocalFile("xxx/eagle.png");
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

Image limitations

  • Image resolution:

    • Minimum size: The width and height of the image must both be greater than 10 pixels.

    • Aspect ratio: The ratio of the long side to the short side of the image must not exceed 200:1.

    • Pixel limit: We recommend that you keep the image resolution within 8K (7680×4320). Images that exceed this resolution may cause API calls to time out because of large file sizes and long network transmission times.

  • Supported image formats

    • For resolutions below 4K (3840×2160), the following image formats are supported:

      Image format

      Common extension

      MIME Type

      BMP

      .bmp

      image/bmp

      JPEG

      .jpe, .jpeg, .jpg

      image/jpeg

      PNG

      .png

      image/png

      TIFF

      .tif, .tiff

      image/tiff

      WEBP

      .webp

      image/webp

      HEIC

      .heic

      image/heic

    • For resolutions between 4K (3840×2160) and 8K (7680×4320), only JPEG, JPG, and PNG formats are supported.

  • Image size:

    • When passing a public URL or local path: The size of a single image cannot exceed 10 MB.

    • When passing a Base64-encoded string: The size of the encoded string cannot exceed 10 MB.

    To reduce the file size, see How to compress an image or video to the required size.
  • Number of supported images: When you pass multiple images, the number of images is limited by the model's maximum input tokens. The total number of tokens for all images and text combined must be less than this limit.

Model features

Model

Multi-turn conversation

Deep thinking

Function Calling

structured output

web search

partial mode

Context Cache

Kimi-K2.5

Supported

Supported

Supported

Not supported

Not supported

Not supported

Not supported

kimi-k2-thinking

Supported

Supported

Supported

Supported

Not supported

Not supported

Not supported

Moonshot-Kimi-K2-Instruct

Supported

Not supported

Supported

Not supported

Supported

Not supported

Not supported

Default parameter values

Model

enable_thinking

temperature

top_p

presence_penalty

Kimi-K2.5

false

Thinking mode: 1.0

Non-thinking mode: 0.6

Thinking/non-thinking mode: 0.95

Thinking/non-thinking mode: 0.0

kimi-k2-thinking

-

1.0

-

-

Moonshot-Kimi-K2-Instruct

-

0.6

1.0

0

A hyphen (-) indicates that there is no default value and the parameter cannot be set.

Error codes

If a model call fails and returns an error message, see Error messages to troubleshoot the issue.