Kimi - Alibaba Cloud Model Studio - Alibaba Cloud Documentation Center

This topic describes how to call Kimi models on Alibaba Cloud Model Studio using OpenAI-compatible APIs or the DashScope SDK.

Important

The models described in this topic are available only in the China (Beijing) region. To use these models, you must have an API key from the China (Beijing) region.

Overview

Kimi is a series of large language models (LLMs) from Moonshot AI.

kimi-k2-thinking: This model supports only the thinking mode and displays the thinking process through the reasoning_content field. It has excellent coding and tool calling capabilities. This model is suitable for scenarios that require logical analysis, planning, or deep understanding.
Moonshot-Kimi-K2-Instruct: This model does not support deep thinking and generates responses directly. It has a faster response speed and is suitable for scenarios that require quick and direct answers.

Model	Context window	Max input	Max chain-of-thought	Max response	Input price	Output price
Model	(Tokens)				(Million tokens)
kimi-k2-thinking	262,144	229,376	32,768	16,384	$0.574	$2.294
Moonshot-Kimi-K2-Instruct	131,072	131,072	-	8,192	$0.574	$2.294

Getting started

Before you call the API, create an API key and export the API key as an environment variable. If you call the model using an SDK, install the OpenAI or DashScope SDK.

OpenAI compatible

Python

Sample code

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="kimi-k2-thinking",
    messages=[{"role": "user", "content": "Who are you?"}],
    stream=True,
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Whether the response phase has started

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if chunk.choices:
        delta = chunk.choices[0].delta
        # Collect only the thinking content
        if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
            if not is_answering:
                print(delta.reasoning_content, end="", flush=True)
            reasoning_content += delta.reasoning_content
        # Received content, start responding
        if hasattr(delta, "content") and delta.content:
            if not is_answering:
                print("\n" + "=" * 20 + "Complete response" + "=" * 20 + "\n")
                is_answering = True
            print(delta.content, end="", flush=True)
            answer_content += delta.content

Sample response

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical terms so that regular users can understand. At the same time, I should emphasize that I am an AI without personal consciousness, emotions, or personal experiences.

Response structure:
- Directly state my identity
- Mention the developer
- Briefly introduce core capabilities
- Keep it simple and clear
====================Complete response====================

I am Kimi, an AI assistant developed by Moonshot AI. I am based on a Mixture-of-Experts (MoE) architecture and have capabilities such as ultra-long context understanding, intelligent conversation, file processing, code generation, and complex task inference. How can I help you?

Node.js

Sample code

import OpenAI from "openai";
import process from 'process';

// Initialize the OpenAI client
const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // Complete thinking process
let answerContent = ''; // Complete response
let isAnswering = false; // Whether the response phase has started

async function main() {
    const messages = [{ role: 'user', content: 'Who are you?' }];

    const stream = await openai.chat.completions.create({
        model: 'kimi-k2-thinking',
        messages,
        stream: true,
    });

    console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');

    for await (const chunk of stream) {
        if (chunk.choices?.length) {
            const delta = chunk.choices[0].delta;
            // Collect only the thinking content
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // Received content, start responding
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Complete response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    }
}

main();

Result

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical terms so that regular users can easily understand. At the same time, I should emphasize that I am an AI without personal consciousness, emotions, or personal experiences to avoid misunderstandings.

Response structure:
- Directly state my identity
- Mention the developer
- Briefly introduce core capabilities
- Keep it simple and clear
====================Complete response====================

I am Kimi, an artificial intelligence assistant developed by Moonshot AI.

I am good at:
- Long text understanding and generation
- Intelligent conversation and Q&A
- File processing and analysis
- Information retrieval and integration

As an AI assistant, I do not have personal consciousness, emotions, or experiences, but I will do my best to provide you with accurate and useful help. How can I help you?

HTTP

Sample code

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you?"
        }
    ]
}'

Sample response

{
    "choices": [
        {
            "message": {
                "content": "I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I am good at handling long texts, intelligent conversation, file analysis, programming assistance, and complex task inference. I can help you answer questions, create content, and analyze documents. How can I help you?",
                "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:\n1. My identity: AI assistant\n2. My developer: Moonshot AI\n3. My name: Kimi\n4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.\n\nI should maintain a friendly and professional tone while providing useful information. No need to overcomplicate, just answer directly.",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 183,
        "total_tokens": 191
    },
    "created": 1762753998,
    "system_fingerprint": null,
    "model": "kimi-k2-thinking",
    "id": "chatcmpl-485ab490-90ec-48c3-85fa-1c732b683db2"
}

DashScope

Python

Sample code

import os
from dashscope import Generation

# Initialize request parameters
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="kimi-k2-thinking",
    messages=messages,
    result_format="message",  # Set the result format to message
    stream=True,              # Enable streaming output
    incremental_output=True,  # Enable incremental output
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Whether the response phase has started

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    
    # Collect only the thinking content
    if message.reasoning_content:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # Received content, start responding
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

# After the loop, the reasoning_content and answer_content variables contain the complete content
# You can perform subsequent processing here as needed
# print(f"\n\nComplete thinking process:\n{reasoning_content}")
# print(f"\nComplete response:\n{answer_content}")

Sample response

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.

Key information to include the following:
1. My name: Kimi
2. My developer: Moonshot AI
3. My nature: Artificial intelligence assistant
4. I can provide help: answer questions, assist with creation, etc.

I should maintain a friendly and helpful tone while accurately stating my identity. I should not pretend to be human or have a personal identity.

A suitable response could be:
"I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?"

This response is direct, accurate, and invites further interaction.
====================Complete response====================

I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?

Java

Sample code

// DashScope SDK version >= 2.19.4
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (reasoning!= null&&!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (content!= null&&!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Complete response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("kimi-k2-thinking")
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
            // Print the final result
            // if (reasoningContent.length() > 0) {
            //     System.out.println("\n====================Complete response====================");
            //     System.out.println(finalContent.toString());
            // }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

Sample response

====================Thinking process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.

The response should include the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, etc.

I should not pretend to be human, nor should I provide too many technical details. I just need to give a clear and friendly answer.
====================Complete response====================
I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I am good at handling long texts, engaging in intelligent conversations, answering questions, assisting with creation, and helping you analyze and process files. How can I help you?

HTTP

Sample code

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

Sample response

{
    "output": {
        "choices": [
            {
                "finish_reason": "stop",
                "message": {
                    "content": "I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you answer questions, create content, analyze documents, and write code. How can I help you?",
                    "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.\n\nKey information to include the following:\n1. My name: Kimi\n2. My developer: Moonshot AI\n3. My nature: Artificial intelligence assistant\n4. I can provide help: answer questions, assist with creation, etc.\n\nI should respond in a friendly and direct manner that is easy for the user to understand.",
                    "role": "assistant"
                }
            }
        ]
    },
    "usage": {
        "input_tokens": 9,
        "output_tokens": 156,
        "total_tokens": 165
    },
    "request_id": "709a0697-ed1f-4298-82c9-a4b878da1849"
}

Model features

Model	Multi-turn conversation	Deep thinking	Function calling	Structured output	Web search	Partial mode	Context cache
kimi-k2-thinking	Supported	Supported	Supported	Supported	Not supported	Not supported	Not supported
Moonshot-Kimi-K2-Instruct	Supported	Not supported	Supported	Not supported	Supported	Not supported	Not supported

Default parameter values

Model	temperature	top_p	presence_penalty
kimi-k2-thinking	1.0	-	-
Moonshot-Kimi-K2-Instruct	0.6	1.0	0

A hyphen (-) indicates that a parameter is not configurable and has no default value.

Error codes

If a model call fails and an error message is returned, see Error messages for more information.