All Products
Search
Document Center

Alibaba Cloud Model Studio:Kimi

Last Updated:Nov 13, 2025

This topic describes how to call Kimi models on Alibaba Cloud Model Studio using OpenAI-compatible APIs or the DashScope SDK.

Important

The models described in this topic are available only in the China (Beijing) region. To use these models, you must have an API key from the China (Beijing) region.

Overview

Kimi is a series of large language models (LLMs) from Moonshot AI.

  • kimi-k2-thinking: This model supports only the thinking mode and displays the thinking process through the reasoning_content field. It has excellent coding and tool calling capabilities. This model is suitable for scenarios that require logical analysis, planning, or deep understanding.

  • Moonshot-Kimi-K2-Instruct: This model does not support deep thinking and generates responses directly. It has a faster response speed and is suitable for scenarios that require quick and direct answers.

Model

Context window

Max input

Max chain-of-thought

Max response

Input price

Output price

(Tokens)

(Million tokens)

kimi-k2-thinking

262,144

229,376

32,768

16,384

$0.574

$2.294

Moonshot-Kimi-K2-Instruct

131,072

131,072

-

8,192

$0.574

$2.294

Getting started

Before you call the API, create an API key and export the API key as an environment variable. If you call the model using an SDK, install the OpenAI or DashScope SDK.

OpenAI compatible

Python

Sample code

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="kimi-k2-thinking",
    messages=[{"role": "user", "content": "Who are you?"}],
    stream=True,
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Whether the response phase has started

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    if chunk.choices:
        delta = chunk.choices[0].delta
        # Collect only the thinking content
        if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
            if not is_answering:
                print(delta.reasoning_content, end="", flush=True)
            reasoning_content += delta.reasoning_content
        # Received content, start responding
        if hasattr(delta, "content") and delta.content:
            if not is_answering:
                print("\n" + "=" * 20 + "Complete response" + "=" * 20 + "\n")
                is_answering = True
            print(delta.content, end="", flush=True)
            answer_content += delta.content

Sample response

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical terms so that regular users can understand. At the same time, I should emphasize that I am an AI without personal consciousness, emotions, or personal experiences.

Response structure:
- Directly state my identity
- Mention the developer
- Briefly introduce core capabilities
- Keep it simple and clear
====================Complete response====================

I am Kimi, an AI assistant developed by Moonshot AI. I am based on a Mixture-of-Experts (MoE) architecture and have capabilities such as ultra-long context understanding, intelligent conversation, file processing, code generation, and complex task inference. How can I help you?

Node.js

Sample code

import OpenAI from "openai";
import process from 'process';

// Initialize the OpenAI client
const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // Complete thinking process
let answerContent = ''; // Complete response
let isAnswering = false; // Whether the response phase has started

async function main() {
    const messages = [{ role: 'user', content: 'Who are you?' }];

    const stream = await openai.chat.completions.create({
        model: 'kimi-k2-thinking',
        messages,
        stream: true,
    });

    console.log('\n' + '='.repeat(20) + 'Thinking process' + '='.repeat(20) + '\n');

    for await (const chunk of stream) {
        if (chunk.choices?.length) {
            const delta = chunk.choices[0].delta;
            // Collect only the thinking content
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // Received content, start responding
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Complete response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    }
}

main();

Result

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.

I should maintain a friendly and professional tone, avoiding overly technical terms so that regular users can easily understand. At the same time, I should emphasize that I am an AI without personal consciousness, emotions, or personal experiences to avoid misunderstandings.

Response structure:
- Directly state my identity
- Mention the developer
- Briefly introduce core capabilities
- Keep it simple and clear
====================Complete response====================

I am Kimi, an artificial intelligence assistant developed by Moonshot AI.

I am good at:
- Long text understanding and generation
- Intelligent conversation and Q&A
- File processing and analysis
- Information retrieval and integration

As an AI assistant, I do not have personal consciousness, emotions, or experiences, but I will do my best to provide you with accurate and useful help. How can I help you?

HTTP

Sample code

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you?"
        }
    ]
}'

Sample response

{
    "choices": [
        {
            "message": {
                "content": "I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I am good at handling long texts, intelligent conversation, file analysis, programming assistance, and complex task inference. I can help you answer questions, create content, and analyze documents. How can I help you?",
                "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should introduce myself clearly and concisely, including the following:\n1. My identity: AI assistant\n2. My developer: Moonshot AI\n3. My name: Kimi\n4. My core capabilities: long-text processing, intelligent conversation, file processing, search, etc.\n\nI should maintain a friendly and professional tone while providing useful information. No need to overcomplicate, just answer directly.",
                "role": "assistant"
            },
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 183,
        "total_tokens": 191
    },
    "created": 1762753998,
    "system_fingerprint": null,
    "model": "kimi-k2-thinking",
    "id": "chatcmpl-485ab490-90ec-48c3-85fa-1c732b683db2"
}

DashScope

Python

Sample code

import os
from dashscope import Generation

# Initialize request parameters
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="kimi-k2-thinking",
    messages=messages,
    result_format="message",  # Set the result format to message
    stream=True,              # Enable streaming output
    incremental_output=True,  # Enable incremental output
)

reasoning_content = ""  # Complete thinking process
answer_content = ""     # Complete response
is_answering = False    # Whether the response phase has started

print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    
    # Collect only the thinking content
    if message.reasoning_content:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # Received content, start responding
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

# After the loop, the reasoning_content and answer_content variables contain the complete content
# You can perform subsequent processing here as needed
# print(f"\n\nComplete thinking process:\n{reasoning_content}")
# print(f"\nComplete response:\n{answer_content}")

Sample response

====================Thinking process====================

The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.

Key information to include the following:
1. My name: Kimi
2. My developer: Moonshot AI
3. My nature: Artificial intelligence assistant
4. I can provide help: answer questions, assist with creation, etc.

I should maintain a friendly and helpful tone while accurately stating my identity. I should not pretend to be human or have a personal identity.

A suitable response could be:
"I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?"

This response is direct, accurate, and invites further interaction.
====================Complete response====================

I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you with various tasks such as answering questions, creating content, and analyzing documents. How can I help you?

Java

Sample code

// DashScope SDK version >= 2.19.4
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;

    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();

        if (reasoning!= null&&!reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thinking process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }

        if (content!= null&&!content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Complete response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("kimi-k2-thinking")
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }

    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
            // Print the final result
            // if (reasoningContent.length() > 0) {
            //     System.out.println("\n====================Complete response====================");
            //     System.out.println(finalContent.toString());
            // }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

Sample response

====================Thinking process====================
The user asks "Who are you?", which is a direct question about my identity. I need to answer truthfully based on my actual identity.

I am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.

The response should include the following:
1. My identity: AI assistant
2. My developer: Moonshot AI
3. My name: Kimi
4. My core capabilities: long-text processing, intelligent conversation, file processing, etc.

I should not pretend to be human, nor should I provide too many technical details. I just need to give a clear and friendly answer.
====================Complete response====================
I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I am good at handling long texts, engaging in intelligent conversations, answering questions, assisting with creation, and helping you analyze and process files. How can I help you?

HTTP

Sample code

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "kimi-k2-thinking",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

Sample response

{
    "output": {
        "choices": [
            {
                "finish_reason": "stop",
                "message": {
                    "content": "I am Kimi, an artificial intelligence assistant developed by Moonshot AI. I can help you answer questions, create content, analyze documents, and write code. How can I help you?",
                    "reasoning_content": "The user asks \"Who are you?\", which is a direct question about my identity. I need to answer truthfully based on my actual identity.\n\nI am an artificial intelligence assistant developed by Moonshot AI, and my name is Kimi. I should state this clearly and concisely.\n\nKey information to include the following:\n1. My name: Kimi\n2. My developer: Moonshot AI\n3. My nature: Artificial intelligence assistant\n4. I can provide help: answer questions, assist with creation, etc.\n\nI should respond in a friendly and direct manner that is easy for the user to understand.",
                    "role": "assistant"
                }
            }
        ]
    },
    "usage": {
        "input_tokens": 9,
        "output_tokens": 156,
        "total_tokens": 165
    },
    "request_id": "709a0697-ed1f-4298-82c9-a4b878da1849"
}

Model features

Model

Multi-turn conversation

Deep thinking

Function calling

Structured output

Web search

Partial mode

Context cache

kimi-k2-thinking

Supported

Supported

Supported

Supported

Not supported

Not supported

Not supported

Moonshot-Kimi-K2-Instruct

Supported

Not supported

Supported

Not supported

Supported

Not supported

Not supported

Default parameter values

Model

temperature

top_p

presence_penalty

kimi-k2-thinking

1.0

-

-

Moonshot-Kimi-K2-Instruct

0.6

1.0

0

A hyphen (-) indicates that a parameter is not configurable and has no default value.

Error codes

If a model call fails and an error message is returned, see Error messages for more information.