All Products
Search
Document Center

Alibaba Cloud Model Studio:GLM

Last Updated:Mar 11, 2026

Topik ini menjelaskan cara memanggil model seri GLM menggunakan API pada platform Alibaba Cloud Model Studio.

Daftar model

Model seri GLM adalah model penalaran hibrida yang dikembangkan oleh Zhipu AI untuk aplikasi berbasis agen. Model ini mendukung mode berpikir dan mode non-berpikir.

Nama model

Panjang konteks

Maksimum input

Panjang maksimum chain-of-thought

Panjang maksimum respons

(Jumlah token)

glm-5

202.752

202.752

32.768

16.384

glm-4.7

169.984

glm-4.6

Model-model ini bukan layanan pihak ketiga. Semuanya di-deploy pada server Alibaba Cloud Model Studio.

Mulai

glm-5 adalah model terbaru dalam seri GLM. Model ini mendukung peralihan antara mode berpikir dan mode non-berpikir melalui parameter enable_thinking. Jalankan kode berikut untuk memanggil model glm-5 dalam mode berpikir secara cepat.

Sebelum memulai, dapatkan Kunci API dan konfigurasikan sebagai Variabel lingkungan. Jika Anda menggunakan kit pengembangan perangkat lunak (SDK), instal SDK OpenAI atau DashScope.

Kompatibel dengan OpenAI

Catatan

enable_thinking bukan parameter standar OpenAI. Pada SDK Python OpenAI, teruskan melalui extra_body. Pada SDK Node.js, teruskan sebagai parameter tingkat atas.

Python

Kode contoh

from openai import OpenAI
import os

# Inisialisasi klien OpenAI
client = OpenAI(
    # Jika Anda belum mengonfigurasi variabel lingkungan, ganti nilai berikut dengan Kunci API Model Studio Anda: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
    model="glm-5",
    messages=messages,
    # Atur enable_thinking dalam extra_body untuk mengaktifkan mode berpikir
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # Proses pemikiran lengkap
answer_content = ""  # Respons lengkap
is_answering = False  # Menunjukkan apakah fase respons telah dimulai
print("\n" + "=" * 20 + "Thought Process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\n" + "=" * 20 + "Token Usage" + "=" * 20 + "\n")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # Kumpulkan hanya konten pemikiran
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # Setelah menerima konten, mulai menghasilkan respons
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

Respons

====================Thought Process====================

Let me think carefully about this seemingly simple but profound question from the user.

Based on the language used, the user is speaking Chinese, so I should respond in Chinese. This is a basic self-introduction question, but it might have multiple layers of meaning.

First, as a language model, I must honestly state my identity and nature. I am not a human, nor do I have real emotional consciousness. I am an AI assistant trained with deep learning technology. This is the fundamental fact.

Second, considering the user's potential scenarios, they might want to know:
1. What services can I provide?
2. What are my areas of expertise?
3. What are my limitations?
4. How can we interact better?

In my response, I should be friendly and open, yet professional and accurate. I should state my main areas of expertise, such as knowledge Q&A, writing assistance, and creative support, while also frankly pointing out my limitations, such as the lack of real emotional experience.

Additionally, to make the response more complete, I should express a positive attitude and willingness to help the user solve problems. I can guide the user to ask more specific questions to better showcase my abilities.

Given that this is an open-ended opening, the response should be concise and clear, yet contain enough information for the user to have a clear understanding of my basic situation and to lay a good foundation for subsequent conversations.

Finally, the tone should remain humble and professional, neither too technical nor too casual, to make the user feel comfortable and natural.
====================Complete Response====================

I am a GLM large language model trained by Zhipu AI, designed to provide users with information and help solve problems. I am designed to understand and generate human language, and I can answer questions, provide explanations, or discuss various topics.

I do not store your personal data. Our conversations are anonymous. Is there any topic I can help you understand or explore?
====================Token Usage====================

CompletionUsage(completion_tokens=344, prompt_tokens=7, total_tokens=351, completion_tokens_details=None, prompt_tokens_details=None)

Node.js

Kode contoh

import OpenAI from "openai";
import process from 'process';

// Inisialisasi klien OpenAI
const openai = new OpenAI({
    // Jika Anda belum mengonfigurasi variabel lingkungan, ganti nilai berikut dengan Kunci API Model Studio Anda: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY, 
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // Proses pemikiran lengkap
let answerContent = ''; // Respons lengkap
let isAnswering = false; // Menunjukkan apakah fase respons telah dimulai

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        
        const stream = await openai.chat.completions.create({
            model: 'glm-5',
            messages,
            // Catatan: Pada SDK Node.js, parameter non-standar seperti enable_thinking diteruskan sebagai properti tingkat atas, bukan di dalam extra_body.
            enable_thinking: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        console.log('\n' + '='.repeat(20) + 'Thought Process' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\n' + '='.repeat(20) + 'Token Usage' + '='.repeat(20) + '\n');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // Kumpulkan hanya konten pemikiran
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // Setelah menerima konten, mulai menghasilkan respons
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

Respons

====================Thought Process====================

Let me think carefully about the user's question, 'Who are you?'. This needs to be analyzed and responded to from multiple angles.

First, this is a basic identity question. As a GLM large language model, I need to accurately state my identity. I should clearly state that I am an AI assistant developed by Zhipu AI.

Second, I need to consider the user's possible intent in asking this question. They might be first-time users wanting to know basic features, or they might want to confirm if I can provide specific help, or they might just be testing my response style. Therefore, I need to give an open and friendly answer.

I also need to consider the completeness of the answer. In addition to introducing my identity, I should also briefly explain my main functions, such as Q&A, creation, and analysis, to let the user know how to use this assistant.

Finally, I must ensure a friendly and approachable tone and express a willingness to help. I can use expressions like 'I am happy to serve you' to make the user feel the warmth of the communication.

Based on these thoughts, I can organize a concise and clear answer that both answers the user's question and guides subsequent communication.
====================Complete Response====================

I am GLM, a large language model trained by Zhipu AI. I am trained on massive amounts of text data to understand and generate human language, helping users answer questions, provide information, and engage in conversations.

I will continue to learn and improve to provide better services. I am happy to answer your questions or provide assistance. What can I do for you?
====================Token Usage====================

{ prompt_tokens: 7, completion_tokens: 248, total_tokens: 255 }

HTTP

Kode contoh

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "glm-5",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

DashScope

Python

Kode contoh

import os
from dashscope import Generation

# Inisialisasi parameter permintaan
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # Jika Anda belum mengonfigurasi variabel lingkungan, ganti nilai berikut dengan Kunci API Model Studio Anda: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="glm-5",
    messages=messages,
    result_format="message",  # Atur format hasil ke message
    enable_thinking=True,     # Aktifkan mode berpikir
    stream=True,              # Aktifkan keluaran streaming
    incremental_output=True,  # Aktifkan keluaran inkremental
)

reasoning_content = ""  # Proses pemikiran lengkap
answer_content = ""     # Respons lengkap
is_answering = False    # Menunjukkan apakah fase respons telah dimulai

print("\n" + "=" * 20 + "Thought Process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    # Kumpulkan hanya konten pemikiran
    if "reasoning_content" in message:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # Setelah menerima konten, mulai menghasilkan respons
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

print("\n" + "=" * 20 + "Token Usage" + "=" * 20 + "\n")
print(chunk.usage)

Respons

====================Thought Process====================

Let me think carefully about the user's question, 'Who are you?'. First, I need to analyze the user's intent. This could be curiosity from a first-time user, or they might want to know about my specific functions and capabilities.

From a professional perspective, I should clearly state my identity as a GLM large language model, explaining my basic positioning and main functions. I should avoid overly technical descriptions and explain in an easy-to-understand way.

At the same time, I should also consider some practical issues that users might care about, such as privacy protection and data security. These are points of great concern for users when using AI services.

In addition, to show professionalism and friendliness, I can proactively guide the conversation after the introduction by asking if the user needs specific help. This will help the user understand me better and pave the way for subsequent conversations.

Finally, I must ensure the answer is concise and clear, with key points highlighted, so that the user can quickly understand my identity and purpose. Such an answer can both satisfy the user's curiosity and demonstrate professionalism and a service-oriented attitude.
====================Complete Response====================

I am a GLM large language model developed by Zhipu AI, designed to provide users with information and help through natural language processing technology. I am trained on massive amounts of text data and can understand and generate human language, answer questions, provide knowledge support, and participate in conversations.

My design goal is to be a useful AI assistant while ensuring user privacy and data security. I do not store users' personal information and will continue to learn and improve to provide higher quality services.

Is there any question I can answer or any task I can assist you with?
====================Token Usage====================

{"input_tokens": 8, "output_tokens": 269, "total_tokens": 277}

Java

Kode contoh

Penting

Gunakan SDK Java DashScope versi 2.19.4 atau yang lebih baru.

// Versi SDK DashScope harus 2.19.4 atau lebih baru.
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;

public class Main {
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;
    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();
        if (reasoning != null && !reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thought Process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }
        if (content != null && !content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Complete Response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // Jika Anda belum mengonfigurasi variabel lingkungan, ganti baris berikut dengan: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("glm-5")
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("An exception occurred: " + e.getMessage());
        }
    }
}

Respons

====================Thought Process====================
Let me think about how to answer the user's question. First, this is a simple identity question that needs a clear and direct answer.

As a large language model, I should accurately state my basic identity information. This includes:
- Name: GLM
- Developer: Zhipu AI
- Main functions: Language understanding and generation

Considering that the user's question may stem from their first interaction, I need to introduce myself in an easy-to-understand way, avoiding overly technical terms. At the same time, I should also briefly explain my main capabilities to help the user better understand how to interact with me.

I should also express a friendly and open attitude, welcoming users to ask various questions to lay a good foundation for subsequent conversations. However, the introduction should be concise and clear, without being too detailed, to avoid overwhelming the user with information.

Finally, to promote further communication, I can proactively ask if the user needs specific help to better serve their actual needs.
====================Complete Response====================
I am GLM, a large language model developed by Zhipu AI. I am trained on massive amounts of text data and can understand and generate human language, answer questions, provide information, and engage in conversations.

My design purpose is to help users solve problems, provide knowledge, and support various language tasks. I will continuously learn and update to provide more accurate and useful answers.

Is there any question I can help you answer or discuss?

HTTP

Kode contoh

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "glm-5",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

Pemanggilan alat streaming

Model glm-5, glm-4.7, dan glm-4.6 mendukung parameter tool_stream. Parameter Boolean ini secara default bernilai false dan hanya berlaku ketika parameter stream diatur ke true. Jika diaktifkan, bidang `arguments` dalam respons `tool_call` pada pemanggilan fungsi akan dikembalikan secara inkremental melalui aliran, bukan sekaligus setelah proses generasi selesai sepenuhnya.

Perilaku gabungan antara parameter stream dan tool_stream adalah sebagai berikut:

stream

tool_stream

metode pengembalian tool_call

true

true

Argumen dikembalikan secara inkremental dalam beberapa chunk.

true

false (default)

Argumen dikembalikan sepenuhnya dalam satu chunk.

false

true/false

tool_stream tidak berlaku. Argumen dikembalikan sekaligus dalam respons lengkap.

Kompatibel dengan OpenAI

Python

Kode contoh

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The name of the city"}
                },
                "required": ["city"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What is the weather like in Beijing"}]

completion = client.chat.completions.create(
    model="glm-5",
    tools=tools,
    messages=messages,
    extra_body={
        "tool_stream": True,
    },
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in completion:
    if chunk.choices:
        delta = chunk.choices[0].delta
        if hasattr(delta, 'content') and delta.content:
            print(f"[content] {delta.content}")
        if hasattr(delta, 'tool_calls') and delta.tool_calls:
            for tc in delta.tool_calls:
                print(f"[tool_call] id={tc.id}, name={tc.function.name}, args={tc.function.arguments}")
        if chunk.choices[0].finish_reason:
            print(f"[finish_reason] {chunk.choices[0].finish_reason}")
    if not chunk.choices and chunk.usage:
        print(f"[usage] {chunk.usage}")

Node.js

Kode contoh

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

const tools = [
    {
        type: "function",
        function: {
            name: "get_weather",
            description: "Get weather information for a specified city",
            parameters: {
                type: "object",
                properties: {
                    city: { type: "string", description: "The name of the city" }
                },
                required: ["city"]
            }
        }
    }
];

async function main() {
    try {
        const stream = await openai.chat.completions.create({
            model: 'glm-5',
            messages: [{ role: 'user', content: 'What is the weather like in Beijing' }],
            tools: tools,
            tool_stream: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                if (chunk.usage) {
                    console.log(`[usage] ${JSON.stringify(chunk.usage)}`);
                }
                continue;
            }

            const delta = chunk.choices[0].delta;

            if (delta.content) {
                console.log(`[content] ${delta.content}`);
            }

            if (delta.tool_calls) {
                for (const tc of delta.tool_calls) {
                    console.log(`[tool_call] id=${tc.id}, name=${tc.function.name}, args=${tc.function.arguments}`);
                }
            }

            if (chunk.choices[0].finish_reason) {
                console.log(`[finish_reason] ${chunk.choices[0].finish_reason}`);
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

HTTP

Kode contoh

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "glm-5",
    "messages": [
        {
            "role": "user",
            "content": "What is the weather like in Beijing"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get weather information for a specified city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "The name of the city"}
                    },
                    "required": ["city"]
                }
            }
        }
    ],
    "stream": true,
    "stream_options": {"include_usage": true},
    "tool_stream": true
}'

DashScope

Python

Kode contoh

import os
from dashscope import Generation

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The name of the city"}
                },
                "required": ["city"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What is the weather like in Beijing"}]

completion = Generation.call(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="glm-5",
    messages=messages,
    tools=tools,
    result_format="message",
    stream=True,
    tool_stream=True,
    incremental_output=True,
)

for chunk in completion:
    msg = chunk.output.choices[0].message
    if msg.content:
        print(f"[content] {msg.content}")
    if "tool_calls" in msg and msg.tool_calls:
        for tc in msg.tool_calls:
            fn = tc.get("function", {})
            print(f"[tool_call] id={tc.get('id','')}, name={fn.get('name','')}, args={fn.get('arguments','')}")
    finish = chunk.output.choices[0].get("finish_reason", "")
    if finish and finish != "null":
        print(f"[finish_reason] {finish}")

HTTP

Kode contoh

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "glm-5",
    "input": {
        "messages": [
            {
                "role": "user",
                "content": "What is the weather like in Beijing"
            }
        ]
    },
    "parameters": {
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get weather information for a specified city",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "city": {"type": "string", "description": "The name of the city"}
                        },
                        "required": ["city"]
                    }
                }
            }
        ],
        "tool_stream": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

Fitur model

Model

Percakapan multi-turn

Function Calling

Structured output

Web search

Partial mode

Context cache

glm-5

Didukung

Didukung

Didukung

Hanya dalam mode non-berpikir

Tidak didukung

Tidak didukung

Didukung

Saat ini, hanya caching implisit yang didukung

glm-4.7

Didukung

Didukung

Didukung

Hanya dalam mode non-berpikir

Tidak didukung

Tidak didukung

Tidak didukung

glm-4.6

Didukung

Didukung

Didukung

Hanya dalam mode non-berpikir

Tidak didukung

Tidak didukung

Tidak didukung

Nilai parameter default

Model

enable_thinking

temperature

top_p

top_k

repetition_penalty

glm-5

true

1,0

0,95

20

1,0

glm-4.7

true

1,0

0,95

20

1,0

glm-4.6

true

1,0

0,95

20

1,0

Penagihan

Penagihan didasarkan pada jumlah token input dan output yang digunakan oleh model. Untuk informasi selengkapnya mengenai harga, lihat GLM.

Dalam mode berpikir, output chain-of-thought ditagih berdasarkan jumlah token output.

FAQ

T: Bagaimana cara mengonfigurasi Dify?

J: Saat ini, Anda tidak dapat mengintegrasikan model seri GLM dari Alibaba Cloud Model Studio dengan Dify. Sebagai gantinya, gunakan model Qwen3 melalui kartu Qwen. Untuk informasi selengkapnya, lihat Dify.

Kode error

Jika terjadi error selama eksekusi, lihat Pesan Error untuk troubleshooting.