API による GLM 思考モードの有効化でスマートな応答を実現 - Model Studio

このトピックでは、Alibaba Cloud Model Studio プラットフォームで API を使用して GLM シリーズのモデルを呼び出す方法について説明します。

モデルリスト

GLM シリーズのモデルは、Zhipu AI がエージェントベースのアプリケーション向けに開発したハイブリッド推論モデルです。思考モードとノンシンキングモードの両方をサポートしています。

モデル名	コンテキスト長	最大入力	最大 Chain-of-Thought 長	最大応答長
	(トークン数)
glm-5	202,752	202,752	32,768	16,384
glm-4.7		169,984
glm-4.6

これらのモデルはサードパーティサービスではありません。すべて Alibaba Cloud Model Studio のサーバーにデプロイされています。

クイックスタート

glm-5 は、GLM シリーズの最新のモデルです。enable_thinking パラメーターを使用して、思考モードとノンシンキングモード間の切り替えをサポートしています。思考モードで glm-5 モデルをすばやく呼び出すには、以下のコードを実行します。

開始する前に、API キーを取得し、環境変数として設定してください。ソフトウェア開発キット (SDK) を使用する場合は、OpenAI または DashScope SDK をインストールしてください。

OpenAI 互換

説明

enable_thinking は標準の OpenAI パラメーターではありません。OpenAI Python SDK では、extra_body を通じて渡します。Node.js SDK では、最上位レベルのパラメーターとして渡します。

Python

サンプルコード

from openai import OpenAI
import os

# OpenAI クライアントを初期化します
client = OpenAI(
    # 環境変数を設定していない場合は、値を Model Studio API キーに置き換えます: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you"}]
completion = client.chat.completions.create(
    model="glm-5",
    messages=messages,
    # extra_body で enable_thinking を設定して思考モードを有効にします
    extra_body={"enable_thinking": True},
    stream=True,
    stream_options={
        "include_usage": True
    },
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""  # 完全な応答
is_answering = False  # 応答フェーズが開始されたかどうかを示します
print("\n" + "=" * 20 + "Thought Process" + "=" * 20 + "\n")

for chunk in completion:
    if not chunk.choices:
        print("\n" + "=" * 20 + "Token Usage" + "=" * 20 + "\n")
        print(chunk.usage)
        continue

    delta = chunk.choices[0].delta

    # 思考コンテンツのみを収集します
    if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
        if not is_answering:
            print(delta.reasoning_content, end="", flush=True)
        reasoning_content += delta.reasoning_content

    # コンテンツを受信した後、応答の生成を開始します
    if hasattr(delta, "content") and delta.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
            is_answering = True
        print(delta.content, end="", flush=True)
        answer_content += delta.content

応答

====================思考プロセス====================

ユーザーからのこの一見シンプルだが奥深い質問について、慎重に考えてみましょう。

言語の使用状況から、ユーザーは中国語を話しているので、中国語で応答する必要があります。これは基本的な自己紹介の質問ですが、複数の意味合いを持つ可能性があります。

まず、言語モデルとして、私は自分のアイデンティティと性質を正直に述べなければなりません。私は人間ではなく、本当の感情的な意識も持っていません。私はディープラーニング技術でトレーニングされた AI アシスタントです。これが基本的な事実です。

次に、ユーザーの潜在的なシナリオを考慮すると、彼らは次のことを知りたいかもしれません：
1. どのようなサービスを提供できますか？
2. 私の専門分野は何ですか？
3. 私の限界は何ですか？
4. どうすればより良い対話ができますか？

私の応答では、フレンドリーでオープンでありながら、プロフェッショナルで正確でなければなりません。知識の Q&A、ライティング支援、創造的なサポートなど、私の主な専門分野を述べると同時に、本当の感情的な経験がないなどの限界も率直に指摘する必要があります。

さらに、応答をより完全にするために、ユーザーが問題を解決するのを助けるという前向きな姿勢と意欲を表現する必要があります。ユーザーが私の能力をよりよく示すために、より具体的な質問をするように誘導することができます。

これがオープンエンドの始まりであることを考えると、応答は簡潔で明確でありながら、ユーザーが私の基本的な状況を明確に理解し、その後の会話のための良い基盤を築くのに十分な情報を含んでいる必要があります。

最後に、トーンは謙虚でプロフェッショナルであり続け、技術的すぎず、カジュアルすぎず、ユーザーが快適で自然に感じるようにする必要があります。
====================完全な応答====================

私は Zhipu AI によってトレーニングされた GLM 大規模言語モデルで、ユーザーに情報を提供し、問題解決を支援するように設計されています。私は人間の言語を理解し、生成するように設計されており、質問に答えたり、説明を提供したり、さまざまなトピックについて議論したりすることができます。

私はあなたの個人データを保存しません。私たちの会話は匿名です。私が理解したり探求したりするのを手伝えるトピックはありますか？
====================トークン使用量====================

CompletionUsage(completion_tokens=344, prompt_tokens=7, total_tokens=351, completion_tokens_details=None, prompt_tokens_details=None)

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

// OpenAI クライアントを初期化します
const openai = new OpenAI({
    // 環境変数を設定していない場合は、値を Model Studio API キーに置き換えます: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY, 
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

let reasoningContent = ''; // 完全な思考プロセス
let answerContent = ''; // 完全な応答
let isAnswering = false; // 応答フェーズが開始されたかどうかを示します

async function main() {
    try {
        const messages = [{ role: 'user', content: 'Who are you' }];
        
        const stream = await openai.chat.completions.create({
            model: 'glm-5',
            messages,
            // 注：Node.js SDK では、enable_thinking のような非標準パラメーターは extra_body 内ではなく、トップレベルのプロパティとして渡されます。
            enable_thinking: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        console.log('\n' + '='.repeat(20) + 'Thought Process' + '='.repeat(20) + '\n');

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                console.log('\n' + '='.repeat(20) + 'Token Usage' + '='.repeat(20) + '\n');
                console.log(chunk.usage);
                continue;
            }

            const delta = chunk.choices[0].delta;
            
            // 思考コンテンツのみを収集します
            if (delta.reasoning_content !== undefined && delta.reasoning_content !== null) {
                if (!isAnswering) {
                    process.stdout.write(delta.reasoning_content);
                }
                reasoningContent += delta.reasoning_content;
            }

            // コンテンツを受信した後、応答の生成を開始します
            if (delta.content !== undefined && delta.content) {
                if (!isAnswering) {
                    console.log('\n' + '='.repeat(20) + 'Complete Response' + '='.repeat(20) + '\n');
                    isAnswering = true;
                }
                process.stdout.write(delta.content);
                answerContent += delta.content;
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

応答

====================思考プロセス====================

ユーザーの質問「あなたは誰ですか？」について慎重に考えてみましょう。これは複数の角度から分析し、応答する必要があります。

まず、これは基本的なアイデンティティに関する質問です。GLM 大規模言語モデルとして、私は自分のアイデンティティを正確に述べる必要があります。私は Zhipu AI によって開発された AI アシスタントであることを明確に述べるべきです。

次に、この質問をするユーザーの意図を考慮する必要があります。彼らは基本的な機能を知りたい初めてのユーザーかもしれませんし、特定の助けを提供できるか確認したいのかもしれませんし、単に私の応答スタイルをテストしているだけかもしれません。したがって、オープンでフレンドリーな答えをする必要があります。

また、答えの完全性も考慮する必要があります。私のアイデンティティを紹介するだけでなく、Q&A、作成、分析など、私の主な機能も簡単に説明して、ユーザーがこのアシスタントの使い方を知ることができるようにする必要があります。

最後に、フレンドリーで親しみやすいトーンを確保し、助けたいという意欲を表現する必要があります。「喜んでお仕えします」のような表現を使って、ユーザーにコミュニケーションの温かさを感じさせることができます。

これらの考えに基づいて、ユーザーの質問に答え、その後のコミュニケーションを導く、簡潔で明確な答えをまとめることができます。
====================完全な応答====================

私は GLM、Zhipu AI によってトレーニングされた大規模言語モデルです。私は大量のテキストデータでトレーニングされ、人間の言語を理解し生成することで、ユーザーが質問に答え、情報を提供し、会話に参加するのを助けます。

私はより良いサービスを提供するために学習と改善を続けます。ご質問にお答えしたり、お手伝いできることを嬉しく思います。何かお手伝いできることはありますか？
====================トークン使用量====================

{ prompt_tokens: 7, completion_tokens: 248, total_tokens: 255 }

HTTP

サンプルコード

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "glm-5",
    "messages": [
        {
            "role": "user", 
            "content": "Who are you"
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "enable_thinking": true
}'

DashScope

Python

サンプルコード

import os
from dashscope import Generation

# リクエストパラメーターを初期化します
messages = [{"role": "user", "content": "Who are you?"}]

completion = Generation.call(
    # 環境変数を設定していない場合は、値を Model Studio API キーに置き換えます: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="glm-5",
    messages=messages,
    result_format="message",  # 結果のフォーマットを message に設定します
    enable_thinking=True,     # 思考モードを有効にします
    stream=True,              # ストリーミング出力を有効にします
    incremental_output=True,  # 増分出力を有効にします
)

reasoning_content = ""  # 完全な思考プロセス
answer_content = ""     # 完全な応答
is_answering = False    # 応答フェーズが開始されたかどうかを示します

print("\n" + "=" * 20 + "Thought Process" + "=" * 20 + "\n")

for chunk in completion:
    message = chunk.output.choices[0].message
    # 思考コンテンツのみを収集します
    if "reasoning_content" in message:
        if not is_answering:
            print(message.reasoning_content, end="", flush=True)
        reasoning_content += message.reasoning_content

    # コンテンツを受信した後、応答の生成を開始します
    if message.content:
        if not is_answering:
            print("\n" + "=" * 20 + "Complete Response" + "=" * 20 + "\n")
            is_answering = True
        print(message.content, end="", flush=True)
        answer_content += message.content

print("\n" + "=" * 20 + "Token Usage" + "=" * 20 + "\n")
print(chunk.usage)

応答

====================思考プロセス====================

ユーザーの質問「あなたは誰ですか？」について慎重に考えてみましょう。まず、ユーザーの意図を分析する必要があります。これは初めてのユーザーの好奇心かもしれませんし、私の特定の機能や能力について知りたいのかもしれません。

専門的な観点から、私は GLM 大規模言語モデルとしてのアイデンティティを明確に述べ、基本的な位置づけと主な機能を説明する必要があります。過度に技術的な説明は避け、分かりやすく説明すべきです。

同時に、プライバシー保護やデータセキュリティなど、ユーザーが気にする可能性のある実際的な問題も考慮する必要があります。これらは AI サービスを使用する際にユーザーが非常に懸念する点です。

さらに、プロフェッショナリズムとフレンドリーさを示すために、紹介の後にユーザーに特定の助けが必要かどうかを尋ねることで、積極的に会話を導くことができます。これにより、ユーザーは私をよりよく理解し、その後の会話の道筋をつけることができます。

最後に、答えが簡潔で明確であり、要点が強調されていることを確認し、ユーザーが私のアイデンティティと目的を迅速に理解できるようにする必要があります。このような答えは、ユーザーの好奇心を満たし、プロフェッショナリズムとサービス指向の態度を示すことができます。
====================完全な応答====================

私は Zhipu AI によって開発された GLM 大規模言語モデルで、自然言語処理技術を通じてユーザーに情報を提供し、支援するように設計されています。私は大量のテキストデータでトレーニングされており、人間の言語を理解し生成し、質問に答え、知識サポートを提供し、会話に参加することができます。

私の設計目標は、ユーザーのプライバシーとデータセキュリティを確保しながら、役立つ AI アシスタントになることです。私はユーザーの個人情報を保存せず、より高品質のサービスを提供するために学習と改善を続けます。

お答えできる質問や、お手伝いできるタスクはありますか？
====================トークン使用量====================

{"input_tokens": 8, "output_tokens": 269, "total_tokens": 277}

Java

サンプルコード

重要

DashScope Java SDK のバージョンは 2.19.4 以降を使用してください。

// DashScope SDK のバージョンは 2.19.4 以降である必要があります。
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import java.lang.System;
import java.util.Arrays;

public class Main {
    private static StringBuilder reasoningContent = new StringBuilder();
    private static StringBuilder finalContent = new StringBuilder();
    private static boolean isFirstPrint = true;
    private static void handleGenerationResult(GenerationResult message) {
        String reasoning = message.getOutput().getChoices().get(0).getMessage().getReasoningContent();
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();
        if (reasoning != null && !reasoning.isEmpty()) {
            reasoningContent.append(reasoning);
            if (isFirstPrint) {
                System.out.println("====================Thought Process====================");
                isFirstPrint = false;
            }
            System.out.print(reasoning);
        }
        if (content != null && !content.isEmpty()) {
            finalContent.append(content);
            if (!isFirstPrint) {
                System.out.println("\n====================Complete Response====================");
                isFirstPrint = true;
            }
            System.out.print(content);
        }
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 環境変数を設定していない場合は、次の行を .apiKey("sk-xxx") に置き換えます
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("glm-5")
                .incrementalOutput(true)
                .resultFormat("message")
                .messages(Arrays.asList(userMsg))
                .build();
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("An exception occurred: " + e.getMessage());
        }
    }
}

応答

====================思考プロセス====================
ユーザーの質問にどう答えるか考えてみましょう。まず、これは簡単なアイデンティティに関する質問であり、明確で直接的な答えが必要です。

大規模言語モデルとして、私は基本的なアイデンティティ情報を正確に述べる必要があります。これには以下が含まれます：
- 名前：GLM
- 開発者：Zhipu AI
- 主な機能：言語の理解と生成

ユーザーの質問が初めての対話から来ている可能性を考慮して、過度に技術的な用語を避け、分かりやすい方法で自己紹介する必要があります。同時に、ユーザーが私とどのように対話するかをよりよく理解できるように、私の主な能力も簡単に説明する必要があります。

また、フレンドリーでオープンな態度を表現し、ユーザーがさまざまな質問をすることを歓迎して、その後の会話のための良い基盤を築くべきです。ただし、紹介は簡潔で明確であり、詳細すぎないようにして、ユーザーを情報で圧倒しないようにする必要があります。

最後に、さらなるコミュニケーションを促進するために、ユーザーの実際のニーズにより良く応えるために、特定の助けが必要かどうかを積極的に尋ねることができます。
====================完全な応答====================
私は Zhipu AI によって開発された大規模言語モデル、GLM です。私は大量のテキストデータでトレーニングされており、人間の言語を理解し生成し、質問に答え、情報を提供し、会話に参加することができます。

私の設計目的は、ユーザーが問題を解決し、知識を提供し、さまざまな言語タスクをサポートすることです。私はより正確で有用な答えを提供するために、継続的に学習し更新していきます。

お答えできる質問や、議論できることはありますか？

HTTP

サンプルコード

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "glm-5",
    "input":{
        "messages":[      
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters":{
        "enable_thinking": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

ストリーミングツール呼び出し

glm-5、glm-4.7、および glm-4.6 モデルは、tool_stream パラメーターをサポートしています。このブール値パラメーターのデフォルトは false です。これは、stream が true に設定されている場合にのみ有効になります。有効にすると、関数呼び出しによる `tool_call` 応答の `arguments` フィールドは、完全な生成が完了した後に一度に返されるのではなく、ストリームで増分的に返されます。

streamとtool_streamの連携動作は、以下のとおりです。

stream	tool_stream	tool_call の戻りメソッド
true	true	引数は複数のチャンクで増分的に返されます。
true	false (デフォルト)	引数は単一のチャンクで完全に返されます。
false	true/false	tool_stream は効果がありません。引数は完全な応答で一度に返されます。

OpenAI 互換

Python

サンプルコード

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The name of the city"}
                },
                "required": ["city"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What is the weather like in Beijing"}]

completion = client.chat.completions.create(
    model="glm-5",
    tools=tools,
    messages=messages,
    extra_body={
        "tool_stream": True,
    },
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in completion:
    if chunk.choices:
        delta = chunk.choices[0].delta
        if hasattr(delta, 'content') and delta.content:
            print(f"[content] {delta.content}")
        if hasattr(delta, 'tool_calls') and delta.tool_calls:
            for tc in delta.tool_calls:
                print(f"[tool_call] id={tc.id}, name={tc.function.name}, args={tc.function.arguments}")
        if chunk.choices[0].finish_reason:
            print(f"[finish_reason] {chunk.choices[0].finish_reason}")
    if not chunk.choices and chunk.usage:
        print(f"[usage] {chunk.usage}")

Node.js

サンプルコード

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
});

const tools = [
    {
        type: "function",
        function: {
            name: "get_weather",
            description: "Get weather information for a specified city",
            parameters: {
                type: "object",
                properties: {
                    city: { type: "string", description: "The name of the city" }
                },
                required: ["city"]
            }
        }
    }
];

async function main() {
    try {
        const stream = await openai.chat.completions.create({
            model: 'glm-5',
            messages: [{ role: 'user', content: 'What is the weather like in Beijing' }],
            tools: tools,
            tool_stream: true,
            stream: true,
            stream_options: {
                include_usage: true
            },
        });

        for await (const chunk of stream) {
            if (!chunk.choices?.length) {
                if (chunk.usage) {
                    console.log(`[usage] ${JSON.stringify(chunk.usage)}`);
                }
                continue;
            }

            const delta = chunk.choices[0].delta;

            if (delta.content) {
                console.log(`[content] ${delta.content}`);
            }

            if (delta.tool_calls) {
                for (const tc of delta.tool_calls) {
                    console.log(`[tool_call] id=${tc.id}, name=${tc.function.name}, args=${tc.function.arguments}`);
                }
            }

            if (chunk.choices[0].finish_reason) {
                console.log(`[finish_reason] ${chunk.choices[0].finish_reason}`);
            }
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

HTTP

サンプルコード

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "glm-5",
    "messages": [
        {
            "role": "user",
            "content": "What is the weather like in Beijing"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get weather information for a specified city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "The name of the city"}
                    },
                    "required": ["city"]
                }
            }
        }
    ],
    "stream": true,
    "stream_options": {"include_usage": true},
    "tool_stream": true
}'

DashScope

Python

サンプルコード

import os
from dashscope import Generation

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The name of the city"}
                },
                "required": ["city"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What is the weather like in Beijing"}]

completion = Generation.call(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="glm-5",
    messages=messages,
    tools=tools,
    result_format="message",
    stream=True,
    tool_stream=True,
    incremental_output=True,
)

for chunk in completion:
    msg = chunk.output.choices[0].message
    if msg.content:
        print(f"[content] {msg.content}")
    if "tool_calls" in msg and msg.tool_calls:
        for tc in msg.tool_calls:
            fn = tc.get("function", {})
            print(f"[tool_call] id={tc.get('id','')}, name={fn.get('name','')}, args={fn.get('arguments','')}")
    finish = chunk.output.choices[0].get("finish_reason", "")
    if finish and finish != "null":
        print(f"[finish_reason] {finish}")

HTTP

サンプルコード

curl

curl -X POST "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "glm-5",
    "input": {
        "messages": [
            {
                "role": "user",
                "content": "What is the weather like in Beijing"
            }
        ]
    },
    "parameters": {
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get weather information for a specified city",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "city": {"type": "string", "description": "The name of the city"}
                        },
                        "required": ["city"]
                    }
                }
            }
        ],
        "tool_stream": true,
        "incremental_output": true,
        "result_format": "message"
    }
}'

モデルの特徴

モデル	マルチターン会話	関数呼び出し	構造化出力	Web 検索	パーシャルモード	コンテキストキャッシュ
glm-5	サポート	サポート	サポートノンシンキングモードでのみ	非サポート	非サポート	サポート現在、暗黙的なキャッシュのみがサポートされています
glm-4.7	サポート	サポート	サポートノンシンキングモードでのみ	非サポート	非サポート	非サポート
glm-4.6	サポート	サポート	サポートノンシンキングモードでのみ	非サポート	非サポート	非サポート

デフォルトのパラメーター値

モデル	enable_thinking	temperature	top_p	top_k	repetition_penalty
glm-5	true	1.0	0.95	20	1.0
glm-4.7	true	1.0	0.95	20	1.0
glm-4.6	true	1.0	0.95	20	1.0

課金

課金は、モデルが使用した入力トークンと出力トークンの数に基づいています。価格の詳細については、「GLM」をご参照ください。

思考モードでは、Chain-of-Thought の出力は出力トークンの数に基づいて課金されます。

よくある質問

Q：Dify の設定方法は？

A：現在、Alibaba Cloud Model Studio の GLM シリーズモデルを Dify と統合することはできません。代わりに、Qwen カードを使用して Qwen3 モデルを使用してください。詳細については、「Dify」をご参照ください。

エラーコード

実行中にエラーが発生した場合は、「エラーメッセージ」を参照してトラブルシューティングを行ってください。