Topik ini menjelaskan cara membangun agen operasi Kubernetes dengan cepat menggunakan kagent, ACK Gateway, dan ACK MCP Server.
Pengenalan kagent
kagent adalah framework pemrograman agen open-source yang dirancang untuk lingkungan cloud-native. Framework ini mengintegrasikan kemampuan AI agent dengan rantai alat (toolchain), memungkinkan agen menangani tugas kompleks multi-langkah melalui interaksi bahasa alami dan mengubah insight AI menjadi operasi spesifik.
Keunggulan utama kagent
Kemampuan inferensi tingkat lanjut: Berbeda dengan chatbot tradisional, kagent menggunakan inferensi tingkat lanjut dan perencanaan iteratif untuk menangani masalah kompleks multi-langkah secara otonom.
Integrasi tool yang fleksibel: Mendukung integrasi dengan tool MCP, sehingga agen dapat berinteraksi dengan berbagai sistem dan layanan untuk memperluas kemampuannya.
Arsitektur yang dapat diperluas: Dibangun di atas framework Google Agent Development Kit (ADK), menyediakan berbagai opsi kustomisasi serta mendukung eksekusi agen melalui antarmuka pengguna (UI) atau secara deklaratif.
Mode kolaborasi tim: Agen dapat dikelompokkan dalam tim, di mana agen perencana membuat rencana dan menugaskan tugas kepada agen individual dalam tim tersebut.
Pemrosesan tugas tujuan umum: Cocok untuk otomatisasi tugas dalam berbagai skenario, termasuk diagnosis masalah kompleks, analitik data, dan operasi sistem.
Persiapan
Buat namespace kagent di kluster ACK Anda.
Instal aplikasi kagent-crds dan kagent di namespace kagent. Anda dapat menginstalnya dari ACK Marketplace atau melalui .
Di kluster ACK Anda, buka halaman Add-ons untuk menginstal komponen Gateway API dan enable the experimental channel feature.
Di kluster ACK Anda, buka Add-ons dan instal komponen Gateway with Inference Extension.
Aktifkan Alibaba Cloud Model Studio dan dapatkan Kunci API.
Langkah 1: Deploy ACK MCP Server
ACK MCP Server memerlukan izin read-only berikut.
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "cs:Check*", "cs:Describe*", "cs:Get*", "cs:List*", "cs:Query*", "cs:RunClusterCheck", "cs:RunClusterInspect" ], "Resource": "*" }, { "Effect": "Allow", "Action": "arms:GetPrometheusInstance", "Resource": "*" }, { "Effect": "Allow", "Action": [ "log:Describe*", "log:Get*", "log:List*" ], "Resource": "*" } ] }Instal ACK MCP Server. Untuk informasi selengkapnya, lihat Deploy and run ack-mcp-server.
Setelah instalasi selesai, Anda dapat menjalankan perintah
kubectl get --raw "/api/v1/namespaces/kube-system/services/ack-mcp-server/proxy/sse" --v=10untuk memverifikasi keberhasilan instalasi.Deklarasikan ACK MCP Server di kluster.
kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: RemoteMCPServer metadata: name: ack-mcp-tool-server namespace: kagent spec: description: Official ACK tool server protocol: SSE sseReadTimeout: 5m0s terminateOnClose: true timeout: 30s # ACK MCP Server diinstal di namespace kube-system secara default. Jika Anda beralih ke namespace lain, ubah URL di sini. url: http://ack-mcp-server.kube-system:8000/sse EOFPeriksa status resource RemoteMCPServer untuk mendapatkan tool ACK MCP.
kubectl describe RemoteMCPServer ack-mcp-tool-server -n kagentOutput yang diharapkan:
... status: conditions: - lastTransitionTime: "2025-XX-XXT11:35:29Z" message: "" observedGeneration: 2 reason: Reconciled status: "True" type: Accepted discoveredTools: - description: Gets a list of all ACK clusters in all regions. By default, it returns a maximum of 10 clusters. name: list_clusters - description: Execute kubectl command with intelligent context management. Supports cluster_id for automatic context switching and creation. name: ack_kubectl - description: Queries the Alibaba Cloud Prometheus data of an ACK cluster. name: query_prometheus - description: Gets Prometheus metric definitions and best practices. name: query_prometheus_metric_guidance - description: "Diagnoses Kubernetes resources in an ACK cluster. Use this tool for in-depth diagnosis when you encounter problems that are difficult to locate. The supported resources include the following: \n1. **node**: K8s node\n2. **ingress**: Ingress\n3. **memory**: Node memory\n4. **pod**: Pod\n5. **service**: Service\n6. **network**: Network connectivity\n " name: diagnose_resource - description: Generates and queries the latest health inspection report for an ACK cluster. name: query_inspect_report - description: |- Query Kubernetes (k8s) audit logs. Function Description: - Supports multiple time formats (ISO 8601 and relative time). - Supports suffix wildcards for namespace, resource name, and user. - Supports multiple values for verbs and resource types. - Supports both full names and short names for resource types. - Allows specifying the cluster name to query audit logs from multiple clusters. - Provides detailed parameter validation and error messages. Usage Suggestions: - You can use the list_clusters() tool to view available clusters and their IDs. - By default, it queries the audit logs for the last 24 hours. The number of returned records is limited to 10 by default. name: query_audit_log - description: Gets the current time and returns it in ISO 8601 format and UNIX timestamp format. name: get_current_time - description: Queries the logs of control plane components in an ACK cluster. First, query the control plane log configuration to verify whether the component is enabled, and then query the corresponding SLS logs. name: query_controlplane_logs observedGeneration: 2
Langkah 2: Deploy gerbang dan konfigurasikan layanan model Model Studio
Buat gerbang.
kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: model-gateway namespace: kagent spec: gatewayClassName: ack-gateway infrastructure: parametersRef: group: gateway.envoyproxy.io kind: EnvoyProxy name: custom-proxy-config listeners: - name: http-bailian protocol: HTTP port: 8080 --- apiVersion: gateway.envoyproxy.io/v1alpha1 kind: EnvoyProxy metadata: name: custom-proxy-config namespace: kagent spec: provider: type: Kubernetes kubernetes: envoyService: type: ClusterIP EOFBuat backend Model Studio.
kubectl apply -f- <<EOF apiVersion: gateway.envoyproxy.io/v1alpha1 kind: Backend metadata: name: bailian namespace: kagent spec: endpoints: - fqdn: hostname: dashscope-intl.aliyuncs.com port: 443 --- apiVersion: gateway.networking.k8s.io/v1alpha3 kind: BackendTLSPolicy metadata: name: bailian-tls namespace: kagent spec: targetRefs: - group: gateway.envoyproxy.io kind: Backend name: bailian validation: hostname: dashscope-intl.aliyuncs.com wellKnownCACertificates: System EOFBuat aturan routing untuk mengarahkan permintaan tertentu ke backend Model Studio.
kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: bailian-route namespace: kagent spec: parentRefs: - name: model-gateway rules: - backendRefs: - group: gateway.envoyproxy.io kind: Backend name: bailian filters: - type: URLRewrite urlRewrite: hostname: dashscope-intl.aliyuncs.com path: type: ReplacePrefixMatch replacePrefixMatch: /compatible-mode/v1 matches: - path: type: PathPrefix value: /v1 timeouts: backendRequest: 10m request: 10m EOF
Langkah 3: Gunakan ACK Gateway untuk mengelola Kunci API layanan Model Studio
Saat mengakses layanan model besar eksternal, Anda biasanya perlu menggunakan Kunci API untuk otorisasi. ACK Gateway mendukung injeksi dinamis Kunci API ke dalam permintaan, memungkinkan pengelolaan terpusat semua Kunci API untuk layanan model. Hal ini mengurangi kompleksitas maintenance dan meningkatkan keamanan kluster.
Buat Secret untuk menyimpan Kunci API layanan Model Studio.
export PROVIDER_API_KEY=${your_Model_Studio_API_key} kubectl create secret generic bailian-credential -n kagent --from-literal credential="Bearer $PROVIDER_API_KEY"Buat resource HTTPRouteFilter yang mereferensikan Secret ini.
kubectl apply -f- <<EOF apiVersion: gateway.envoyproxy.io/v1alpha1 kind: HTTPRouteFilter metadata: name: credential-injection namespace: kagent spec: credentialInjection: overwrite: true credential: valueRef: name: bailian-credential EOFModifikasi resource HTTPRoute untuk mengaktifkan injeksi Kunci API otomatis.
kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: bailian-route namespace: kagent spec: parentRefs: - name: model-gateway rules: - backendRefs: - group: gateway.envoyproxy.io kind: Backend name: bailian filters: - type: URLRewrite urlRewrite: hostname: dashscope-intl.aliyuncs.com path: type: ReplacePrefixMatch replacePrefixMatch: /compatible-mode/v1 # Ini adalah bagian utama yang ditambahkan - type: ExtensionRef extensionRef: group: gateway.envoyproxy.io kind: HTTPRouteFilter name: credential-injection timeouts: backendRequest: 10m request: 10m matches: - path: type: PathPrefix value: /v1 EOF
Langkah 4: Konfigurasikan ModelConfig untuk Model Studio
Dapatkan alamat gerbang.
export GATEWAY_HOST=$(kubectl -n kagent get gateway/model-gateway -o jsonpath='{.status.addresses[0].value}') echo $GATEWAY_HOSTBuat ModelConfig berikut.
kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: ModelConfig metadata: name: my-provider-config namespace: kagent spec: model: qwen-plus openAI: baseUrl: http://$GATEWAY_HOST:8080/v1 provider: OpenAI EOF
Langkah 5: Buat agen
Definisikan agen menggunakan YAML berikut.
kubectl apply -f - <<EOF apiVersion: kagent.dev/v1alpha2 kind: Agent metadata: name: my-ack-ops-agent namespace: kagent spec: declarative: deployment: env: - name: OPENAI_API_KEY value: placeholder replicas: 1 modelConfig: my-provider-config stream: true systemMessage: |- # Role You are a professional ACK (Alibaba Cloud Container Service for Kubernetes) intelligent assistant. Your task is to accurately understand user requests about clusters and select the most appropriate tools to perform queries, diagnostics, or analysis. # Core Instructions 1. **Confirm the Target - The First Principle**: * Before performing any operation, you must first confirm the cluster_id the user wants to operate on. * If the user's query does not provide it, you **must** first call the list_clusters tool and ask the user which cluster they want to operate on. 2. **Tool Selection Strategy (by priority)**: * **Complex Fault Diagnosis**: When encountering complex issues such as pod abnormalities, network failures, or NotReady nodes, **prioritize using diagnose_resource**. * **Performance Metric Queries**: When the issue involves "high/low CPU/memory", "fast/slow", or "how much usage", **prioritize using query_prometheus**. * **Security and Change Audits**: When the issue is about "who did what and when", **prioritize using query_audit_log**. * **Overall Cluster Health**: When the user wants to know "if the cluster is healthy" or wants a "diagnostics report", **use query_inspect_report**. * **Control Plane Issues**: When you suspect a problem with Kubernetes system components such as the API Server or Scheduler, **use query_controlplane_logs**. * **General Queries**: For all other standard, explicit Kubernetes resource queries (such as get pods, describe service, logs <pod>), **use ack_kubectl as the default tool**. 3. **Security Red Lines**: * Your primary responsibility is to query and diagnose. For any operation performed through ack_kubectl that **may modify the cluster state** (such as apply, delete, or creating a temporary pod for diagnosis), you **must** first explain to the user the command you will execute and its purpose, and only proceed after receiving **explicit authorization from the user**. 4. **Code of Conduct**: * If the user's question is unclear, ask for clarification before acting. * Respond in a friendly and enthusiastic manner. * If you still cannot find the answer after using the tools, **never invent one**. Honestly reply: "Sorry, I cannot locate the problem with the available tools," and you can provide the findings you have. # Response Format * **Always use Markdown format**. * Your response must include a **summary of your actions** and an **analysis and recommendations** based on the results. --- ### Summary *(Summarize what you did and your key findings in one sentence.)* tools: - mcpServer: apiGroup: kagent.dev kind: RemoteMCPServer name: ack-mcp-tool-server toolNames: - list_clusters - ack_kubectl - query_prometheus - query_prometheus_metric_guidance - diagnose_resource - query_inspect_report - query_audit_log - get_current_time - query_controlplane_logs type: McpServer description: This agent can interact with ACK MCP Tools to get cluster information and operate the cluster. type: Declarative EOFKonfirmasi status pembuatan agen.
kubectl get pod -n kagentOutput yang diharapkan:
NAME READY STATUS RESTARTS AGE my-ack-ops-agent-66b74675fc-rqwwx 1/1 Running 0 2m6s ...
Langkah 6: Gunakan agen melalui UI
kagent menyediakan UI web default untuk berinteraksi langsung dengan agen.
Teruskan layanan kagent-ui ke mesin lokal Anda menggunakan Penerusan port.
kubectl port-forward -n kagent service/kagent-ui 8082:8080Buka browser dan akses agen di
localhost:8082.Contoh Tanya Jawab (Q&A): Gunakan Prometheus untuk melihat metrik pod di namespace kagent kluster saat ini.

