Container Service for Kubernetes (ACK) allows you to diagnose nodes, pods, Services, Ingresses, and memory with a few clicks to identify issues in your ACK cluster. This topic describes how to use the cluster diagnostics feature to diagnose an ACK cluster.
Prerequisites
- An ACK managed cluster is created. For more information, see Create an ACK managed cluster.
- The cluster is running as expected. You can log on to the Container Service for Kubernetes (ACK) console, navigate to the Clusters page, and then check whether the cluster is in the Running state.
Introduction to cluster diagnostics
The following table describes the diagnostics features provided by ACK.
Category | Description |
---|---|
Node diagnostics | Diagnose node issues, such as Kubernetes nodes in the NotReady state. |
Pod diagnostics | Diagnose pod status issues, such as pod startup failures or frequent pod restarts. |
Service diagnostics | Diagnose Service issues, such as Service configurations, resource quotas, and abnormal events. |
Ingress diagnostics | Diagnose Ingress-related issues in traffic routing configurations. |
Memory diagnostics | Diagnose node memory issues, such as memory leaks, cgroup leaks, out of memory (OOM) errors. Diagnostic results can be visualized to display the overall memory usage. |
Configure diagnostics
The procedures for configuring node, pod, Service, Ingress, and memory diagnostics are similar. The following section uses node diagnostics as an example to demonstrate how to configure the diagnostics features.
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of the cluster that you want to diagnose. In the left-side navigation pane, choose .
- On the Diagnosis page, click Node diagnosis.
- In the Select node panel, specify Node name, read the warning and select I know and agree, and then click Create diagnosis. Wait until the Status column of the diagnostic report on the Diagnosis page displays Success.
View diagnostic results
Diagnostic item | Flag | Description |
---|---|---|
Node diagnostics |
| Node diagnostics consist of the Node, NodeComponent, ClusterComponent, ECSControllerManager, and GPUNode diagnostic items. These diagnostic items help you identify node anomalies based on the status of nodes, node components, cluster components, and Elastic Compute Service (ECS) instances. On the diagnostic details page, you can view the node diagnostic results, repair suggestions, and diagnostic items. Move the pointer over the Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. |
Pod diagnostics | Pod diagnostics consist of the Pod, ClusterComponent, Node, NodeComponent, and ECSControllerManager diagnostic items. These diagnostic items help you identify pod anomalies based on the status of pods, cluster components, nodes, and ECS instances. On the diagnostic details page, you can view the pod diagnostic results, repair suggestions, and diagnostic items. Move the pointer over the Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Service diagnostics | Service diagnostics consist of the Service and ResourceQuotas diagnostic items. These diagnostic items help you identify Service anomalies based on the billing method of Classic Load Balancer (CLB) instances, certificates, quotas, and abnormal events. Move the pointer over the Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Ingress diagnostics | Ingress diagnostics consist of the Ingress, Addon, and SLB diagnostic items. These diagnostic items help you identify Ingress anomalies based on the status of Ingresses, Ingress plug-ins, and Server Load Balancer (SLB) instances. Move the pointer over the Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Memory diagnostics | None. | On the diagnostic details page, you can view diagnostic results in the Memory Overview, Memory Analysis, and OOM Analysis sections, including memory leaks, memory utilization, and memory occupied by each process. |