Resource monitoring is one of the most commonly used monitoring methods in Kubernetes. You can use resource monitoring to check the resource usage of workloads. The resources include CPU, memory, and network resources. Container Service for Kubernetes (ACK) integrates Cloud Monitor to provide resource monitoring features. By default, ACK installs the Cloud Monitor agent for new clusters. This topic describes how to monitor basic resources and configure alerts by using the ACK console.
Prerequisites
Features
- Provides comprehensive metrics to help you gain insight into cluster performance.
- Improves monitoring and alerting capabilities.
Upgrades Cloud Monitor to the latest version to provide professional capabilities of container resource monitoring. Provides monitoring metrics for native Kubernetes objects, such as namespaces, nodes, workloads, and pods. Upgrades the alerting feature and allows you to configure alert rules based on different perspectives.
- Provides appropriate metrics for different monitoring scenarios.
Supports the most appropriate metrics for different scenarios, such as the host infrastructure layer, container layer in Platform as a Service (PaaS), and Kubernetes scheduling layer. For example, the memory metrics that affect Kubernetes scheduling in containers are dedicated to the working memory of containers. This helps distinguish container memory usage from host memory usage.
Go to Resource Monitoring
Method 1: Go to Resource Monitoring by using the ACK console
Method 2: Go to Resource Monitoring by using the Cloud Monitor console
Configure alerts based on scenarios
Scenario | Description | How to configure |
---|---|---|
Monitor the health status of the cluster and send alerts on resource usage exceptions in the cluster or nodes. | When resource usage exceptions occur in the cluster or nodes, alerts need to be sent at the earliest opportunity to prevent service interruptions. We recommend that you configure alert rules to monitor the resource usage of the entire cluster or all nodes in the cluster. | When you create an alert rule, set Resource Range to Cluster or Node. This allows you to detect abnormal metrics in the entire cluster or any node in the cluster. If you set Resource Range to Node, make sure that you select All nodes. This triggers an alert when an abnormal value of the metric specified in Rule Description is detected in any node in the cluster. |
Monitor the resource usage of pods and send alerts on any pod in the cluster. | When a resource usage exception occurs in the cluster, the exception need to be analyzed to find the pod that causes the problem. We recommend that you configure alert rules to monitor the resource usage of all pods in the cluster. | When you create an alert rule, set Resource Range to Container Group (pod) and set both Namespace and Container Group (pod) to All. This triggers an alert when an abnormal value of the metric specified in Rule Description is detected in any pod in the cluster. |
Monitor the cluster by namespace and send alerts on pods in a specified namespace in the cluster. | In most cases, a cluster is shared among multiple applications. Namespaces provide a commonly used method to isolate applications in a multi-tenant environment. When a resource usage exception occurs in an application of a specified namespace, alerts need to be sent at the earliest opportunity. We recommend that you configure alert rules to monitor the resource usage of all pods in a specified namespace in the cluster. | When you create an alert rule, set Resource Range to Container Group (pod), set Namespace to the one where your application belongs, and set Container Group (pod) to All. This triggers an alert when an abnormal value of the metric specified in Rule Description is detected in any pod in the specified namespace. |
Monitor the resource usage of applications and send alerts on pods of a specified workload in a specified namespace. | In most cases, a cluster is shared among multiple applications. Workloads provide a commonly used method to isolate applications in a multi-tenant environment. For example, an application may be run as a Deployment. When a resource usage exception occurs in a Deployment of a specified application, alerts need to be sent at the earliest opportunity. We recommend that you configure alert rules to monitor the resource usage of all pods of a specified workload. | When you create an alert rule, set Resource Range to Container Group (pod), set Namespace to the one where your application belongs, and select the workload type of your application. The following types of workload are supported: Deployment, StatefulSet, DaemonSet, Job, and CronJob. Set Container Group (pod) to All. This triggers an alert when an abnormal value of the metric specified in Rule Description is detected in any pod of the specified workload. |
Configure alert rules
Step 1: Create an alert contact and add it to an alert contact group
Step 2: Create an alert rule
Verification
Previous Resource Monitoring page
If the metrics-server component of your cluster is not upgraded to V0.3.8.5 or later, you can perform the following steps to go to the previous Resource Monitoring page: