CloudMonitor helps reduce the O&M costs and workloads of cloud services. CloudMonitor provides real-time operation data that you can use to identify risks in advance, troubleshoot issues, and prevent potential loss. When issues occur, CloudMonitor immediately sends you an alert notification to help you restore business in a quick manner.
- The CloudMonitor agent is running on the Elastic Compute Service (ECS) instances that you want to monitor and is able to collect metric data. If the CloudMonitor agent is not installed on the instances, manually install it. For more information, see Install the CloudMonitor agent.
- Alert contacts and contact groups are added. We recommend that you add at least two contacts to ensure real-time responses to monitoring alerts. For more information about monitoring metrics, see Alert service and Overview.
The dashboard feature of CloudMonitor provides system-wide visibility into resource utilization and operational health. In this topic, the CPU utilization, memory usage, and disk usage of ECS instances are separately displayed, and the four metrics of ApsaraDB RDS instances are displayed in two groups.
Configure alert thresholds and alert rules
We recommend that you configure suitable alert thresholds for monitoring metrics based on your business requirements. A low threshold may lead to frequent triggering of alerts and affect user experience. A high threshold may leave you with insufficient time to respond to events.
- Configure alert rules for RDS instances
We recommend that you configure alert rules for RDS instances based on your requirements. For example, you can set the alert threshold for the CPU utilization of RDS instances to 70% and set an alert to be triggered when the threshold is exceeded three consecutive times. You can configure alert thresholds for the disk usage, IOPS utilization, and total number of connections based on your requirements. For information about how to view the information about monitoring metrics, see Cloud service monitoring.
- Configure alert rules for SLB instances
Before you use CloudMonitor for SLB instances, make sure that health check is enabled for your SLB instances. You can set the alert threshold for the bandwidth value of SLB instances to 7 Mbit/s, as shown in the following figure.
Configure process monitoring
For web applications, you can configure process monitoring to monitor application processes in real time and use monitoring data to troubleshoot issues. For more information, see Configure process monitoring.
Configure site monitoring
Site monitoring is an external monitoring service for ECS instances and is used to simulate real user access scenarios and test the business availability in real time. The monitoring data can also be used to troubleshoot issues.
If the preceding monitoring metrics do not meet your requirements, you can use the custom monitoring feature. For more information, see Overview.