You can use CloudMonitor to monitor the metrics of Log Service. The metrics include write traffic, overall QPS, and service status. You can configure alert rules to monitor log collection and shard usage, and detect related exceptions.

Prerequisites

If you want to use a RAM user to view the metrics that are monitored by CloudMonitor, the RAM user is granted the read-only or read and write permissions on CloudMonitor. To grant the permissions, you can use an Alibaba Cloud account to attach the AliyunCloudMonitorReadOnlyAccess policy or the AliyunCloudMonitorFullAccess policy to the RAM user. For more information, see Step 2: Grant permissions to the RAM user.

View metrics that are monitored by CloudMonitor

  1. Log on to the Log Service console.
  2. Click the project for which you want to enable the service log feature.
  3. Choose Log Storage > Logstores. On the Logstores tab, find the Logstore whose metrics you want to view and choose Logstore management icon > Monitor to go to the CloudMonitor console.

Metrics that are monitored by CloudMonitor

Metric Description
Write Traffic The size of the data that is written in real time to the Logstore per minute.
Size of Raw Data The original size of the data that is written to the Logstore per minute. This metric measures the size of the data before compression.
Overall QPS The QPS of all operations.
Number of operations The number of API operations that are called per minute. For more information, see API reference.
Service Status The numbers of returned HTTP status codes.
Traffic resolved successfully The size of raw log data that is collected by Logtail.
Lines resolved successfully The number of log lines that is collected by Logtail.
Lines failed to be resolved The number of log lines that Logtail fails to collect. If statistics are displayed in this chart, errors occurred during log data collection.
Number of errors The number of errors that occurred during log data collection.
Number of error instances The number of machines on which errors occurred during log data collection.
Number of error IPs The numbers of IP addresses for which different types of errors are reported.

You can locate an IP address based an error type. Then, you can log on to the machine to view the /usr/local/ilogtail/ilogtail.LOG file and troubleshoot the error.

Lines The number of log lines that are written to the Logstore per minute.
Read Traffic The size of the data that is read in real time from the Logstore per minute.
Delay Time of Consumption The difference between the current consumption checkpoint and the point in time at which the most recent data is written to the queue. In a consumer group, the value of this metric is the largest time difference among all shards.

Configure alert rules

Log Service allows you to configure alert rules by using CloudMonitor. If the Service Status metric meets a trigger condition in an alert rule, a text message or an email notification is sent to the specified recipients. You can configure alert rules in the CloudMonitor console to monitor log collection and shard usage, and detect related exceptions.