Log Service provides the operations log feature to help you understand the usage of Log Service in real time and improve O&M efficiency.
Enable the operations log feature
Operations logs are divided into detailed logs and important logs (including Logtail-related logs, consumer group latency logs, and metering logs), respectively stored in internal-operation_log and internal-diagnostic_log. The internal-diagnostic_log Logstore is not charged, while the internal-operation_log Logstore is charged as common Logstores. Detailed logs record each operation or API request of a user. Multiple operations logs are generated when there are multiple read and write requests.You can enable the operations log feature as required.We recommend that you select Automatic creation (recommended) for Log Storage. In this way, the operations logs in the same region can be stored in the same project, thus facilitating log management and statistics.
Monitor the Logtail heartbeat
After Logtail is installed, you can use Logtail status logs in the operations logs to check the working status of Logtail.
__topic__: logtail_statusto search for Logtail status logs. You can obtain the number of machines for a recent period of time and compare it with the number of machines in the Logtail application machine group. In addition, you can configure the alert rules. For example, an alert is triggered when the counted number is smaller than the number of machines in the machine group.
- Query statement
__topic__: logtail_status | SELECT COUNT(DISTINCT ip) as ip_count
- Query snapshot
- Alert rule configuration (assume that the number of machines in the machine group is 100)
View metering data
After you write logs to Log Service, you can view the metering information such as the log traffic, read and write times, and storage space to learn about the usage and billing item details of Log Service.
A Logstore generates a metering log per hour, including the read and write traffic
and times in the statistical time period, and the storage size for raw logs and indexes
at the current time point. For more information about the fields of metering logs,
see Log types. Metering logs of Log Service are stored in the internal-diagnostic_log Logstore.
On the internal-diagnostic_log page, run the query statement
__topic__: metering to search for metering logs.
The default dashboard of operations logs provides abundant charts for metering logs. For more information, see Service log dashboards.
__topic__: metering | SELECT max_by(storage_index+storage_raw, __time__) as total_storage, project, logstore GROUP BY project, logstore
View the consumer group latency
After logs are written to Log Service, you can consume logs in addition to querying and analyzing them. Log Service provides consumer groups supported by multiple programming languages.When you use consumer groups to consume logs, the latency for consuming logs is one of the most concerning problems. By monitoring the consumption latency, you can learn about the consumption progress. In case of a high latency, you can adjust the consumption speed by changing the number of consumers.
As a kind of operations logs, consumer group latency logs are also stored in the internal-diagnostic_log
Logstore, and are generated every 2 minutes. On the internal-diagnostic_log page,
run the query statement
__topic__: consumergroup_log to search for all consumer group latency logs.
Query the consumption latency of the consumer group test-consumer-group.
Monitor Logtail exceptions
The proper running of Logtail guarantees the log integrity. If Logtail exceptions can be found in time, you can adjust the Logtail configurations to avoid log missing.
You can run the query statement
__topic__: logtail_alarm to search for the exception logs of Logtail.
__topic__: logtail_alarm | select sum(alarm_count)as errorCount, alarm_type GROUP BY alarm_type
Monitor the data traffic written to a Logstore
- Query the traffic of raw logs and compressed logs written to a Logstore within 15
Method: PostLogStoreLogs AND Project: my-project and LogStore: my-logstore | SELECT sum(InFlow) as raw_bytes, sum(NetInflow) as network_bytes
- Query the traffic decline ratio of the logs written to a Logstore within 15 minutes.
Method: PostLogStoreLogs AND Project: my-project and LogStore: my-logstore | select round((diff-diff)/diff,2) as rate from (select compare(network_bytes, 900) as diff from (select sum(NetInflow) as network_bytes from log))
- Create an alert.
Set the alert rule so that an alert is triggered when the traffic decline ratio of the logs written to a Logstore exceeds 50%.
Audit operations logs
|Alibaba Cloud account||