This topic describes how Logtail collects logs. The log collection process includes log monitoring, reading, processing, filtering, aggregating, and uploading.
Monitor log files
After you install Logtail on servers and create Logtail configurations for log collection in the Log Service console, the configurations are synchronized to the servers in real time. Logtail monitors log files of the servers based on the configurations. Logtail scans log directories and files based on the log path and maximum depth of monitored directories that are specified in the configurations.
If the log files of the servers in a machine group are not updated after Logtail configurations for log collection are applied to the machine group, the log files are considered historical log files. Logtail does not collect historical log files. If log files are updated, Logtail reads the files and collect the log files to Log Service. For more information about how to collect historical log files, see Import historical logs.
Logtail registers event listeners to monitor directories from which log files are collected. The event listeners pool the log files under the directories on a regular basis to ensure the timeliness and stability of log collection. In Linux-based servers, Inotify is used to monitor the directories and pool log files.
Read log files
- When Logtail reads a log file for the first time, it checks the size of the file.
- If the file size is less than 1 MB, Logtail reads the file from the beginning of the file.
- If the file size is greater than 1 MB, Logtail reads from the last 1 MB of data in the file.
- If Logtail has read the log file, it reads the file from the last checkpoint.
- Logtail can read up to 512 KB of data at a time. Make sure that the size of every log entry in a log file is limited to 512 KB.
Process log entries
- Split each log entry into multiple lines
If you specify a regular expression to match the first line of a log entry, Logtail splits the log entry into multiple lines based on the regular expression. If such a regular expression is not specified, a single log line is processed as a log entry.
- Parse log entries
Logtail parses every log entry based on the collection mode that is specified in the Logtail configurations for log collection.Note If you configure complex regular expressions, Logtail may consume excessive CPU resources. Therefore, we recommend that you configure efficient regular expressions.If a log entry fails to be parsed, Logtail handles the log entry based on whether the Drop Failed to Parse Logs switch is turned on in the Logtail configurations for log collection:
- If the Drop Failed to Parse Logs switch is turned on, Logtail drops the log entry and reports an error.
- If the Drop Failed to Parse Logs switch is turned off, Logtail uploads the log entry. The key of the log entry is set to raw_log and the value is set to the log content.
- Set the time field for a log entry
- If you do not set the time field for a log entry, the log time is the time when the log entry is parsed.
- If you set the time field for a log entry:
- If the difference between the log generation time and the current time is less than 12 hours, the log time is extracted from the parsed log fields.
- If the difference between the log time and the current time is greater than 12 hours, the log entry is dropped and an error is reported
Filter log entries
After Logtail process log entries, it filters the log entries based on the specified filter conditions.
- If you do not specify filter conditions in the Filter Configuration field, the log entries are not filtered.
- If you specify filter conditions in the Filter Configuration field, the fields in every log entry are traversed.
Only the log entries that meet the filter conditions are collected.
Aggregate log entries
To reduce the number of network requests, Logtail caches the processed and filtered log entries for a period of time. Then, Logtail aggregates these log entries and send the log entries to Log Service.
- The aggregation duration exceeds 3 seconds.
- The number of aggregated log entries exceeds 4,096.
- The total size of aggregated log entries exceeds 512 KB.
Send log entries
Logtail sends the aggregated log entries to Log Service. You can set the max_bytes_per_sec
and send_request_concurrency
parameters in the Logtail startup configuration file to specify the maximum transmission
rate of log data and concurrent requests. For more information, see Configure the startup parameters of Logtail.
Error code | Description | Handling method |
---|---|---|
401 | Logtail is not authorized to collect data. | Logtail drops the log data. |
404 | The project or Logstore that is specified in Logtail configurations for log collection does not exist. | Logtail drops the log data. |
403 | The shard quota is exhausted. | After 3 seconds, Logtail tries again. |
500 | A server exception occurs. | After 3 seconds, Logtail tries again. |