All Products
Search
Document Center

Simple Log Service:Log collection details of Logtail

Last Updated:Apr 25, 2024

This topic describes how Logtail collects logs. The log collection process consists of the following steps: monitor logs, read logs, process logs, filter logs, aggregate logs, and send logs.

Procedure

Logtail performs the following steps to collect logs:

  1. Monitor logs

  2. Read logs

  3. Process logs

  4. Filter logs

  5. Aggregate logs

  6. Send logs

Monitor logs

After you install Logtail on servers and create a Logtail configuration in the Simple Log Service console, the configuration is synchronized to the servers in real time. Logtail monitors logs in the log files of the servers based on the configuration. Logtail scans log directories and files based on the log file path and the maximum directory depth that you specify for monitoring in the configuration.

If the log files of the servers in a machine group are not updated after the Logtail configuration is applied to the machine group, the log files are considered historical log files. Logtail does not collect logs from historical log files. If log files are updated, Logtail reads and collects logs from the files, and then sends the logs to Simple Log Service. For more information about how to collect logs from historical log files, see Import historical logs from log files.

Logtail registers event listeners to monitor directories from which logs are collected. The event listeners poll the log files in the directories on a regular basis. This ensures that logs are collected at the earliest opportunity in a stable manner. For Linux servers, Inotify is used to monitor directories and poll log files.

Read logs

After Logtail detects updated log files, Logtail reads data in the log files.

  • The first time Logtail reads data in a log file, Logtail can read up to 1,024 KB of data in the log file by default.

    • If the file size is less than 1,024 KB, Logtail reads data from the beginning of the file.

    • If the file size is greater than 1,024 KB, Logtail reads the last 1,024 KB of data in the file.

    Note

    Simple Log Service allows you to specify the data size that Logtail can read in a log file the first time Logtail reads the file.

    • Console mode: Modify the First Collection Size parameter in the Advanced Options section on the Logtail Config page. For more information, see Advanced settings.

    • API mode: Modify the tail_size_kb parameter in the Logtail configuration. For more information, see advanced.

  • If the data in a log file is previously read, Logtail reads data in the file from the previous checkpoint.

  • Logtail can read up to 512 KB of data at a time. Make sure that the size of each log in a log file does not exceed 512 KB. Otherwise, Logtail cannot read data as expected.

Note

If you change the system time on a server, you must restart Logtail. Otherwise, the log time becomes incorrect and logs are dropped.

Process logs

When Logtail reads logs in a log file, Logtail splits each log in the file into multiple lines, parses the log, and then configures the time field for the log.

  • Split a log into multiple lines

    If you specify a regular expression to match the beginning of the first line of a log, Logtail splits the log into multiple lines based on the regular expression. If you do not specify a regular expression, a single log line is processed as a log.

  • Parse a log

    Logtail parses each log based on the collection mode that you specify in the Logtail configuration.

    Note

    If you specify complex regular expressions, Logtail may consume an excessive amount of CPU resources. We recommend that you specify regular expressions that allow Logtail to parse logs in an efficient manner.

    If Logtail fails to parse a log, Logtail handles the failure based on the setting of the Drop Failed to Parse Logs parameter in the Logtail configuration.

    • If you turn on Drop Failed to Parse Logs, Logtail drops the log and reports an error.

    • If you turn off Drop Failed to Parse Logs, Logtail uploads the log. The key of the log is set to raw_log and the value is set to the log content.

  • Configure the time field for a log

    • If you do not configure the time field for a log, the log time is the time when the log is parsed.

    • If you configure the time field for a log, the manner in which the log is processed varies in the following scenarios:

      • If the difference between the time when the log is generated and the current time is within 12 hours, the log time is extracted from the parsed log fields.

      • If the difference between the time when the log is generated and the current time is greater than 12 hours, the log is dropped and an error is reported.

Filter logs

After logs are processed, Logtail filters the logs based on the specified filter conditions.

  • If you do not specify filter conditions in the Filter Configuration field, the logs are not filtered.

  • If you specify filter conditions in the Filter Configuration field, the fields in each log are traversed.

    Logtail collects only the logs that meet the filter conditions.

Aggregate logs

To reduce the number of network requests, Logtail caches the processed and filtered logs for a specified period of time. Then, Logtail aggregates the logs and sends the logs to Simple Log Service. If one of the following conditions is met when data is cached, Logtail sends aggregated logs to Simple Log Service.

  • The aggregation duration exceeds 3 seconds.

  • The number of aggregated logs exceeds 4,000.

  • The total size of aggregated logs exceeds 512 KB.

Send logs

Logtail sends aggregated logs to Simple Log Service. If a log fails to be sent, Logtail retries or no longer sends the log based on the HTTP status code.

HTTP status code

Description

Handling method of Logtail

401

The current account does not have the permissions to collect data. You must grant the account the permissions to access data. For more information, see Configure the permission assistant feature.

Logtail drops the log packet.

404

The project or Logstore that is specified in the Logtail configuration does not exist.

Logtail drops the log packet.

403

The shard quota is exhausted.

Logtail retries after 3 seconds.

500

A server exception occurs.

Logtail retries after 3 seconds.

Note

If you want to change the data transmission rate and the maximum number of concurrent connections, you can modify the max_bytes_per_sec and send_request_concurrency parameters in the Logtail startup configuration file. For more information, see Configure the startup parameters of Logtail.