Logtail collects text-file logs based on the following process.
Specify the file path > specify the procedure for separating log lines > extract log fields > specify the log time
An access log (for example, Nginx access log) occupies a line. Individual logs are separated by linefeeds. Two access logs are shown as follows.
10.1.1.1 - - [13/Mar/2016:10:00:10 +0800] "GET / HTTP/1.1" 0.011 180 404 570 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 360se)"
10.1.1.1 - - [13/Mar/2016:10:00:11 +0800] "GET / HTTP/1.1" 0.011 180 404 570 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 360se)"
For Java applications, a program log spans several lines. The beginning of a log is used to distinguish the beginning of the line. A Java program log is shown as follows.
[2016-03-18T14:16:16,000] [INFO] [SessionTracker] [SessionTrackerImpl.java:148] Expiring sessions
The beginning of the Java log is a fixed time format. The regular expression is
According to the Log Service data models, a log contains one or more key–value pairs. To extract specified fields for analysis, you need to set a regular expression. If log content is not processed, the log can be considered as a key–value pair. For the preceding access log:
When fields are extracted
Extracted content:1）10.1.1.1；2）13/Mar/2016:10:00 ；3）GET
When fields are not extracted
Extracted content:1）10.1.1.1 - - [13/Mar/2016:10:00:10 +0800] "GET / HTTP/1.1" 0.011 180 404 570 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 360se)"
According to the Log Service data models, each log must have a time field in UNIX timestamp format. The log time can be set to the system time at which Logtail captures the log or time in the log content. For the preceding access log:
Time in the log content
Log capture time
Time: Timestamp when the log is captured