edit-icon download-icon

Reference for collecting syslog data

Last Updated: Mar 14, 2018

Currently, Logtail supports collecting syslog data and text logs. See the following figure.

syslog

Logtail supports syslog by using the TCP protocol. For how to configure Logtail to collect syslog data, see Syslog.

Advantages of syslog

Compared with the text logs, syslog data is directly collected to LogHub without being stored to the disk, which is good in security. Free from the file caching and parsing, the throughput rate of a single machine can reach 80 MB/s.

Basic principle

By using Logtail, you can configure TCP ports locally to receive syslog data forwarded by syslog agents.

Logtail enables the TCP ports, receives the syslog data forwarded by rsyslog or other syslog agents by using the TCP protocol, parses the received data, and forwards it to LogHub. For how to configure Logtail to collect syslog data, see Syslog.

The following figure shows the relationship among Logtail, syslog, and LogHub.

syslog2

Syslog log format

Logtail receives data as streams by using the TCP ports. To parse individual logs from the data streams, make sure that the log format meets the following requirements:

  • Logs are separated by a line break (\n). The line break cannot appear within a log.
  • Only the message body of a log can contain spaces. Other fields cannot contain spaces.

The log format of syslog is as follows:

  1. $version $tag $unixtimestamp $ip [$user-defined-field-1 $user-defined-field-2 $user-defined-field-n] $msg\n"

The meaning of each field is as follows.

Log field Meaning
version The version of the log format. Logtail uses the version to parse user-defined fields.
tag The data tag used to search for the project or Logstore. It cannot contain spaces or line breaks.
unixtimestamp: The timestamp of the log.
ip The IP address of the machine corresponding to the log. If this log field is 127.0.0.1, it is replaced with the peer address of the TCP socket when the log is sent to Log Service.
user-defined-field Zero or multiple user-defined fields can be set. The fields cannot contain spaces or line breaks. Brackets ([]) indicate that the fields are optional.
msg The message body of the log, which cannot contain line breaks. The \n at the end of this field is the line break.

See the following sample log that meets the format requirements:

  1. 2.1 streamlog_tag 1455776661 10.101.166.127 ERROR com.alibaba.streamlog.App.main(App.java:17) connection refused, retry

Logtail supports collecting syslog data and other log types that meet the following requirements:

  • Logs can be formatted and the formatted logs meet the preceding format requirements.
  • Logs can be appended to the remote end by using the TCP protocol.

Use Logtail to parse syslog data

Logtail needs additional configuration for parsing syslog data.

For example:

  1. "streamlog_formats":
  2. [
  3. {"version": "2.1", "fields": ["level", "method"]},
  4. {"version": "2.2", "fields": []},
  5. {"version": "2.3", "fields": ["pri-text", "app-name", "syslogtag"]}
  6. ]

Logtail identifies the corresponding user-defined field format in streamlog_formats based on the version field. Apply this configuration. The preceding log sample, with the version field set to 2.1, contains two user-defined fields: level and method. Therefore, the log sample is parsed into the following format:

  1. {
  2. "source": "10.101.166.127",
  3. "time": 1455776661,
  4. "level": "ERROR",
  5. "method": "com.alibaba.streamlog.App.main(App.java:17)",
  6. "msg": "connection refused, retry"
  7. }

The version field is used to parse user-defined fields, and the tag field is used to search for the project or Logstore where the data is sent. These two fields are not included in the logs sent to Alibaba Cloud Log Service. In addition, Logtail predefines some log formats where the version field starts with “0.” or “1.”, such as 0.1 and 1.1. Therefore, user-defined version fields cannot start with “0.” or “1.”.

Common logging tools collect syslog data by using Logtail

Log4j

  • Introduce the Log4j library.

    1. <dependency>
    2. <groupId>org.apache.logging.log4j</groupId>
    3. <artifactId>log4j-api</artifactId>
    4. <version>2.5</version>
    5. </dependency>
    6. <dependency>
    7. <groupId>org.apache.logging.log4j</groupId>
    8. <artifactId>log4j-core</artifactId>
    9. <version>2.5</version>
    10. </dependency>
  • Introduce the Log4j configuration file log4j_aliyun.xml to programs.

    1. <?xml version="1.0" encoding="UTF-8"?>
    2. <configuration status="OFF">
    3. <appenders>
    4. <Socket name="StreamLog" protocol="TCP" host="10.101.166.173" port="11111">
    5. <PatternLayout pattern="%X{version} %X{tag} %d{UNIX} %X{ip} %-5p %l %enc{%m}%n" />
    6. </Socket>
    7. </appenders>
    8. <loggers>
    9. <root level="trace">
    10. <appender-ref ref="StreamLog" />
    11. </root>
    12. </loggers>
    13. </configuration>

    10.101.166.173:11111 is the address of the server where the Logtail client is installed.

  • Set ThreadContext in programs.

    1. package com.alibaba.streamlog;
    2. import org.apache.logging.log4j.LogManager;
    3. import org.apache.logging.log4j.Logger;
    4. import org.apache.logging.log4j.ThreadContext;
    5. public class App
    6. {
    7. private static Logger logger = LogManager.getLogger(App.class);
    8. public static void main( String[] args ) throws InterruptedException
    9. {
    10. ThreadContext.put("version", "2.1");
    11. ThreadContext.put("tag", "streamlog_tag");
    12. ThreadContext.put("ip", "127.0.0.1");
    13. while(true)
    14. {
    15. logger.error("hello world");
    16. Thread.sleep(1000);
    17. }
    18. //ThreadContext.clearAll();
    19. }
    20. }

Tengine

Tengine can collect syslog data by using ilogtail.

Tengine uses the ngx_http_log_module for logging to the local syslog agent, which forwards the logs to rsyslog.

For how to configure syslog in Tengine, see Configure syslog in Tengine.

Example:

Send INFO-level access logs of the user type to Unix dgram (/dev/log) of the local machine and set the application tag to Nginx.

  1. access_log syslog:user:info:/var/log/nginx.sock:nginx

Rsyslog configuration:

  1. module(load="imuxsock") # needs to be done just once
  2. input(type="imuxsock" Socket="/var/log/nginx.sock" CreatePath="on")
  3. $template ALI_LOG_FMT,"2.3 streamlog_tag %timegenerated:::date-unixtimestamp% %fromhost-ip% %pri-text% %app-name% %syslogtag% %msg:::drop-last-lf%\n"
  4. if $syslogtag == 'nginx' then @@10.101.166.173:11111;ALI_LOG_FMT

Nginx

The collection of Nginx access logs is used as an example.

Access log configuration:

  1. access_log syslog:server=unix:/var/log/nginx.sock,nohostname,tag=nginx;

Rsyslog configuration:

  1. module(load="imuxsock") # needs to be done just once
  2. input(type="imuxsock" Socket="/var/log/nginx.sock" CreatePath="on")
  3. $template ALI_LOG_FMT,"2.3 streamlog_tag %timegenerated:::date-unixtimestamp% %fromhost-ip% %pri-text% %app-name% %syslogtag% %msg:::drop-last-lf%\n"
  4. if $syslogtag == 'nginx' then @@10.101.166.173:11111;ALI_LOG_FMT

For more information, see Nginx.

Python syslog

Example:

  1. import logging
  2. import logging.handlers
  3. logger = logging.getLogger('myLogger')
  4. logger.setLevel(logging.INFO)
  5. #add handler to the logger using unix domain socket '/dev/log'
  6. handler = logging.handlers.SysLogHandler('/dev/log')
  7. #add formatter to the handler
  8. formatter = logging.Formatter('Python: { "loggerName":"%(name)s", "asciTime":"%(asctime)s", "pathName":"%(pathname)s", "logRecordCreationTime":"%(created)f", "functionName":"%(funcName)s", "levelNo":"%(levelno)s", "lineNo":"%(lineno)d", "time":"%(msecs)d", "levelName":"%(levelname)s", "message":"%(message)s"}')
  9. handler.formatter = formatter
  10. logger.addHandler(handler)
  11. logger.info("Test Message")
Thank you! We've received your feedback.