What are the differences among log collection agents? - Simple Log Service

Client evaluation in log collection scenarios

In the data technology (DT) era, hundreds of millions of servers, mobile terminals, and network devices generate a large number of logs every day. The centralized log processing solution effectively meets the log consumption requirements in the lifecycle of log data. Before consuming logs, you need to collect logs from devices and synchronize them to the cloud first.

Three log collection tools

Logstash
- As a part of the ELK Stack, Logstash is active in the open-source community. It can work with extensive plug-ins in the ecosystem.
- Logstash is coded in JRuby and can run across platforms on Java virtual machines (JVMs).
- With a modular design, Logstash features high scalability and interoperability.
Fluentd
- Fluentd is a popular log collection tool in the open-source community. Its core component, td-agent, is commercially available and maintained by Treasure Data. Fluentd is selected for evaluation in this topic.
- Fluentd is coded in CRuby. Some key components related to its performance are re-coded in C. The overall performance of Fluentd is excellent.
- Fluentd features a simple design and provides reliable data transmission in pipelines.
- Compared with Logstash, Fluentd has fewer plug-ins.
Logtail
- As the producer of Alibaba Cloud Log Service, Logtail has been tested in big data scenarios for many years in Alibaba Group.
- Logtail, which is coded in C++, delivers excellent performance in stability, resource control, and management.
- Compared with Logstash and Fluentd, Logtail obtains less support from the open-source community and focuses more on log collection.

Feature comparison

Feature	Logstash	Fluentd	Logtail
Log data read	Polling	Polling	Triggered by event
File rotation	Supported	Supported	Supported
Failover processing based on local checkpoints	Supported	Supported	Supported
General log parsing	Parsing by using Grok based on regular expressions	Parsing by using regular expressions	Parsing by using regular expressions
Specific log types	Mainstream formats such as delimiter, key-value, and JSON	Mainstream formats such as delimiter, key-value, and JSON	Mainstream formats such as delimiter, key-value, and JSON
Data compression for transmission	Supported by plug-ins	Supported by plug-ins	LZ4
Data filtering	Supported	Supported	Supported
Data buffer for transmission	Supported by plug-ins	Supported by plug-ins	Supported
Transmission exception handling	Supported by plug-ins	Supported by plug-ins	Supported
Runtime environment	Coded in JRuby and dependent on the JVM environment	Coded in CRuby and C and dependent on the Ruby environment	Coded in C++, without special requirements for the runtime environment
Thread support	Multithreading	Multithreading restricted by the global interpreter lock (GIL)	Multithreading
Hot upgrade	Not supported	Not supported	Supported
Centralized configuration management	Not supported	Not supported	Supported
Running status self-check	Not supported	Not supported	CPU or memory threshold protection supported

Performance comparison in log collection scenarios

For example, the following Nginx access log contains 365 bytes, from which 14 fields can be extracted:

In the simulated test scenario, this log is repeatedly written at different compression ratios. The time field of each log is set to the current system time when the log is written, and the other 13 fields are the same. Compared with the actual scenario, the simulated scenario has no difference in parsing logs. The only difference lies in that a high data compression ratio can reduce the network traffic on writing data.

Logstash

In Logstash 2.0.0, Logstash parses logs by using Grok and writes the logs to Kafka by using a built-in plug-in that enables GZIP compression.

Log parsing configuration:

grok {    
    patterns_dir=>"/home/admin/workspace/survey/logstash/patterns"
    match=>{ "message"=>"%{IPORHOST:ip} %{USERNAME:rt} - \[%{HTTPDATE:time}\] \"%{WORD:method} %{DATA:url}\" %{NUMBER:status} %{NUMBER:size} \"%{DATA:ref}\" \"%{DATA:agent}\" \"%{DATA:cookie_unb}\" \"%{DATA:cookie_cookie2}\" \"%{DATA:monitor_traceid}\" %{WORD:cell} %{WORD:ups} %{BASE10NUM:remote_port}" }
    remove_field=>["message"]
}

The following table lists test results.

Write transactions per second (TPS)	Write traffic (Unit: KB/s)	CPU usage (Unit: %)	Memory usage (Unit: MB)
500	178.22	22.4	427
1,000	356.45	46.6	431
5,000	1,782.23	221.1	440
10,000	3,564.45	483.7	450

Fluentd

In td-agent 2.2.1, Fluentd parses logs by using regular expressions and writes the logs to Kafka by using the third-party plug-in fluent-plugin-kafka that enables GZIP compression.

Log parsing configuration:

<source>
  type tail
  format /^(? <ip>\S+)\s(? <rt>\d+)\s-\s\[(? <time>[^\]]*)\]\s"(? <url>[^\"]+)"\s(? <status>\d+)\s(? <size>\d+)\s"(? <ref>[^\"]+)"\s"(? <agent>[^\"]+)"\s"(? <cookie_unb>\d+)"\s"(? <cookie_cookie2>\w+)"\s"(?
<monitor_traceid>\w+)"\s(? <cell>\w+)\s(? <ups>\w+)\s(? <remote_port>\d+).*$/
  time_format %d/%b/%Y:%H:%M:%S %z
  path /home/admin/workspace/temp/mock_log/access.log
  pos_file /home/admin/workspace/temp/mock_log/nginx_access.pos
  tag nginx.access
</source>

The following table lists test results.

Write TPS	Write traffic (Unit: KB/s)	CPU usage (Unit: %)	Memory usage (Unit: MB)
500	178.22	13.5	61
1,000	356.45	23.4	61
5,000	1,782.23	94.3	103

Note

Due to the restrictions of the GIL, a single process of Fluentd uses only one CPU core. You can install the multiprocess plug-in to use multiple processes for achieving a higher log throughput.

Logtail

In Logtail 0.9.4, Logtail uses regular expressions to extract log fields, compresses data by using the LZ4 compression algorithm, and then writes the data to Alibaba Cloud Log Service in compliance with HTTP. The batch_size parameter is set to 4000.

Log parsing configuration:

logRegex : (\S+)\s(\d+)\s-\s\[([^]]+)]\s"([^"]+)"\s(\d+)\s(\d+)\s"([^"]+)"\s"([^"]+)"\s"(\d+)"\s"(\w+)"\s"(\w+)"\s(\w+)\s(\w+)\s(\d+).*
keys : ip,rt,time,url,status,size,ref,agent,cookie_unb,cookie_cookie2,monitor_traceid,cell,ups,remote_port
timeformat : %d/%b/%Y:%H:%M:%S

The following table lists test results.

Write TPS	Write traffic (Unit: KB/s)	CPU usage (Unit: %)	Memory usage (Unit: MB)
500	178.22	1.7	13
1,000	356.45	3	15
5,000	1,782.23	15.3	23
10,000	3,564.45	31.6	25

Comparison of single-core CPU processing capabilities

Summary

The three log collection tools have their own advantages and disadvantages:

Logstash supports all mainstream log types, the most abundant plug-ins, and flexible customization. However, its performance on log collection is relatively poor, and it requires high memory usage when running in the JVM environment.
Fluentd supports all mainstream log types and many plug-ins. Its performance on log collection is excellent.
Logtail occupies the fewest CPU and memory resources of machines, achieves a high performance throughput, and provides comprehensive support for common log collection scenarios. However, it lacks the support of plug-ins, so it is less flexible and scalable than the preceding two tools.