This topic describes how to collect and analyze Log4j logs by using Log Service.

Background information

In recent years, the emergence of stateless programming, containers, and serverless computing has greatly improved the efficiency of software delivery and deployment. These technological advancements have driven application architecture to change in the following areas:
  • More and more applications are adopting the microservices architecture over the traditional monolithic architecture, where applications are built from a collection of microservices instead of an individual system.
  • Physical servers are being phased out in favor of virtual resources.
Figure 1. Architecture evolution
Architecture evolution
Although these changes have driven huge improvements in the elastic and standard architectures, the complexity of operations and maintenance (O&M) and troubleshooting have increased. Ten years ago, engineers can log on to a server to obtain logs. However, this method (attach to process) is no longer in use. Today, O&M is performed on a standardized "black box".
Figure 2. Trend
Trend

A series of diagnostic and analytic tools geared towards DevOps are developed to respond to these changes. These tools include centralized monitoring services, centralized log services, and software as a service (SaaS) deployment and monitoring services.

Another solution to these changes is centralized log processing. The guiding principle is to transmit application logs in real time or near-real time to syslog, Kafka, ELK (Elasticsearch, Logstash, and Kibana), and HBase for centralized storage.

Advantages of centralized log processing

  • Easy deployment: Running grep commands to search for logs of a stateless application in a traditional architecture is troublesome and time-consuming. In contrast, you need only to run a simple search command to search for logs on a centralized server.
  • Separated compute and storage: You do not need to plan the storage space required for logs when purchasing hardware.
  • Low costs: Centralized log storage allocates resources based on loads, providing efficient and flexible resource allocation.
  • High security: In the event that your service is affected by a malicious attack or disaster, key data is retained and can be used as evidence.
Figure 3. Advantages of centralized log processing
Advantages of centralized log processing

Log collector for Java applications

Log Service provides more than 30 data collection sources and complete implementation solutions for servers, mobile clients, and embedded devices developed in various languages. Java developers need to get familiar with the following logging frameworks: Log4j, Log4j2, and Logback.

For Java applications, there are two mainstream log collection solutions:
  • Java application logs are flushed to disks before Logtail collects them in real time.
  • Java applications are configured with Appenders, which are used to upload logs to Log Service in real time when the applications are running.
The following table describes the differences between the two solutions.
Item Logs flushed to disks and collected by Logtail Logs sent by an Appender
Timeliness Low High
Throughput Large Large
Resumable upload Supported, depending on the Logtail configuration Supported, depending on the memory size
Application location Required when you configure a server group Not required because the Appender actively sends logs to Log Service
Local logs Supported Supported
Method to disable collection Delete the Logtail configuration. Modify the Appender configuration file and restart the application.
Appenders allow you to collect logs in real time through simple configurations, without the need to modify code. An Appender provided by Log Service for Java applications has the following advantages:
  • Easy deployment: The Appender takes effect after the configuration file is modified. You do not need to modify application code.
  • Asynchronous and resumable uploads: The input and output operations do not affect the main thread. The upload operation allows certain tolerance to network and service errors.
  • High concurrency: The Appender is capable of sending a large number of logs.
  • Contextual search: The Appender keeps the precise context of log entries after sending them to Log Service. This allows you to query log entries before and after a log entry.

Overview and use of Appender

Appenders provided by Log Service use aliyun-log-producer-java as the LogHub Producer Library to write data. The following table lists the Appenders available for log data uploading.
Name Description
aliyun-log-log4j-appender The Appender developed for Log4j 1.x. If your applications use the Log4j 1.x logging framework, we recommend that you use this Appender.
aliyun-log-log4j2-appender The Appender developed for Log4j 2.x. If your applications use the Log4j 2.x logging framework, we recommend that you use this Appender.
aliyun-log-logback-appender The Appender developed for Logback. If your applications use the Logback logging framework, we recommend that you use this Appender.
aliyun-log-producer-java The high-concurrency LogHub Producer Library for Java applications. All the preceding Appenders use this library to write data. With high flexibility, aliyun-log-producer-java enables you to specify the fields and formats of the data written to LogHub. If these Appenders cannot meet your business needs, you can develop your own log collection programs based on aliyun-log-producer-java.

Step 1: Configure an Appender

For more information, see the configuration steps in aliyun-log-log4j-appender.

The configuration file log4j.properties contains the following information:
log4j.rootLogger=WARN,loghub

log4j.appender.loghub=com.aliyun.openservices.log.log4j.LoghubAppender

# Specify the project of your log services, required
log4j.appender.loghub.project=[your project]
# Specify the logStore of your log services, required
log4j.appender.loghub.logStore=[your logStore]
# Specify the HTTP endpoint of your log services, required
log4j.appender.loghub.endpoint=[your project endpoint]
# Specify the account information for your log services, required
log4j.appender.loghub.accessKeyId=[your accessKeyId]
log4j.appender.loghub.accessKeySecret=[your accessKeySecret]

# Specify format of the field log, required
log4j.appender.loghub.layout=org.apache.log4j.PatternLayout
log4j.appender.loghub.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

# The upper limit log size that a single producer instance can hold, default is 100MB.
log4j.appender.loghub.totalSizeInBytes=104857600
# If the producer has insufficient free space, the caller's maximum blocking time on the send method, defaults is 60 seconds.
log4j.appender.loghub.maxBlockMs=60000
# The thread pool size for executing log sending tasks, defaults is the number of processors available.
log4j.appender.loghub.ioThreadCount=8
# When the size of the cached log in a Producer Batch is greater than or equal batchSizeThresholdInBytes, the batch will be send, default is 512KB, maximum can be set to 5MB.
log4j.appender.loghub.batchSizeThresholdInBytes=524288
# When the number of log entries cached in a ProducerBatch is greater than or equal to batchCountThreshold, the batch will be send.
log4j.appender.loghub.batchCountThreshold=4096
# A ProducerBatch has a residence time from creation to sending, defaulting is 2 seconds and a minimum of 100 milliseconds.
log4j.appender.loghub.lingerMs=2000
# The number of times a Producer Batch can be retried if it fails to send for the first time, default is 10.
log4j.appender.loghub.retries=10
# The backoff time for the first retry, default 100 milliseconds.
log4j.appender.loghub.baseRetryBackoffMs=100
# The maximum backoff time for retries, default is 50 seconds.
log4j.appender.loghub.maxRetryBackoffMs=100

# Specify the topic of your log, default is "", optional
log4j.appender.loghub.topic = [your topic]

# Specify the source of your log, default is host ip, optional
source = [your source]

# Specify time format of the field time, default is yyyy-MM-dd'T'HH:mm:ssZ, optional
timeFormat = yyyy-MM-dd'T'HH:mm:ssZ

# Specify timezone of the field time, default is UTC, optional
timeZone = UTC

Step 2: Prepare for log query and analysis

After the Appender is configured, Java application logs are automatically sent to Log Service. You can use the search and analytics feature provided by Log Service to query and analyze logs in real time. The following lists the formats of sample logs:
  • Log that records a user logon:
    level:  INFO  
    location:  com.aliyun.log4jappendertest.Log4jAppenderBizDemo.login(Log4jAppenderBizDemo.java:38)
    message:  User login successfully. requestID=id4 userID=user8  
    thread:  main  
    time:  2018-01-26T15:31+0000
  • Log that records a user purchase:
    level:  INFO  
    location:  com.aliyun.log4jappendertest.Log4jAppenderBizDemo.order(Log4jAppenderBizDemo.java:46)
    message:  Place an order successfully. requestID=id44 userID=user8 itemID=item3 amount=9  
    thread:  main  
    time:  2018-01-26T15:31+0000

Step 3: Enable the log search and analytics feature

You must enable the log search and analytics feature before you query and analyze logs. Perform the following steps to enable the feature:

  1. Log on to the Log Service console, and then click the target project name.
  2. Click the Logstore management icon icon next to the target Logstore name and select Search & Analysis to go to the Search & Analysis page.
  3. In the upper-right corner of the page, choose Index Attributes > Modify.
  4. In the Search & Analysis dialog box that appears, turn on the Enable Analytics switch for required fields.
    Figure 4. Specify fields to be queried
    Specify fields to be queried

Step 4: Analyze logs

  • Count errors occurred in each location in the last hour, rank the locations, and find the top 3 locations with the most errors.
    level: ERROR | select location ,count(*) as count GROUP BY  location  ORDER BY count DESC LIMIT 3
  • Count the number of log entries of each severity generated in the last 15 minutes.
    | select level ,count(*) as count GROUP BY level ORDER BY count DESC
  • Query the context of a log entry.

    You can find the full context of a raw log entry. For more information, see Perform a context query.

  • Count the logons of each user in the last hour, rank the users, and find the top 3 users with the most logons.
    login | SELECT regexp_extract(message, 'userID=(? <userID>[a-zA-Z\d]+)', 1) AS userID, count(*) as count GROUP BY userID ORDER BY count DESC LIMIT 3
  • Count the total payment amount of each user in the last 15 minutes.
    order | SELECT regexp_extract(message, 'userID=(? <userID>[a-zA-Z\d]+)', 1) AS userID, sum(cast(regexp_extract(message, 'amount=(? <amount>[a-zA-Z\d]+)', 1) AS double)) AS amount GROUP BY userID