This topic describes how to use LogReduce to group and analyze collected log entries that are extremely similar to detect frequently occurring log patterns such as conditions that trigger alarms.

Scenario

With LogReduce, you can locate problems, detect exceptions, and perform other O&M-related actions for DevOps, or detect network intrusions that may have compromised the security of your services. In addition, you can save the log grouping result as an analysis chart to a dashboard, and then view the grouped data in real time.

Benefits

  • Log entries in three formats (Log4J, JSON, or Syslog) can be grouped by using the LogReduce function.
  • Gigabytes of data can be grouped in seconds.
  • You can view the log entries that are grouped for each log pattern, and you can display the number of grouped log entries during different time ranges.
  • You can dynamically adjust the tolerance of log grouping.

Index size

Note After you enable the LogReduce function, the size of log indexes increases by 10%. For example, if the size of raw log data is 100 GB/day, the size of the log indexes increases by 10 GB after you enable the function.
Raw log size Proportion of indexes in the raw log Size of indexes generated by LogReduce Index size
100 GB 20% (20 GB) 100 * 10% 30 GB
100 GB 40% (40 GB) 100 * 10% 50 GB
100 GB 100% (100 GB) 100 * 10% 110 GB

Enable LogReduce

Note By default, LogReduce is disabled.
  1. Log on to the Log Service console, and then click the target project name.
  2. On the Logstores page, click Search on the right of the target Logstore.
  3. If you have enabled the index function, choose Index Attributes > Modify. If you have not enabled the index function, click Enable.
    Figure 1. Enable the index function


    Figure 2. Modify the log index


  4. Set index parameters, and click the switch to enable LogReduce.
    Figure 3. Enable LogReduce


  5. Click OK.
    Note After you enable LogReduce, Log Service automatically groups collected log data.

View the log grouping result and the raw log

  1. On the Search & Analysis page, enter a search and analysis statement in the search box, and click Search & Analysis.
    Note
    • You can also use key words to filter the grouped log entries.
    • The SQL type of statements is not supported by the LogReduce function. This means that analysis results of log data cannot be grouped by this function.
  2. Click the LogReduce tab to view the result.
    Item Description
    Number Indicates the sequence number of a log group.
    Count Indicates the number of log entries of a log group.
    Pattern Indicates the log patterns. Each log group has one or more than one sub-patterns.
    Figure 4. Result


  3. Move your pointer over a Count value to show the sub-patterns of this log group and the proportion of each sub-pattern in the log group.
    Note You can also click + in front of a Count value to show the pattern list of the log group.
    Figure 5. View log grouping details


  4. Click a Count value to view the raw log of the log group.
    Figure 6. View the raw log


Adjust the log grouping tolerance

  1. On the Search & Analysis page, click the LogReduce tab.
  2. In the upper-right corner of the tab page, drag the Pattern slider to adjust the log grouping tolerance.
    • If you drag the slider towards Many, the system outputs a more specific log grouping result and shows patterns in greater detail.
    • If you drag the slider towards Little, the system outputs a less specific log grouping result and shows patterns in less detail.
Figure 7. Adjust the log grouping tolerance


Compare the number of group log entries during different periods of time

Click Log Compare, select a time length, and then click OK.

Item Description
Number Displays the sequence number of a log group.
Pre_Count Displays the number of log entries during a time range.
Count Displays the number of log entries for the log pattern for the current time range.
Diff Displays the difference in the number of log entries for the log pattern for the current time range and a past time range.
Pattern Displays the log pattern.

Use the LogReduce function through API

  • To obtain a log grouping result, execute the following SQL statement:
    * | select a.pattern, a.count,a.signature, a.origin_signatures from (select log_reduce(3) as a from log) limit 1000 
    Note If you directly view the result through the Log Service console, you can click Copy Query to get the SQL statement executed by the system in the backend.

    Parameter and field description

    • The parameter in the SQL statement that you need to customize is log_reduce(precision).

      This parameter must be set to an integer that is in the range of 1 to 16. Its default value is 3. A lower tolerance value outputs a grouping result of a higher tolerance, and more log patterns.

    • The execution result of the SQL statement contains the following returned fields:
      • pattern: indicates the sub-patterns of log entries in a log group.
      • count: indicates the number of log entries in a log group.
      • signature: indicates the log pattern of a log group.
      • origin_signatures: indicates the original signature of a log group. You can use this field to search the log entries of this log group.
  • To show the difference of log grouping results between different times, execute the following SQL statement:
    * | select v.pattern, v.signature, v.count, v.count_compare, v.diff from (select compare_log_reduce(3, 86400) as v from log) order by v.diff desc limit 1000 
    Note If you click Log Compare in the Log Service console to show the difference of log grouping results between different times, the system then executes an SQL statement for the log entries. You can click Copy Query to get the SQL statement.

    Parameter and field description

    • The parameter in the SQL statement that you need to customize is compare_log_reduce(precision, compare_interval) .
      • The tolerance parameter must be an integer that is in the range of 1 to 16. Its default value is 3. A lower tolerance value outputs a grouping result of a higher tolerance, and more log patterns.
      • The compare_interval parameter indicates the number of seconds before which the log entries to be compared with was generated. This parameter must be set as a positive integer.
    • The execution result of the SQL statement contains the following returned fields:
      • pattern: indicates the sub-patterns of log entries in a log group.
      • signature: indicates the log pattern of a log group.
      • count: indicates the number of log entries in a log group.
      • count_compare: indicates the number of log entries for a log group of the same log pattern within the specified time range.
      • diff: indicates the difference between the count field value and the count_compare field value.