This topic describes how to use the LogReduce feature of Log Service. You can enable the feature, view log clustering results and raw logs, and compare the number of clustered logs in different time periods.

Background information

When you collect logs, the LogReduce feature can cluster highly similar logs and extract patterns from the logs. This way, you can quickly have an overall understanding of the logs. The feature can cluster text logs in multiple formats. You can use the feature to perform O&M operations in DevOps scenarios. For example, you can use the feature to identify errors, detect anomalies, and roll back versions. You can also use the feature to detect intrusions in security scenarios. You can save log clustering results as charts to a dashboard and view the clustered data in real time.

Benefits

  • The feature can cluster logs in multiple formats. Examples: Log4j, JSON, and single-line logs.
  • The feature can cluster hundreds of millions of logs in seconds.
  • The feature can cluster logs by multiple patterns.
  • You can retrieve raw logs that are clustered by pattern based on pattern signatures.
  • You can compare patterns in different time periods.
  • You can adjust the precision of log clustering based on your business requirements.

Index traffic

After you enable the LogReduce feature, the total size of indexes increases by 10% of the size for raw logs. For example, if the size of raw logs is 100 GB per day, the total size of indexes increases by 10 GB after you enable the LogReduce feature for the raw logs.
Size of raw logs Index percentage Size of indexes that are generated by LogReduce Total size of indexes
100 GB 20% (20 GB) 100 * 10% 30 GB
100 GB 40% (40 GB) 100 * 10% 50 GB
100 GB 100% (100 GB) 100 * 10% 110 GB

Enable the LogReduce feature

  1. Log on to the Log Service console.
  2. In the Projects section, click the name of the project that you want to view.
  3. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
  4. Enable the LogReduce feature.
    1. Choose Index Attributes > Attributes.
      If the indexing feature is not enabled, click Enable.
    2. In the Search & Analysis panel, turn on LogReduce.
    3. Optional:Configure a whitelist or a blacklist to cluster logs by field.
      Note You cannot configure both a whitelist and a blacklist.
      LogReduce Filter Description
      Whitelist After you configure a whitelist, Log Service uses the fields in the whitelist to cluster logs.
      Blacklist After you configure a blacklist, Log Service does not use the fields in the blacklist to cluster logs.
      No whitelist or blacklist configured If you do not configure a blacklist or a whitelist, Log Service clusters logs based on all fields and the clustering rules that you specify.
    4. Click OK.

View log clustering results and raw logs

  1. On the query and analysis page, enter a search statement in the search box, specify the query time range, and then click Search & Analyze.
    Note You can use only search statements to filter logs. You cannot use analytic statements to filter logs because the LogReduce feature cannot cluster analysis results.
  2. Click the LogReduce tab to view the log clustering results.

    You can click Add to New Dashboard to save the log clustering results to a dashboard.

    Log clustering details
    Parameter Description
    Number The ordinal number of the log cluster.
    Count The number of logs for the pattern in the specified query time range.
    Pattern The log pattern. Each log cluster has one or more sub-patterns.
    • Move the pointer over a number in the Count column to view the sub-patterns of the log cluster. You can also view the percentage of each sub-pattern in the log cluster. Click the plus sign (+) next to a number in the Count column to expand the sub-pattern list.
    • Click a number in the Count column. You are navigated to the Raw Logs tab. On this tab, you can view the raw logs of the pattern.

Change the precision of log clustering

On the LogReduce tab, you can adjust the Pattern Count slider to change the precision of log clustering.
  • If you adjust the slider toward Many, you can obtain a more precise log clustering result that has more detailed patterns.
  • If you adjust the slider toward Little, you can obtain a less precise log clustering result that has less detailed patterns.

Compare the number of logs that are clustered in different time periods

  1. On the LogReduce tab, click Log Compare.
  2. Specify a time range and click OK.
    For example, if you set the time range to 15 minutes when you query logs and specify 1Day for Log Compare, the start time and end time of log comparison are automatically displayed. The time ranges for comparison are the last 15 minutes and the 15 minutes on the previous day. Compare the number of logs
    Parameter Description
    Number The ordinal number of the log cluster.
    Pre_Count The number of logs for the pattern in the time range that is specified by Log Compare.
    Count The number of logs for the pattern in the time range that is specified for the query.
    Diff The difference between the numbers of logs in the Pre_Count and Count columns and the growth rate.
    Pattern The log pattern.

Examples of query statements

You can use query statements to obtain log clustering results.

  • Obtain log clustering results
    • Query statement
      * | select a.pattern, a.count,a.signature, a.origin_signatures from (select log_reduce(3) as a from log) limit 1000 
      Note When you view log clustering results, you can click Copy Query to obtain the query statement of the log clustering results.
    • Modify parameters

      Modify the parameter settings in log_reduce(precision) of the query statement. The precision parameter specifies the precision of log clustering. A smaller value indicates a higher precision and more patterns. Valid values: 1 to 16. Default value: 3.

    • Returned fields
      You can view log clustering details on the Graph tab.
      Parameter Description
      pattern The log pattern.
      count The number of logs for the pattern in the time range that is specified for the query.
      signature The signature of the log pattern.
      origin_signatures The secondary signature of the log pattern. You can use the secondary signature to retrieve the raw logs.
  • Compare the number of logs that are clustered in different time periods.
    • Query statement
      * | select v.pattern, v.signature, v.count, v.count_compare, v.diff from (select compare_log_reduce(3, 86400) as v from log) order by v.diff desc limit 1000 
      Note When you use Log Compare to compare log clustering results in different time periods, you can click Copy Query to obtain the query statement of the log clustering results.
    • Modify parameters
      Modify the parameter settings in compare_log_reduce(precision, compare_interval) of the query statement.
      • The precision parameter specifies the precision of log clustering. A smaller value indicates a higher precision and more patterns. Valid values: 1 to 16. Default value: 3.
      • The compare_interval parameter specifies the time difference between the two time ranges for comparison. The value is a positive integer. Unit: seconds.
    • Returned fields
      Parameter Description
      pattern The log pattern.
      count_compare The number of logs for the pattern in the previous time range that is specified for comparison.
      count The number of logs for the pattern in the time range that is specified for the query.
      diff The difference between the numbers of logs in the count and count_compare columns.
      signature The signature of the log pattern.

Disable the LogReduce feature

If you no longer need to use the LogReduce feature, you can disable the feature.

  1. On the query and analysis page of the Logstore for which you want to disable this feature, choose Index Attributes > Attributes.
  2. Turn off LogReduce.
  3. Click OK.