Detect Log Anomalies with LogReduce Clustering - Simple Log Service

Use log clustering to turn large volumes of unordered logs into structured templates and variable distributions. It helps you spot new anomalies after incidents or releases and find root causes without complex queries. Use it for troubleshooting, regression testing, and security audits.

Function overview

The new log clustering feature provides the following core capabilities:

Log pattern extraction: Automatically discovers log categories from massive volumes of logs and extracts a log template for each category.
Variable distribution analysis: Allows you to view the distribution of variables within a log template.
Grouped clustering: Allows you to group logs by a specified field before clustering. This is useful for analyzing logs from multiple modules.
Comparative analysis: Allows you to compare log pattern changes between different time periods to quickly identify anomalies.
Reverse lookup with regular expressions: Provides a regular expression for each log template to look up the corresponding raw logs.

Core concepts

Log clustering uses the following concepts to turn unstructured logs into insights:

Log category: Logs with a similar format belong to the same log category. Logs with different formats belong to different log categories.
Log template: A common pattern extracted from logs with a similar format. A template consists of constants and variables.
Log variable: The dynamic part of the log content within the same log category. This is usually runtime information recorded in the log output statement. Variables are highlighted in the template.
Log constant: All content in a log template except for the variables. The constant part is the same for all logs in the same log category.

Prerequisites

Before you use this feature, make sure that the following prerequisites are met:

A Standard Logstore has been created.
Logs have been collected.
An index has been configured.
Note
The new log clustering feature can cluster data in text fields for which you have enabled an index. Log templates are calculated in real time based on the field index. Therefore, no extra index traffic is generated beyond the traffic for log field indexing.

Perform log clustering analysis

This section guides you through a complete log clustering analysis, including basic clustering, grouped clustering, and comparative analysis.

Step 1: Go to the log clustering page

Log on to the Simple Log Service console.
In the Projects section, click the target project.
In the navigation pane on the left, select Log Storage, and then click the icon next to the target logstore name to expand the menu.
In the drop-down menu, click Search & Analysis.
On the search and analysis page, click LogReduce > Real-time Clustering - New.

Step 2: Configure clustering parameters and run the analysis

On the clustering page, configure the following parameters to start the analysis:

Set the analysis scope
- Time range (Optional): Select the time range of the logs that you want to analyze.
- Query statement (Optional): Enter a query statement to filter logs. This narrows the analysis scope and excludes irrelevant logs, such as * and not level:INFO. This can significantly improve the accuracy and efficiency of the clustering analysis.
Select the clustering target
- Clustering field (Required): Select the text field that you want to analyze. The system recommends the most suitable field for clustering based on the field content. You can select only one field.
(Optional) Grouping and comparative analysis
- Aggregation field (Optional): Select one or more fields to pre-group the logs. An index must be configured and statistics must be enabled for the fields. When logs contain multiple modules, such as different service_name values, grouping allows for independent clustering of each module. This prevents similar logs from different modules from being incorrectly categorized together.
  - Scenarios:
    - Logs from different modules vary significantly and require separate clustering analysis.
    - You need to view clustering results grouped by dimensions such as log level or service name.
  - Description of aggregation fields:
    - These are dimension fields used for pre-grouping. Logs are aggregated by these fields before clustering.
    - Recommended fields include log level, service, module, and environment.
    - You can select multiple aggregation fields (up to three) for combined grouping.
    - Pre-aggregation can improve clustering accuracy and prevent logs from different dimensions from being incorrectly categorized.
    - Select fields that have a limited number of values and are meaningful to your business as aggregation dimensions.
  - Limit: You can select a maximum of three aggregation fields.
- Comparison time (Optional): Select a time period for comparison. By comparing the data with historical data, you can quickly identify log patterns that are new, have disappeared, or have significantly increased or decreased during a version release or failure.
  - Scenarios:
    - Comparing logs before and after a version release.
    - Comparing a failure period with a normal period.
    - Periodic comparison, such as today vs. the same time yesterday, or this week vs. the same time last week.
    - Evaluating the effects of performance optimization.
  - Configuration methods:
    - Time Offset: Shift forward or backward relative to the current time range.
    - Custom Time: Manually select the start and end times for the comparison group.
  - Configuration recommendations:
    - Select time periods of the same duration for comparison to ensure the results are comparable. Do not select overlapping time ranges.
    - For periodic services, use a periodic comparison, such as the same time yesterday or the same time last week.
    - You can combine this with aggregation fields for a layered comparison. For example, you can compare log changes for each service separately using the service dimension.

After you complete the configuration, the system automatically starts the real-time clustering analysis.

Step 3: Interpret the clustering results

After the analysis is complete, the results are displayed in a list. Each row represents a log template.

Clustering Mode: Shows the pattern extracted from the logs. The variable parts are highlighted.
Quantity: Shows the total number of logs that match the template.
Log Distribution: Shows the distribution of logs that match this template at different points in time as a histogram, which helps you observe trends.
Group information: If you configured an aggregation field, this column shows the group to which the template belongs.

Note

The clustering algorithm samples a representative batch of logs for analysis. If the data volume in the time window is too large and the log formats are complex, the sampled logs may not cover all log categories.
You can configure a query statement to filter out logs that you are not interested in. This improves the clustering results.

Step 4: Drill down to analyze variable distribution

When you find a suspicious log template, such as an error log template with a sharp increase in count, analyzing the distribution of its internal variables is a key step in finding the root cause.

In the list of clustering results, click the highlighted variable in the Clustering Mode column.
In the panel that appears, view the distribution of the variable:
- Enumeration type: Shows the top N values of the variable and their occurrence counts. For example, you can view the distribution of the error_code variable to quickly find which error code appears most frequently.
- Numeric type: Shows the distribution of the variable's numeric range.
If you configured a comparison time, the panel displays the variable distribution for both the current time period (experiment group, dark color) and the comparison time period (control group, light color). This helps you quickly identify changes:
- Dark column chart: The number of logs in the current time range (experiment group).
- Light column chart: The number of logs in the comparison time range (control group).
By comparing them, you can quickly identify:
- New log patterns that exist in the experiment group but not in the control group.
- Disappeared log patterns that exist in the control group but not in the experiment group.
- Log patterns with significant changes in count.

Step 5: Look up raw logs

After you locate a specific log template, you may need to view the complete raw logs to obtain more context.

In the list of clustering results, click the row of the target log template to go to the template details page.
On the Log Sample tab, you can view some of the raw logs that correspond to the template. A maximum of 50 logs are displayed.
To view all logs that match the template, perform the following steps:
1. In the upper-right corner, click Display Parsing Rules to obtain the regular expression for the template.
2. Copy the regular expression.
3. Go to the Search & Analyze page and use the regexp_like operator to run a query. Example:
```
* | SELECT * FROM log WHERE regexp_like(Content, 'regular_expression')
```

Best practices

Quickly locate abnormal logs

Filter out normal logs with a query (for example, * and not LEVEL:INFO).
Review clustering results for the remaining logs and focus on abnormal or new patterns.
Open a suspicious template to check samples and variable distribution.

Compare version releases

Set the current time range to the period after the release.
Set the comparison time to the period before the release.
View the comparison results. Focus on new or disappeared log patterns.
Check for changes in variable distribution to determine if there are any anomalies.

Analyze by module

Set the aggregation field to a module identifier field, such as Component or ServiceName.
View the clustering results for each module to quickly locate the problematic module.

Choose between new and old log clustering

To help you make the right technical choice, the following table compares the core differences between the new and old versions of log clustering.

Feature	New log clustering	Old log clustering
Index traffic	No extra index traffic (relies only on field index)	Generates extra index traffic
Processing efficiency	Real-time calculation. Uses sampling for large data volumes.	High processing efficiency. Suitable for large data volumes.
Algorithm accuracy	Uses a more accurate algorithm	Relatively lower algorithm accuracy
Variable distribution	Supports analysis of variable distribution in log patterns	Not supported
Raw log lookup	Look up using the `regexp_like` operator	Natively supports direct lookup
Multi-field clustering	Supports only single-field clustering	Supports simultaneous clustering of multiple fields
Grouped clustering	Supported	Not supported
Dashboard integration	Not supported	Supported
Clustering precision adjustment	Not supported	Supported

Conclusion:

New log clustering: Suitable for scenarios with relatively small or filtered log volumes that require fine-grained analysis, such as variable distribution and comparative analysis.
Old log clustering: Suitable for fast clustering of massive log volumes and for scenarios where clustering results need to be displayed on a dashboard.

FAQ

Why are the clustering results empty?

Possible reasons and solutions:

No data in the selected field: Check if the clustering field contains data, or adjust the time range.
The query statement filtered out all logs: Check if the query statement is correct.
Index is not enabled for the field: Make sure that an index is configured for the clustering field.

What can I do if variables in a log template are not identified accurately?

The new log clustering feature uses an intelligent algorithm to automatically identify variables. If the identification is inaccurate, it may be because:

The log format is not standard, and the distinction between the constant and variable parts is not clear.
The log sample size is insufficient. You can expand the time range or reduce the filter conditions.

How can I view all logs for a specific log category?

Go to the log template details page.
Click and copy the regular expression.
On the LogSearch page, use the regexp_like operator to filter the logs:

* | SELECT * FROM log WHERE regexp_like(Content, 'copied regular expression')

During comparative analysis, why do some log patterns appear in only one time period?

This usually indicates:

Unique to the experiment group: This may be a new log. Check if it is an anomaly.
Unique to the control group: This may be a log that has disappeared due to a fixed issue or a feature change.

Analyze the situation based on your business context to determine if further investigation is needed.