Simple Log Service provides powerful alerting and analysis tools that help you diagnose issues and pinpoint anomalous sub-dimensions. When an anomaly occurs in a time series metric, use a root cause analysis function to identify the dimension attributes causing the deviation.
rca_kpi_search
Syntax
select rca_kpi_search(varchar_array, name_array, real, forecast, level)
The following table describes the parameters.
|
Parameter |
Description |
Value |
|
varchar_array |
The dimension attributes. |
Array. For example, |
|
name_array |
The names of the dimensions. |
Array. For example, |
|
real |
The actual value for the corresponding attribute combination. |
Type: double. All real numbers are supported. |
|
forecast |
The predicted value for the corresponding attribute combination. |
Type: double. All real numbers are supported. |
|
level |
The number of dimension attributes to include in the output root cause set. If you set this parameter to 0, the function returns all identified root cause sets. |
Type: long. Valid values: 0 ≤ level ≤ The number of dimensions to analyze (the length of |
Example
-
Query and analysis:
First, use a subquery to organize the actual and predicted values for each fine-grained attribute. Then, call the
rca_kpi_searchfunction to analyze the root cause of the anomaly.* not Status:200 | select rca_kpi_search( array[ ProjectName, LogStore, UserAgent, Method ], array[ 'ProjectName', 'LogStore', 'UserAgent', 'Method' ], real, forecast, 1) from ( select ProjectName, LogStore, UserAgent, Method, sum(case when time < 1552436040 then real else 0 end) * 1.0 / sum(case when time < 1552436040 then 1 else 0 end) as forecast, sum(case when time >=1552436040 then real else 0 end) *1.0 / sum(case when time >= 1552436040 then 1 else 0 end) as real from ( select '("__time__" - ("__time__" % 60))' as time, ProjectName, LogStore, UserAgent, Method, COUNT(*) as real from log GROUP by time, ProjectName, LogStore, UserAgent, Method ) GROUP BY ProjectName, LogStore, UserAgent, Method limit 100000000) -
Output: The analysis results include a time-series area chart and multiple expandable root cause set entries. The columns in the root cause set table include region, project, Logstore, useragent, method, rsst, change, and score. In this context, rsst represents the statistical value of the root cause submetric, and score represents the abnormality score for the root cause. A small trend line chart for the corresponding submetric is displayed on the right, which helps you visually inspect fluctuations during the anomaly.
Output structure:
{
"rcSets": [
{
"rcItems": [
{
"kpi": [{"attr": "xxx", "val": "xxx"}],
"nleaf": 100,
"change": 0.524543,
"score": 0.1454543
}
]
}
]
}
The following table describes the display items.
|
Parameter |
Description |
|
rcSets |
An array of root cause sets. |
|
rcItems |
An array of items that form the root cause set. |
|
kpi |
An array representing the conditions of an item within the root cause set. Each element in the array is an object where |
|
nleaf |
The number of leaves in the raw data that are covered by a KPI item in the root cause set. Note
A leaf is a log entry for the finest-grained attribute combination. |
|
change |
The proportion of the total anomalous change contributed by this item's leaves. |
|
score |
The abnormality score for this root cause item. The value is in the range [0, 1]. |
The output is a JSON object. The following code shows an example:
{
"rcSets": [
{
"rcItems": [
{
"kpi": [
{
"attr": "country",
"val": "*"
},
{
"attr": "province",
"val": "*"
},
{
"attr": "provider",
"val": "*"
},
{
"attr": "domain",
"val": "example.com"
},
{
"attr": "method",
"val": "*"
}
],
"nleaf": 119,
"change": 0.3180687806279939,
"score": 0.14436007709620113
}
]
}
]
}