Optimize the parameters of statistical time series anomaly detection algorithms - Lindorm

This topic describes how to optimize the parameters of statistical time series anomaly detection algorithms, including the esd, ttest, and nsigma algorithms.

Background information

Statistical time series anomaly detection algorithms calculate a score for each data point based on historical data and use the anomalyScore to record the score. The score is used to determine whether the data point is abnormal. The threshold based on which the algorithms determine whether a data point is abnormal is determined by the input parameters of the algorithms, such as esd.alpha, ttest.alpha, and nsigma.n. If the anomalyScore value of a data point is larger than the threshold of the algorithm, the data point is determined abnormal. Otherwise, the data point is determined normal.

If you set the esd.alpha parameter of the esd algorithm and the ttest.alpha parameter of the ttest algorithm to larger values, the algorithms become more sensitive and can detect more abnormal data points. If you set the esd.alpha parameter of the esd algorithm and the ttest.alpha parameter of the ttest algorithm to smaller values, the algorithms become more insensitive and can detect fewer abnormal data points. If you set the nsigma.n parameter of the nsigma algorithm to a smaller value, the algorithm becomes more sensitive and can detect more abnormal data points. If you set the nsigma.n parameter of the nsigma algorithm to a larger value, the algorithm becomes more insensitive and can detect fewer abnormal data points.

For more information about the scenarios and parameters of statistical time series anomaly detection algorithms, see Time series anomaly detection.

Parameter settings

If you want that all abnormal data points are detected, configure the input parameters in the following methods to improve the sensitivity of the algorithms:
- For the esd algorithm, set the esd.alpha parameter to a larger value. Example: esd.alpha=0.3.
- For the ttest algorithm, set the ttest.alpha parameter to a larger value. Example: ttest.alpha=0.2.
- For the nsigma algorithm, set the nsigma.n parameter to a smaller value. Example: nsigma.n=1.
Important If the values of esd.alpha and ttest.alpha are too large, the algorithms become too sensitive and excessive data points are determined abnormal.
If you want only data points that are significantly abnormal are detected, configure the input parameters in the following methods to reduce the sensitivity of the algorithms:
- For the esd algorithm, set the esd.alpha parameter to a smaller value. Example: esd.alpha=0.05.
- For the ttest algorithm, set the ttest.alpha parameter to a smaller value. Example: ttest.alpha=0.01.
- For the nsigma algorithm, set the nsigma.n parameter to a larger value. Example: nsigma.n=5.
Important If the values of esd.alpha and ttest.alpha are too small, only serious anomalies are reported and some anomalies may not be detected as expected.
You can configure the input parameters of the algorithms based on the data distribution in different scenarios. For more information, see Scenarios.

Scenarios

The number of detected abnormal data points is too large or too small

Optimization

You can adjust the sensitivity of algorithms to increase or reduce the threshold.

esd/ttest algorithm: Set the alpha parameter to a larger value to increase the sensitivity of the algorithm. In this case, the threshold of the algorithm is reduced and more abnormal data points are detected. If you set the alpha parameter to a smaller value, less abnormal data points are detected.
The following statements provide an example on how to adjust the sensitivity of the esd and ttest algorithms:
```
// Increase the value of the alpha parameter to 0.2. In this case, the algorithm becomes more sensitive and more abnormal data points are detected.
SELECT xx, anomaly_detect(mean_duration, 'esd', 'alpha=0.2') as res FROM xxx SAMPLE BY 0;
SELECT xx, anomaly_detect(mean_duration, 'ttest', 'alpha=0.2') as res FROM xxx SAMPLE BY 0;
```
Parameters
alpha: the sensitivity of the esd and ttest algorithm. The valid value of this parameter ranges from 0 to 1. The default value of the alpha parameter is 0.1 for the esd algorithm and 0.05 for the ttest algorithm.
nsigma algorithm: Set the n parameter to a smaller value to increase the sensitivity of the algorithm. In this case, more abnormal data points are detected. If you set the n parameter to a larger value, less abnormal data points are detected.
The following statements provide an example on how to adjust the sensitivity of the nsigma algorithm:
```
// Set the n parameter to 2. In this case, the algorithm becomes more sensitive and more abnormal data points are detected.
SELECT xx, anomaly_detect(mean_duration, 'nsigma', 'n=2') as res FROM xxx SAMPLE BY 0
```
Parameters
n: the threshold based on which the nsigma algorithm determines abnormal data points. The value of the n parameter cannot be 0 and is 3 by default.

Adjust the detection results of specific data points

You can adjust the alpha or n parameter of the algorithm to prevent specific data points from being determined abnormal.

For example, the detection result of data points is stored in the res column of the following table. According to the detection results, data points whose mean_duration values are 13622.6 and 8651.6 are detected as abnormal data points.

+----------------------------+----------------+-------+
|           time             | mean_duration  |  res  |
+----------------------------+----------------+-------+
|  2022-04-11T13:00:00+08:00 | 0              | false |
|  2022-04-11T14:00:00+08:00 | 0              | false |
|  2022-04-11T15:00:00+08:00 | 0              | false |
|  2022-04-11T16:00:00+08:00 | 0              | false |
|  2022-04-11T17:00:00+08:00 | 1136.3         | false |
|* 2022-04-11T18:00:00+08:00 | 13622.6        | true  |
|* 2022-04-11T19:00:00+08:00 | 8651.6         | true  |
|  2022-04-11T20:00:00+08:00 | 2735.46        | false |
|  2022-04-11T21:00:00+08:00 | 1496.683       | false |
|  2022-04-11T22:00:00+08:00 | 991.3175       | false |
+----------------------------+----------------+-------+

If you want to prevent the data point whose mean_duration value is 8651.6 from being detected as abnormal, perform the following steps to configure the input parameter.

Optimization

Add the verbose=true condition in the anomaly detection statement to enable the verbose mode.

SELECT xx, anomaly_detect(mean_duration, 'esd', 'verbose=true') as res FROM xxx SAMPLE BY 0

The following result is returned:

+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+
|           time             | mean_duration | res$anomaly |  res$anomalyScore   |   res$threshold    | res$detectedDirection |
+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+
| 2022-04-11T13:00:00+08:00  | 0             | false       | 0                   | 1.6447834844273468 | NONE                  |
| 2022-04-11T14:00:00+08:00  | 0             | false       | 0                   | 1.6447834844273468 | NONE                  |
| 2022-04-11T15:00:00+08:00  | 0             | false       | 0                   | 1.6447834844273468 | NONE                  |
| 2022-04-11T16:00:00+08:00  | 0             | false       | 0                   | 1.6447834844273468 | NONE                  |
| 2022-04-11T17:00:00+08:00  | 3136.3        | false       | 0.6917962785972575  | 1.6447834844273468 | NONE                  |
|* 2022-04-11T18:00:00+08:00 | 13622.6       | true        | 3.0136653345953954  | 1.6447834844273468 | UP                    |
|* 2022-04-11T19:00:00+08:00 | 8651.6        | true        | 1.7122438285577357  | 1.6447834844273468 | UP                    |
| 2022-04-11T20:00:00+08:00  | 6735.46       | false       | 1.252994967798293   | 1.6447834844273468 | NONE                  |
| 2022-04-11T21:00:00+08:00  | 1496.683      | false       | 0                   | 1.6447834844273468 | NONE                  |
| 2022-04-11T22:00:00+08:00  | 1691.3175     | false       | 0                   | 1.6447834844273468 | NONE                  |
+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+

According to the result, the anomaly scores of the two data points with the 2022-04-11T18:00:00+08:00 and 2022-04-11T19:00:00+08:00 timestamps exceed the threshold. Therefore, the two data points are determined abnormal. To prevent the data point with the 2022-04-11T19:00:00+08:00 timestamp from being determined abnormal, you can modify the value of the alpha parameter to change the threshold to a value within the (1.71, 3.01) interval.

Change the value of alpha from the default value 0.1 to 0.05 to reduce the sensitivity of the algorithm.

SELECT xx, anomaly_detect(mean_duration, 'esd', 'alpha=0.05, verbose=true') as res FROM xxx SAMPLE BY 0

The following result is returned:

+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+
|           time             | mean_duration | res$anomaly |  res$anomalyScore   |   res$threshold    | res$detectedDirection |
+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+
| 2022-04-11T13:00:00+08:00  | 0             | false       | 0                   | 1.9598247370788646 | NONE                  |
| 2022-04-11T14:00:00+08:00  | 0             | false       | 0                   | 1.9598247370788646 | NONE                  |
| 2022-04-11T15:00:00+08:00  | 0             | false       | 0                   | 1.9598247370788646 | NONE                  |
| 2022-04-11T16:00:00+08:00  | 0             | false       | 0                   | 1.9598247370788646 | NONE                  |
| 2022-04-11T17:00:00+08:00  | 3136.3        | false       | 0.6917962785972575  | 1.9598247370788646 | NONE                  |
|* 2022-04-11T18:00:00+08:00 | 13622.6       | true        | 3.0136653345953954  | 1.9598247370788646 | UP                    |
| 2022-04-11T19:00:00+08:00  | 8651.6        | false       | 1.7122438285577357  | 1.9598247370788646 | NONE                  |
| 2022-04-11T20:00:00+08:00  | 6735.46       | false       | 1.252565839324602   | 1.9598247370788646 | NONE                  |
| 2022-04-11T21:00:00+08:00  | 1496.683      | false       | 0                   | 1.9598247370788646 | NONE                  |
| 2022-04-11T22:00:00+08:00  | 1691.3175     | false       | 0                   | 1.9598247370788646 | NONE                  |
+----------------------------+---------------+-------------+---------------------+--------------------+-----------------------+

According to the detection result, the threshold in the res$threshold column is changed to 1.9598247370788646. Therefore, the data point with the 2022-04-11T19:00:00+08:00 timestamp is not determined abnormal. The data point with the 2022-04-11T18:00:00+08:00 timestamp is still determined abnormal.

Reduce unexpected anomalies when a large number of data points are detected

If the number of data points that need to be detected is small, unexpected anomalies are reported because the data points are not enough for the algorithm to train a stable model. The algorithm returns less unexpected anomalies when more data points are detected.

Optimization: Set the warmupCount parameter to a larger value. For example, if you set warmupCount to 100, the first 100 data points are used only to train the model and are not returned as anomalies.

Data points whose anomaly scores slightly deviate from the threshold cannot be detected because of dirty data

Dirty data such as data points whose anomaly scores are significantly larger than the threshold is stored in the table that you want to detect. In this case, data points whose anomaly scores slightly deviate from the threshold cannot be detected.

Optimization: Add the reset_state=true condition in the anomaly detection statement to reset the model. Example:

SELECT xx, anomaly_detect(mean_duration, 'esd', 'reset_state=true') as res FROM xxx WHERE time >= xxxx SAMPLE BY 0

Data points whose anomaly scores slightly deviate from the threshold cannot be detected after the business code is optimized

After the business code is optimized, the fluctuation of the data points that need to be detected becomes small. In this case, some data points whose anomaly scores slightly deviate from the threshold cannot be detected.

Optimization: The algorithm automatically updates the model based on new data. Therefore, this issue will be gradually fixed when more data is written to the table. If you want to immediately fix this issue, we recommend that you reset the model and modify the value of the lenHistoryWindow parameter rather than configuring the alpha parameter to adjust the algorithm sensitivity. Example:

SELECT xx, anomaly_detect(mean_duration, 'esd', 'lenHistoryWindow=1000, reset_state=true') as res FROM xxx SAMPLE BY 0

In the preceding example, the algorithm calculates the anomaly score of the current data point based on the distribution of only the latest 1,000 data points.