This topic provides answers to some frequently asked questions about time series anomaly detection.
What should I do if I want to detect anomalies in data that contains a large number of time series or data with irregular intervals?
If the data on which you want to perform anomaly detection contains a large number of time series, a long period of time is required for Lindorm ML to return the results. We recommend that you specify conditions in the
WHEREclause to filter out a small number of time series for testing. After you determine the algorithms and parameters that are used for anomaly detection, you can gradually increase the number of time series in the data.If the time series data on which you want to perform anomaly detection has irregular intervals, we recommend that you use the
SAMPLE BYoperator to perform downsampling on the data before the detection.
What algorithm should I use to detect time series anomalies?
Lindorm ML provides two types of algorithms for time series anomaly detection: statistical algorithms and decomposition-based algorithms. For more information, see Time series anomaly detection.
If the time series data in your business is periodic data with peaks or valleys at fixed intervals on a daily or weekly basis, we recommend that you use decomposition-based algorithms, such as ostl-esd and istl-esd, to detect anomalies. For more information, see Optimize the parameters of decomposition-based time series anomaly detection algorithms.
If anomalies in your data can be detected based on thresholds, we recommend that you use statistical algorithms, such as esd, ttest, and nsigma, to detect anomalies. For more information, see Optimize the parameters of statistical time series anomaly detection algorithms.
How do I select a time range for time series anomaly detection?
You can select a time range for anomaly detection based on whether the anomaly detection algorithm is initialized.
If you perform anomaly detection on your data for the first time, select a time range that is long enough for the algorithm to be initialized. For example, you must specify a time range that contains at least four complete cycles for decomposition-based algorithms to be initialized. Each cycle must contain a peak and a valley. You can also set the
verboseparameter to true to check whether the algorithm is initialized based on the returned value of thewarmupcolumn. If the value of the warmup column isFALSE, the algorithm is initialized.If you perform anomaly detection after the algorithm is initialized, the detection results are retained and continuously updated. In this case, you can specify a time range based on your requirements. The shorter the time range that you specify, the less time required by the algorithm used to detect anomalies. If you want to continuously detect anomalies in the latest data, see Detect time series data exceptions continuously.
When should I configure the adhoc_state parameter?
If you are not familiar with the anomaly detection algorithms and related parameters, we recommend that you set the adhoc_state parameter to true to test the algorithms and parameters. In this case, the status of an anomaly detection query takes effect only within this query and does not affect the results of other anomaly detection queries. Therefore, you can obtain the same anomaly detection result by specifying the same algorithm and parameters.
When should I configure the reset_state parameter?
If the distribution of data significantly changes and the cumulative status of anomaly detection is not applicable to further detection, you can set the reset_state parameter to true to reset the anomaly detection status.