Log Service provides the machine learning feature that supports multiple algorithms and calling methods. You can use the analytic statement and machine learning functions to call machine learning algorithms to analyze the characteristics of one or more fields within a period of time.

Log Service offers various time series analysis algorithms. You can call these algorithms to solve problems that are related to time series data. For example, you can predict time series, detect time series anomalies, decompose time series, and cluster multiple time series. In addition, the algorithms are compatible with standard SQL functions. This simplifies the usage of the algorithms and improves the efficiency of troubleshooting.

## Features

- Supports various smooth operations on single-time series data.
- Supports algorithms that are used for the prediction, anomaly detection, change point detection, inflection point detection, and multi-period estimation of single-time series data.
- Supports decomposition operations on single-time series data.
- Supports various clustering algorithms of multi-time series data.
- Supports multi-field pattern mining (based on the sequence of numeric data or text).

## Limits

When you use the machine learning feature of Log Service, you must take note of the following limits:

- The specified time series data must be sampled based on the same interval.
- The specified time series data cannot contain data that is repeatedly sampled from the same point in time.
- The processing capacity cannot exceed the maximum capacity. The following table describes
the limits.
Item Limit Capacity of the time-series data processing Data can be collected from a maximum of 150,000 consecutive points in time. If the data volume exceeds the processing capacity, you must aggregate the data or reduce the sampling amount.

Capacity of the density-based clustering algorithm A maximum of 5,000 time series curves can be clustered at a time. Each curve cannot contain more than 1,440 points in time. Capacity of the hierarchical clustering algorithm A maximum of 2,000 time series curves can be clustered at a time. Each curve cannot contain more than 1,440 points in time.

## Machine learning functions

Category | Function | Description | |
---|---|---|---|

Time series | Smooth function | ts_smooth_simple | Uses the Holt Winters algorithm to smooth time series data. |

ts_smooth_fir | Uses the finite impulse response (FIR) filter to smooth time series data. | ||

ts_smooth_iir | Uses the infinite impulse response (IIR) filter to smooth time series data. | ||

Multi-period estimation function | ts_period_detect | Estimates time series data by period. | |

Change point detection function | ts_cp_detect | Detects the intervals in which data has different statistical features. The interval endpoints are change points. | |

ts_breakout_detect | Detects the points in time at which data experiences dramatic changes. | ||

Maximum value detection function | ts_find_peaks | Detects the local maximum value of time series data in a specified window. | |

Prediction and anomaly detection function | ts_predicate_simple | Uses default parameters to model time series data, predict time series data, and detect anomalies. | |

ts_predicate_ar | Uses an autoregressive (AR) model to model time series data, predict time series data, and detect anomalies. | ||

ts_predicate_arma | Uses an autoregressive moving average (ARMA) model to model time series data, predict time series data, and detect anomalies. | ||

ts_predicate_arima | Uses an autoregressive integrated moving average (ARIMA) model to model time series data, predict time series data, and detect anomalies. | ||

ts_regression_predict | Predicts the long-run trend for a single periodic time series. | ||

Sequence decomposition function | ts_decompose | Uses the Seasonal and Trend decomposition using Loess (STL) algorithm to decompose time series data. | |

Time series clustering function | ts_density_cluster | Uses a density-based clustering method to cluster multiple time series. | |

ts_hierarchical_cluster | Uses a hierarchical clustering method to cluster multiple time series. | ||

ts_similar_instance | Queries time series curves that are similar to a specified time series curve. | ||

Kernal density estimation functions | kernel_density_estimation | Uses the smooth peak function to fit the observed data points. In this way, the function simulates the real probability distribution curve. | |

Time series padding function | series_padding | Pads data points that are missing in a time series. | |

Anomaly comparison function | anomaly_compare | Compares the degree of difference of an observed object in two periods of time. | |

Pattern mining | Frequent pattern statistical function | pattern_stat | Mines representative combinations of attributes among the given multi-attribute field samples to obtain the frequent pattern in statistical patterns. |

Differential pattern statistical function | pattern_diff | Identifies the pattern that causes differences between two collections in specified conditions. | |

Root cause analysis function | rca_kpi_search | Analyze the subdimension attributes that cause anomalies of the monitoring metric. | |

Correlation analysis functions | ts_association_analysis | Identifies the metrics that are correlated to a specified metric among multiple observed metrics in the system. | |

ts_similar | Identifies the metrics that are correlated to specified time series data among multiple observed metrics in the system. | ||

Request URL classification function | url_classify | Classifies a request URL and attaches a tag to the URL. The function also provides the regular expression that defines the pattern of the tag. |