You can use a correlation analysis function to quickly find the metrics that are correlated with a specified metric or time series data among multiple observed metrics in the system.

Function list

Function Description
ts_association_analysis Quickly finds the metrics that are correlated with a specified metric among multiple observed metrics in the system.
ts_similar Quickly finds the metrics that are correlated with specified time series data among multiple observed metrics in the system.

ts_association_analysis

Function format:
select ts_association_analysis(stamp, params, names, indexName, threshold)
The following table describes the parameters.
Parameter Description Value
stamp The Unix timestamp. Long type.
params The dimensions of the metrics to be analyzed. Array of the double type. For example, Latency, QPS, and NetFlow.
names The names of the metrics to be analyzed. Array of the varchar type. For example, Latency, QPS, and NetFlow.
indexName The name of the target metric. Varchar type, for example, Latency.
threshold The threshold of correlation between the metrics to be analyzed and the target metric. Double type. Valid values: [0, 1].
Result:
  • name: the name of the analyzed metric.
  • score: the value of correlation between the analyzed metric and the target metric. Valid values: [0, 1].
Sample code:
* | select ts_association_analysis(
              time, 
              array[inflow, outflow, latency, status], 
              array['inflow', 'outflow', 'latency', 'status'], 
              'latency', 
              0.1) from log;
Sample result:
| results               |
| --------------------- |
| ['latency', '1.0']    |
| ['outflow', '0.6265'] |
| ['status', '0.2270']  |

ts_similar

Function format 1:
select ts_similar(stamp, value, ts, ds)
select ts_similar(stamp, value, ts, ds, metricType)
The following table describes the parameters.
Parameter Description Value
stamp The Unix timestamp. Long type.
value The value of the specified metric. Double type.
ts The sequence of time for the specified curve. Array of the double type.
ds The sequence of numeric data for the specified curve. Array of the double type.
metricType The type of correlation between the measured curves. Varchar type. Valid values:

SHAPE, RMSE, PEARSON, SPEARMAN, R2, and KENDALL

Function format 2:
select ts_similar(stamp, value, startStamp, endStamp, step, ds)
select ts_similar(stamp, value, startStamp, endStamp, step, ds, metricType )
The following table describes the parameters.
Parameter Description Value
stamp The Unix timestamp. Long type.
value The value of the specified metric. Double type.
startStamp The start timestamp of the specified curve. Long type.
endStamp The end timestamp of the specified curve. Long type.
step The time interval between two adjacent points in the sequence of time. Long type.
ds The sequence of numeric data for the specified curve. Array of the double type.
metricType The type of correlation between the measured curves. Varchar type. Valid values:

SHAPE, RMSE, PEARSON, SPEARMAN, R2, and KENDALL

Result:

score: the value of correlation between the analyzed metric and the target metric. Valid values: [-1, 1].

Sample code:
* | select vhost, metric, ts_similar(time, value, 1560911040, 1560911065, 5, array[5.1,4.0,3.3,5.6,4.0,7.2], 'PEARSON') from log  group by vhost, metric;
Sample result:
| vhost  | metric          | score                |
| ------ | --------------- | -------------------- |
| vhost1 | redolog         | -0.3519082537204182  |
| vhost1 | kv_qps          | -0.15922168009772697 |
| vhost1 | file_meta_write | NaN                  |