This topic describes the frequent pattern statistical function that you can use to mine representative combinations of attributes from the specified multi-attribute field samples to summarize the current logs.

pattern_stat

Function format:
select pattern_stat(array[col1, col2, col3], array['col1_name', 'col2_name', 'col3_name'], array[col5, col6], array['col5_name', 'col6_name'], support_score, sample_ratio) 
The following table lists the parameters of the function.
Parameter Description Value
array[col1, col2, col3] The input columns of character type values. The values are in array format. Example: array[clientIP, sourceIP, path, logstore].
array['col1_name', 'col2_name', 'col3_name'] The names corresponding to the input columns of character type values. The values are in array format. Example: array['clientIP', 'sourceIP', 'path', 'logstore'].
array[col5, col6] The input columns of numeric values. The values are in array format. Example: array[Inflow, OutFlow].
array['col5_name', 'col6_name'] The names corresponding to the input columns of numeric values. The values are in array format. Example: array['Inflow', 'OutFlow']
support_score The support level of the sample in pattern mining. The value is of the double data type. Valid values: (0,1].
sample_ratio The sampling ratio. The default value is 0.1, which indicates that only 10% of the total samples are used. The value is of the double data type. Valid values: (0,1].
Example
  • The query statement is as follows:
    * | select pattern_stat(array[ Category, ClientIP, ProjectName, LogStore, Method, Source, UserAgent ], array[ 'Category', 'ClientIP', 'ProjectName', 'LogStore', 'Method', 'Source', 'UserAgent' ], array[ InFlow, OutFlow ], array[ 'InFlow', 'OutFlow' ], 0.45, 0.3) limit 1000
  • Output result

The following table lists the display items.
Display item Description
count The number of samples that conform to the current pattern.
support_score The support level of the current pattern.
pattern The content of the current pattern, which is organized in the format of conditional queries.