An aggregate function calculates the values of a field and returns a single value. This topic describes the syntax of aggregate functions. This topic also provides examples on how to use aggregate functions.

The following table describes the aggregate functions that are supported by Log Service.

Notice If you want to use strings in analytic statements, you must enclose strings in single quotation marks (''). Strings that are not enclosed or enclosed in double quotation marks ("") indicate field names or column names. For example, 'status' indicates the status string, and status or "status" indicates the status log field.
Function Syntax Description
arbitrary function arbitrary(x) Returns a random, non-null value of the x field.
avg function avg(x) Calculates the average of the values of the x field.
bitwise_and_agg function bitwise_and_agg(x) Returns the result of the bitwise AND operation on the values of the x field.
bitwise_or_agg function bitwise_or_agg(x) Returns the result of the bitwise OR operation on the values of the x field.
bool_and function bool_and(boolean expression) Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true.

This function is equivalent to the every function.

bool_or function bool_or(boolean expression) Checks whether a log entry that meets the specified condition exists. If a log entry that meets the specified condition exists, the function returns true.
checksum function checksum(x) Calculates the checksum of the values of the x field.
count function count(*) Counts the number of log entries.
count(1) Counts the number of log entries. This function is equivalent to the count(*) function.
count(x) Counts the number of log entries that contain the x field whose value is not null.
count_if function count_if(boolean expression) Counts the number of log entries that meet the specified condition.
every function every(boolean expression) Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true.

This function is equivalent to the bool_and function.

geometric_mean function geometric_mean(x) Calculates the geometric mean of the values of the x field.
kurtosis function kurtosis(x) Calculates the excess kurtosis of the values of the x field.
map_union function map_union(x) Returns the result of the union operation on the specified maps. If multiple input maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.
max function max(x) Queries the largest value of the x field.
max(x,n) Queries the n largest values of the x field. The function returns an array.
max_by function max_by(x,y) Queries the value of x that is associated with the largest value of the y field.
max_by(x,y,n) Queries the values of x that are associated with the n largest values of the y field.
min function min(x) Queries the minimum value of the x field.
min(x,n) Queries the n smallest values of the x field. The function returns an array.
min_by function min_by(x,y) Queries the value of x that is associated with the smallest value of the y field.
min_by(x,y,n) Queries the values of x that are associated with the n smallest values of the y field. The function returns an array.
skewness function skewness(x) Calculates the skewness of the values of the x field.
sum function sum(x) Calculates the sum of the values of the x field.

arbitrary function

Returns an arbitrary, non-null value of the x field.

Syntax

arbitrary(x)

Parameters

Parameter Description
x The value of this parameter can be of any data type.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

Returns an arbitrary, non-null value of the request_method field.

  • Query statement
    * | SELECT arbitrary(request_method) AS request_method
  • Query resultarbitrary() function

avg function

Calculates the average of the values of the x field.

Syntax

avg(x)

Parameters

Parameter Description
x The parameter can be of the double, bigint, decimal, or real data type.

Return value type

Data of the double data type.

Examples

Return the projects whose average latency of a defined item is greater than 1,000 microseconds.
  • Query statement
    method: PostLogstoreLogs | SELECT avg(latency) AS avg_latency, Project GROUP BY Project HAVING avg_latency > 1000
  • Query resultavg() function

bitwise_and_agg function

Returns the result of the bitwise AND operation on the values of the x field.

Syntax

bitwise_and_agg(x)

Parameters

Parameter Description
x The parameter can be of the bigint data type.

Return value type

Data of the bigint data type. The result represents a binary.

Examples

Returns the result of the bitwise AND operation on the values of the request_time field.

  • Query statement
    * | SELECT bitwise_and_agg(status)
  • Query resultAND

bitwise_or_agg function

Returns the result of the bitwise OR operation on the values of the xfield.

Syntax

bitwise_or_agg(x)

Parameters

Parameter Description
x The parameter can be of the bigint data type.

Return value type

Data of the bigint data type. The result represents a binary.

Examples

Returns the result of the bitwise OR operation on the values of the request_time field.
  • Query statement
    * | SELECT bitwise_or_agg(request_length)
  • Query resultOR

bool_and function

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true. This function is equivalent to the every function.

Syntax

bool_and(boolean expression)

Parameters

Parameter Description
boolean expression The parameter can be a boolean expression.

Return value type

Data of the boolean data type.

Examples

Checks whether the value of the request_time field is less than 100 in all log entries. Unit: seconds. If the value of the request_time field is less than 100 in all log entries, the function returns true.

  • Query statement
    * | SELECT bool_and(request_time < 100)
  • Query resultbool_and

bool_or function

Checks whether a log entry that meets the specified condition exists. If a log entry that meets the specified condition exists, the function returns true.

Syntax

bool_or(boolean expression)

Parameters

Parameter Description
boolean expression The parameter can be a boolean expression.

Return value type

Data of the boolean type.

Examples

Checks whether a log entry in which the value of the request_time field is less than 20 exists. Unit: seconds. If a log entry in which the value of the request_time field is less than 20 exists, the function returns true.

  • Query statement
    * | SELECT bool_or(request_time < 20)
  • Query resultbool_or

checksum function

Calculates the checksum of the values of the x field.

Syntax

checksum(x)

Parameters

Parameter Description
x The value of this parameter can be of any data type.

Return value type

Data of the string data type. The result is Base64-encoded.

Examples

  • Query statement
    * | SELECT checksum(request_method) AS request_method
  • Query resultchecksum

count function

Counts the number of log entries.

Syntax

  • Counts the number of log entries.
    count(*)
  • Counts the number of log entries. This function is equivalent to the count(*) function.
    count(1)
  • Counts the number of log entries that contain the x field whose value is not null.
    count(x)

Parameters

Parameter Description
x The value of this parameter can be of any data type.

Return value type

Data of the integer data type.

Examples

  • Example 1: Counts the page view (PV) of a website.
    • Query statement
      * | SELECT count(*) AS PV
    • Query resultcount() function
  • Example 2: Counts the number of log entries that have the request_method field whose value is not null.
    • Query statement
      * | SELECT count(request_method) AS count
    • Query resultcount() function

count_if function

Counts the number of log entries that meet the specified condition.

Syntax

count_if(boolean expression)

Parameters

Parameter Description
boolean expression The parameter can be a boolean expression.

Return value type

Data of the integer data type.

Examples

Count the log entries that have the request_uri field whose value ends with file-0.

  • Query statement
    * | SELECT count_if(request_uri like '%file-0') AS count
  • Query resultcount() function

geometric_mean function

Calculates the geometric mean of the values of the x field.

Syntax

geometric_mean(x)

Parameters

Parameter Description
x The parameter can be of the double, bigint, or real data type.

Return value type

Data of the double data type.

Examples

Calculates the geometric mean of request times.

  • Query statement
    * | SELECT geometric_mean(request_time) AS time
  • Query resultgeometric_mean

every function

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true. This function is equivalent to the bool_and function.

Syntax

every(boolean expression)

Parameters

Parameter Description
boolean expression The parameter can be a boolean expression.

Return value type

Data of the boolean data type.

Examples

Checks whether all request times are less than 100 seconds. If all request times are less than 100 seconds, the function returns true.

  • Query statement
    * | SELECT every(request_time < 100)
  • Query resultbool_and

kurtosis function

Calculates the excess kurtosis of the values of the x field.

Syntax

kurtosis(x)

Parameters

Parameter Description
x The parameter can be of the double or bigint data type.

Return value type

Data of the double data type.

Examples

Calculates the excess kurtosis of a set of request time.

  • Query statement
    *| SELECT kurtosis(request_time)
  • Query resultkurtosis

map_union function

Returns the result of the union operation on the maps that parameter specifies. If multiple maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.

Syntax

map_union(x)

Parameters

Parameter Description
x The parameter can be of the map data type.

Return value type

Data of the map data type.

Examples

Perform a union operation on the maps of the etl_context field and returns a map. If multiple input maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.

  • Field example
    etl_context: {
     project:"datalab-148****6461-cn-chengdu"
     logstore:"internal-etl-log"
     consumer_group:"etl-83****4d1965"
     consumer:"etl-b2d40ed****c8d6-291294"
     shard_id:"0" }
  • Query statement
    * | SELECT map_union(try_cast(json_parse(etl_context) AS map(varchar,varchar)))
  • Query result map_union

max function

Queries the largest value of the xfield.

Syntax

  • Returns the largest value.
    max(x)
  • Returns the n largest values.
    max(x,n)

Parameters

Parameter Description
x The value of this parameter can be of any data type.
n The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the longest request duration.
    • Query statement
      * | SELECT max(request_time) AS max_request_time
    • Query resultmax() function
  • Example 2: Queries the 10 longest request durations.
    • Query statement
      * | SELECT max(request_time,10) AS "top 10"
    • Query resultmax

max_by function

The following list shows the syntax that is supported by the max_by function.

Syntax

  • Queries the value of the x field that is associated with the largest value of the y field.
    max_by(x,y)
  • Queries the values of the x field that is associated with the n largest values of the yfield.
    max_by(x,y,n)

Parameters

Parameter Description
x The value of this parameter can be of any data type.
y The value of this parameter can be of any data type.
n The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the time of the order that has the highest value.
    • Query statement
      * | SELECT max_by(UsageEndTime, PretaxAmount) as time
    • Query resultmax_by
  • Example 2: Queries the request methods of the 3 requests that have the longest request durations.
    • Query statement
      * | SELECT max_by(request_method,request_time,3) AS method
    • Query resultmax_by

min function

Queries the minimum value of the x field.

Syntax

  • Returns the minimum value.
    min(x)
  • Returns the n minimum values. The function returns an array.
    min(x,n)

Parameters

Parameter Description
x The value of this parameter can be of any data type.
n The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the shortest request duration.
    • Example
      * | SELECT min(request_time) AS min_request_time
    • Query resultmin() function
  • Example: Queries the 10 shortest request durations.
    • Query statement
      * | SELECT min(request_time,10)
    • Query resultmin

min_by function

The following list shows the syntax that is supported by the min_by function.

Syntax

  • Queries the value of the x field that is associated with the smallest value of the y field.
    min_by(x,y)
  • Queries the values of the x field that is associated with the n smallest values of the yfield. The function returns an array.
    min_by(x,y,n)

Parameters

Parameter Description
x The value of this parameter can be of any data type.
y The value of this parameter can be of any data type.
n The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • The following query statement returns the request method of the request that has the shortest request duration:
    • Query statement
      * | SELECT min_by(request_method,request_time) AS method
    • Query resultmin_by
  • The following query statement returns the request methods of the requests whose request durations are the three shortest request durations:
    • Query statement
      * | SELECT min_by(request_method,request_time,3) AS method
    • Query resultmax_by

skewness function

Calculates the skewness of the values of the x field.

Syntax

skewness(x)

Parameters

Parameter Description
x The parameter can be of the double or bigint data type.

Return value type

Data of the double data type.

Examples

Calculates the skewness of the request times.

  • Query statement
    *| SELECT skewness(request_time) AS skewness
  • Query resultskewness

sum function

Calculates the sum of the values of x field.

Syntax

sum(x)

Parameters

Parameter Description
x The parameter can be of the double, bigint, decimal, or real data type.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

Calculates the daily inbound traffic of the website.
  • Query statement
    * | SELECT date_trunc('day',__time__) AS time, sum(body_bytes_sent) AS body_bytes_sent GROUP BY time ORDER BY time
  • Query resultsum() function