All Products
Search
Document Center

Simple Log Service:Aggregate functions

Last Updated:Mar 19, 2024

An aggregate function calculates the values of a field and returns a single value. This topic describes the syntax of aggregate functions. This topic also provides examples on how to use the functions.

The following table describes the aggregate functions that are supported by Simple Log Service.

Important If you want to use strings in analytic statements, you must enclose strings in single quotation marks (''). Strings that are not enclosed or enclosed in double quotation marks ("") indicate field names or column names. For example, 'status' indicates the status string, and status or "status" indicates the status log field.

Function

Syntax

Description

Supported in SQL

Supported in SPL

arbitrary function

arbitrary(x)

Returns a random, non-null value of the x field.

×

avg function

avg(x)

Calculates the average of the values of the x field.

×

bitwise_and_agg function

bitwise_and_agg(x)

Returns the result of the bitwise AND operation on the values of the x field.

×

bitwise_or_agg function

bitwise_or_agg(x)

Returns the result of the bitwise OR operation on the values of the x field.

×

bool_and function

bool_and(boolean expression)

Checks whether all logs meet the specified condition. If yes, the function returns true.

This function is equivalent to the every function.

×

bool_or function

bool_or(boolean expression)

Checks whether a log that meets the specified condition exists. If yes, the function returns true.

×

checksum function

checksum(x)

Calculates the checksum of the values of the x field.

×

count function

count(*)

Counts the number of logs.

×

count(1)

Counts the number of logs. This function is equivalent to the count(*) function.

×

count(x)

Counts the number of logs whose value of the x field is not NULL.

×

count_if function

count_if(boolean expression)

Counts the number of logs that meet the specified condition.

×

every function

every(boolean expression)

Checks whether all logs meet the specified condition. If yes, the function returns true.

This function is equivalent to the bool_and function.

×

geometric_mean function

geometric_mean(x)

Calculates the geometric mean of the values of the x field.

×

kurtosis function

kurtosis(x)

Calculates the kurtosis of the values of the x field.

×

map_union function

map_union(x)

Returns the result of the union operation on the specified maps. If a key exists in multiple input maps, the function randomly returns one of the values of the key.

×

max function

max(x)

Queries the largest value of the x field.

×

max(x, n)

Queries the n largest values of the x field. The function returns an array.

×

max_by function

max_by(x, y)

Queries the value of x that is associated with the largest value of the y field.

×

max_by(x, y, n)

Queries the values of x that are associated with the n largest values of the y field. The function returns an array.

×

min function

min(x)

Queries the smallest value of the x field.

×

min(x, n)

Queries the n smallest values of the x field. The function returns an array.

×

min_by function

min_by(x, y)

Queries the value of x that is associated with the smallest value of the y field.

×

min_by(x, y, n)

Queries the values of x that are associated with the n smallest values of the y field. The function returns an array.

×

skewness function

skewness(x)

Calculates the skewness of the values of the x field.

×

sum function

sum(x)

Calculates the sum of the values of the x field.

×

arbitrary function

The arbitrary function returns a random, non-null value of the x field.

Syntax

arbitrary(x)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

Return value type

Same as the data type of the parameter value.

Examples

Return an arbitrary, non-null value of the request_method field.

  • Query statement (Debug)

    * | SELECT
      arbitrary(request_method) AS request_method
  • Query and analysis resultsarbitrary函数

avg function

The avg function calculates the average of the values of the x field.

Syntax

avg(x)

Parameters

Parameter

Description

x

The value of this parameter is of the double, bigint, decimal, or real type.

Return value type

The double type.

Examples

Return the projects whose average latency is greater than 1,000 microseconds.

  • Query statement (Debug)

    method: PostLogstoreLogs | SELECT
      avg(latency) AS avg_latency,
      Project
    GROUP BY
      Project
    HAVING
      avg_latency > 1000
  • Query and analysis resultsavg函数

bitwise_and_agg function

The bitwise_and_agg function returns the result of the bitwise AND operation on the values of the x field.

Syntax

bitwise_and_agg(x)

Parameters

Parameter

Description

x

The value of this parameter is of the bigint type.

Return value type

The bigint type in binary representation.

Examples

Return the result of the bitwise AND operation on the values of the request_time field.

  • Query statement (Debug)

    * | SELECT
      bitwise_and_agg(status)
  • Query and analysis resultsAND

bitwise_or_agg function

The bitwise_or_agg function returns the result of the bitwise OR operation on the values of the x field.

Syntax

bitwise_or_agg(x)

Parameters

Parameter

Description

x

The value of this parameter is of the bigint type.

Return value type

The bigint type in binary representation.

Examples

Return the result of the bitwise OR operation on the values of the request_time field.

  • Query statement (Debug)

    * | SELECT
      bitwise_or_agg(request_length)
  • Query and analysis resultsOR

bool_and function

The bool_and function checks whether all logs meet the specified condition. If yes, the function returns true. This function is equivalent to the every function.

Syntax

bool_and(boolean expression)

Parameters

Parameter

Description

boolean expression

The value of this parameter is a Boolean expression.

Return value type

The Boolean type.

Examples

Check whether the value of the request_time field is less than 100 in all logs. Unit: seconds. If yes, the function returns true.

  • Query statement (Debug)

    * | SELECT
      bool_and(request_time < 100)
  • Query and analysis resultsbool_and

bool_or function

The bool_or function checks whether a log that meets the specified condition exists. If yes, the function returns true.

Syntax

bool_or(boolean expression)

Parameters

Parameter

Description

boolean expression

The value of this parameter is a Boolean expression.

Return value type

The Boolean type.

Examples

Check whether a log in which the value of the request_time field is less than 20 exists. Unit: seconds. If yes, the function returns true.

  • Query statement (Debug)

    * | SELECT
      bool_or(request_time < 20)
  • Query and analysis resultsbool_or

checksum function

The checksum function calculates the checksum of the values of the x field.

Syntax

checksum(x)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

Return value type

The string type. The result is Base64-encoded.

Examples

  • Query statement (Debug)

    * | SELECT
      checksum(request_method) AS request_method
  • Query and analysis resultschecksum

count function

The count function counts the number of logs.

Syntax

  • If you use the following syntax, the function counts the number of logs.

    count(*)
  • If you use the following syntax, the function counts the number of logs, which is equivalent to count(*).

    count(1)
  • If you use the following syntax, the function counts the number of logs whose value of the x field is not NULL.

    count(x)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

Return value type

The integer type.

Examples

  • Example 1: Count the page views (PVs) of a website.

    • Query statement (Debug)

      * | SELECT
        count(*) AS PV
    • Query and analysis resultscount函数

  • Example 2: Count the number of logs whose value of the request_method field is not NULL.

    • Query statement (Debug)

      * | SELECT
        count(request_method) AS count
    • Query and analysis resultscount函数

count_if function

The count_if function counts the number of logs that meet the specified condition.

Syntax

count_if(boolean expression)

Parameters

Parameter

Description

boolean expression

The value of this parameter is a Boolean expression.

Return value type

The integer type.

Examples

Count the number of logs whose request_uri field has a value suffixed with file-0.

  • Query statement (Debug)

    * | SELECT
      count_if(request_uri like '%file-0') AS count
  • Query and analysis resultscount函数

geometric_mean function

The geometric_mean function calculates the geometric mean of the values of the x field.

Syntax

geometric_mean(x)

Parameters

Parameter

Description

x

The value of this parameter is of the double, bigint, or real type.

Return value type

The double type.

Examples

Calculate the geometric mean of the values of the request_time field.

  • Query statement (Debug)

    * | SELECT
      geometric_mean(request_time) AS time
  • Query and analysis resultsgeometric_mean

every function

The every function checks whether all logs meet the specified condition. If yes, the function returns true. This function is equivalent to the bool_and function.

Syntax

every(boolean expression)

Parameters

Parameter

Description

boolean expression

The value of this parameter is a Boolean expression.

Return value type

The Boolean type.

Examples

Check whether the value of the request_time field is less than 100 in all logs. Unit: seconds. If yes, the function returns true.

  • Query statement (Debug)

    * | SELECT
      every(request_time < 100)
  • Query and analysis resultsbool_and

kurtosis function

The kurtosis function calculates the kurtosis of the values of the x field.

Syntax

kurtosis(x)

Parameters

Parameter

Description

x

The value of this parameter is of the double or bigint type.

Return value type

The double type.

Examples

Calculate the kurtosis of the values of the request_time field.

  • Query statement (Debug)

    *| SELECT
      kurtosis(request_time)
  • Query and analysis resultskurtosis

map_union function

The map_union function returns the result of the union operation on the specified maps. If a key exists in multiple input maps, the function randomly returns one of the values of the key.

Syntax

map_union(x)

Parameters

Parameter

Description

x

The value of this parameter is of the map type.

Return value type

The map type.

Examples

Perform a union operation on the maps of the etl_context field and randomly return one of the maps.

  • Sample field

    etl_context: {
     project:"datalab-148****6461-cn-chengdu"
     logstore:"internal-etl-log"
     consumer_group:"etl-83****4d1965"
     consumer:"etl-b2d40ed****c8d6-291294"
     shard_id:"0" }
  • Query statement

    * | SELECT
      map_union(
        try_cast(json_parse(etl_context) AS map(varchar, varchar))
      )
  • Query and analysis results map_union

max function

The max function queries the largest value of the x field.

Syntax

  • If you use the following syntax, the function queries the largest value of the x field.

    max(x)
  • If you use the following syntax, the function queries the n largest values of the x field. The function returns an array.

    max(x, n)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

n

The value of this parameter is a positive integer.

Return value type

Same as the data type of the parameter value.

Examples

  • Example 1: Query the largest value of the request_time field.

    • Query statement (Debug)

      * | SELECT
        max(request_time) AS max_request_time
    • Query and analysis resultsMAX函数

  • Example 2: Query the 10 largest values of the request_time field.

    • Query statement (Debug)

      * | SELECT
        max(request_time, 10) AS "top 10"
    • Query and analysis resultsmax

max_by function

The following list shows the syntax that is supported by the max_by function.

Syntax

  • If you use the following syntax, the function queries the value of x that is associated with the largest value of the y field.

    max_by(x, y)
  • If you use the following syntax, the function queries the values of x that are associated with the n largest values of the y field. The function returns an array.

    max_by(x, y, n)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

y

The value of this parameter is of an arbitrary data type.

n

The value of this parameter is an integer greater than 0.

Return value type

Same as the data type of the parameter value.

Examples

  • Example 1: Query the point in time of the order that has the largest consumption amount.

    • Query statement

      * | SELECT
        max_by(UsageEndTime, PretaxAmount) AS time
    • Query and analysis resultsmax_by

  • Example 2: Query the request methods of the requests that have the three largest values of the request_time field.

    • Query statement (Debug)

      * | SELECT
        max_by(request_method, request_time, 3) AS method
    • Query and analysis resultsmax_by

min function

The min function queries the smallest value of the x field.

Syntax

  • If you use the following syntax, the function queries the smallest value of the x field.

    min(x)
  • If you use the following syntax, the function queries the n smallest values of the x field. The function returns an array.

    min(x,n)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

n

The value of this parameter is a positive integer.

Return value type

Same as the data type of the parameter value.

Examples

  • Example 1: Query the smallest value of the request_time field.

    • Query statement (Debug)

      * | SELECT
        min(request_time) AS min_request_time
    • Query and analysis resultsMIN函数

  • Example 2: Query the 10 smallest values of the request_time field.

    • Query statement (Debug)

      * | SELECT
        min(request_time, 10)
    • Query and analysis resultsmin

min_by function

The following list shows the syntax that is supported by the min_by function.

Syntax

  • If you use the following syntax, the function queries the value of x that is associated with the smallest value of the y field.

    min_by(x, y)
  • If you use the following syntax, the function queries the values of x that are associated with the n smallest values of the y field. The function returns an array.

    min_by(x, y, n)

Parameters

Parameter

Description

x

The value of this parameter is of an arbitrary data type.

y

The value of this parameter is of an arbitrary data type.

n

The value of this parameter is an integer greater than 0.

Return value type

Same as the data type of the parameter value.

Examples

  • Example 1: Query the request method of the request that has the smallest value of the request_time field.

    • Query statement (Debug)

      * | SELECT
        min_by(request_method, request_time) AS method
    • Query and analysis resultsmin_by

  • Example 2: Query the request methods of the requests that have the three smallest values of the request_time field.

    • Query statement (Debug)

      * | SELECT
        min_by(request_method, request_time, 3) AS method
    • Query and analysis resultsmax_by

skewness function

The skewness function calculates the skewness of the values of the x field.

Syntax

skewness(x)

Parameters

Parameter

Description

x

The value of this parameter is of the double or bigint type.

Return value type

The double type.

Examples

Calculate the skewness of the values of the request_time field.

  • Query statement (Debug)

    *| SELECT
      skewness(request_time) AS skewness
  • Query and analysis resultsskewness

sum function

The sum function calculates the sum of the values of the x field.

Syntax

sum(x)

Parameters

Parameter

Description

x

The value of this parameter is of the double, bigint, decimal, or real type.

Return value type

Same as the data type of the parameter value.

Examples

Calculate the daily inbound traffic of the website.

  • Query statement (Debug)

    * | SELECT
      date_trunc('day', __time__) AS time,
      sum(body_bytes_sent) AS body_bytes_sent
    GROUP BY
      time
    ORDER BY
      time
  • Query and analysis resultsSUM函数