All Products
Search
Document Center

Simple Log Service:Aggregate function

Last Updated:Dec 13, 2023

An aggregate function calculates the values of a field and returns a single value. This topic describes the syntax of aggregate functions. This topic also provides examples on how to use aggregate functions.

The following table describes the aggregate functions that are supported by Log Service.

Important If you want to use strings in analytic statements, you must enclose strings in single quotation marks (''). Strings that are not enclosed or enclosed in double quotation marks ("") indicate field names or column names. For example, 'status' indicates the status string, and status or "status" indicates the status log field.

Function

Syntax

Description

arbitrary function

arbitrary(x)

Returns a random, non-null value of the x field.

avg function

avg(x)

Calculates the average of the values of the x field.

bitwise_and_agg function

bitwise_and_agg(x)

Returns the result of the bitwise AND operation on the values of the x field.

bitwise_or_agg function

bitwise_or_agg(x)

Returns the result of the bitwise OR operation on the values of the x field.

bool_and function

bool_and(boolean expression)

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true.

This function is equivalent to the every function.

bool_or function

bool_or(boolean expression)

Checks whether a log entry that meets the specified condition exists. If a log entry that meets the specified condition exists, the function returns true.

checksum function

checksum(x)

Calculates the checksum of the values of the x field.

count function

count(*)

Counts the number of log entries.

count(1)

Counts the number of log entries. This function is equivalent to the count(*) function.

count(x)

Counts the number of log entries that contain the x field whose value is not null.

count_if function

count_if(boolean expression)

Counts the number of log entries that meet the specified condition.

every function

every(boolean expression)

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true.

This function is equivalent to the bool_and function.

geometric_mean function

geometric_mean(x)

Calculates the geometric mean of the values of the x field.

kurtosis function

kurtosis(x)

Calculates the excess kurtosis of the values of the x field.

map_union function

map_union(x)

Returns the result of the union operation on the specified maps. If multiple input maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.

max function

max(x)

Queries the largest value of the x field.

max(x,n)

Queries the n largest values of the x field. The function returns an array.

max_by function

max_by(x,y)

Queries the value of x that is associated with the largest value of the y field.

max_by(x,y,n)

Queries the values of x that are associated with the n largest values of the y field.

min function

min(x)

Queries the minimum value of the x field.

min(x,n)

Queries the n smallest values of the x field. The function returns an array.

min_by function

min_by(x,y)

Queries the value of x that is associated with the smallest value of the y field.

min_by(x,y,n)

Queries the values of x that are associated with the n smallest values of the y field. The function returns an array.

skewness function

skewness(x)

Calculates the skewness of the values of the x field.

sum function

sum(x)

Calculates the sum of the values of the x field.

arbitrary function

Returns an arbitrary, non-null value of the x field.

Syntax

arbitrary(x)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

Returns an arbitrary, non-null value of the request_method field.

  • Query statement

    * | SELECT arbitrary(request_method) AS request_method
  • Query resultarbitrary() function

avg function

Calculates the average of the values of the x field.

Syntax

avg(x)

Parameters

Parameter

Description

x

The parameter can be of the double, bigint, decimal, or real data type.

Return value type

Data of the double data type.

Examples

Return the projects whose average latency of a defined item is greater than 1,000 microseconds.

  • Query statement

    method: PostLogstoreLogs | SELECT avg(latency) AS avg_latency, Project GROUP BY Project HAVING avg_latency > 1000
  • Query resultavg() function

bitwise_and_agg function

Returns the result of the bitwise AND operation on the values of the x field.

Syntax

bitwise_and_agg(x)

Parameters

Parameter

Description

x

The parameter can be of the bigint data type.

Return value type

Data of the bigint data type. The result represents a binary.

Examples

Returns the result of the bitwise AND operation on the values of the request_time field.

  • Query statement

    * | SELECT bitwise_and_agg(status)
  • Query resultAND

bitwise_or_agg function

Returns the result of the bitwise OR operation on the values of the xfield.

Syntax

bitwise_or_agg(x)

Parameters

Parameter

Description

x

The parameter can be of the bigint data type.

Return value type

Data of the bigint data type. The result represents a binary.

Examples

Returns the result of the bitwise OR operation on the values of the request_time field.

  • Query statement

    * | SELECT bitwise_or_agg(request_length)
  • Query resultOR

bool_and function

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true. This function is equivalent to the every function.

Syntax

bool_and(boolean expression)

Parameters

Parameter

Description

boolean expression

The parameter can be a boolean expression.

Return value type

Data of the boolean data type.

Examples

Checks whether the value of the request_time field is less than 100 in all log entries. Unit: seconds. If the value of the request_time field is less than 100 in all log entries, the function returns true.

  • Query statement

    * | SELECT bool_and(request_time < 100)
  • Query resultbool_and

bool_or function

Checks whether a log entry that meets the specified condition exists. If a log entry that meets the specified condition exists, the function returns true.

Syntax

bool_or(boolean expression)

Parameters

Parameter

Description

boolean expression

The parameter can be a boolean expression.

Return value type

Data of the boolean type.

Examples

Checks whether a log entry in which the value of the request_time field is less than 20 exists. Unit: seconds. If a log entry in which the value of the request_time field is less than 20 exists, the function returns true.

  • Query statement

    * | SELECT bool_or(request_time < 20)
  • Query resultbool_or

checksum function

Calculates the checksum of the values of the x field.

Syntax

checksum(x)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

Return value type

Data of the string data type. The result is Base64-encoded.

Examples

  • Query statement

    * | SELECT checksum(request_method) AS request_method
  • Query resultchecksum

count function

Counts the number of log entries.

Syntax

  • Counts the number of log entries.

    count(*)
  • Counts the number of log entries. This function is equivalent to the count(*) function.

    count(1)
  • Counts the number of log entries that contain the x field whose value is not null.

    count(x)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

Return value type

Data of the integer data type.

Examples

  • Example 1: Counts the page view (PV) of a website.

    • Query statement

      * | SELECT count(*) AS PV
    • Query resultcount() function

  • Example 2: Counts the number of log entries that have the request_method field whose value is not null.

    • Query statement

      * | SELECT count(request_method) AS count
    • Query resultcount() function

count_if function

Counts the number of log entries that meet the specified condition.

Syntax

count_if(boolean expression)

Parameters

Parameter

Description

boolean expression

The parameter can be a boolean expression.

Return value type

Data of the integer data type.

Examples

Counts the log entries that have the request_uri field whose value ends with file-0.

  • Query statement

    * | SELECT count_if(request_uri like '%file-0') AS count
  • Query resultcount() function

geometric_mean function

Calculates the geometric mean of the values of the x field.

Syntax

geometric_mean(x)

Parameters

Parameter

Description

x

The parameter can be of the double, bigint, or real data type.

Return value type

Data of the double data type.

Examples

Calculates the geometric mean of request times.

  • Query statement

    * | SELECT geometric_mean(request_time) AS time
  • Query resultgeometric_mean

every function

Checks whether all log entries meet the specified condition. If all log entries meet the specified condition, the function returns true. This function is equivalent to the bool_and function.

Syntax

every(boolean expression)

Parameters

Parameter

Description

boolean expression

The parameter can be a boolean expression.

Return value type

Data of the boolean data type.

Examples

Checks whether all request times are less than 100 seconds. If all request times are less than 100 seconds, the function returns true.

  • Query statement

    * | SELECT every(request_time < 100)
  • Query resultbool_and

kurtosis function

Calculates the excess kurtosis of the values of the x field.

Syntax

kurtosis(x)

Parameters

Parameter

Description

x

The parameter can be of the double or bigint data type.

Return value type

Data of the double data type.

Examples

Calculates the excess kurtosis of a set of request time.

  • Query statement

    *| SELECT kurtosis(request_time)
  • Query resultkurtosis

map_union function

Returns the result of the union operation on the maps that the x parameter specifies. If multiple maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.

Syntax

map_union(x)

Parameters

Parameter

Description

x

The parameter can be of the map data type.

Return value type

Data of the map data type.

Examples

Performs a union operation on the maps of the etl_context field and returns a map. If multiple input maps have the same key, the function selects an arbitrary input map from these input maps to be in the output map.

  • Field example

    etl_context: {
     project:"datalab-148****6461-cn-chengdu"
     logstore:"internal-etl-log"
     consumer_group:"etl-83****4d1965"
     consumer:"etl-b2d40ed****c8d6-291294"
     shard_id:"0" }
  • Query statement

    * | SELECT map_union(try_cast(json_parse(etl_context) AS map(varchar,varchar)))
  • Query result map_union

max function

Queries the largest value of the xfield.

Syntax

  • Returns the largest value.

    max(x)
  • Returns the n largest values.

    max(x,n)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

n

The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the longest request duration.

    • Query statement

      * | SELECT max(request_time) AS max_request_time
    • Query resultmax() function

  • Example 2: Queries the 10 longest request durations.

    • Query statement

      * | SELECT max(request_time,10) AS "top 10"
    • Query resultmax

max_by function

The following list shows the syntax that is supported by the max_by function.

Syntax

  • Queries the value of the x field that is associated with the largest value of the y field.

    max_by(x,y)
  • Queries the values of the x field that is associated with the n largest values of the yfield.

    max_by(x,y,n)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

y

The value of this parameter can be of any data type.

n

The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the time of the order that has the highest value.

    • Query statement

      * | SELECT max_by(UsageEndTime, PretaxAmount) as time
    • Query resultmax_by

  • Example 2: Queries the request methods of the 3 requests that have the longest request durations.

    • Query statement

      * | SELECT max_by(request_method,request_time,3) AS method
    • Query resultmax_by

min function

Queries the minimum value of the x field.

Syntax

  • Returns the minimum value.

    min(x)
  • Returns the n minimum values. The function returns an array.

    min(x,n)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

n

The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • Example 1: Queries the shortest request duration.

    • Example

      * | SELECT min(request_time) AS min_request_time
    • Query resultmin() function

  • Example: Queries the 10 shortest request durations.

    • Query statement

      * | SELECT min(request_time,10)
    • Query resultmin

min_by function

The following list shows the syntax that is supported by the min_by function.

Syntax

  • Queries the value of the x field that is associated with the smallest value of the y field.

    min_by(x,y)
  • Queries the values of the x field that is associated with the n smallest values of the yfield. The function returns an array.

    min_by(x,y,n)

Parameters

Parameter

Description

x

The value of this parameter can be of any data type.

y

The value of this parameter can be of any data type.

n

The value must be a positive integer.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

  • The following query statement returns the request method of the request that has the shortest request duration:

    • Query statement

      * | SELECT min_by(request_method,request_time) AS method
    • Query resultmin_by

  • The following query statement returns the request methods of the requests whose request durations are the three shortest request durations:

    • Query statement

      * | SELECT min_by(request_method,request_time,3) AS method
    • Query resultmax_by

skewness function

Calculates the skewness of the values of the x field.

Syntax

skewness(x)

Parameters

Parameter

Description

x

The parameter can be of the double or bigint data type.

Return value type

Data of the double data type.

Examples

Calculates the skewness of the request times.

  • Query statement

    *| SELECT skewness(request_time) AS skewness
  • Query resultskewness

sum function

Calculates the sum of the values of x field.

Syntax

sum(x)

Parameters

Parameter

Description

x

The parameter can be of the double, bigint, decimal, or real data type.

Return value type

The data type of the return value is the same as the data type of the parameter.

Examples

Calculates the daily inbound traffic of the website.

  • Query statement

    * | SELECT date_trunc('day',__time__) AS time, sum(body_bytes_sent) AS body_bytes_sent GROUP BY time ORDER BY time
  • Query resultsum() function