Aggregate functions supported by SQL analysis - IoT Platform

Aggregate functions process multiple input records and return a single output value. They are commonly used with the group by statement.

SQL analysis supports the following aggregate functions.

Function	Feature
ANY_VALUE	Returns a random value from a specified scope.
APPROX_DISTINCT	Returns the approximate number of unique input values.
ARG_MAX	Returns the value of a column from the row that contains the maximum value of a specified column.
ARG_MIN	Returns the value of a column from the row that contains the minimum value of a specified column.
AVG	Calculates the average value.
BITWISE_AND_AGG	Calculates the bitwise AND aggregate value of the input values.
BITWISE_OR_AGG	Calculates the bitwise OR aggregate value of the input values.
COUNT	Counts the number of records.
COUNT_IF	Counts the number of records for which the specified expression is true.
MAX	Calculates the maximum value.
MAX_BY	Returns the value of a column from the row that contains the maximum value of a specified column.
MEDIAN	Calculates the median.
MIN	Calculates the minimum value.
MIN_BY	Returns the value of a column from the row that contains the minimum value of a specified column.
STDDEV	Calculates the population standard deviation.
STDDEV_SAMP	Calculates the sample standard deviation.
SUM	Calculates the sum.
WM_CONCAT	Concatenates strings using a specified separator.

ANY_VALUE

Syntax
```
any_value(<colname>)
```
Description

Returns a random value from a specified scope.
Description

colname: Required. The column name. It can be of any data type.
Return value

The return value has the same data type as the colname value. Rows where the colname value is NULL are ignored.

Examples

Example 1: Select a random employee name from all employees. The command is as follows:
```
select any_value(ename) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| SMITH      |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and select a random employee name from each group. The command is as follows:

select deptno, any_value(ename) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | CLARK      |
| 20         | SMITH      |
| 30         | ALLEN      |
+------------+------------+

APPROX_DISTINCT

Syntax
```
approx_distinct(<colname>)
```
Description

Calculates the approximate number of unique values in a specified column.
Parameters

colname: Required. The column for which to count unique values.
Return value

Returns a BIGINT value. This function has a standard error of 5%. Rows where the colname value is NULL are ignored.

Examples

Example 1: Calculate the approximate number of unique values in the salary (sal) column. The command is as follows:

select approx_distinct(sal) from emp;

The result is as follows:

+-------------------+
| numdistinctvalues |
+-------------------+
| 12                |
+-------------------+

Example 2: Use with group by to group all employees by department (deptno) and calculate the approximate number of unique values in the salary (sal) column. The command is as follows:

select deptno, approx_distinct(sal) from emp group by deptno;

The result is as follows:

+------------+-------------------+
| deptno     | numdistinctvalues |
+------------+-------------------+
| 10         | 3                 |
| 20         | 4                 |
| 30         | 5                 |
+------------+-------------------+

ARG_MAX

Syntax

arg_max(<valueToMaximize>, <valueToReturn>)

Description

Returns the valueToReturn from the row that contains the maximum valueToMaximize.
Parameters
- valueToMaximize: Required. Can be of any data type.
- valueToReturn: Required. Can be of any data type.
Return value

The return value has the same data type as valueToReturn. If multiple rows contain the maximum value, a value from a random one of these rows is returned. Rows where the valueToMaximize value is NULL are ignored.

Examples

Example 1: Return the name of the employee with the highest salary. The command is as follows:
```
select arg_max(sal, ename) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| KING       |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and return the name of the employee with the highest salary in each group. The command is as follows:

select deptno, arg_max(sal, ename) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | KING       |
| 20         | SCOTT      |
| 30         | BLAKE      |
+------------+------------+

ARG_MIN

Syntax

arg_min(<valueToMinimize>, <valueToReturn>)

Description

Returns the valueToReturn from the row that contains the minimum valueToMinimize.
Parameters
- valueToMinimize: Required. Can be of any data type.
- valueToReturn: Required. Can be of any data type.
Return value

The return value has the same data type as valueToReturn. If multiple rows contain the minimum value, a value from a random one of these rows is returned. Rows where the valueToMinimize value is NULL are ignored.

Examples

Example 1: Return the name of the employee with the lowest salary. The command is as follows:
```
select arg_min(sal, ename) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| SMITH      |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and return the name of the employee with the lowest salary in each group. The command is as follows:

select deptno, arg_min(sal, ename) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | MILLER     |
| 20         | SMITH      |
| 30         | JAMES      |
+------------+------------+

AVG

Syntax
```
DECIMAL | DOUBLE  avg(<colname>)
```
Description

Calculates the average value.
Parameters

colname: Required. The column name. The column values can be of any data type that can be converted to DOUBLE for calculation.

Return value

If the colname value is NULL, the row is ignored. The return type is determined by the input type as follows:

Input type	Return type
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

Examples

Example 1: Calculate the average salary (sal) of all employees. The command is as follows:
```
select avg(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 2222.0588235294117 |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the average salary (sal) for each department. The command is as follows:

select deptno, avg(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2916.6666666666665 |
| 20         | 2175.0     |
| 30         | 1566.6666666666667 |
+------------+------------+

BITWISE_AND_AGG

Function declaration
```
bigint bitwise_and_agg(bigint value)
```
Description

Performs a bitwise AND operation on all input values and returns the aggregate value.
Parameter descriptions.

value: Required. A BIGINT value. NULL values are ignored.
Return value

Returns a BIGINT value.

Example

select id, bitwise_and_agg(v) from
    values (1L, 2L), (1L, 1L), (2L, null), (1L, null) t(id, v) group by id;

The result is as follows:

+------------+------------+
| id         | _c1        |
+------------+------------+
| 1          | 0          |
| 2          | NULL       |
+------------+------------+

BITWISE_OR_AGG

Function declaration
```
bigint bitwise_or_agg(bigint value)
```
Description

Performs a bitwise OR operation on all input values and returns the aggregate value.
Parameter descriptions

value: Required. A BIGINT value. NULL values are ignored.
Return value

Returns a BIGINT value.

Example

select id, bitwise_or_agg(v) from
    values (1L, 2L), (1L, 1L), (2L, null), (1L, null) t(id, v) group by id;

The result is as follows:

+------------+------------+
| id         | _c1        |
+------------+------------+
| 1          | 3          |
| 2          | NULL       |
+------------+------------+

COUNT

Syntax
```
bigint count([distinct|all] <colname>)
```
Description

Counts the number of records.
Parameters
- distinct|all: Optional. Specifies whether to count only unique records. The default is all, which counts all records. If distinct is specified, only unique values are counted.
- colname: Required. The column name. It can be of any data type. colname can be *, as in count(*), to return the total number of rows.
Return value

Returns a BIGINT value. Rows where the colname value is NULL are ignored.

Examples

Example 1: Count the total number of employees in all departments. The command is as follows:
```
select count(*) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 17         |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and count the number of employees in each department (deptno). The command is as follows:

select deptno, count(*) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 6          |
| 20         | 5          |
| 30         | 6          |
+------------+------------+

Example 3: Use distinct to count the number of unique departments. The command is as follows:
```
select count(distinct deptno) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 3          |
+------------+
```

COUNT_IF

Syntax
```
bigint count_if(boolean <expr>)
```
Description

Counts the number of records for which the expr value is True.
Description

expr: Required. A BOOLEAN expression.
Return value

Returns a BIGINT value. Rows where expr is False or the column specified in expr is NULL are ignored.

Example

select count_if(sal > 1000), count_if(sal <=1000) from emp;

The result is as follows:

+------------+------------+
| _c0        | _c1        |
+------------+------------+
| 15         | 2          |
+------------+------------+

MAX

Syntax
```
max(<colname>)
```
Description

Calculates the maximum value.
Parameter descriptions

colname: Required. The column can be of any data type except BOOLEAN.
Return value
The return value has the same data type as colname. The following rules apply:
- Rows where the colname value is NULL are ignored.
- If colname is of the BOOLEAN type, it cannot be used in the calculation.

Examples

Example 1: Calculate the highest salary (sal) of all employees. The command is as follows:
```
select max(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 5000       |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the highest salary (sal) for each department. The command is as follows:

select deptno, max(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 5000       |
| 20         | 3000       |
| 30         | 2850       |
+------------+------------+

MAX_BY

Syntax

max_by(<valueToReturn>,<valueToMaximize>)

Description

Note
The MAX_BY function is identical to the ARG_MAX function, but the parameter order is different. The MAX_BY function is added to be compatible with open source syntax.

Returns the valueToReturn from the row that contains the maximum valueToMaximize.
Metric description
- valueToMaximize: Required. Can be of any data type.
- valueToReturn: Required. Can be of any data type.
Return value

The return value has the same data type as valueToReturn. If multiple rows contain the maximum value, a value from a random one of these rows is returned. Rows where the valueToMaximize value is NULL are ignored.

Examples

Example 1: Return the name of the employee with the highest salary. The command is as follows:
```
select max_by(ename,sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| KING       |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and return the name of the employee with the highest salary in each group. The command is as follows:

select deptno, max_by(ename,sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | KING       |
| 20         | SCOTT      |
| 30         | BLAKE      |
+------------+------------+

MEDIAN

Syntax

double median(double <colname>)
decimal median(decimal <colname>)

Description

Calculates the median.
Description

colname: Required. The column name. The column values can be of the DOUBLE or DECIMAL type. If the input is of the STRING or BIGINT type, it is implicitly converted to the DOUBLE type for calculation.

Return value

If the colname value is NULL, the row is ignored. The return type is determined by the input type as follows:

Input type	Return type
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

Examples

Example 1: Calculate the median salary (sal) of all employees. The command is as follows:
```
select median(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 1600.0     |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the median salary (sal) for each department. The command is as follows:

select deptno, median(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 2450.0     |
| 20         | 2975.0     |
| 30         | 1375.0     |
+------------+------------+

MIN

Syntax
```
min(<colname>)
```
Description

Calculates the minimum value.
Description.

colname: Required. The value of the column. The value can be of any data type except BOOLEAN.
Return value
The return value has the same data type as colname. The following rules apply:
- Rows where the colname value is NULL are ignored.
- If colname is of the BOOLEAN type, it cannot be used in the calculation.

Examples

Example 1: Calculate the lowest salary (sal) of all employees. The command is as follows:
```
select min(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 800        |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the lowest salary (sal) for each department. The command is as follows:

select deptno, min(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1300       |
| 20         | 800        |
| 30         | 950        |
+------------+------------+

MIN_BY

Syntax

min_by(<valueToReturn>,<valueToMinimize>)

Description

Note
The MIN_BY function is identical to the ARG_MIN function, but the parameter order is different. The MIN_BY function is added to be compatible with open source syntax.

Returns the valueToReturn from the row that contains the minimum valueToMinimize.
Parameter descriptions.
- valueToMinimize: Required. Can be of any data type.
- valueToReturn: Required. Can be of any data type.
Return value

The return value has the same data type as valueToReturn. If multiple rows contain the minimum value, a value from a random one of these rows is returned. Rows where the valueToMinimize value is NULL are ignored.

Examples

Example 1: Return the name of the employee with the lowest salary. The command is as follows:
```
 select min_by(ename,sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| SMITH      |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and return the name of the employee with the lowest salary in each group. The command is as follows:

select deptno, min_by(ename,sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | MILLER     |
| 20         | SMITH      |
| 30         | JAMES      |
+------------+------------+

STDDEV

Syntax

double stddev(double <colname>)
decimal stddev(decimal <colname>)

Description

Calculates the population standard deviation.
Description.

colname: Required. A DOUBLE or DECIMAL type. If the input is of the STRING or BIGINT type, it is implicitly converted to the DOUBLE type for calculation.

Return value

If the colname value is NULL, the row is ignored. The return type is determined by the input type as follows:

Input type	Return type
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

Examples

Example 1: Calculate the population standard deviation of the salaries (sal) of all employees. The command is as follows:
```
select stddev(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 1262.7549932628976 |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the population standard deviation of salaries (sal) for each department. The command is as follows:

select deptno, stddev(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1546.1421524412158 |
| 20         | 1004.7387720198718 |
| 30         | 610.1001739241043 |
+------------+------------+

STDDEV_SAMP

Syntax

double stddev_samp(double <colname>)
decimal stddev_samp(decimal <colname>)

Description

Calculates the sample standard deviation.
Parameters

colname: Required. The column values can be of the DOUBLE or DECIMAL type. If the input is of the STRING or BIGINT type, it is implicitly converted to the DOUBLE type for calculation.

Return value

If the colname value is NULL, the row is ignored. The return type is determined by the input type as follows:

Input type	Return type
TINYINT	DOUBLE
SMALLINT	DOUBLE
INT	DOUBLE
BIGINT	DOUBLE
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

Examples

Example 1: Calculate the sample standard deviation of the salaries (sal) of all employees. The command is as follows:
```
select stddev_samp(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 1301.6180541247609 |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the sample standard deviation of salaries (sal) for each department. The command is as follows:

select deptno, stddev_samp(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1693.7138680032901 |
| 20         | 1123.3320969330487 |
| 30         | 668.3312551921141 |
+------------+------------+

SUM

Syntax

DECIMAL | DOUBLE | BIGINT  sum(<colname>)

Description

Calculates the sum.
Description

colname: Required. The column values can be of the DOUBLE, DECIMAL, or BIGINT type. If the input is of the STRING type, it is implicitly converted to the DOUBLE type for calculation.

Return value

If the colname value is NULL, the row is ignored. The return type is determined by the input type as follows:

Input type	Return type
TINYINT	BIGINT
SMALLINT	BIGINT
INT	BIGINT
BIGINT	BIGINT
FLOAT	DOUBLE
DOUBLE	DOUBLE
DECIMAL	DECIMAL

Examples

Example 1: Calculate the sum of the salaries (sal) of all employees. The command is as follows:
```
select sum(sal) from emp;
```
The result is as follows:
```
+------------+
| _c0        |
+------------+
| 37775      |
+------------+
```

Example 2: Use with group by to group all employees by department (deptno) and calculate the sum of salaries (sal) for each department. The command is as follows:

select deptno, sum(sal) from emp group by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 17500      |
| 20         | 10875      |
| 30         | 9400       |
+------------+------------+

WM_CONCAT

Syntax

string wm_concat(string <separator>, string <colname>)

Description

Concatenates the values in colname using the specified separator.
Parameters
- separator: Required. A STRING constant that acts as the separator.
- colname: Required. A STRING type. If the input is of the BIGINT, DOUBLE, or DATETIME type, it is implicitly converted to the STRING type for calculation.
Return value (When grouping with group by, the returned values within a group are not sorted.)
Returns a STRING value. The following rules apply:
- An error is returned if separator is not a STRING constant.
- An error is returned if colname is not of the STRING, BIGINT, DOUBLE, or DATETIME type.
- Rows where the colname value is NULL are ignored.
Note
In the select wm_concat(',', name) from table_name; statement, if table_name is an empty collection, the statement returns NULL.

Examples

Example 1: Concatenate the names (ename) of all employees. The command is as follows:

select wm_concat(',', ename) from emp;

The result is as follows:

+------------+
| _c0        |
+------------+
| SMITH,ALLEN,WARD,JONES,MARTIN,BLAKE,CLARK,SCOTT,KING,TURNER,ADAMS,JAMES,FORD,MILLER,JACCKA,WELAN,TEBAGE |
+------------+

Example 2: Use with group by to group all employees by department (deptno) and concatenate the names (ename) of employees in the same group. The command is as follows:

select deptno, wm_concat(',', ename) from emp group by deptno order by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | CLARK,KING,MILLER,JACCKA,WELAN,TEBAGE |
| 20         | SMITH,JONES,SCOTT,ADAMS,FORD |
| 30         | ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES |
+------------+------------+

Example 3: Use with group by to group all employees by department (deptno), and then deduplicate and concatenate the salaries (sal) in the same group. The command is as follows:

select deptno, wm_concat(distinct ',', sal) from emp group by deptno order by deptno;

The result is as follows:

+------------+------------+
| deptno     | _c1        |
+------------+------------+
| 10         | 1300,2450,5000 |
| 20         | 1100,2975,3000,800 |
| 30         | 1250,1500,1600,2850,950 |
+------------+------------+

Example 4: Use with group by and order by to group all employees by department (deptno), and then concatenate and sort the salaries (sal) in the same group. The command is as follows:

select deptno, wm_concat(',',sal) within group(order by sal) from emp group by deptno order by deptno;

The result is as follows:

+------------+------------+
|deptno|_c1|
+------------+------------+
|10|1300,1300,2450,2450,5000,5000|
|20|800,1100,2975,3000,3000|
|30|950,1250,1250,1500,1600,2850|
+------------+------------+