This topic describes the aggregate functions supported by AnalyticDB for MySQL.

AnalyticDB for MySQL supports the following aggregate functions:
  • ARBITRARY: randomly returns a value from a group of data.
  • AVG: calculates the average value for a set of values.
  • BIT_AND: returns the result of bitwise AND operations for all bits of the argument.
  • BIT_OR: returns the result of bitwise OR operations for all bits of the argument.
  • BIT_XOR: returns the result of bitwise XOR operations for all bits of the argument.
  • COUNT: counts the number of records returned by a SELECT query.
  • MAX: calculates the maximum value.
  • MIN: calculates the minimum value.
  • STD or STDDEV: returns the sample standard deviation of all input values.
  • STDDEV_POP: returns the population standard deviation of all input values.
  • STDDEV_SAMP: returns the sample standard deviation for a group of integers, decimals, or floating-point numbers.
  • SUM: calculates the sum of all input values.
  • VARIANCE (non-standard SQL function): returns the population variance for a group of integers, decimals, or floating-point numbers.
  • VAR_POP (standard SQL function): returns the population variance for a group of integers, decimals, or floating-point numbers.
  • VAR_SAMP: returns the sample variance for a group of integers, decimals, or floating-point numbers.
  • GROUP_CONCAT: returns a string result of the concatenated values from a group. The values are the returned results of the GROUP BY clause.
Note
In this topic, a table named testtable is used for aggregate functions with the exception of the GROUP_CONCAT function. The following statement shows how to create the table:
CREATE TABLE testtable(a INT) DISTRIBUTED BY HASH(a);

The following statement is used to insert test data into the testtable table:

INSERT INTO testtable VALUES (1),(2),(3);

ARBITRARY

arbitrary(x)
  • Description: This function randomly returns a value from a group of data.
  • Data type of the input value: all data types.
  • Data type of the return value: consistent with the data type of the input value of this function.
  • Example:
    SELECT arbitrary(a) FROM testtable;
    The following result is returned:
    +--------------+
    | arbitrary(a) |
    +--------------+
    |            2 |
    +--------------+

AVG

avg(x)              
  • Description: This function calculates the average value for a set of values.
  • Data type of the input value: BIGINT, DOUBLE, or FLOAT.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT avg(a) FROM testtable;
    The following result is returned:
    +--------+
    | avg(a) |
    +--------+
    |    2.0 |
    +--------+

BIT_AND

bit_and(x)
  • Description: This function returns the result of bitwise AND operations for all bits of the argument.
  • Data type of the input value: BIGINT, DOUBLE, or FLOAT.
  • Data type of the return value: BIGINT.
  • Example:
    SELECT bit_and(a) FROM testtable;
    The following result is returned:
    +------------+
    | bit_and(a) |
    +------------+
    |          0 |
    +------------+

BIT_OR

bit_or(x)
  • Description: This function returns the result of bitwise OR operations for all bits of the argument.
  • Data type of the input value: BIGINT, DOUBLE, or FLOAT.
  • Data type of the return value: BIGINT.
  • Example:
    SELECT bit_or(a) FROM testtable;
    The following result is returned:
    +-----------+
    | bit_or(a) |
    +-----------+
    |         3 |
    +-----------+

BIT_XOR

bit_xor(x)
  • Description: This function returns the result of bitwise XOR operations for all bits of the argument.
  • Data type of the input value: BIGINT, DOUBLE, or FLOAT.
  • Data type of the return value: BIGINT.
  • Example:
    SELECT bit_xor(a) FROM testtable;
    The following result is returned:
    +------------+
    | bit_xor(a) |
    +------------+
    |          0 |
    +------------+

COUNT

count([distinct|all] x)             
  • Description: This function counts the number of records returned by a SELECT query.
    Note distinct and all specify whether to exclude duplicate records in the counting process. The default value is all, which indicates that all records are counted. If distinct is specified, only records with distinct values are counted.
  • Data type of the input value: NUMERIC, STRING, or BOOLEAN.
  • Data type of the return value: BIGINT.
  • Examples:
    • Calculate the number of records whose values are unique in testtable. Sample statement:
      SELECT count(distinct a) FROM testtable;
      The following result is returned:
      +-------------------+
      | count(distinct a) |
      +-------------------+
      |                 3 |
      +-------------------+
    • Calculate the number of all records in testtable. Sample statement:
      SELECT count(all a) FROM testtable;                 
      The following result is returned:
      +--------------+
      | count(all a) |
      +--------------+
      |            3 |
      +--------------+

MAX

max(x)               
  • Description: This function calculates the maximum value.
  • Data type of the input value: all data types. However, data of the BOOLEAN type is not included in the calculation.
    Note If a value in the specified column is NULL, the row that contains this value is not included in the calculation.
  • Data type of the return value: consistent with the data type of the input value of this function.
  • Example:
    SELECT max(a) FROM testtable;
    The following result is returned:
    +--------+
    | max(a) |
    +--------+
    |      3 |
    +--------+

MIN

min(value x)               
  • Description: This function calculates the minimum value.
  • Data type of the input value: all data types. However, data of the BOOLEAN type is not included in the calculation.
    Note If a value in the specified column is NULL, the row that contains this value is not included in the calculation.
  • Data type of the return value: consistent with the data type of the input value of this function.
  • Example:
    SELECT min(a) FROM testtable;
    The following result is returned:
    +--------+
    | min(a) |
    +--------+
    |      1 |
    +--------+

STD or STDDEV

std(x)
stddev(x)
  • Description: This function returns the sample standard deviation of all input values.
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT std(a) FROM testtable;
    The following result is returned:
    +-------------------+
    | std(a)            |
    +-------------------+
    | 0.816496580927726 |
    +-------------------+

STDDEV_POP

stddev_pop(x)
  • Description: This function returns the population standard deviation of all input values.
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT stddev_pop(a) FROM testtable;
    The following result is returned:
    +-------------------+
    | stddev_pop(a)     |
    +-------------------+
    | 0.816496580927726 |
    +-------------------+

STDDEV_SAMP

stddev_samp(x)
  • Description: This function returns the sample standard deviation for a group of integers, decimals, or floating-point numbers.
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT stddev_samp(a) FROM testtable;
    The following result is returned:
    +----------------+
    | stddev_samp(a) |
    +----------------+
    |            1.0 |
    +----------------+

SUM

sum(x)
  • Description: This function calculates the sum of all input values.
  • Data type of the input value: BIGINT, DOUBLE, or FLOAT.
  • Data type of the return value: BIGINT.
  • Example:
    SELECT sum(a) FROM testtable;
    The following result is returned:
    +--------+
    | sum(a) |
    +--------+
    |      6 |
    +--------+

VARIANCE

variance(x)
  • Description: This function returns the population standard variance for a group of integers, decimals, or floating-point numbers.
    Note
    • VARIANCE() ignores rows with NULL values. If a group of values are all NULL, VARIANCE() directly returns NULL.
    • The VARIANCE() function is an extension to standard SQL and can be replaced by the standard SQL function VAR_POP().
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT variance(a)  FROM testtable;
    +----------------------------+
    |         variance(a)        |
    +----------------------------+
    |    0.6666666666666666      |               

VAR_POP

var_pop(x)
  • Description: This function returns the population standard variance for a group of integers, decimals, or floating-point numbers.
    Note
    • VAR_POP() ignores rows with NULL values. If a group of values are all NULL, VAR_POP() directly returns NULL.
    • You can also use the VARIANCE() function, which is equivalent but is not a standard SQL function.
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT var_pop(a) FROM testtable;
    The following result is returned:
    +--------------------+
    | var_pop(a)         |
    +--------------------+
    | 0.6666666666666666 |
    +--------------------+

VAR_SAMP

var_samp(x)
  • Description: This function returns the sample variance for a group of integers, decimals, or floating-point numbers.
  • Data type of the input value: BIGINT or DOUBLE.
  • Data type of the return value: DOUBLE.
  • Example:
    SELECT var_samp(a)  FROM testtable;
    The following result is returned:
    +-------------+
    | var_samp(a) |
    +-------------+
    |         1.0 |
    +-------------+

GROUP_CONCAT

GROUP_CONCAT([DISTINCT] col_name
             [ORDER BY col_name [ASC | DESC]]
             [SEPARATOR str_val])
Clause Required Description
DISTINCT No Specifies the columns for which to remove duplicate rows of data.
ORDER BY Specifies the columns to be sorted within a group. The following sorting methods are available:
  • ASC: ascending order.
  • DESC: descending order.

The default sorting method is ascending.

SEPARATOR Specifies the delimiter used to separate values within a group.

The default delimiter is a comma (,).

  • Description: This function returns a string result of the concatenated values from a group. The values are the returned results of the GROUP BY clause.
    Note NULL is returned only if all values in the columns to be concatenated by using the GROUP_CONCAT function are NULL.
  • Data type of the input value: STRING.
  • Data type of the return value: STRING.
  • Example:
    The following example demonstrates how to use the GROUP_CONCAT function. A table named person is created in the example by using the following statement:
    CREATE TABLE person(id INT,name VARCHAR,age INT ) DISTRIBUTED BY HASH(id);
    The following statement is used to insert data into the table:
    INSERT INTO person VALUES (1,'mary',13),(2,'eva',14),(2,'adam',13),(3,'eva',13),(3,null,13),(3,null,null),(4,null,13),(4,null,null);
    In the following statement, data is grouped by the id column, and the values in the name column that have the same ID are displayed by using the GROUP_CONCAT function. Duplicate values in the name column are removed, and the table is sorted in descending order by the name column. Multiple name column values with the same ID are separated by a number sign (#).
    SELECT
      id,
      GROUP_CONCAT(DISTINCT name ORDER BY name DESC SEPARATOR '#')
    FROM person 
      GROUP BY id;
    The following result is returned:
    +------+--------------------------------------------------------------+
    | id   | GROUP_CONCAT(DISTINCT name ORDER BY name DESC SEPARATOR '#') |
    +------+--------------------------------------------------------------+
    |    2 | eva#adam                                                     |
    |    1 | mary                                                         |
    |    4 | NULL                                                         |
    |    3 | eva                                                          |
    +------+--------------------------------------------------------------+