All Products
Search
Document Center

PolarDB:Aggregate functions

Last Updated:Jan 21, 2025

This topic describes the functions that activate auto aggregation.

Prepare data

SELECT create_graph('graph_name');

SELECT * FROM cypher('graph_name', $$
 CREATE (a:Person {name: 'A', age: 13}),
 (b:Person {name: 'B', age: 33, eyes: "blue"}),
 (c:Person {name: 'C', age: 44, eyes: "blue"}),
 (d1:Person {name: 'D', eyes: "brown"}),
 (d2:Person {name: 'D'}),
 (a)-[:KNOWS]->(b),
 (a)-[:KNOWS]->(c),
 (a)-[:KNOWS]->(d1),
 (b)-[:KNOWS]->(d2),
 (c)-[:KNOWS]->(d2)
$$) as (a agtype);

min

min() returns the minimum value in a set of values.

Syntax

min(expression)

Return value

Depending on the value returned by the expression, it can be an property type or a list.

Parameters

Parameter

Description

expression

An expression that returns a set that contains any combination of property types and lists.

Usage notes

  • Any null values are excluded from the calculation.

  • In a mixed set, any string value is always considered to be less than any numeric value, and any list is always considered to be less than any string.

  • Lists are compared in dictionary order, which means the list elements are compared by pair in ascending order from the start to the end of the list.

  • min(null) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (v:Person)
    RETURN min(v.age)
$$) as (min_age agtype);

Returns the minimum value of all values in the property age.

 min_age 
---------
 13
(1 row)

Use min() and lists

  1. To prepare data, Run the following three commands to prepare data

    SELECT * FROM cypher('graph_name', $$
        CREATE (:min_test {val:'d'})
    $$) as (result agtype);
    
    SELECT * FROM cypher('graph_name', $$
        CREATE (:min_test {val:['a', 'b', 23]})
    $$) as (result agtype);
    
    SELECT * FROM cypher('graph_name', $$
        CREATE (:min_test {val:[1, 'b', 23]})
    $$) as (result agtype);
  2. Use min() and lists.

    SELECT *
    FROM cypher('graph_name', $$
        MATCH (v:min_test)
        RETURN min(v.val)
    $$) as (min_val agtype);

    Returns the minimum value of all values in the set. In this case, the list ['a', 'b', 23'] is returned because the ['a', 'b', 23] and [1, 'b', 23] lists are considered to be less than the string d, and the string a in ['a', 'b', 23] is considered to be less than the value 1.

        min_val     
    ----------------
     ["a", "b", 23]
    (1 row)

max

max() returns the maximum value in a set of values.

Syntax

max(expression)

Return value

Depending on the value returned by the expression, it can be an property type or a list.

Parameters

Parameter

Description

expression

An expression that returns a set that contains any combination of property types and lists.

Usage notes

  • Any null values are excluded from the calculation.

  • In a mixed set, any string value is always considered to be less than any numeric value, and any list is always considered to be less than any string.

  • Lists are compared in dictionary order, which means the list elements are compared by pair in ascending order from the start to the end of the list.

  • max(null) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n:Person)
    RETURN max(n.age)
$$) as (max_age agtype);

Returns the maximum value of all values in the property age.

 max_age 
---------
 44
(1 row)

stDev

stDev() returns the standard deviation of the given value within a group. It uses a standard two-pass method with N - 1 as the denominator, and should be used when taking a sample of the population for an unbiased estimate. When the standard deviation of the entire population is being calculated, stDevP should be used.

Syntax

stDev(expression)

Return value

An agtype floating-point number.

Parameters

Parameter

Description

expression

An agtype numeric expression.

Usage notes

  • Any null values are excluded from the calculation.

  • stDev(null) returns 0.0 (zero).

Example

SELECT *
FROM cypher('graph_name', $$
   MATCH (n:Person)
   RETURN stDev(n.age)
$$) as (stdev_age agtype);

The standard deviation of the values in the property age is returned.

    stdev_age     
--------------------
 15.716233645501712
(1 row)

stDevP

stDevP() returns the standard deviation of the given value within a group. It uses a standard two-pass method, with N as the denominator, and should be used when calculating the standard deviation for an entire population. When the standard deviation of only a sample of the population is being calculated, stDev should be used.

Syntax

stDevP(expression)

Return value

An agtype floating-point number.

Parameters

Parameter

Description

expression

An agtype numeric expression.

Usage notes

  • Any null values are excluded from the calculation.

  • stDevP(null) returns 0.0 (zero).

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n:Person)
    RETURN max(n.age)
$$) as (max_age agtype);

The population standard deviation of the values in the property age is returned.

    stdevp_age    
--------------------
 12.832251036613439
(1 row)

percentileCont

percentileCont() returns the percentile of a given value in a group, with percentiles ranging from 0.0 to 1.0. It uses a linear interpolation method, calculating a weighted average between two values if the desired percentile lies between them. For nearest values using a rounding method, see percentileDisc.

Syntax

percentileCont(expression, percentile)

Return value

An agtype floating-point number.

Parameters

Parameter

Description

expression

An agtype numeric expression.

percentile

Percentile. An agtype number value between 0.0 and 1.0

Usage notes

  • Any null values are excluded from the calculation.

  • percentileCont(null, percentile) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n:Person)
    RETURN percentileCont(n.age, 0.4)
$$) as (percentile_cont_age agtype);

The 40th percentile of the values in the property age is returned, calculated with a weighted average. In this case, the 0.4 is the median, which is the 40th percentile.

 percentile_cont_age 
---------------------
 29.0
(1 row)

percentileDisc

The percentileDisc() returns the percentile of the given value in a group, with a percentile from 0.0 to 1.0. It uses a rounding method and calculates the nearest value to the percentile. For more information about the interpolated values, see percentileCont.

Syntax

percentileDisc(expression, percentile)

Return value

An agtype floating-point number.

Parameters

Parameter

Description

expression

An agtype numeric expression.

percentile

Percentile. An agtype number value between 0.0 and 1.0

Usage notes

  • Any null values are excluded from the calculation.

  • percentileDisc(null, percentile) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n:Person)
    RETURN percentileDisc(n.age, 0.5)
$$) as (percentile_disc_age agtype);

The 50th percentile of the values in the property age is returned.

 percentile_disc_age 
---------------------
 33.0
(1 row)

count

count() returns the number of values or records, and appears in two variants:

  • count(*) returns the number of matching records.

  • count(expr) returns the number of non-null values returned by an expression.

Syntax

count(expression)

Return value

An agtype integer.

Parameters

Parameter

Description

expression

An agtype numeric expression.

Usage notes

  • count(*) includes records that return null.

  • count(expr) ignores null values.

  • count(null) returns 0 (zero).

  • count(*) can be used to return the number of nodes. For example, the number of nodes connected to a certain node n.

Example

  • SELECT *
    FROM cypher('graph_name', $$
        MATCH (n {name: 'A'})-[]->(x)
        RETURN n.age, count(*)
    $$) as (age agtype, number_of_people agtype);

    The age property of the start node n (with a name value of A) and the number of nodes related to n are returned.

     age | number_of_people 
    -----+------------------
     13  | 3
    (1 row)
  • Use count(*) to group and count relationship types and return the number of each relationship type.

    SELECT *
    FROM cypher('graph_name', $$
        MATCH (n {name: 'A'})-[r]->()
        RETURN type(r), count(*)
    $$) as (label agtype, count agtype);

    The relationship type and the number of relationships with that type are returned.

      label  | count 
    ---------+-------
     "KNOWS" | 3
    (1 row)

Use count(expression) to return the number of values

Instead of simply using the count(*) to return the number of records, it may be more useful to return the actual number of values returned by an expression.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n {name: 'A'})-[]->(x)
    RETURN count(x)
$$) as (count agtype);

The number of nodes connected to the start node n is returned.

 count 
-------
 3
(1 row)

Count non-null values

count(expression) can be used to return the number of non-null values returned by the expression.

Example

SELECT *
FROM cypher('graph_name', $$
    MATCH (n:Person)
    RETURN count(n.age)
$$) as (count agtype);

The number of nodes with the label Person that have a non-null value for the age property is returned.

 count 
-------
 3
(1 row)

Count with and without duplicates

Example

In the following example, we try to find all our "friends of friends" and count their number.

  • The first aggregate function count(DISTINCT friend_of_friend) counts a friend_of_friend only once, because DISTINCT removes duplicates.

  • The second aggregate function count(friend_of_friend) counts the same friend_of_friend multiple times.

SELECT *
FROM cypher('graph_name', $$
    MATCH (me:Person)-[]->(friend:Person)-[]->(friend_of_friend:Person)
    WHERE me.name = 'A'
    RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)
$$) as (friend_of_friends_distinct agtype, friend_of_friends agtype);

Both B and C know D. If you do not use DISTINCT, D is counted twice.

 friend_of_friends_distinct | friend_of_friends 
----------------------------+-------------------
 1                          | 2
(1 row)

avg

avg() returns the average of a set of numeric values.

Syntax

avg(expression)

Return value

An agtype integer.

Parameters

Parameter

Description

expression

An expression that returns a set of numeric values.

Usage notes

  • Any null values are excluded from the calculation.

  • avg(null) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN avg(n.age)
$$) as (avg_age agtype);

The average of all the values in the property age is returned.

 avg_age 
---------
 30.0
(1 row)

sum

sum() returns the sum of a set of numeric values.

Syntax

sum(expression)

Return value

An agtype floating-point number.

Parameters

Parameter

Description

expression

An expression that returns a set of numeric values.

Usage notes

  • Any null values are excluded from the calculation.

  • sum(null) returns null.

Example

SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN sum(n.age)
$$) as (total_age agtype);

Returns the sum of all values in the property age.

 total_age 
-----------
 90
(1 row)