In relational databases, you must use multiple SELECT and UNION statements to group results based on multiple groups of dimensions. PolarDB-X provides GROUPING SETS, ROLLUP, and CUBE extensions that allow you to group results based on multiple groups of dimensions. In addition, PolarDB-X allows you to use the GROUPING and GROUPING_ID functions in a SELECT statement or a HAVING clause. This can be used to explain the results of the preceding extensions. This topic describes the syntax and provides examples for the GROUPING SETS, ROLLUP, and CUBE extensions and the GROUPING and GROUPING_ID functions.

Note

  • In the syntax of all GROUP BY extensions described in this topic, SQL queries cannot be pushed down to the LogicalView operators for execution. For more information about SQL query pushdown, see Push down and rewrite queries.
  • The following test data is used in the examples provided in this topic.
    Execute the following statement to create a table named requests:
    CREATE TABLE requests (
      `id` int(10) UNSIGNED NOT NULL,
      `os` varchar(20) DEFAULT NULL,
      `device` varchar(20) DEFAULT NULL,
      `city` varchar(20) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8 dbpartition BY hash(`id`) tbpartition BY hash(`id`);
    Execute the following statement to insert the required test data into the requests table:
    INSERT INTO requests (id, os, device, city) VALUES
    (1, 'windows', 'PC', 'Beijing'),
    (2, 'windows', 'PC', 'Shijiazhuang'),
    (3, 'linux', 'Phone', 'Beijing'),
    (4, 'windows', 'PC', 'Beijing'),
    (5, 'ios', 'Phone', 'Shijiazhuang'),
    (6, 'linux', 'PC', 'Beijing'),
    (7, 'windows', 'Phone', 'Shijiazhuang');

GROUPING SETS extension

  • Overview

    GROUPING SETS is an extension of the GROUP BY clause and can be used to generate a result set. The result set is a concatenation of multiple result sets based on different groups. The result returned by the GROUPING SETS extension is similar to that of the UNION ALL operator. The UNION ALL operator and the GROUPING SETS expansion do not remove duplicate rows from the combined result sets.

  • Syntax
    GROUPING SETS (
      { expr_1 | ( expr_1a [, expr_1b ] ...) |
        ROLLUP ( expr_list ) | CUBE ( expr_list )
      } [, ...] )
    Note A GROUPING SETS extension can contain one or more comma-separated expressions, such as expr_1 or (expr_1a [, expr_1b ] ...), and lists of expressions enclosed in parentheses (), such as ( expr_list ). In the syntax:
    • Each expression can be used to determine how the result set is grouped.
    • You can nest a ROLLUP or CUBE extension in a GROUPING SETS extension.
  • Examples
    • You can group the data that you want to query by using a GROUPING SETS extension. The following code block provides the relevant syntax:
      select os,device, city ,count(*)
      from requests
      group by grouping sets((os, device), (city), ());

      The preceding statement is equivalent to the following statement:

      select os, device, NULL, count(*)
      from requests group by os, device
      union all
      select NULL, NULL, NULL, count(*)
      from requests
      union all
      select null, null, city, count(*)
      from requests group by city;
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | NULL    | NULL   | Shijiazhuang |        3 |
      | NULL    | NULL   | Beijing      |        4 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
      Note If an expression is not used in a grouping set, NULL is used as a placeholder for the expression. This way, operations can be performed on the result set that is not used in the grouping set. For example, the result set is a group of the rows for which the values of the city column are NULL in the returned result.
    • You can group data by nesting a ROLLUP extension in a GROUPING SETS extension. The following code block shows the relevant syntax:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), ROLLUP(os, device));
      
      The preceding statement is equivalent to the following statement:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), (os), (os, device), ());
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | NULL    | NULL   | Shijiazhuang |        3 |
      | NULL    | NULL   | Beijing      |        4 |
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | linux   | NULL   | NULL         |        2 |
      | ios     | NULL   | NULL         |        1 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • You can group data by specifying a CUBE extension nested in a GROUPING SETS extension. The following code block shows the relevant syntax:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), CUBE(os, device));
      
      The preceding statement is equivalent to the following statement:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), (os), (os, device), (), (device));
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | NULL    | NULL   | Beijing      |        4 |
      | NULL    | NULL   | Shijiazhuang |        3 |
      | windows | PC     | NULL         |        3 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | linux   | PC     | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | ios     | NULL   | NULL         |        1 |
      | linux   | NULL   | NULL         |        2 |
      | NULL    | PC     | NULL         |        4 |
      | NULL    | Phone  | NULL         |        3 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • You can combine the GROUP BY clause and the CUBE, and GROUPING SETS extensions to generate grouping sets, as shown in the following example:
      select os,device, city, count(*)
      from requests 
      group by os, cube(os,device), grouping sets(city);
      
      The preceding statement is equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by grouping sets((os,device,city),(os,city),(os,device,city));
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | windows | PC     | Beijing      |        2 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | linux   | NULL   | Beijing      |        2 |
      | windows | NULL   | Shijiazhuang |        2 |
      | windows | NULL   | Beijing      |        2 |
      | ios     | NULL   | Shijiazhuang |        1 |
      +---------+--------+--------------+----------+

ROLLUP extension

  • Overview

    A ROLLUP extension generates a hierarchical set of groups. This set contains a grand total and subtotals for each hierarchical group. The hierarchical order is determined by the order of the expressions that are specified in the ROLLUP expression list. The top of the hierarchy is the first item from the left side of the list. Each successive item that proceeds to the right side moves down the hierarchy. The last item in the hierarchy is at the lowest level.

  • Syntax
    ROLLUP ( { expr_1 | ( expr_1a [, expr_1b ] ...) }
      [, expr_2 | ( expr_2a [, expr_2b ] ...) ] ...)
    Note
    • Each expression is used to determine how the result set is grouped. If the expressions are enclosed in parentheses (), such as ( expr_1a, expr_1b, ...), the combination of values returned by expr_1a and expr_1b indicates a single grouping level of the hierarchy.
    • For the first item in the list, such as expr_1 or the combination of ( expr_1a, expr_1b, ...), PolarDB-X returns a subtotal for each unique value of the first item. For the second item in the list, such as expr_2 or the combination of ( expr_2a, expr_2b, ...), PolarDB-X returns a subtotal for each unique value of each group in the second item. Similar rules are used in each grouping level of the subsequent items. As a result, PolarDB-X returns a grand total for the entire result set.
    • For the subtotal rows, NULL is returned for the items for which the subtotal is taken.
  • Examples
    • ROLLUP is used to aggregate (os, device, city) in a hierarchical manner to generate grouping sets. The following code block shows the relevant syntax:
      select os,device, city, count(*)
      from requests 
      group by rollup (os, device, city);
      
      The preceding statement is equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by os, device, city with rollup;
      
      The first statement is also equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by grouping sets ((os, device, city),(os, device),(os),());
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | windows | PC     | Beijing      |        2 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | windows | PC     | NULL         |        3 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | PC     | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | ios     | NULL   | NULL         |        1 |
      | linux   | NULL   | NULL         |        2 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • ROLLUP is used to aggregate os, (os,device), and city in a hierarchical manner to generate grouping sets. The following code block shows the relevant syntax:
      select os,device, city, count(*)
      from requests 
      group by rollup (os, (os,device), city);
      
      The preceding statement is equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by os, (os,device), city with rollup;
      
      The first statement is also equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by grouping sets ((os, device, city),(os, device),(os),());
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | windows | PC     | Beijing      |        2 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | linux   | NULL   | NULL         |        2 |
      | ios     | NULL   | NULL         |        1 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+

CUBE extension

  • Overview

    A CUBE extension is similar to a ROLLUP extension. A ROLLUP extension generates groups and results in a hierarchy based on a left-to-right list of items in the ROLLUP expression list. A CUBE extension generates groupings and subtotals based on each permutation of all items in the CUBE expression list. A CUBE extension returns more rows in the generated result set than a ROLLUP extension performed on the same expression list.

  • Syntax
    CUBE ( { expr_1 | ( expr_1a [, expr_1b ] ...) }
      [, expr_2 | ( expr_2a [, expr_2b ] ...) ] ...)
    Note
    • Each expression is used to determine how the result set is grouped. If the expressions are enclosed in parentheses (), such as ( expr_1a, expr_1b, ...), the combination of values that are returned by expr_1a and expr_1b defines a single group.
    • For the first item in the list, such as expr_1 or the combination of ( expr_1a, expr_1b, ...), PolarDB-X returns a subtotal for each unique value of the first item. For the second item in the list, such as expr_2 or the combination of ( expr_2a, expr_2b, ...), PolarDB-X returns a subtotal for each unique value of the second item. A subtotal is also returned for each unique combination of the first and second items. If a third item exists, PolarDB-X returns a subtotal for each unique value of the third item, each unique combination of the third and first items, each unique combination of the third and second items, and each unique combination of the third, second, and first items. As a result, a grand total is returned for the entire result set.
    • For the subtotal rows, NULL is returned for the items for which the subtotal is taken
  • Examples
    • CUBE lists all possible combinations of (os, device, city) columns as grouping sets. The following code block shows the relevant syntax:
      select os,device, city, count(*)
      from requests 
      group by cube (os, device, city);
      
      The preceding statement is equivalent to the following statement:
      select os,device, city, count(*)
      from requests 
      group by grouping sets ((os, device, city),(os, device),(os, city),(device,city),(os),(device),(city),());
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Beijing      |        2 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | windows | PC     | NULL         |        3 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | PC     | NULL         |        1 |
      | linux   | NULL   | Beijing      |        2 |
      | windows | NULL   | Shijiazhuang |        2 |
      | windows | NULL   | Beijing      |        2 |
      | ios     | NULL   | Shijiazhuang |        1 |
      | linux   | NULL   | NULL         |        2 |
      | windows | NULL   | NULL         |        4 |
      | ios     | NULL   | NULL         |        1 |
      | NULL    | Phone  | Beijing      |        1 |
      | NULL    | Phone  | Shijiazhuang |        2 |
      | NULL    | PC     | Beijing      |        3 |
      | NULL    | PC     | Shijiazhuang |        1 |
      | NULL    | Phone  | NULL         |        3 |
      | NULL    | PC     | NULL         |        4 |
      | NULL    | NULL   | Beijing      |        4 |
      | NULL    | NULL   | Shijiazhuang |        3 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • CUBE lists all the possible combinations of (os, device),(device, city) columns as grouping sets.
      select os,device, city, count(*) 
      from requests 
      group by cube ((os, device), (device, city));
      
      The preceding statement is equivalent to the following statement:
      select os,device, city, count(*) 
      from requests 
      group by grouping sets ((os, device, city),(os, device),(device,city),());
      The following result is returned:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Beijing      |        2 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | NULL    | Phone  | Beijing      |        1 |
      | NULL    | Phone  | Shijiazhuang |        2 |
      | NULL    | PC     | Beijing      |        3 |
      | NULL    | PC     | Shijiazhuang |        1 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+

GROUPING and GROUPING_ID functions

  • Overview
    • GROUPING function

      When you use the GROUPING SETS, ROLLUP, or CUBE extensions in the GROUP BY clause, NULL is used as a placeholder in a return value of the GROUPING SETS extension. As a result, the placeholder NULL cannot be distinguished from the NULL value. In this case, you can use the GROUPING function provided by PolarDB-X to identify the placeholder and the NULL value.

      The GROUPING function allows you to use a column name as a parameter. If the corresponding rows are aggregated based on the column, 0 is returned in the result. In this case, NULL is a value. If the corresponding rows are not aggregated based on the column, 1 is returned. In this case, NULL is a placeholder in a return value of the GROUPING SETS extension.

    • GROUPING_ID function

      The GROUPING_ID function simplifies the GROUPING function. The GROUPING_ID function is used to determine the subtotal level of a row in the result set of a ROLLBACK, CUBE, or GROUPING SETS extension. The GROUPING function uses only one column expression and returns a value to indicate whether a row is a subtotal for all values of the specified column. Multiple GROUPING functions may be required to determine the levels of subtotals for queries that contain multiple grouping columns. The GROUPING_ID function supports one or more column expressions specified in the ROLLBACK, CUBE, or GROUPING SETS extensions and returns a single integer. This integer indicates the column on which a subtotal is aggregated.

  • Syntax
    • GROUPING function
      SELECT [ expr ...,] GROUPING( col_expr ) [, expr ] ...
      FROM ...
      GROUP BY { ROLLUP | CUBE | GROUPING SETS }( [...,] col_expr
        [, ...] ) [, ...]
      Note The GROUPING function uses a single parameter. This parameter must be an expression of a dimension column specified in the expression list of a ROLLUP, CUBE, or GROUPING SETS extension of the GROUP BY clause.
    • GROUPING_ID function
      SELECT [ expr ...,]
        GROUPING_ID( col_expr_1 [, col_expr_2 ] ... )
        [, expr ] ...
      FROM ...
      GROUP BY { ROLLUP | CUBE | GROUPING SETS }( [...,] col_expr_1
        [, col_expr_2 ] [, ...] ) [, ...]
  • Examples
    The GROUPING_ID function uses multiple column names as parameters. The function also converts the grouping results of the parameter columns into integers by using the bitmap algorithm. The following code block shows the relevant syntax:
    select a,b,c,count(*),
    grouping(a) ga, grouping(b) gb, grouping(c) gc, grouping_id(a,b,c) groupingid 
    from (select 1 as a ,2 as b,3 as c)
    group by cube(a,b,c);
    The following result is returned:
    +------+------+------+----------+------+------+------+------------+
    | a    | b    | c    | count(*) | ga   | gb   | gc   | groupingid |
    +------+------+------+----------+------+------+------+------------+
    |    1 |    2 |    3 |        1 |    0 |    0 |    0 |          0 |
    |    1 |    2 | NULL |        1 |    0 |    0 |    1 |          1 |
    |    1 | NULL |    3 |        1 |    0 |    1 |    0 |          2 |
    |    1 | NULL | NULL |        1 |    0 |    1 |    1 |          3 |
    | NULL |    2 |    3 |        1 |    1 |    0 |    0 |          4 |
    | NULL |    2 | NULL |        1 |    1 |    0 |    1 |          5 |
    | NULL | NULL |    3 |        1 |    1 |    1 |    0 |          6 |
    | NULL | NULL | NULL |        1 |    1 |    1 |    1 |          7 |
    +------+------+------+----------+------+------+------+------------+