The ROLLUP option allows you to execute only a single SQL statement to calculate the subtotal for each group in your SQL query and the total for all groups. The groups are created based on dimensions. This topic describes how to use the ROLLUP option.

Prerequisites

Your PolarDB cluster is a cluster of PolarDB for MySQL that runs MySQL 8.0 and the revision version of your cluster is V8.0.1.1.0 or later. For more information about how to view the version, see Query the engine version.

Syntax

The ROLLUP option is an extension of the GROUP BY clause. You need only to add WITH ROLLUP after the column names that are specified in the GROUP BY clause. The following code provides an example:

SELECT year, country, product, SUM(profit) AS profit
       FROM sales
       GROUP BY year, country, product WITH ROLLUP;

The ROLLUP option calculates an aggregate result for each column that is specified in the GROUP BY clause. The ROLLUP option calculates high-level subtotals based on the right-to-left order of the specified columns. After a subtotal is calculated each time, the first column from the right is removed from the list of columns for the next calculation. This process continues until a total is produced. In the preceding example, the profits are calculated based on the following order of clauses: GROUP BY year, country, product, GROUP BY year, country, and GROUP BY year. Then, a grand total of the profits is calculated for the entire sales table. During the calculation of the grand total, the conditions specified in the GROUP BY clause are ignored.

The ROLLUP option provides the following benefits:
  • The ROLLUP option facilitates multi-dimensional statistical analysis and reduces the complexity of SQL queries for multi-dimensional analysis.
  • The ROLLUP option improves the efficiency of processing queries.
  • The ROLLUP option allows the server to perform all the aggregation operations. The client can access data only once to collect statistics. This reduces the processing loads and network traffic on the client. If you do not use the ROLLUP option, you must run multiple queries to collect the same statistics.

Test the performance of parallel queries after the ROLLUP option is used

PolarDB uses the ROLLUP option to improve the capabilities of parallel queries. After the ROLLUP option is used, multiple threads are used to aggregate data in parallel and produce aggregation results. This improves the efficiency of executing statements.

TPC Benchmark™H (TPC-H) is used for testing. TPC-H provides 22 SQL queries. In the following example, the first SQL query is used. In the following statement, the ROLLUP option is added in the GROUP BY clause.

Note In this topic, a test is implemented based on the TPC-H benchmark test, but it does not meet all TPC-H benchmark specifications. Therefore, the test results may not match the published results of the TPC-H benchmark test.
select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate <= date_sub('1998-12-01', interval ':1' day)
group by
    l_returnflag,
    l_linestatus
with rollup
order by
    l_returnflag,
    l_linestatus;
  • When parallel queries are disabled, 318.73s are consumed to execute the statement. 1
  • After parallel queries are enabled, 22.30s are consumed to execute the statement. The efficiency of executing the statement is improved by more than 14 times. 2