The ROLLUP
option allows you to execute only a single SQL statement to calculate the subtotal
for each group in your SQL query and the total for all groups. The groups are created
based on dimensions. This topic describes how to use the ROLLUP
option.
Prerequisites
Your PolarDB cluster is a cluster of PolarDB for MySQL that runs MySQL 8.0 and the revision version of your cluster is V8.0.1.1.0 or later. For more information about how to view the version, see Query the engine version.
Syntax
The ROLLUP
option is an extension of the GROUP BY
clause. You need only to add WITH ROLLUP
after the column names that are specified in the GROUP BY
clause. The following code provides an example:
SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
The ROLLUP
option calculates an aggregate result for each column that is specified in the GROUP
BY clause. The ROLLUP option calculates high-level subtotals based on the right-to-left
order of the specified columns. After a subtotal is calculated each time, the first
column from the right is removed from the list of columns for the next calculation.
This process continues until a total is produced. In the preceding example, the profits
are calculated based on the following order of clauses: GROUP BY year, country, product
, GROUP BY year, country
, and GROUP BY year
. Then, a grand total of the profits is calculated for the entire sales table. During
the calculation of the grand total, the conditions specified in the GROUP BY
clause are ignored.
ROLLUP
option provides the following benefits:
- The ROLLUP option facilitates multi-dimensional statistical analysis and reduces the complexity of SQL queries for multi-dimensional analysis.
- The ROLLUP option improves the efficiency of processing queries.
- The
ROLLUP
option allows the server to perform all the aggregation operations. The client can access data only once to collect statistics. This reduces the processing loads and network traffic on the client. If you do not use the ROLLUP option, you must run multiple queries to collect the same statistics.
Test the performance of parallel queries after the ROLLUP option is used
PolarDB uses the ROLLUP
option to improve the capabilities of parallel queries. After the ROLLUP option is
used, multiple threads are used to aggregate data in parallel and produce aggregation
results. This improves the efficiency of executing statements.
TPC Benchmark™H (TPC-H) is used for testing. TPC-H provides 22 SQL queries. In the
following example, the first SQL query is used. In the following statement, the ROLLUP
option is added in the GROUP BY clause.
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date_sub('1998-12-01', interval ':1' day)
group by
l_returnflag,
l_linestatus
with rollup
order by
l_returnflag,
l_linestatus;
- When parallel queries are disabled, 318.73s are consumed to execute the statement.
- After parallel queries are enabled, 22.30s are consumed to execute the statement.
The efficiency of executing the statement is improved by more than 14 times.