GROUPING SETS is an extension of the GROUP BY clause that lets you compute aggregations across multiple grouping dimensions in a single query—without chaining multiple SELECT statements with UNION ALL. This allows MaxCompute to generate more efficient execution plans with higher performance.
Key concepts
The following extensions work with GROUPING SETS.
| Extension | Description |
|---|---|
CUBE | Generates aggregations for all possible combinations of the specified columns. |
ROLLUP | Generates aggregations at each level of a column hierarchy, from most granular to a grand total. |
GROUPING | Returns 0 if a row is aggregated on the specified column (NULL is a real value), or 1 if the column is not part of the current grouping set (NULL is a placeholder). |
GROUPING_ID | Returns a bitmap integer that encodes the GROUPING result for one or more columns. |
GROUPING__ID | A no-parameter variant for Hive-compatible queries. Equivalent to GROUPING_ID(<GROUP BY column list>), with columns in the same order as the GROUP BY clause. Recommended for use with Apache Hive 2.3.0 or later. |
Usage notes
NULL in results
When GROUPING SETS merges multiple result sets—each grouped by different criteria—columns that are not part of the current grouping set are filled with NULL. This NULL is a placeholder, not a real null value. Use the GROUPING function to distinguish between the two (see GROUPING and GROUPING_ID).
Combining multiple grouping expressions
When you mix GROUP BY, CUBE, ROLLUP, and GROUPING SETS in the same clause, MaxCompute applies cross-product semantics—each combination of grouping sets from all expressions is included in the final result.
For example:
GROUP BY os, CUBE (os, device), GROUPING SETS (city)
-- Equivalent to:
GROUP BY GROUPING SETS ((os, device, city), (os, city), (os, device, city))Prepare sample data
All examples in this topic use the same requests table.
CREATE TABLE requests LIFECYCLE 20 AS
SELECT * FROM VALUES
(1, 'windows', 'PC', 'Beijing'),
(2, 'windows', 'PC', 'Shijiazhuang'),
(3, 'linux', 'Phone', 'Beijing'),
(4, 'windows', 'PC', 'Beijing'),
(5, 'ios', 'Phone', 'Shijiazhuang'),
(6, 'linux', 'PC', 'Beijing'),
(7, 'windows', 'Phone', 'Shijiazhuang')
AS t(id, os, device, city);GROUPING SETS examples
Basic GROUPING SETS
The following example groups requests by (os, device), by city, and produces a grand total—all in one query.
Without GROUPING SETS (three separate scans):
SELECT NULL, NULL, NULL, COUNT(*) FROM requests
UNION ALL
SELECT os, device, NULL, COUNT(*) FROM requests GROUP BY os, device
UNION ALL
SELECT NULL, NULL, city, COUNT(*) FROM requests GROUP BY city;With GROUPING SETS (one scan):
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY GROUPING SETS ((os, device), (city), ());Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| NULL | NULL | NULL | 7 |
| NULL | NULL | Beijing | 4 |
| NULL | NULL | Shijiazhuang | 3 |
| ios | Phone | NULL | 1 |
| linux | PC | NULL | 1 |
| linux | Phone | NULL | 1 |
| windows | PC | NULL | 3 |
| windows | Phone | NULL | 1 |
+------------+------------+--------------+-----+NULL in the city column for rows 4–8 is a placeholder indicating that city was not part of that grouping set—not a null value in the data.
CUBE examples
CUBE expands to all possible combinations of the specified columns.
Example 1: All individual columns
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY CUBE (os, device, city);
-- Equivalent to:
-- GROUP BY GROUPING SETS ((os,device,city),(os,device),(os,city),(device,city),(os),(device),(city),())Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| NULL | NULL | NULL | 7 |
| NULL | NULL | Beijing | 4 |
| NULL | NULL | Shijiazhuang | 3 |
| NULL | PC | NULL | 4 |
| NULL | PC | Beijing | 3 |
| NULL | PC | Shijiazhuang | 1 |
| NULL | Phone | NULL | 3 |
| NULL | Phone | Beijing | 1 |
| NULL | Phone | Shijiazhuang | 2 |
| ios | NULL | NULL | 1 |
| ios | NULL | Shijiazhuang | 1 |
| ios | Phone | NULL | 1 |
| ios | Phone | Shijiazhuang | 1 |
| linux | NULL | NULL | 2 |
| linux | NULL | Beijing | 2 |
| linux | PC | NULL | 1 |
| linux | PC | Beijing | 1 |
| linux | Phone | NULL | 1 |
| linux | Phone | Beijing | 1 |
| windows | NULL | NULL | 4 |
| windows | NULL | Beijing | 2 |
| windows | NULL | Shijiazhuang | 2 |
| windows | PC | NULL | 3 |
| windows | PC | Beijing | 2 |
| windows | PC | Shijiazhuang | 1 |
| windows | Phone | NULL | 1 |
| windows | Phone | Shijiazhuang | 1 |
+------------+------------+--------------+-----+Example 2: Column groups
When you pass column groups to CUBE, each group is treated as a single unit in the combination logic.
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY CUBE ((os, device), (device, city));
-- Equivalent to:
-- GROUP BY GROUPING SETS ((os,device,city),(os,device),(device,city),())Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| NULL | NULL | NULL | 7 |
| NULL | PC | Beijing | 3 |
| NULL | PC | Shijiazhuang | 1 |
| NULL | Phone | Beijing | 1 |
| NULL | Phone | Shijiazhuang | 2 |
| ios | Phone | NULL | 1 |
| ios | Phone | Shijiazhuang | 1 |
| linux | PC | NULL | 1 |
| linux | PC | Beijing | 1 |
| linux | Phone | NULL | 1 |
| linux | Phone | Beijing | 1 |
| windows | PC | NULL | 3 |
| windows | PC | Beijing | 2 |
| windows | PC | Shijiazhuang | 1 |
| windows | Phone | NULL | 1 |
| windows | Phone | Shijiazhuang | 1 |
+------------+------------+--------------+-----+ROLLUP examples
ROLLUP generates aggregations at each level of a hierarchy, from the most granular grouping down to a grand total.
Example 3: Simple hierarchy
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY ROLLUP (os, device, city);
-- Equivalent to:
-- GROUP BY GROUPING SETS ((os,device,city),(os,device),(os),())Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| NULL | NULL | NULL | 7 |
| ios | NULL | NULL | 1 |
| ios | Phone | NULL | 1 |
| ios | Phone | Shijiazhuang | 1 |
| linux | NULL | NULL | 2 |
| linux | PC | NULL | 1 |
| linux | PC | Beijing | 1 |
| linux | Phone | NULL | 1 |
| linux | Phone | Beijing | 1 |
| windows | NULL | NULL | 4 |
| windows | PC | NULL | 3 |
| windows | PC | Beijing | 2 |
| windows | PC | Shijiazhuang | 1 |
| windows | Phone | NULL | 1 |
| windows | Phone | Shijiazhuang | 1 |
+------------+------------+--------------+-----+Example 4: Hierarchy with a column group
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY ROLLUP (os, (os, device), city);
-- Equivalent to:
-- GROUP BY GROUPING SETS ((os,device,city),(os,device),(os),())Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| NULL | NULL | NULL | 7 |
| ios | NULL | NULL | 1 |
| ios | Phone | NULL | 1 |
| ios | Phone | Shijiazhuang | 1 |
| linux | NULL | NULL | 2 |
| linux | PC | NULL | 1 |
| linux | PC | Beijing | 1 |
| linux | Phone | NULL | 1 |
| linux | Phone | Beijing | 1 |
| windows | NULL | NULL | 4 |
| windows | PC | NULL | 3 |
| windows | PC | Beijing | 2 |
| windows | PC | Shijiazhuang | 1 |
| windows | Phone | NULL | 1 |
| windows | Phone | Shijiazhuang | 1 |
+------------+------------+--------------+-----+Example 5: Combining GROUP BY, CUBE, and GROUPING SETS
The following example applies cross-product semantics across three grouping expressions.
SELECT os, device, city, COUNT(*)
FROM requests
GROUP BY os, CUBE (os, device), GROUPING SETS (city);
-- Equivalent to:
-- GROUP BY GROUPING SETS ((os,device,city),(os,city),(os,device,city))Result:
+------------+------------+--------------+-----+
| os | device | city | _c3 |
+------------+------------+--------------+-----+
| ios | NULL | Shijiazhuang | 1 |
| ios | Phone | Shijiazhuang | 1 |
| linux | NULL | Beijing | 2 |
| linux | PC | Beijing | 1 |
| linux | Phone | Beijing | 1 |
| windows | NULL | Beijing | 2 |
| windows | NULL | Shijiazhuang | 2 |
| windows | PC | Beijing | 2 |
| windows | PC | Shijiazhuang | 1 |
| windows | Phone | Shijiazhuang | 1 |
+------------+------------+--------------+-----+Examples for GROUPING and GROUPING_ID
Because GROUPING SETS fills non-grouped columns with NULL, you cannot tell from the result alone whether a NULL came from the data or from the aggregation. GROUPING and GROUPING_ID resolve this ambiguity.
Distinguish NULL placeholders with GROUPING
GROUPING(col) returns:
0— the row is aggregated oncol(theNULL, if any, is a real null value)1—colis not part of the current grouping set (theNULLis a placeholder)
GROUPING_ID(col1, col2, ...) encodes the GROUPING result for multiple columns as a bitmap integer.
SELECT a, b, c, COUNT(*),
GROUPING(a) AS ga,
GROUPING(b) AS gb,
GROUPING(c) AS gc,
GROUPING_ID(a, b, c) AS groupingid
FROM VALUES (1, 2, 3) AS t(a, b, c)
GROUP BY CUBE (a, b, c);Result:
+------+------+------+-----+----+----+----+------------+
| a | b | c | _c3 | ga | gb | gc | groupingid |
+------+------+------+-----+----+----+----+------------+
| NULL | NULL | NULL | 1 | 1 | 1 | 1 | 7 |
| NULL | NULL | 3 | 1 | 1 | 1 | 0 | 6 |
| NULL | 2 | NULL | 1 | 1 | 0 | 1 | 5 |
| NULL | 2 | 3 | 1 | 1 | 0 | 0 | 4 |
| 1 | NULL | NULL | 1 | 0 | 1 | 1 | 3 |
| 1 | NULL | 3 | 1 | 0 | 1 | 0 | 2 |
| 1 | 2 | NULL | 1 | 0 | 0 | 1 | 1 |
| 1 | 2 | 3 | 1 | 0 | 0 | 0 | 0 |
+------+------+------+-----+----+----+----+------------+Replace NULL placeholders with readable labels
Use GROUPING inside an IF expression to display a label instead of NULL for placeholder rows.
SELECT
IF(GROUPING(os) = 0, os, 'ALL') AS os,
IF(GROUPING(device) = 0, device, 'ALL') AS device,
IF(GROUPING(city) = 0, city, 'ALL') AS city,
COUNT(*) AS count
FROM requests
GROUP BY os, device, city GROUPING SETS ((os, device), (city), ());Result:
+------------+------------+--------------+-------+
| os | device | city | count |
+------------+------------+--------------+-------+
| ALL | ALL | ALL | 7 |
| ALL | ALL | Beijing | 4 |
| ALL | ALL | Shijiazhuang | 3 |
| ios | Phone | ALL | 1 |
| linux | PC | ALL | 1 |
| linux | Phone | ALL | 1 |
| windows | PC | ALL | 3 |
| windows | Phone | ALL | 1 |
+------------+------------+--------------+-------+Example for GROUPING__ID
GROUPING__ID is a no-parameter alias for GROUPING_ID(<GROUP BY column list>), provided for Hive-compatible queries. The column order matches the GROUP BY clause.
If you use Hive 2.3.0 or later, we recommend that you use this function in MaxCompute. If you use a Hive version earlier than 2.3.0, we recommend that you do not use this function in MaxCompute.
SELECT a, b, c, COUNT(*), grouping__id
FROM VALUES (1, 2, 3) AS t(a, b, c)
GROUP BY a, b, c GROUPING SETS ((a, b, c), (a));
-- Equivalent to:
SELECT a, b, c, COUNT(*), GROUPING_ID(a, b, c)
FROM VALUES (1, 2, 3) AS t(a, b, c)
GROUP BY a, b, c GROUPING SETS ((a, b, c), (a));Result:
+------+------+------+-----+-----+
| a | b | c | _c3 | _c4 |
+------+------+------+-----+-----+
| 1 | NULL | NULL | 1 | 3 |
| 1 | 2 | 3 | 1 | 0 |
+------+------+------+-----+-----+CUBE and ROLLUP equivalences
The following tables show the GROUPING SETS equivalents for common CUBE and ROLLUP expressions.
CUBE equivalences
| Expression | Equivalent GROUPING SETS |
|---|---|
GROUP BY CUBE (a, b, c) | GROUPING SETS ((a,b,c),(a,b),(a,c),(b,c),(a),(b),(c),()) |
GROUP BY CUBE ((a, b), (c, d)) | GROUPING SETS ((a,b,c,d),(a,b),(c,d),()) |
GROUP BY a, CUBE (b, c), GROUPING SETS ((d), (e)) | GROUPING SETS ((a,b,c,d),(a,b,c,e),(a,b,d),(a,b,e),(a,c,d),(a,c,e),(a,d),(a,e)) |
ROLLUP equivalences
| Expression | Equivalent GROUPING SETS |
|---|---|
GROUP BY ROLLUP (a, b, c) | GROUPING SETS ((a,b,c),(a,b),(a),()) |
GROUP BY ROLLUP (a, (b, c), d) | GROUPING SETS ((a,b,c,d),(a,b,c),(a),()) |
GROUP BY GROUPING SETS ((b), (c), ROLLUP (a,b,c)) | GROUPING SETS ((b),(c),(a,b,c),(a,b),(a),()) |