All Products
Search
Document Center

MaxCompute:GROUPING SETS

Last Updated:Mar 26, 2026

GROUPING SETS is an extension of the GROUP BY clause that lets you aggregate data across multiple dimensions in a single query—without writing multiple SELECT ... UNION ALL statements. MaxCompute generates a single, optimized execution plan for the entire operation.

Syntax extensions

The following syntax extensions work with GROUPING SETS.

SyntaxDescription
CUBEGenerates all possible combinations of the specified columns as grouping sets. For N columns, CUBE produces 2^N grouping sets.
ROLLUPGenerates grouping sets in a hierarchical manner—from the most granular level to the grand total. For N elements, ROLLUP produces N+1 grouping sets.
GROUPINGDistinguishes null placeholders from actual null values in data. Takes a column name as a parameter. Returns 0 if the rows are aggregated by that column, or 1 otherwise.
GROUPING_IDTakes one or more column names as parameters and combines their GROUPING results into a single integer using a bitmap representation.
GROUPING__IDA Hive-compatible variant that takes no parameters. Equivalent to GROUPING_ID(<GROUP BY column list>), with parameters in the same order as the GROUP BY clause. Requires set odps.sql.hive.compatible=true. Recommended for use with Hive 2.3.0 or later.

How null placeholders work

When a column is not part of a grouping set, GROUPING SETS inserts NULL in that column as a placeholder—not because the value is missing from the data, but because the row represents an aggregation that does not include that dimension.

Use GROUPING to tell the two cases apart. For example:

if(grouping(city) == 0, city, 'ALL')

This replaces the null placeholder with 'ALL' when the row is a cross-city aggregate, while preserving an actual null from the data as-is.

Syntax equivalences

CUBE

CUBE over N columns generates all 2^N grouping sets:

group by cube (a, b, c)
-- Equivalent to:
grouping sets ((a,b,c),(a,b),(a,c),(b,c),(a),(b),(c),())

group by cube ((a,b),(c,d))
-- Equivalent to:
grouping sets (
  (a,b,c,d),
  (a,b),
  (c,d),
  ()
)

group by a, cube (b,c), grouping sets ((d),(e))
-- Equivalent to:
group by grouping sets (
  (a, b, c, d), (a, b, c, e),
  (a, b, d), (a, b, e),
  (a, c, d), (a, c, e),
  (a, d), (a, e)
)

ROLLUP

ROLLUP over N elements generates N+1 grouping sets, from the most granular level down to the grand total:

group by rollup (a,b,c)
-- Equivalent to:
grouping sets ((a,b,c),(a,b),(a),())

group by rollup (a,(b,c),d)
-- Equivalent to:
grouping sets (
  (a,b,c,d),
  (a,b,c),
  (a),
  ()
)

group by grouping sets((b),(c),rollup(a,b,c))
-- Equivalent to:
group by grouping sets (
  (b),(c),(a,b,c),(a,b),(a),()
)
List columns in ROLLUP from the highest level of the hierarchy to the lowest—for example, ROLLUP(year, month, day) rather than ROLLUP(day, month, year). Reversing the order produces different aggregation subtotals because ROLLUP generates each intermediate level by dropping the last column, so the rollup path depends on column order.

Examples

Prepare sample data

All examples in this topic use the following table:

create table requests lifecycle 20 as
select * from values
    (1, 'windows', 'PC', 'Beijing'),
    (2, 'windows', 'PC', 'Shijiazhuang'),
    (3, 'linux', 'Phone', 'Beijing'),
    (4, 'windows', 'PC', 'Beijing'),
    (5, 'ios', 'Phone', 'Shijiazhuang'),
    (6, 'linux', 'PC', 'Beijing'),
    (7, 'windows', 'Phone', 'Shijiazhuang')
as t(id, os, device, city);

The table has 7 rows with columns id, os, device, and city.

Use GROUPING SETS

The following example groups data by (os, device), by city, and by the grand total—all in one query.

Without GROUPING SETS, you need three separate queries joined by UNION ALL:

select NULL, NULL, NULL, count(*)
from requests
union all
select os, device, NULL, count(*)
from requests group by os, device
union all
select null, null, city, count(*)
from requests group by city;

With GROUPING SETS, a single query produces the same result:

select os, device, city, count(*)
from requests
group by grouping sets((os, device), (city), ());

This query produces three levels of aggregation:

  • Grand total across all rows (the () grouping set)

  • Count by city

  • Count by OS and device combination

Result:

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 7          |
| NULL       | NULL       | Beijing    | 4          |
| NULL       | NULL       | Shijiazhuang | 3        |
| ios        | Phone      | NULL       | 1          |
| linux      | PC         | NULL       | 1          |
| linux      | Phone      | NULL       | 1          |
| windows    | PC         | NULL       | 3          |
| windows    | Phone      | NULL       | 1          |
+------------+------------+------------+------------+

Columns not included in a given grouping set appear as NULL. For example, city is NULL in rows 4–8 because those rows are grouped by (os, device), not by city.

Use CUBE

Example 1: All combinations of three columns

CUBE(os, device, city) generates all 2^3 = 8 grouping sets, covering every possible combination of the three columns.

select os, device, city, count(*)
from requests
group by cube (os, device, city);
-- Equivalent to:
select os, device, city, count(*)
from requests
group by grouping sets ((os, device, city),(os, device),(os, city),(device,city),(os),(device),(city),());

This query produces eight levels of aggregation:

  • Count by OS, device, and city (most granular)

  • Count by OS and device

  • Count by OS and city

  • Count by device and city

  • Count by OS

  • Count by device

  • Count by city

  • Grand total

Result (27 rows):

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 7          |
| NULL       | NULL       | Beijing    | 4          |
| NULL       | NULL       | Shijiazhuang | 3        |
| NULL       | PC         | NULL       | 4          |
| NULL       | PC         | Beijing    | 3          |
| NULL       | PC         | Shijiazhuang | 1        |
| NULL       | Phone      | NULL       | 3          |
| NULL       | Phone      | Beijing    | 1          |
| NULL       | Phone      | Shijiazhuang | 2        |
| ios        | NULL       | NULL       | 1          |
| ios        | NULL       | Shijiazhuang | 1        |
| ios        | Phone      | NULL       | 1          |
| ios        | Phone      | Shijiazhuang | 1        |
| linux      | NULL       | NULL       | 2          |
| linux      | NULL       | Beijing    | 2          |
| linux      | PC         | NULL       | 1          |
| linux      | PC         | Beijing    | 1          |
| linux      | Phone      | NULL       | 1          |
| linux      | Phone      | Beijing    | 1          |
| windows    | NULL       | NULL       | 4          |
| windows    | NULL       | Beijing    | 2          |
| windows    | NULL       | Shijiazhuang | 2        |
| windows    | PC         | NULL       | 3          |
| windows    | PC         | Beijing    | 2          |
| windows    | PC         | Shijiazhuang | 1        |
| windows    | Phone      | NULL       | 1          |
| windows    | Phone      | Shijiazhuang | 1        |
+------------+------------+------------+------------+

Example 2: All combinations of two column groups

CUBE((os, device), (device, city)) treats each parenthesized group as a single unit and generates all 2^2 = 4 grouping sets.

select os, device, city, count(*)
from requests
group by cube ((os, device), (device, city));
-- Equivalent to:
select os, device, city, count(*)
from requests
group by grouping sets ((os, device, city),(os, device),(device,city),());

This query produces four levels of aggregation:

  • Count by OS, device, and city

  • Count by OS and device

  • Count by device and city

  • Grand total

Result (16 rows):

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 7          |
| NULL       | PC         | Beijing    | 3          |
| NULL       | PC         | Shijiazhuang | 1        |
| NULL       | Phone      | Beijing    | 1          |
| NULL       | Phone      | Shijiazhuang | 2        |
| ios        | Phone      | NULL       | 1          |
| ios        | Phone      | Shijiazhuang | 1        |
| linux      | PC         | NULL       | 1          |
| linux      | PC         | Beijing    | 1          |
| linux      | Phone      | NULL       | 1          |
| linux      | Phone      | Beijing    | 1          |
| windows    | PC         | NULL       | 3          |
| windows    | PC         | Beijing    | 2          |
| windows    | PC         | Shijiazhuang | 1        |
| windows    | Phone      | NULL       | 1          |
| windows    | Phone      | Shijiazhuang | 1        |
+------------+------------+------------+------------+

Use ROLLUP

Example 3: Hierarchical aggregation over three columns

ROLLUP(os, device, city) generates 4 grouping sets: the full combination, two intermediate levels, and the grand total.

select os, device, city, count(*)
from requests
group by rollup (os, device, city);
-- Equivalent to:
select os, device, city, count(*)
from requests
group by grouping sets ((os, device, city),(os, device),(os),());

This query produces four levels of aggregation:

  • Count by OS, device, and city (most granular)

  • Count by OS and device

  • Count by OS

  • Grand total

Result (15 rows):

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 7          |
| ios        | NULL       | NULL       | 1          |
| ios        | Phone      | NULL       | 1          |
| ios        | Phone      | Shijiazhuang | 1        |
| linux      | NULL       | NULL       | 2          |
| linux      | PC         | NULL       | 1          |
| linux      | PC         | Beijing    | 1          |
| linux      | Phone      | NULL       | 1          |
| linux      | Phone      | Beijing    | 1          |
| windows    | NULL       | NULL       | 4          |
| windows    | PC         | NULL       | 3          |
| windows    | PC         | Beijing    | 2          |
| windows    | PC         | Shijiazhuang | 1        |
| windows    | Phone      | NULL       | 1          |
| windows    | Phone      | Shijiazhuang | 1        |
+------------+------------+------------+------------+

Example 4: Hierarchical aggregation with a composite group

ROLLUP(os, (os,device), city) treats (os, device) as one unit in the hierarchy.

select os, device, city, count(*)
from requests
group by rollup (os, (os,device), city);
-- Equivalent to:
select os, device, city, count(*)
from requests
group by grouping sets ((os, device, city),(os, device),(os),());

This query produces the same four levels as Example 3, with (os, device) treated as a single composite element in the rollup hierarchy.

Result (15 rows):

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 7          |
| ios        | NULL       | NULL       | 1          |
| ios        | Phone      | NULL       | 1          |
| ios        | Phone      | Shijiazhuang | 1        |
| linux      | NULL       | NULL       | 2          |
| linux      | PC         | NULL       | 1          |
| linux      | PC         | Beijing    | 1          |
| linux      | Phone      | NULL       | 1          |
| linux      | Phone      | Beijing    | 1          |
| windows    | NULL       | NULL       | 4          |
| windows    | PC         | NULL       | 3          |
| windows    | PC         | Beijing    | 2          |
| windows    | PC         | Shijiazhuang | 1        |
| windows    | Phone      | NULL       | 1          |
| windows    | Phone      | Shijiazhuang | 1        |
+------------+------------+------------+------------+

Example 5: Combining GROUP BY, CUBE, and GROUPING SETS

select os, device, city, count(*)
from requests
group by os, cube(os,device), grouping sets(city);
-- Equivalent to:
select os, device, city, count(*)
from requests
group by grouping sets((os,device,city),(os,city),(os,device,city));

Result (10 rows):

+------------+------------+------------+------------+
| os         | device     | city       | _c3        |
+------------+------------+------------+------------+
| ios        | NULL       | Shijiazhuang | 1        |
| ios        | Phone      | Shijiazhuang | 1        |
| linux      | NULL       | Beijing    | 2          |
| linux      | PC         | Beijing    | 1          |
| linux      | Phone      | Beijing    | 1          |
| windows    | NULL       | Beijing    | 2          |
| windows    | NULL       | Shijiazhuang | 2        |
| windows    | PC         | Beijing    | 2          |
| windows    | PC         | Shijiazhuang | 1        |
| windows    | Phone      | Shijiazhuang | 1        |
+------------+------------+------------+------------+

Use GROUPING and GROUPING_ID

The following example uses GROUPING and GROUPING_ID to identify which columns participated in each grouping set.

select a, b, c, count(*),
grouping(a) ga, grouping(b) gb, grouping(c) gc, grouping_id(a,b,c) groupingid
from values (1,2,3) as t(a,b,c)
group by cube(a,b,c);

Result:

+------------+------------+------------+------------+------------+------------+------------+------------+
| a          | b          | c          | _c3        | ga         | gb         | gc         | groupingid |
+------------+------------+------------+------------+------------+------------+------------+------------+
| NULL       | NULL       | NULL       | 1          | 1          | 1          | 1          | 7          |
| NULL       | NULL       | 3          | 1          | 1          | 1          | 0          | 6          |
| NULL       | 2          | NULL       | 1          | 1          | 0          | 1          | 5          |
| NULL       | 2          | 3          | 1          | 1          | 0          | 0          | 4          |
| 1          | NULL       | NULL       | 1          | 0          | 1          | 1          | 3          |
| 1          | NULL       | 3          | 1          | 0          | 1          | 0          | 2          |
| 1          | 2          | NULL       | 1          | 0          | 0          | 1          | 1          |
| 1          | 2          | 3          | 1          | 0          | 0          | 0          | 0          |
+------------+------------+------------+------------+------------+------------+------------+------------+

ga, gb, and gc show which columns were used for grouping in each row:

  • 0: the row is grouped by this column (the value is real data, not a placeholder)

  • 1: the row is not grouped by this column (the value is a null placeholder)

groupingid combines ga, gb, and gc into a single integer using a bitmap. For example, groupingid=7 means ga=1, gb=1, gc=1—none of the three columns participate in this grouping set (the grand total row).

To replace null placeholders with a readable label, use GROUPING in an IF expression:

select
  if(grouping(os) == 0, os, 'ALL') as os,
  if(grouping(device) == 0, device, 'ALL') as device,
  if(grouping(city) == 0, city, 'ALL') as city,
  count(*) as count
from requests
group by os, device, city grouping sets((os, device), (city), ());

Result:

+------------+------------+------------+------------+
| os         | device     | city       | count      |
+------------+------------+------------+------------+
| ALL        | ALL        | ALL        | 7          |
| ALL        | ALL        | Beijing    | 4          |
| ALL        | ALL        | Shijiazhuang | 3        |
| ios        | Phone      | ALL        | 1          |
| linux      | PC         | ALL        | 1          |
| linux      | Phone      | ALL        | 1          |
| windows    | PC         | ALL        | 3          |
| windows    | Phone      | ALL        | 1          |
+------------+------------+------------+------------+

Use GROUPING__ID

GROUPING__ID is a Hive-compatible function that requires no parameters. It is equivalent to GROUPING_ID applied to the full GROUP BY column list, in the same order.

set odps.sql.hive.compatible=true;
select
a, b, c, count(*), grouping__id
from values (1,2,3) as t(a,b,c)
group by a, b, c grouping sets ((a,b,c), (a));
-- Equivalent to:
select
a, b, c, count(*), grouping_id(a,b,c)
from values (1,2,3) as t(a,b,c)
group by a, b, c grouping sets ((a,b,c), (a));

Result:

+------------+------------+------------+------------+------------+
| a          | b          | c          | _c3        | _c4        |
+------------+------------+------------+------------+------------+
| 1          | NULL       | NULL       | 1          | 3          |
| 1          | 2          | 3          | 1          | 0          |
+------------+------------+------------+------------+------------+
We recommend that you use GROUPING__ID with Hive 2.3.0 or later in MaxCompute. We recommend that you do not use this function with Hive versions earlier than 2.3.0.