The GROUP BY clauses are often used with aggregate functions to categorize rows in query results based on one or more column values. This creates a summary row for each unique combination of values, improving data analysis. Additionally, GROUP BY can be combined with ROLLUP, CUBE, and GROUPING SETS to enhance grouping capabilities and provide more analysis options.
Basic syntax
GROUP BY
The GROUP BY clause groups the analysis results.
SELECT
key1,
...
aggregate_function
GROUP BY
key,...Parameters:
key1: The column you want to group by its value. It supports grouping by log field names or the result columns of aggregate functions. TheGROUP BYclause supports single or multiple columns.aggregate_function: The aggregate function applied to each group, such ascount,min,max,avg, andsum.
GROUP BY ROLLUP
The GROUP BY ROLLUP clause creates a summary for each group and a total for all groups. For example, the result set of GROUP BY ROLLUP (a, b) includes (a, b), (a, null), and (null, null).
SELECT
key1,
...
aggregate_function
GROUP BY ROLLUP (key1,...)Parameters:
key1: The column you want to group by its value. It supports grouping by log field names or the result columns of aggregate functions. TheGROUP BYclause supports single or multiple columns.aggregate_function: The aggregate function applied to each group, such ascount,min,max,avg, andsum.
GROUP BY CUBE
The GROUP BY CUBE clause groups by all possible combinations of columns. For example, the result set of GROUP BY CUBE (a, b) includes (a, b), (null, b), (a, null), and (null, null).
SELECT
key1,
...
aggregate_function
GROUP BY CUBE (key1,...)Parameters:
key1: The column you want to group by its value. It supports grouping by log field names or the result columns of aggregate functions. TheGROUP BYclause supports single or multiple columns.aggregate_function: The aggregate function applied to each group, such ascount,min,max,avg, andsum.
GROUP BY GROUPING SETS
The GROUP BY GROUPING SETS clause groups columns sequentially. For example, the result set of GROUP BY GROUPING SETS (a, b) includes (a, null) and (null, b).
SELECT
key1,
...
aggregate_function
GROUP BY GROUPING SETS (key1,...)Parameters:
key1: The column you want to group by its value. It supports grouping by log field names or the result columns of aggregate functions. TheGROUP BYclause supports single or multiple columns.aggregate_function: The aggregate function applied to each group, such ascount,min,max,avg, andsum.
Examples
In SQL statements, when using the GROUP BY clause, you can only select the following in the SELECT statement:
The columns specified in the
GROUP BYclause.The results of aggregate calculations on any column, such as
COUNT(),SUM(), etc.
Selecting columns not included in the GROUP BY clause directly is not permitted because their values may not be unique after grouping, leading to ambiguity. For example, the following statement is not valid:
* | SELECT status, request_time, COUNT(*) AS PV GROUP BY statusThis is because request_time is not part of the GROUP BY clause and is not aggregated.
The correct format is:
* | SELECT status, arbitrary(request_time), COUNT(*) AS PV GROUP BY statusarbitrary(request_time) is a method to aggregate request_time, indicating that a random request_time value is selected from each group. This approach complies with SQL syntax rules and meets the query requirements.
Example 1: Count page views (PVs) by status code
Query statement
* | SELECT status, count(*) AS PV GROUP BY statusQuery results

Example 2: Count hourly PVs for website access
Query statement
The
__time__field is a reserved field in Simple Log Service, representing the time column.timeis an alias fordate_trunc('hour', __time__). For more information, see date_trunc function.* | SELECT count(*) AS PV, date_trunc('hour', __time__) AS time GROUP BY time ORDER BY time LIMIT 1000Query results

Example 3: Count PVs in 5-minute intervals
Query statement
Since the
date_truncfunction counts at fixed intervals, you can use the modulo operation for custom time-based statistical analysis. For example,%300represents a modulo operation for grouping data into 5-minute intervals.* | SELECT count(*) AS PV, __time__-__time__ % 300 AS time GROUP BY time LIMIT 1000Query results

Example 4: Group by request method and status, calculating access data for each method and count for each status
Query statement
* | SELECT request_method, status, count(*) AS PV GROUP BY GROUPING SETS (request_method, status)Query results

Example 5: Group by request method and status using the sets
(null, null),(request_method, null),(null, status), and(request_method, status), and calculate the access count for each groupQuery statement
* | SELECT request_method, status, count(*) AS PV GROUP BY CUBE (request_method, status)Query results

Example 6: Group by request method and status using the sets
(request_method, status),(request_method, null), and(null, null), and calculate the access count for each groupQuery statement
* | SELECT request_method, status, count(*) AS PV GROUP BY ROLLUP (request_method, status)Query results
