GROUP BY supports multiple columns and the use of a SELECT column alias to indicate the corresponding key.
Example:
method:PostLogstoreLogs |select avg(latency),projectName,date_trunc('hour',__time__) as hour group by projectName,hour
Alias hour indicates the third SELECT column date_trunc('hour',__time__)
. This is helpful for complex queries.
GROUP BY supports GROUPING SETS, CUBE, and ROLLUP.
Examples:
method:PostLogstoreLogs |select avg(latency) group by cube(projectName,logstore)
method:PostLogstoreLogs |select avg(latency) group by GROUPING SETS ( ( projectName,logstore), (projectName,method))
method:PostLogstoreLogs |select avg(latency) group by rollup(projectName,logstore)
Examples
Perform GROUP BY based on time
Each log has a built-in time column __time__
. When the statistical analysis feature of any column is enabled, the statistics of
the time column are included.
The date_trunc
function can truncate the time column to minute, hour, day, month, and year. date_trunc
accepts a truncation unit and a column of UNIX
time or timestamp type such as __time__
.
- PV statistics per hour and per minute
* | SELECT count(1) as pv , date_trunc('hour',__time__) as hour group by hour order by hour limit 100 * | SELECT count(1) as pv , date_trunc('minute',__time__) as minute group by minute order by minute limit 100
Note limit 100 indicates that up to 100 rows can be retrieved. If the LIMIT clause is not added, up to 10 rows of data are retrieved by default. - date_trunc is applicable only to statistics at a fixed time interval. For statistics
based on flexible time dimensions, such as every 5 minutes, perform GROUP
BY in mod.
In the preceding formula,* | SELECT count(1) as pv, __time__ - __time__% 300 as minute5 group by minute5 limit 100
%300
indicates that the time is truncated in mod every 5 minutes.
Retrieve non-aggregation columns in GROUP BY
In standard SQL statements, if the GROUP BY syntax is used during the SELECT operation, the system only selects the raw data of the SELECT GROUP BY column, or performs aggregation on any columns. Retrieving data from non-GROUP BY columns is not allowed.
For example, the following syntax is invalid. Because b is a non-GROUP BY column, the system cannot determine which row of b to return during the GROUP BY operation based on a.
*|select a, b , count(c) group by a
Instead, you can use the arbitrary
function to return b.
*|select a, arbitrary(b), count(c) group by a