GROUP BY clause - Time Series Database - Alibaba Cloud Documentation Center

The GROUP BY clause can be used to group points based on the values of one or more tags or specified time intervals.

GROUP BY tags

You can use GROUP BY <tag> to group points based on the values of specified tags.

Syntax

SELECT_clause FROM_clause [WHERE_clause] GROUP BY [*|<tag_key>[,<tag_key]]

Description

You can use GROUP BY * to group points based on the values of all tags.

You can use GROUP BY <tag_key> to group points based on the values of a specified tag.

You can use GROUP BY <tag_key>,<tag_key> format to group points based on the values of multiple tags. The order in which the tag keys are placed has no effect on the results.

If a statement contains a WHERE clause, you must place the GROUP BY clause after the WHERE clause.

You can use regular expressions in GROUP BY clauses.

Examples

Group points based on the values of one tag

 SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location"

name: h2o_feet
tags: location=coyote_creek
time                     mean
--------
1970-01-01T00:00:00Z5.359342451341401


name: h2o_feet
tags: location=santa_monica
time                     mean
--------
1970-01-01T00:00:00Z3.530863470081006

In the preceding example, the statement is used to query the average value of each group of values in the water_level column. The values in the water_level column are grouped based on the values of the location tag that is included in the h2o_feet measurement. The MEAN() function is used to calculate the average value of each group of water_level values. TSDB for InfluxDB® returns two series. Each series corresponds to a location.

Note

In TSDB for InfluxDB®, epoch 0 indicates 1970-01-01T00:00:00Z and is used as an empty timestamp. If no timestamps are specified, TSDB for InfluxDB® returns epoch 0 as the timestamp. For example, if you use an aggregate function and you do not specify a bounded time range, TSDB for InfluxDB® returns epoch 0.

Group points based on the values of multiple tags

 SELECT MEAN("index") FROM "h2o_quality" GROUP BY location,randtag

name: h2o_quality
tags: location=coyote_creek, randtag=1
time                  mean
--------
1970-01-01T00:00:00Z50.69033760186263

name: h2o_quality
tags: location=coyote_creek, randtag=2
time                   mean
--------
1970-01-01T00:00:00Z49.661867544220485

name: h2o_quality
tags: location=coyote_creek, randtag=3
time                   mean
--------
1970-01-01T00:00:00Z49.360939907550076

name: h2o_quality
tags: location=santa_monica, randtag=1
time                   mean
--------
1970-01-01T00:00:00Z49.132712456344585

name: h2o_quality
tags: location=santa_monica, randtag=2
time                   mean
--------
1970-01-01T00:00:00Z50.2937984496124

name: h2o_quality
tags: location=santa_monica, randtag=3
time                   mean
--------
1970-01-01T00:00:00Z49.99919903884662

In the preceding example, the statement is used to query the average value of each group of values in the index column. The values in the index column are grouped based on the combination of the values of the location tag and the randtag tag that are included in the h2o_quality measurement. The MEAN() function is used to calculate the average value of each group of index values. In this example, the location tag has two values and the randtag tag has three values. These values are combined to create six groups. Therefore, the values in the index column are combined into six groups. In the GROUP BY clause, you must separate multiple tag keys with commas (,).

Group points based on the values of all tags

 SELECT MEAN("index") FROM "h2o_quality" GROUP BY *

name: h2o_quality
tags: location=coyote_creek, randtag=1
time                  mean
--------
1970-01-01T00:00:00Z50.69033760186263

name: h2o_quality
tags: location=coyote_creek, randtag=2
time                   mean
--------
1970-01-01T00:00:00Z49.661867544220485

name: h2o_quality
tags: location=coyote_creek, randtag=3
time                   mean
--------
1970-01-01T00:00:00Z49.360939907550076

name: h2o_quality
tags: location=santa_monica, randtag=1
time                   mean
--------
1970-01-01T00:00:00Z49.132712456344585

name: h2o_quality
tags: location=santa_monica, randtag=2
time                   mean
--------
1970-01-01T00:00:00Z50.2937984496124

name: h2o_quality
tags: location=santa_monica, randtag=3
time                   mean
--------
1970-01-01T00:00:00Z49.99919903884662

Note

The results returned are the same as the results that are returned for the statement: SELECT MEAN("index") FROM "h2o_quality" GROUP BY location,randtag. This is because the h2o_quality measurement contains only the location and randtag tags.

GROUP BY time intervals

You can use GROUP BY time() to group points based on specified time intervals.

Basic syntax

SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>),[tag_key][fill(<fill_option>)]

Description

In a statement that contains a GROUP BY time() clause, you must include a function that is supported by TSDB for InfluxDB® in the SELECT clause and specify a time range in the WHERE clause. You must place the GROUP BY clause after the WHERE clause.

time(time_interval): The time_interval parameter specifies a period of time based on which TSDB for InfluxDB® groups the points that are queried. For example, if the value of the time_interval parameter is set to 5m, TSDB for InfluxDB® queries points and groups the queried points based on intervals of 5 minutes. The timestamps of the queried points must be in the time range that is specified in the WHERE clause
fill(<fill_option>): This function is optional. The fill() function can be used to specify a return value for time intervals within which no data is found.

Scope

When TSDB for InfluxDB® executes a basic GROUP BY time() clause, TSDB for InfluxDB® determines the raw data that is generated within each time interval and the timestamp of each result based on the value of the time_interval parameter and the time boundaries that are preset by the system.

Examples

In the following examples, the following data is queried:

 SELECT "water_level","location" FROM "h2o_feet" WHERE time >='2015-08-18T00:00:00Z' AND time <='2015-08-18T00:30:00Z'

name: h2o_feet
--------------
time                   water_level   location
2015-08-18T00:00:00Z8.12          coyote_creek
2015-08-18T00:00:00Z2.064         santa_monica
2015-08-18T00:06:00Z8.005         coyote_creek
2015-08-18T00:06:00Z2.116         santa_monica
2015-08-18T00:12:00Z7.887         coyote_creek
2015-08-18T00:12:00Z2.028         santa_monica
2015-08-18T00:18:00Z7.762         coyote_creek
2015-08-18T00:18:00Z2.126         santa_monica
2015-08-18T00:24:00Z7.635         coyote_creek
2015-08-18T00:24:00Z2.041         santa_monica
2015-08-18T00:30:00Z7.5           coyote_creek
2015-08-18T00:30:00Z2.051         santa_monica

Group points based on 12-minute intervals

 SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:00:00Z' AND time <='2015-08-18T00:30:00Z' GROUP BY time(12m)

name: h2o_feet
--------------
time                   count
2015-08-18T00:00:00Z2
2015-08-18T00:12:00Z2
2015-08-18T00:24:00Z2

In the preceding example, the COUNT() function is used to calculate the number of the points that contain the water_level field in the h2o_feet measurement. The condition is location = coyote_creek, and the points that are queried are grouped based on intervals of 12 minutes.

The count value that corresponds to each timestamp indicates the number of points that are generated within a 12-minute interval. The count value that corresponds to the first timestamp indicates the number of points that are generated from 2015-08-18T00:00:00Z to 2015-08-18T00:12:00Z, excluding 2015-08-18T00:12:00Z. The count value that corresponds to the second timestamp indicates the number of points that are generated from 2015-08-18T00:12:00Z to 2015-08-18T00:24:00, excluding 2015-08-18T00:24:00.

Group points based on the values of a tag and 12-minute intervals

 SELECT COUNT("water_level") FROM "h2o_feet" WHERE time >='2015-08-18T00:00:00Z' AND time <='2015-08-18T00:30:00Z' GROUP BY time(12m),"location"

name: h2o_feet
tags: location=coyote_creek
time                   count
---------
2015-08-18T00:00:00Z2
2015-08-18T00:12:00Z2
2015-08-18T00:24:00Z2

name: h2o_feet
tags: location=santa_monica
time                   count
---------
2015-08-18T00:00:00Z2
2015-08-18T00:12:00Z2
2015-08-18T00:24:00Z2

In the preceding example, the COUNT() function is used to calculate the number of the points that contain the water_level field, and the points are grouped based on the values of the location tag and 12-minute intervals. In the GROUP BY clause, you must separate the time() function and the tag key with a comma (,).

TSDB for InfluxDB® returns two series. Each series corresponds to a location. The count value that corresponds to each timestamp indicates the number of points that are generated within a 12-minute interval. The count value that corresponds to the first timestamp indicates the number of points that are generated from 2015-08-18T00:00:00Z to 2015-08-18T00:12:00Z, excluding 2015-08-18T00:12:00Z. The count value that corresponds to the second timestamp indicates the number of points that are generated from 2015-08-18T00:12:00Z to 2015-08-18T00:24:00, excluding 2015-08-18T00:24:00.

FAQ about the basic syntax

How does TSDB for InfluxDB® determine the timestamps and values that are to be returned?

When TSDB for InfluxDB® executes a statement that is defined based on the basic syntax, TSDB for InfluxDB® determines the raw data that is generated within each time interval and the timestamp of each result based on the interval value that is specified in the GROUP BY time() clause and the time boundaries that are preset by the system.

Examples

Raw data:

 SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:00:00Z' AND time <='2015-08-18T00:18:00Z'
name: h2o_feet
--------------
time                   water_level
2015-08-18T00:00:00Z8.12
2015-08-18T00:06:00Z8.005
2015-08-18T00:12:00Z7.887
2015-08-18T00:18:00Z7.762

Statement and results:

In the following example, the data that is generated in the specified time range of 12 minutes is queried and grouped based on 12-minute intervals. Two results are returned.

SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <'2015-08-18T00:18:00Z' GROUP BY time(12m)

name: h2o_feet
time                   count
---------
2015-08-18T00:00:00Z1<-----Note that this timestamp occurs before the start of the query's time range
2015-08-18T00:12:00Z   1

Note

TSDB for InfluxDB® uses the rounded values of the time boundaries that are preset by the system to calculate the number of time intervals based on the value that is specified in the GROUP BY clause. TSDB for InfluxDB® does not calculate the number of time intervals based on the time range that is specified in the WHERE clause. The timestamps of the data that is queried must be within the time range that is specified in the WHERE clause.

The following table describes the time boundaries that are preset by the system, the time intervals that are specified by using the GROUP BY time() clause, the points that are generated within each time interval, and the timestamp that is returned for each time interval.

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z`	`8.005`	`2015-08-18T00:00:00Z`
2	`time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:24:00Z`	`time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:18:00Z`	`7.887`	`2015-08-18T00:12:00Z`

The first 12-minute time boundary starts at 2015-08-18T00:00:00Z and ends before 2015-08-18T00:12:00Z. One point whose value is 8.005 falls within both the first time boundary and the first time interval that is specified by the GROUP BY time() clause. Data that is generated before the specified time range is not queried although a timestamp that occurs before the start time of the specified time range is returned.

The second 12-minute time boundary starts at 2015-08-18T00:12:00Z and ends before 2015-08-18T00:24:00Z. One point whose value is 7.887 falls within both the second time boundary and the second time interval that is specified by the GROUP BY time() clause.

If you use the advanced syntax for the GROUP BY time() clause, you can change the start time of the time boundaries that are preset by TSDB for InfluxDB®. If you use a statement that is defined based on the advanced syntax and set the value of the offset_interval parameter to 6m, TSDB for InfluxDB® returns the following result:

name: h2o_feet
time                   count
---------
2015-08-18T00:06:00Z2

Advanced syntax for GROUP BY time()

SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>,<offset_interval>),[tag_key][fill(<fill_option>)]

Description

If you use the advanced syntax for the GROUP BY time() clause, you must include a function that is supported by TSDB for InfluxDB® in the SELECT clause and specify a time range in the WHERE clause. You must place the GROUP BY clause after the WHERE clause.

time(time_interval,offset_interval): For more information about the time_interval parameter, see the description of the basic syntax for the GROUP BY time() clause.

The offset_interval parameter specifies a value based on which the time boundaries that are preset by TSDB for InfluxDB® are offset forward or backward. The value of the offset_interval parameter can be a positive value or a negative value.

fill(<fill_option>): This function is optional. The fill() function can be used to specify a return value for time intervals within which no data is found.

Scope

If you use an advanced GROUP BY time() clause, TSDB for InfluxDB® determines the raw data that is generated within each time interval and the timestamp of each result based on the value of the time_interval and offset_interval parameters and the time boundaries that are preset by the system.

Examples

In the following examples, the following data is queried:

SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:00:00Z' AND time <='2015-08-18T00:54:00Z'

name: h2o_feet
--------------
time                   water_level
2015-08-18T00:00:00Z8.12
2015-08-18T00:06:00Z8.005
2015-08-18T00:12:00Z7.887
2015-08-18T00:18:00Z7.762
2015-08-18T00:24:00Z7.635
2015-08-18T00:30:00Z7.5
2015-08-18T00:36:00Z7.372
2015-08-18T00:42:00Z7.234
2015-08-18T00:48:00Z7.11
2015-08-18T00:54:00Z6.982

Group points based on 18-minute intervals and offset time boundaries forward

SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <='2015-08-18T00:54:00Z' GROUP BY time(18m,6m)

name: h2o_feet
time                   mean
--------
2015-08-18T00:06:00Z7.884666666666667
2015-08-18T00:24:00Z7.502333333333333
2015-08-18T00:42:00Z7.108666666666667

In the preceding example, the statement is used to query the average value of each group of values in the water_level column. The values in the water_level column are grouped based on 18-minute intervals, and the time boundaries that are preset by TSDB for InfluxDB® are offset by 6 minutes forward. The MEAN() function is used to calculate the average value of each group of water_level values.

If you do not configure the offset_interval parameter, the system uses the time boundaries and the timestamps that are preset by TSDB for InfluxDB®. If you do not configure the offset_interval parameter, the following results are returned:

SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <='2015-08-18T00:54:00Z' GROUP BY time(18m)

name: h2o_feet
time                   mean
--------
2015-08-18T00:00:00Z7.946
2015-08-18T00:18:00Z7.6323333333333325
2015-08-18T00:36:00Z7.238666666666667
2015-08-18T00:54:00Z6.982

The following table describes the time boundaries that are preset by the system, the time intervals that are specified by using the GROUP BY time() clause, the points that are generated within each time interval, and the timestamp that is returned for each time interval when the offset_interval parameter is not configured.

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z`	`8.005` and `7.887`	`2015-08-18T00:00:00Z`
2	`time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z`	`time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z`	`7.762`, `7.635`, and `7.5`	`2015-08-18T00:18:00Z`
3	`time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z`	`time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z`	`7.372`, `7.234`, and `7.11`	`2015-08-18T00:36:00Z`
4	`time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z`	`time = 2015-08-18T00:54:00Z`	`6.982`	`2015-08-18T00:54:00Z`

The first 18-minute time boundary starts at 2015-08-18T00:00:00Z and ends before 2015-08-18T00:18:00Z. Two points whose values are 8.005 and 7.887 fall within both the first time boundary and the first time interval that is specified by the GROUP BY time() clause. Data that is generated before the specified time range is not queried although a timestamp that occurs before the start time of the specified time range is returned.

The second 18-minute time boundary starts at 2015-08-18T00:18:00Z and ends before 2015-08-18T00:36:00Z. Three points whose values are 7.762, 7.635, and 7.5 fall within both the second time boundary and the second time interval that is specified by the GROUP BY time() clause. In this case, the range that is defined by the time boundary and the range that is defined by the time interval are the same.

The fourth 18-minute time boundary starts at 2015-08-18T00:54:00Z and ends before 2015-08-18T01:12:00Z. One point whose value is 6.982 falls within both the fourth time boundary and the fourth time interval that is specified by the GROUP BY time() clause.

If you configure the offset_interval parameter, the time boundaries and the timestamps that are returned are offset based on the value of the offset_interval parameter.

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z`	`8.005`, `7.887`, and `7.762`	`2015-08-18T00:06:00Z`
2	`time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z`	`time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z`	`7.635`, `7.5`, and `7.372`	`2015-08-18T00:24:00Z`
3	`time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z`	`time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z`	`7.234`, `7.11`, and `6.982`	`2015-08-18T00:42:00Z`
4	`time >= 2015-08-18T01:00:00Z AND time < 2015-08-18T01:18:00Z`	NA	NA	NA

The ranges that are defined by time boundaries are offset by 6 minutes forward based on the value of the offset_interval parameter. This way, each range that is defined by a time boundary and the range that is defined by the time interval are the same. TSDB for InfluxDB® calculates the average value of each group of points that are generated within each time interval. The time intervals are offset by 6 minutes forward. The timestamp that is returned, the start time of the boundary time range, and the start time of the time interval that is specified by the GROUP BY time() clause are consistent.

The fourth time boundary that is defined based on the value of the offset_interval parameter is not included in the specified time range. Therefore, the data that is generated within the fourth time interval is not queried.

Group points based on 18-minute intervals and offset time boundaries backward

SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <='2015-08-18T00:54:00Z' GROUP BY time(18m,-12m)

name: h2o_feet
time                   mean
--------
2015-08-18T00:06:00Z7.884666666666667
2015-08-18T00:24:00Z7.502333333333333
2015-08-18T00:42:00Z7.108666666666667

In the preceding example, the statement is used to query the average value of each group of values in the water_level column. The values in the water_level column are grouped based on 18-minute intervals, and the time boundaries that are preset by TSDB for InfluxDB® are offset by 12 minutes backward. The MEAN() function is used to calculate the average value of each group of water_level values.

Note

The results of this query are the same as the results of the previous query that groups data based on 18-minute intervals and offset time boundaries forward. In this statement, the value of the offset_interval parameter is set to a negative value, and in the previous statement, the value of the offset_interval parameter is set to a positive value. The query performance of the two statements is the same. When you specify a value for the offset_interval parameter, use a valid and suitable value.

SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <='2015-08-18T00:54:00Z' GROUP BY time(18m)

name: h2o_feet
time                    mean
--------
2015-08-18T00:00:00Z7.946
2015-08-18T00:18:00Z7.6323333333333325
2015-08-18T00:36:00Z7.238666666666667
2015-08-18T00:54:00Z6.982

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z`	`8.005` and `7.887`	`2015-08-18T00:00:00Z`
2	`time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z`	`time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z`	`7.762`, `7.635`, and `7.5`	`2015-08-18T00:18:00Z`
3	`time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z`	`time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z`	`7.372`, `7.234`, and `7.11`	`2015-08-18T00:36:00Z`
4	`time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z`	`time = 2015-08-18T00:54:00Z`	`6.982`	`2015-08-18T00:54:00Z`

If you configure the offset_interval parameter, the time boundaries and the timestamps that are returned are offset based on the value of the offset_interval parameter.

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-17T23:48:00Z AND time < 2015-08-18T00:06:00Z`	NA	NA	NA
2	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z`	`8.005`, `7.887`, and `7.762`	`2015-08-18T00:06:00Z`
3	`time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z`	`time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00`	`7.635`, `7.5`, and `7.372`	`2015-08-18T00:24:00Z`
4	`time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z`	`time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z`	`7.234`, `7.11`, and `6.982`	`2015-08-18T00:42:00Z`

The ranges that are defined by time boundaries are offset by 12 minutes backward based on the value of the offset_interval parameter. This way, each range that is defined by a time boundary and the range that is defined by the time interval are the same. TSDB for InfluxDB® calculates the average value of each group of points that are generated within each time interval. The time intervals are offset by 6 minutes forward. The timestamp that is returned, the start time of the boundary time range, and the start time of the time interval that is specified by the GROUP BY time() clause are consistent.

The first time boundary that is defined based on the value of the offset_interval parameter is not in the specified time range. Therefore, data that is generated within the first time interval is not queried.

Group points based on 12-minute intervals and offset time boundaries forward

In the following example, the statement is provided based on the statement that is used in the "FAQ about the basic syntax" part.

SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <'2015-08-18T00:18:00Z' GROUP BY time(12m,6m)

name: h2o_feet
time                   count
---------
2015-08-18T00:06:00Z2

In the preceding example, the COUNT() function is used to calculate the number of the points that contain the water_level field. The points are grouped based on 12-minute intervals, and the time boundaries are offset by 6 minutes forward.

SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-08-18T00:06:00Z' AND time <'2015-08-18T00:18:00Z' GROUP BY time(12m)

name: h2o_feet
time                   count
---------
2015-08-18T00:00:00Z1
2015-08-18T00:12:00Z1

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z`	`8.005`	`2015-08-18T00:00:00Z`
2	`time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:24:00Z`	`time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:18:00Z`	`7.887`	`2015-08-18T00:12:00Z`

If you configure the offset_interval parameter, the time boundaries and the timestamps that are returned are offset based on the value of the offset_interval parameter.

Time interval number	Time boundary that is preset by the system	Time interval that is specified by using the `GROUP BY time()` clause	Points	Returned timestamp
1	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z`	`time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z`	`8.005` and `7.887`	`2015-08-18T00:06:00Z`
2	`time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:30:00Z`	NA	NA	NA

The ranges that are defined by time boundaries are offset by 6 minutes forward based on the value of the offset_interval parameter. This way, each range that is defined by a time boundary and the range that is defined by the time interval are the same. TSDB for InfluxDB® counts the number of each group of points that are generated within each time interval. The time intervals are offset by 6 minutes forward. The timestamp that is returned, the start time of the boundary time range, and the start time of the time interval that is specified by the GROUP BY time() clause are consistent.

The second time boundary that is defined based on the value of the offset_interval parameter is not included in the specified time range. Therefore, data that is generated within the second time interval is not queried.

GROUP BY time() and fill()

You can use the fill() function to specify a return value for time intervals within which no data is found.

Syntax

SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(time_interval,[<offset_interval])[,tag_key][fill(<fill_option>)]

Description

If no data is found within a time interval that is defined by the GROUP BY time() clause, TSDB for InfluxDB® returns null by default. You can use the fill() function to change the default value to a specified value. If you specify multiple objects such as tags and the time() function in the GROUP BY clause, you must place the fill() function after the GROUP BY clause.

fill_option

A numeric value: If no data is found within a time interval, the specified numeric value is returned.
linear: If no data is found within a time interval, the result of linear interpolation is returned.
none: If no data is found within a time interval, no timestamp or value is returned.
null: If no data is found within a time interval, a result that contains a timestamp and the null value is returned. In this case, the result is the same as the result that is returned when fill() is not used.
previous: If no data is found within a time interval, the value for the previous time interval is returned as the result.

Example 1: fill(100) is not used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z

Example 2: fill(100) is used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m) fill(100)

name: h2o_feet
--------------
ime                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z100

fill(100) is used to specify 100 as the return value when no data is found within a time interval. In the example, no data is found within the fourth time interval. The return value is 100 because fill(100) is used in the statement.

Example 3: fill(linear) is not used

SELECT MEAN("tadpoles") FROM "pond" WHERE time >='2016-11-11T21:00:00Z' AND time <='2016-11-11T22:06:00Z' GROUP BY time(12m)

name: pond
time                   mean
--------
2016-11-11T21:00:00Z1
2016-11-11T21:12:00Z
2016-11-11T21:24:00Z3
2016-11-11T21:36:00Z
2016-11-11T21:48:00Z
2016-11-11T22:00:00Z6

Example 4: fill(linear) is used

SELECT MEAN("tadpoles") FROM "pond" WHERE time >='2016-11-11T21:00:00Z' AND time <='2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)

name: pond
time                   mean
--------
2016-11-11T21:00:00Z1
2016-11-11T21:12:00Z2
2016-11-11T21:24:00Z3
2016-11-11T21:36:00Z4
2016-11-11T21:48:00Z5
2016-11-11T22:00:00Z6

fill(linear) is used to specify the result of linear interpolation as the return value when no data is found within a time interval.

Note

In Example 2, data is not stored in the NOAA_water_database database. To use fill(linear), you must create a dataset that contains a small size of regular data.

Example 5: fill(none) is not used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z

Example 6: fill(none) is used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m) fill(none)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235

When fill(none) is used, no timestamps or values are returned if no data is found within a time interval.

Example 7: fill(null) is not used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z

Example 8: fill(null) is used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m) fill(null)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z

fill(null) is used to specify null as the return value when no data is found within a time interval. The results that are returned when fill(null) is used are the same as the results that are returned when fill(null) is not used.

Example 9: fill(previous) is not used

SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z

Example 10: fill(previous) is used

> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T16:00:00Z' AND time <='2015-09-18T16:42:00Z' GROUP BY time(12m) fill(previous)

name: h2o_feet
--------------
time                   max
2015-09-18T16:00:00Z3.599
2015-09-18T16:12:00Z3.402
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z3.235

In this example, fill(previous) is used to specify 3.235 as the return value for the fourth time interval. This value is the return value for the previous time interval.

FAQ about the fill() function

What is returned by the fill() function if no data is found within the time range that is specified by the WHERE clause?

If no data is found within the specified time range, TSDB for InfluxDB® ignores the fill() function. No data is returned.

Example:

TSDB for InfluxDB® returns no data after the following statement is executed, because no points contain the water_level field are found within the specified time range. fill(800) in the statement does not affect the result.

 SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >='2015-09-18T22:00:00Z' AND time <='2015-09-18T22:18:00Z' GROUP BY time(12m) fill(800)

What is returned by fill(previous) if the previous time interval is not within the specified time range?

If the previous time interval of a time interval is not within the specified time range, TSDB for InfluxDB® ignores fill(previous).

Example:

In the following example, the specified time range is from 2015-09-18T16:24:00Z to 2015-09-18T16:54:00Z. fill(previous) returns the value for 2015-09-18T16:24:00Z as the value for 2015-09-18T16:36:00Z.

SELECT MAX("water_level") FROM "h2o_feet" WHERE location ='coyote_creek' AND time >='2015-09-18T16:24:00Z' AND time <='2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous)

name: h2o_feet
--------------
time                   max
2015-09-18T16:24:00Z3.235
2015-09-18T16:36:00Z3.235
2015-09-18T16:48:00Z4

In the following example, the specified time range is from 2015-09-18T16:36:00Z to 2015-09-18T16:54:00Z. In this query, fill(previous) does not return the value for 2015-09-18T16:24:00Z as the value for 2015-09-18T16:36:00Z because 2015-09-18T16:24:00Z is not within the specified time range.

SELECT MAX("water_level") FROM "h2o_feet" WHERE location ='coyote_creek' AND time >='2015-09-18T16:36:00Z' AND time <='2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous)

name: h2o_feet
--------------
time                   max
2015-09-18T16:36:00Z
2015-09-18T16:48:00Z4

What is returned by fill(linear) if the previous or subsequent time interval is not within the specified time range?

If the previous or subsequent time interval of a time interval is not in the specified time range, fill(linear) does not return a value for the time interval.

Example:

In the following example, the specified time range is from 2016-11-11T21:24:00Z to 2016-11-11T22:06:00Z. fill(linear) returns values for 2016-11-11T21:36:00Z and 2016-11-11T21:48:00Z based on the values that correspond to 2016-11-11T21:24:00Z and 2016-11-11T22:00:00Z.

SELECT MEAN("tadpoles") FROM "pond" WHERE time >'2016-11-11T21:24:00Z' AND time <='2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)

name: pond
time                   mean
--------
2016-11-11T21:24:00Z3
2016-11-11T21:36:00Z4
2016-11-11T21:48:00Z5
2016-11-11T22:00:00Z6

In the following example, the specified time range is from 2016-11-11T21:36:00Z to 2016-11-11T22:06:00Z. In this query, fill(linear) does not return values for 2016-11-11T21:36:00Z and 2016-11-11T21:48:00Z because 2016-11-11T21:24:00Z is not within the specified time range and TSDB for InfluxDB® cannot perform linear interpolation.

SELECT MEAN("tadpoles") FROM "pond" WHERE time >='2016-11-11T21:36:00Z' AND time <='2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)
name: pond
time                   mean
--------
2016-11-11T21:36:00Z
2016-11-11T21:48:00Z
2016-11-11T22:00:00Z6

Note

Data that is used in the preceding examples is not stored in the NOAA_water_database database. To use fill(linear), you must create a dataset that contains a small size of regular data.