All Products
Search
Document Center

GROUP BY Statement

Last Updated: Sep 11, 2020

The GROUP BY clause groups query results based on the specified tags or time intervals.

GROUP BY tags

GROUP BY<tag> groups query results based on the specified tags.

Syntax

  1. SELECT_clause FROM_clause [WHERE_clause] GROUP BY [* | <tag_key>[,<tag_key]]

Description

GROUP BY *groups query results based on all the tags.

GROUP BY <tag_key>groups query results based on a specific tag.

GROUP BY <tag_key>,<tag_key>groups query results based on multiple tags. The order of the tag keys does not affect the query results.

If a query includes a WHERE clause and a GROUP BY clause, the GROUP BY clause must appear after the WHERE clause.

The GROUP BY clause also supports regular expressions.

Examples

Group query results based on a single tag

  1. > SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location"
  2. name: h2o_feet
  3. tags: location=coyote_creek
  4. time mean
  5. ---- ----
  6. 1970-01-01T00:00:00Z 5.359342451341401
  7. name: h2o_feet
  8. tags: location=santa_monica
  9. time mean
  10. ---- ----
  11. 1970-01-01T00:00:00Z 3.530863470081006

The query uses an InfluxQL function to calculate the average field value of the water_level field key for each tag value of location in the h2o_feet measurement. TSDB for InfluxDB® returns the data in two series, and one series is returned for each tag value of location.

Notes: In TSDB for InfluxDB®, ‘epoch 0’ is used as a null timestamp in most cases. Note that epoch 0 is also known as 1970-01-01T00:00:00Z. If your query has no timestamp to return, TSDB for InfluxDB® returns epoch 0 as the timestamp. For example, this occurs if your query includes an aggregate function where an unbounded time range is specified.

Group query results based on multiple tags

  1. > SELECT MEAN("index") FROM "h2o_quality" GROUP BY location,randtag
  2. name: h2o_quality
  3. tags: location=coyote_creek, randtag=1
  4. time mean
  5. ---- ----
  6. 1970-01-01T00:00:00Z 50.69033760186263
  7. name: h2o_quality
  8. tags: location=coyote_creek, randtag=2
  9. time mean
  10. ---- ----
  11. 1970-01-01T00:00:00Z 49.661867544220485
  12. name: h2o_quality
  13. tags: location=coyote_creek, randtag=3
  14. time mean
  15. ---- ----
  16. 1970-01-01T00:00:00Z 49.360939907550076
  17. name: h2o_quality
  18. tags: location=santa_monica, randtag=1
  19. time mean
  20. ---- ----
  21. 1970-01-01T00:00:00Z 49.132712456344585
  22. name: h2o_quality
  23. tags: location=santa_monica, randtag=2
  24. time mean
  25. ---- ----
  26. 1970-01-01T00:00:00Z 50.2937984496124
  27. name: h2o_quality
  28. tags: location=santa_monica, randtag=3
  29. time mean
  30. ---- ----
  31. 1970-01-01T00:00:00Z 49.99919903884662

The query uses an InfluxQL function to calculate the average field value of the index field key for each combination of the location tag values and the randtag tag values in the h2o_quality measurement. The location tag has two unique values and the randtag tag has three unique values. Therefore, a total of six combinations are available for the tag values of the two tag keys. Separate tags with commas (,) in the GROUP BY clause.

Group query results based on all the tags

  1. > SELECT MEAN("index") FROM "h2o_quality" GROUP BY *
  2. name: h2o_quality
  3. tags: location=coyote_creek, randtag=1
  4. time mean
  5. ---- ----
  6. 1970-01-01T00:00:00Z 50.69033760186263
  7. name: h2o_quality
  8. tags: location=coyote_creek, randtag=2
  9. time mean
  10. ---- ----
  11. 1970-01-01T00:00:00Z 49.661867544220485
  12. name: h2o_quality
  13. tags: location=coyote_creek, randtag=3
  14. time mean
  15. ---- ----
  16. 1970-01-01T00:00:00Z 49.360939907550076
  17. name: h2o_quality
  18. tags: location=santa_monica, randtag=1
  19. time mean
  20. ---- ----
  21. 1970-01-01T00:00:00Z 49.132712456344585
  22. name: h2o_quality
  23. tags: location=santa_monica, randtag=2
  24. time mean
  25. ---- ----
  26. 1970-01-01T00:00:00Z 50.2937984496124
  27. name: h2o_quality
  28. tags: location=santa_monica, randtag=3
  29. time mean
  30. ---- ----
  31. 1970-01-01T00:00:00Z 49.99919903884662

The query uses an InfluxQL function to calculate the average field value of the index field key for each combination of the tag values in the h2o_quality measurement.

Note that the query results are the same as the results of the query in the preceding example. This is because the h2o_quality measurement has only two tag keys: location and randtag.

GROUP BY time intervals

GROUP BY time() groups query results based on the specified time interval.

Basic GROUP BY time() syntax

Syntax

  1. SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>),[tag_key] [fill(<fill_option>)]

Description

Basic GROUP BY time() queries require the SELECT clauses that include InfluxQL functions and the WHERE clauses where time ranges are specified. Note that the GROUP BY clause must appear after the WHERE clause.

time(time_interval)

The time_interval parameter in the GROUP BY time() clause specifies a duration literal, and determines how TSDB for InfluxDB® groups query results in terms of the time dimension. For example, if the time_interval parameter is set to 5m, the time range specified in the WHERE clause is divided into 5-minute intervals and the query results are grouped based on each 5-minute interval.

fill(<fill_option>)

The fill(<fill_option>) function is optional. For time intervals in which no data is reported, the function returns the value specified by the fill_option parameter.

Coverage:

Basic GROUP BY time() queries determine the raw data reported in each time interval and the timestamps returned by the query based on the settings of time_interval and the preset time boundaries of TSDB for InfluxDB®.

Examples of basic syntax

The following sample data is used in the examples provided in this section:

  1. > SELECT "water_level","location" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z'
  2. name: h2o_feet
  3. --------------
  4. time water_level location
  5. 2015-08-18T00:00:00Z 8.12 coyote_creek
  6. 2015-08-18T00:00:00Z 2.064 santa_monica
  7. 2015-08-18T00:06:00Z 8.005 coyote_creek
  8. 2015-08-18T00:06:00Z 2.116 santa_monica
  9. 2015-08-18T00:12:00Z 7.887 coyote_creek
  10. 2015-08-18T00:12:00Z 2.028 santa_monica
  11. 2015-08-18T00:18:00Z 7.762 coyote_creek
  12. 2015-08-18T00:18:00Z 2.126 santa_monica
  13. 2015-08-18T00:24:00Z 7.635 coyote_creek
  14. 2015-08-18T00:24:00Z 2.041 santa_monica
  15. 2015-08-18T00:30:00Z 7.5 coyote_creek
  16. 2015-08-18T00:30:00Z 2.051 santa_monica

Group query results based on a 12-minute interval

  1. > SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. --------------
  4. time count
  5. 2015-08-18T00:00:00Z 2
  6. 2015-08-18T00:12:00Z 2
  7. 2015-08-18T00:24:00Z 2

The query uses an InfluxQL function to calculate the number of the points of the water_level field key for the location = coyote_creek tag in the h2o_feet measurement. The query groups the results based on a 12-minute interval.

The result associated with each timestamp indicates the result that corresponds to each 12-minute interval. The result associated with the first timestamp covers the raw data that occurs from 2015-08-18T00:00:00Z to 2015-08-18T00:12:00Z. Note that data associated with the 2015-08-18T00:12:00Z timestamp is excluded. The result associated with the first timestamp covers the raw data that occurs from 2015-08-18T00:12:00Z to 2015-08-18T00:24:00Z. Note that data associated with the 2015-08-18T00:24:00Z timestamp is excluded.

Group query results based on a 12-minute interval and a tag key

  1. > SELECT COUNT("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location"
  2. name: h2o_feet
  3. tags: location=coyote_creek
  4. time count
  5. ---- -----
  6. 2015-08-18T00:00:00Z 2
  7. 2015-08-18T00:12:00Z 2
  8. 2015-08-18T00:24:00Z 2
  9. name: h2o_feet
  10. tags: location=santa_monica
  11. time count
  12. ---- -----
  13. 2015-08-18T00:00:00Z 2
  14. 2015-08-18T00:12:00Z 2
  15. 2015-08-18T00:24:00Z 2

The query uses an InfluxQL function to calculate the number of the points of the water_level field key and groups results based on the location tag and a 12-minute interval. Note that the time interval and the tag key are separated with commas (,) in the GROUP BY clause.

The query returns the data in two series, and one series is returned for each tag value of location. The result associated with each timestamp indicates the result that corresponds to each 12-minute interval. The result associated with the first timestamp covers the raw data that occurs from 2015-08-18T00:00:00Z to 2015-08-18T00:12:00Z. Note that data associated with the 2015-08-18T00:12:00Z timestamp is excluded. The result associated with the first timestamp covers the raw data that occurs from 2015-08-18T00:12:00Z to 2015-08-18T00:24:00Z. Note that data associated with the 2015-08-18T00:24:00Z timestamp is excluded.

FAQ about basic syntax

Why do the query results contain unexpected timestamps and values?

In basic syntax, TSDB for InfluxDB® determines the raw data reported in each time interval and the returned timestamps based on the preset time boundaries and the time interval that is specified by GROUP BY time(). In certain cases, unexpected results may occur.

Example

Raw data:

  1. > SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:18:00Z'
  2. name: h2o_feet
  3. --------------
  4. time water_level
  5. 2015-08-18T00:00:00Z 8.12
  6. 2015-08-18T00:06:00Z 8.005
  7. 2015-08-18T00:12:00Z 7.887
  8. 2015-08-18T00:18:00Z 7.762

Query and results:

The following query covers a 12-minute time range and groups results based on a 12-minute interval. However, the query returns two results.

  1. > SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. time count
  4. ---- -----
  5. 2015-08-18T00:00:00Z 1 <----- Note that this timestamp occurs before the start of the query's time range
  6. 2015-08-18T00:12:00Z 1

Description

TSDB for InfluxDB® uses the round-off number of the preset time boundaries for the time intervals specified by GROUP BY. The preset time boundaries are not affected by the time conditions specified in the WHERE clause. When TSDB for InfluxDB® calculates the results, the timestamps of all the returned data must fall within the time range that is specified in the query. However, the time intervals specified by GROUP BY are based on the preset time boundaries.

The following table describes the preset time boundaries, relevant time intervals specified by GROUP BY time(), returned points, and returned timestamps for each time interval specified by GROUP BY time() in the results.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z 8.005 2015-08-18T00:00:00Z
2 time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:24:00Z time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:18:00Z 7.887 2015-08-18T00:12:00Z

The preset time boundaries of the first 12-minute time range are 00:00 and 12:00. Note that the data associated with the 12:00 timestamp is excluded. Only the point 8.005 falls within the first time interval specified by GROUP BY time() and the first time range specified by the corresponding preset time boundaries. The returned timestamp occurs before the start time of the time range specified in the query. However, the query results exclude the data that occurs before the start time of the specified time range.

The preset time boundaries of the second 12-minute time range are 12:00 and 24:00. Note that the data associated with the 24:00 timestamp is excluded. Only the point 7.887 falls within the second time interval specified by GROUP BY time() and the second time range specified by the corresponding preset time boundaries.

The advanced GROUP BY time() syntax allows you to change the start time of the preset time boundaries of TSDB for InfluxDB®. The third example in the following advanced syntax section continues to explore the query shown in the current example. The query in the third example shifts the preset time boundaries forward by 6 minutes so that TSDB for InfluxDB® returns the following result:

  1. name: h2o_feet
  2. time count
  3. ---- -----
  4. 2015-08-18T00:06:00Z 2

Advanced GROUP BY time() syntax

Syntax

  1. SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>,<offset_interval>),[tag_key] [fill(<fill_option>)]

Description

Advanced GROUP BY time() queries require the SELECT clauses that include InfluxQL functions and the WHERE clauses where time ranges are specified. Note that the GROUP BY clause must appear after the WHERE clause.

time(time_interval,offset_interval)

For more information about time_interval, see the “Basic GROUP BY time() syntax” section.

The offset_interval parameter specifies a duration literal and shifts forward or backward the preset time boundaries of TSDB for InfluxDB®. The value of the offset_interval parameter can be a positive number or a negative number.

fill(<fill_option>)

The fill(<fill_option>) function is optional. For time intervals in which no data is reported, the function returns the value specified by the fill_option parameter.

Coverage:

Advanced GROUP BY time() queries determine the raw data reported in each time interval and the timestamps returned based on the settings of the time_interval and offset_interval parameters and the preset time boundaries of TSDB for InfluxDB®.

Examples

The following sample data is used in the examples provided in this section:

  1. > SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:54:00Z'
  2. name: h2o_feet
  3. --------------
  4. time water_level
  5. 2015-08-18T00:00:00Z 8.12
  6. 2015-08-18T00:06:00Z 8.005
  7. 2015-08-18T00:12:00Z 7.887
  8. 2015-08-18T00:18:00Z 7.762
  9. 2015-08-18T00:24:00Z 7.635
  10. 2015-08-18T00:30:00Z 7.5
  11. 2015-08-18T00:36:00Z 7.372
  12. 2015-08-18T00:42:00Z 7.234
  13. 2015-08-18T00:48:00Z 7.11
  14. 2015-08-18T00:54:00Z 6.982

Group query results based on an 18-minute interval and shift the preset time boundaries forward

  1. > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m,6m)
  2. name: h2o_feet
  3. time mean
  4. ---- ----
  5. 2015-08-18T00:06:00Z 7.884666666666667
  6. 2015-08-18T00:24:00Z 7.502333333333333
  7. 2015-08-18T00:42:00Z 7.108666666666667

The query uses an InfluxQL function to calculate the average field value of the water_level field key. The query groups results based on an 18-minute interval and shifts the preset time boundaries forward by 6 minutes.

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®. The offset_interval parameter is not specified in the query and the following result is returned:

  1. > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m)
  2. name: h2o_feet
  3. time mean
  4. ---- ----
  5. 2015-08-18T00:00:00Z 7.946
  6. 2015-08-18T00:18:00Z 7.6323333333333325
  7. 2015-08-18T00:36:00Z 7.238666666666667
  8. 2015-08-18T00:54:00Z 6.982

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z 8.005 and 7.887 2015-08-18T00:00:00Z
2 time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z <- - -same 7.762, 7.635, and 7.5 2015-08-18T00:18:00Z
3 time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z <- - -same 7.372, 7.234, and 7.11 2015-08-18T00:36:00Z
4 time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z time = 2015-08-18T00:54:00Z 6.982 2015-08-18T00:54:00Z

The preset time boundaries of the first 18-minute time range are 00:00 and 18:00. Note that the data associated with the 18:00 timestamp is excluded. The two points 8.005 and 7.887 fall within the first time interval specified by GROUP BY time() and the first time range specified by the corresponding preset time boundaries. The returned timestamp occurs before the start time of the time range specified in the query. However, the query results exclude the data that occurs before the start time of the specified time range.

The preset time boundaries of the second 18-minute time range are 18:00 and 36:00. Note that the data associated with the 36:00 timestamp is excluded. The three points 7.762, 7.635, and 7.5 fall within the second time interval specified by GROUP BY time() and the second time range specified by the corresponding preset time boundaries. In this case, the time range specified by the preset time boundaries is the same as the time interval specified by GROUP BY time().

The preset time boundaries of the fourth 18-minute time range are 54:00 and 01:12:00. Note that the data associated with the 01:12:00 timestamp is excluded. Only the point 6.982 falls within the fourth time interval specified by GROUP BY time() and the fourth time range specified by the corresponding preset time boundaries.

For the queries where the offset_interval parameter is specified, the time boundaries and the returned timestamps are based on the specified offset time boundaries.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z <- - -same 8.005, 7.887, and 7.762 2015-08-18T00:06:00Z
2 time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z <- - -same 7.635, 7.5, and 7.372 2015-08-18T00:24:00Z
3 time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z <- - -same 7.234, 7.11, and 6.982 2015-08-18T00:42:00Z
4 time >= 2015-08-18T01:00:00Z AND time < 2015-08-18T01:18:00Z NA NA NA

The 6-minute offset interval shifts forward the time range specified by the preset time boundaries. Therefore, the time range specified by the offset time boundaries and the relevant time interval specified by GROUP BY time() are always the same. The offset_interval parameter is specified in the query, and the query performs calculation on the three points for each time interval. The returned timestamp matches the start time of the time range specified by the offset time boundaries and the start time of the time interval specified by GROUP BY time().

Note that offset_interval forces the fourth time boundary to be out of the time range specified in the query. Therefore, the query returns no results for that last interval.

Group query results based on an 18-minute interval and shift the preset time boundaries backward

  1. > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m,-12m)
  2. name: h2o_feet
  3. time mean
  4. ---- ----
  5. 2015-08-18T00:06:00Z 7.884666666666667
  6. 2015-08-18T00:24:00Z 7.502333333333333
  7. 2015-08-18T00:42:00Z 7.108666666666667

The query uses an InfluxQL function to calculate the average field value of the water_level field key. The query groups results based on an 18-minute interval and shifts the preset time boundaries backward by 12 minutes.

Notes: The query in the current example returns the same results as the query in the preceding example Group query results based on an 18-minute interval and shift the preset time boundaries forward. However, the offset_interval parameter is set to a negative number in the current example and a positive number in the preceding example.` The query performance for the two examples is the same. You can choose the most intuitive option when you decide between a positive number and a negative number foroffset_interval`.

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®. The offset_interval parameter is not specified in the query and the following result is returned:

  1. > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m)
  2. name: h2o_feet
  3. time mean
  4. ---- ----
  5. 2015-08-18T00:00:00Z 7.946
  6. 2015-08-18T00:18:00Z 7.6323333333333325
  7. 2015-08-18T00:36:00Z 7.238666666666667
  8. 2015-08-18T00:54:00Z 6.982

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z 8.005 and 7.887 2015-08-18T00:00:00Z
2 time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z <- - -same 7.762, 7.635, and 7.5 2015-08-18T00:18:00Z
3 time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z <- - -same 7.372, 7.234, and 7.11 2015-08-18T00:36:00Z
4 time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z time = 2015-08-18T00:54:00Z 6.982 2015-08-18T00:54:00Z

The preset time boundaries of the first 18-minute time range are 00:00 and 18:00. Note that the data associated with the 18:00 timestamp is excluded. The two points 8.005 and 7.887 fall within the first time interval specified by GROUP BY time() and the first time range specified by the corresponding preset time boundaries. The returned timestamp occurs before the start time of the time range specified in the query. However, the query results exclude the data that occurs before the start time of the specified time range.

The preset time boundaries of the second 18-minute time range are 18:00 and 36:00. Note that the data associated with the 36:00 timestamp is excluded. The three points 7.762, 7.635, and 7.5 fall within the second time interval specified by GROUP BY time() and the second time range specified by the corresponding preset time boundaries. In this case, the time range specified by the preset time boundaries is the same as the time interval specified by GROUP BY time().

The preset time boundaries of the fourth 18-minute time range are 54:00 and 01:12:00. Note that the data associated with the 01:12:00 timestamp is excluded. Only the point 6.982 falls within the fourth time interval specified by GROUP BY time() and the fourth time range specified by the corresponding preset time boundaries.

For the queries where the offset_interval parameter is specified, the time boundaries and the returned timestamps are based on the specified offset time boundaries.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-17T23:48:00Z AND time < 2015-08-18T00:06:00Z NA NA NA
2 time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z <- - -same 8.005, 7.887, and 7.762 2015-08-18T00:06:00Z
3 time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z <- - -same 7.635, 7.5, and 7.372 2015-08-18T00:24:00Z
4 time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z <- - -same 7.234, 7.11, and 6.982 2015-08-18T00:42:00Z

The negative 12-minute offset interval shifts backward the time range specified by the preset time boundaries. Therefore, the time range specified by the offset time boundaries and the relevant time interval specified by GROUP BY time() are always the same. The offset_interval parameter is specified in the query, and the query performs calculation on the three points for each time interval. The returned timestamp matches the start time of the time range specified by the offset time boundaries and the start time of the time interval specified by GROUP BY time().

Note that offset_interval forces the first time boundary to be out of the time range specified in the query. Therefore, the query returns no results for the first interval.

Group query results based on a 12-minute interval and shift the preset time boundaries forward

This example is an extension of the example that is described in the “FAQ about basic syntax” section.

  1. > SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m,6m)
  2. name: h2o_feet
  3. time count
  4. ---- -----
  5. 2015-08-18T00:06:00Z 2

The query uses an InfluxQL function to calculate the number of the points of water_level. The query groups results based on a 12-minute interval and shifts the preset time boundaries forward by 6 minutes.

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®. The offset_interval parameter is not specified in the query and the following result is returned:

  1. > SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. time count
  4. ---- -----
  5. 2015-08-18T00:00:00Z 1
  6. 2015-08-18T00:12:00Z 1

For the queries where the offset_interval parameter is not specified, the time boundaries and the returned timestamps are based on the preset time boundaries of TSDB for InfluxDB®.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z 8.005 2015-08-18T00:00:00Z
2 time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:24:00Z time >= 2015-08-18T00:12:00Z AND time < 2015-08-18T00:18:00Z 7.887 2015-08-18T00:12:00Z

The preset time boundaries of the first 12-minute time range are 00:00 and 12:00. Note that the data associated with the 12:00 timestamp is excluded. Only the point 8.005 falls within the first time interval specified by GROUP BY time() and the first time range specified by the corresponding preset time boundaries. The returned timestamp occurs before the start time of the time range specified in the query. However, the query results exclude the data that occurs before the start time of the specified time range.

The preset time boundaries of the second 12-minute time range are 12:00 and 24:00. Note that the data associated with the 24:00 timestamp is excluded. Only the point 7.887 falls within the second time interval specified by GROUP BY time() and the second time range specified by the corresponding preset time boundaries.

For the queries where the offset_interval parameter is specified, the time boundaries and the returned timestamps are based on the specified offset time boundaries.

Time interval No. Preset time boundary Time interval specified by GROUP BY time() Returned point Returned timestamp
1 time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z <—-same 8.005 and 7.887 2015-08-18T00:06:00Z
2 time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:30:00Z NA NA NA

The 6-minute offset interval shifts forward the time range specified by the preset time boundaries. Therefore, the time range specified by the offset time boundaries and the relevant time interval specified by GROUP BY time() are always the same. When the offset_interval parameter is set, the query returns a single result. The returned timestamp matches the start time of the time range specified by the offset time boundaries and the start time of the time interval specified by GROUP BY time().

Note that offset_interval forces the second time boundary to be out of the time range specified in the query. Therefore, the query returns no results for the second interval.

GROUP BY time intervals and fill()

The fill() function changes the value returned for time intervals in which no data is reported.

Syntax

  1. SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(time_interval,[<offset_interval])[,tag_key] [fill(<fill_option>)]

Description

By default, null values are returned to the output columns for time intervals in which no data is reported. The time intervals are specified by the GROUP BY time() clauses. To change the values returned for time intervals in which no data is reported, use the fill() function. Note that fill() must appear after the GROUP BY clause if you perform the GROUP BY operation based on multiple objects, such as tags and time intervals.

fill_option

  • Numerical value: returns the specified numerical value for time intervals in which no point is reported.
  • linear: returns the results of [linear interpolation] (https://en.wikipedia.org/wiki/Linear_interpolation) for time intervals in which no point is reported.
  • none: returns no timestamps and values for time intervals in which no point is reported.
  • null: returns timestamps and null values for time intervals in which no point is reported. The returned results are the same as those when you use the default values.
  • previous: returns the value that corresponds to the previous time interval for time intervals in which no point is reported.

Examples

Example 1: Use a query where the fill(100) function is not used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z

Example 1: Use a query where the fill(100) function is used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(100)
  2. name: h2o_feet
  3. --------------
  4. ime max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z 100

The fill(100) function returns 100 for time intervals in which no point is reported. In this example, fill(100) returns 100 for the fourth time interval in which no data is reported.

Example 2: Use a query where the fill(linear) function is not used

  1. > SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:00:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m)
  2. name: pond
  3. time mean
  4. ---- ----
  5. 2016-11-11T21:00:00Z 1
  6. 2016-11-11T21:12:00Z
  7. 2016-11-11T21:24:00Z 3
  8. 2016-11-11T21:36:00Z
  9. 2016-11-11T21:48:00Z
  10. 2016-11-11T22:00:00Z 6

Example 2: Use a query where the fill(linear) function is used

  1. > SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:00:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)
  2. name: pond
  3. time mean
  4. ---- ----
  5. 2016-11-11T21:00:00Z 1
  6. 2016-11-11T21:12:00Z 2
  7. 2016-11-11T21:24:00Z 3
  8. 2016-11-11T21:36:00Z 4
  9. 2016-11-11T21:48:00Z 5
  10. 2016-11-11T22:00:00Z 6

The fill(linear) function returns the result of linear interpolation for time intervals in which no point is reported.

Notes: The data used in example 2 is not stored in the NOAA_water_database database. Therefore, a data set that has a small amount of regular data is created to use fill(linear).

Example 3: Use a query where the fill(none) function is not used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z

Example 3: Use a query where the fill(none) function is used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(none)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235

The fill(null) function returns no timestamps and values for time intervals in which no point is reported.

Example 4: Use a query where the fill(null) function is not used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z

Example 4: Use a query where the fill(null) function is used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(null)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z

The fill(null) function returns null as the value for the time interval in which no point is reported. The returned results are the same as those when the fill(null) function is not used in the query.``

Example 5: Use a query where the fill(previous) function is not used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z

Example 5: Use a query where the fill(previous) function is used

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(previous)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:00:00Z 3.599
  6. 2015-09-18T16:12:00Z 3.402
  7. 2015-09-18T16:24:00Z 3.235
  8. 2015-09-18T16:36:00Z 3.235

The fill(previous) function changes the returned value to 3.235 for the time interval in which no point is reported. The returned value is the same as that reported for the previous time interval.

FAQ about the fill() function

What are the impacts if I use the fill() function when no data is reported in the time range specified in the query?

The fill() function does not take effect if no data is reported for the time range specified in the query. This is the expected result.

Example

The following query returns no data because no points are reported for the water_level field key in the specified time range. Note that fill(800) does not affect the query results.

  1. > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" = 'coyote_creek' AND time >= '2015-09-18T22:00:00Z' AND time <= '2015-09-18T22:18:00Z' GROUP BY time(12m) fill(800)
  2. >

What are the impacts if I use fill(previous) when the previous result falls out of the time range specified in the query?

The fill(previous) function does not fill the result for a time interval if the previous value falls out of the time range specified in the query.

Example

The following query covers the time range from 2015-09-18T16:24:00Z to 2015-09-18T16:54:00Z. The fill(previous) function fills the value for 2015-09-18T16:36:00Z by using the value that corresponds to 2015-09-18T16:24:00Z.

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE location = 'coyote_creek' AND time >= '2015-09-18T16:24:00Z' AND time <= '2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:24:00Z 3.235
  6. 2015-09-18T16:36:00Z 3.235
  7. 2015-09-18T16:48:00Z 4

The next query shortens the time range that is used in the previous query. The current query covers the time range from 2015-09-18T16:36:00Z to 2015-09-18T16:54:00Z. The fill(previous) function does not fill the value for 2015-09-18T16:36:00Z by using the value that corresponds to 2015-09-18T16:24:00Z. This is because the value that corresponds to 2015-09-18T16:24:00Z falls out of the shortened time range specified in the query.

  1. > SELECT MAX("water_level") FROM "h2o_feet" WHERE location = 'coyote_creek' AND time >= '2015-09-18T16:36:00Z' AND time <= '2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous)
  2. name: h2o_feet
  3. --------------
  4. time max
  5. 2015-09-18T16:36:00Z
  6. 2015-09-18T16:48:00Z 4

What are the impacts if I use fill(linear) when the previous or next result falls out of the time range specified in the query?

If the previous or next result falls out of the time range specified in the query, fill(linear) does not fill the result for time intervals in which no point is reported.

Example

The following query covers the time range from 2016-11-11T21:24:00Z to 2016-11-11T22:06:00Z. The fill(linear) function fills the value for 2016-11-11T21:36:00Z by using the value that corresponds to 2016-11-11T21:24:00Z. This function also fills the value for 2016-11-11T21:48:00Z by using the value that corresponds to 2016-11-11T22:00:00Z.

  1. > SELECT MEAN("tadpoles") FROM "pond" WHERE time > '2016-11-11T21:24:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)
  2. name: pond
  3. time mean
  4. ---- ----
  5. 2016-11-11T21:24:00Z 3
  6. 2016-11-11T21:36:00Z 4
  7. 2016-11-11T21:48:00Z 5
  8. 2016-11-11T22:00:00Z 6

The next query shortens the time range that is used in the previous query. The current query covers the time range from 2016-11-11T21:36:00Z to 2016-11-11T22:06:00Z. The fill(linear) function does not fill the values for 2016-11-11T21:36:00Z and 2016-11-11T21:48:00Z. This is because the value that corresponds to 2016-11-11T21:24:00Z falls out of the shortened time range specified in the query and TSDB for InfluxDB® cannot perform linear interpolation.

  1. > SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:36:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear)
  2. name: pond
  3. time mean
  4. ---- ----
  5. 2016-11-11T21:36:00Z
  6. 2016-11-11T21:48:00Z
  7. 2016-11-11T22:00:00Z 6

Notes: The sample data used in this example is not stored in the NOAA_water_database database. Therefore, a data set that has a small amount of regular data is created to use fill(linear).


InfluxDB® is a trademark registered by InfluxData, which is not affiliated with, and does not endorse, TSDB for InfluxDB®.