All Products
Search
Document Center

Key concepts

Last Updated: May 29, 2020

Before you dive into TSDB for InfluxDB®, we recommend that you get familiar with the key concepts of TSDB for InfluxDB® databases. This topic introduces the key concepts and terms of TSDB for InfluxDB® databases. The following table lists all the terms that are covered in this topic. We recommend that you read through this topic to gain a general understanding of TSDB for InfluxDB®.

Database Field key Field set
Field value Measurement Point
Retention policy Series Tag key
Tag set Tag value Timestamp

For more information, Terms.

Sample data

The following table lists the sample data that is used in this topic. The sample data serves as a valuable reference for the setup of TSDB for InfluxDB®, even though the sample data is not collected from actual use cases. The data shows the number of butterflies and honeybees that are counted by the langstroth and perpetua scientists in location 1 and location 2. The two scientists counted the butterflies and honeybees over the time period from August 18, 2015 at midnight to August 18, 2015 at 6:12 in the morning. In this example, the data is stored in a database named my_database and the autogen retention policy is used. In the data, the measurement is census. Timestamps are stored in the time column. Field keys are butterflies and honeybees, and field values are the values in the butterflies and honeybees columns. Tag keys are location and scientist, and tag values are the values in the location and scientist columns.

Name: census

time butterflies honeybees location scientist
2015-08-18T00:00:00Z 12 23 1 langstroth
2015-08-18T00:00:00Z 1 30 1 perpetua
2015-08-18T00:06:00Z 11 28 1 langstroth
2015-08-18T00:06:00Z 3 28 1 perpetua
2015-08-18T05:54:00Z 2 11 2 langstroth
2015-08-18T06:00:00Z 1 10 2 langstroth
2015-08-18T06:06:00Z 8 23 2 perpetua
2015-08-18T06:12:00Z 7 22 2 perpetua

Description

This section explains the sample data in TSDB for InfluxDB®.

TSDB for InfluxDB® is a time series database service. Therefore, this section starts with the analysis of time. The time column exists in the sample data. All data in TSDB for InfluxDB® has this column. The time column stores timestamps. Each timestamp shows the date and time that are associated with specific data. The date and time use the UTC+0 time zone and comply with the RFC 3339 protocol.

The two columns named butterflies and honeybees are fields. Fields consist of field keys and field values. The butterflies and honeybees field keys are stored as strings. In the butterflies field key, the field values are 12 to 7 from top to bottom in the preceding table and indicate the number of butterflies. In the honeybees field key, the field values are 23 to 22 from top to bottom in the preceding table and indicate the number of honeybees.

Field values are your data, and can be strings, floating-point numbers, integers, or Boolean values. Each field value is always associated with a timestamp, because TSDB for InfluxDB® is a time series database service. The field values in the sample data are provided as follows:

  1. 12 23
  2. 1 30
  3. 11 28
  4. 3 28
  5. 2 11
  6. 1 10
  7. 8 23
  8. 7 22

A field set is a collection of field key-value pairs. The sample data contains the following eight field sets:

  1. * butterflies = 12 honeybees = 23
  2. * butterflies = 1 honeybees = 30
  3. * butterflies = 11 honeybees = 28
  4. * butterflies = 3 honeybees = 28
  5. * butterflies = 2 honeybees = 11
  6. * butterflies = 1 honeybees = 10
  7. * butterflies = 8 honeybees = 23
  8. * butterflies = 7 honeybees = 22

Fields are a required element in the data structure of TSDB for InfluxDB®. In TSDB for InfluxDB®, you must specify fields to store data. Note that no indexes are created for fields. If you use field values as filter conditions for queries, the system must scan all the values that match the other conditions in the queries. As a result, queries based on field values require a longer response time than tag-based queries. The detailed information about tags is described in the following part. In general, fields cannot contain the metadata that is frequently queried.

The last two columns named location and scientist are tags in the sample data. Tags consist of tag keys and tag values. Tag keys and tag values are stored as strings that record metadata. In the sample data, the tag keys are location and scientist. The location tag key has two tag values: 1 and 2. The scientist tag key also has two tag values: langstroth and perpetua.

Tag sets are the different combinations of tag key-value pairs. The sample data has the following four tag sets:

  1. * location = 1, scientist = langstroth
  2. * location = 2, scientist = langstroth
  3. * location = 1, scientist = perpetua
  4. * location = 2, scientist = perpetua

In TSDB for InfluxDB®, tags are optional. You do not need to use tags in your data structure. However, you can benefit from using tags. Unlike fields, tags are indexed. The queries that use tags as filter conditions require a shorter response time than those use field values as filter conditions. Therefore, the tags are suitable for storing metadata that is frequently queried.


Importance of indexing: Schema use case

Assume that most of your queries use the field values of the butterflies and honeybees field keys as filter conditions.
SELECT FROM “census” WHERE “butterflies” = 1
SELECT
FROM “census” WHERE “honeybees” = 23

Fields are not indexed. Therefore, TSDB for InfluxDB® scans every value of the butterflies field key in the first query and every value of the honeybees field key in the second query. Then, TSDB for InfluxDB® returns the query results. This prolongs the response time of queries, especially when you run queries based on large amounts of data. To improve query performance, you can rearrange your schema by changing the butterflies and honeybees fields to tags and the location and scientist tags to fields. The following table describes the sample data that has the new schema.

Name: census

time location scientist butterflies honeybees
2015-08-18T00:00:00Z 1 langstroth 12 23
2015-08-18T00:00:00Z 1 perpetua 1 30
2015-08-18T00:06:00Z 1 langstroth 11 28
2015-08-18T00:06:00Z 1 perpetua 3 28
2015-08-18T05:54:00Z 2 langstroth 2 11
2015-08-18T06:00:00Z 2 langstroth 1 10
2015-08-18T06:06:00Z 2 perpetua 8 23
2015-08-18T06:12:00Z 2 perpetua 7 22

Note that butterflies and honeybees are tags in this example. If you perform the preceding query again, TSDB for InfluxDB® does not need to scan every value of butterflies and honeybees.


A measurement is a container for tags, fields, and the time column. The measurement name is the description of the data that is stored in the relevant fields. A measurement name is a string. For SQL users, a measurement is similar in concept to a table. The sample data has only one measurement: census. The measurement name census indicates that the field values record the number of butterflies and honeybees instead of sizes, directions, or happiness indexes.

A measurement can belong to more than one retention policy. A retention policy specifies two parameters: DURATION and REPLICATION. The DURATION parameter specifies how long TSDB for InfluxDB® retains data. The REPLICATION parameter specifies how many copies of the data are stored in a cluster.

Note: Replication factors are not applicable to single-node instances.

In the sample data, all data in the census measurement belongs to the autogen retention policy. TSDB for InfluxDB® automatically creates the autogen retention policy. This retention policy allows you to store data permanently, and the replication factor for this retention policy is set to 1.

Now you are familiar with measurements, tag sets, and retention policies. You can move on to get familiar with another key concept: series. In TSDB for InfluxDB®, a series is a collection of data points that share a retention policy, a measurement, and a tag set. The sample data has four series.

Arbitrary series number Retention policy Measurement Tag set
Series 1 autogen census location = 1, scientist = langstroth
Series 2 autogen census location = 2, scientist = langstroth
Series 3 autogen census location = 1, scientist = perpetua
Series 4 autogen census location = 2, scientist = perpetua

We recommend that you understand the concept of a series before you design your data schema and use TSDB for InfluxDB® to process data.

A point is a field set that has the same timestamp in a series. The following example shows a point:

  1. name: census
  2. -----------------
  3. time butterflies honeybees location scientist
  4. 2015-08-18T00:00:00Z 1 30 1 perpetua

The point in this example belongs to series 3 where the measurement is census, and the tag set is location = 1, scientist = perpetua. The retention policy for this series is autogen. The timestamp of the point is 2015-08-18T00:00:00Z.

All of the elements introduced in this topic are stored in databases, such as the series, tag sets, field sets, and points. The sample data is stored in the my_database database. A TSDB for InfluxDB® database is similar to traditional databases, and serves as a logical container for users, retention policies, continuous queries, and time series data.

A database supports multiple users, continuous queries, retention policies, and measurements. TSDB for InfluxDB® is a schemaless database service. This allows you to easily add measurements, tags, and fields at any time. TSDB for InfluxDB® is designed to provide an easy and efficient method for you to process time series data.

Now you have read through this topic and familiarized yourself with the key concepts and terms of TSDB for InfluxDB®. If you are new to TSDB for InfluxDB®, we recommend that you view the following topics: Quick start, Use the HTTP API to write data, and Use the HTTP API to query data. TSDB for InfluxDB® strives to deliver excellent user experience.


InfluxDB® is a trademark registered by InfluxData, which is not affiliated with, and does not endorse, TSDB for InfluxDB®.