Before you dive into TSDB for InfluxDB®, we recommend that you get familiar with the key concepts of TSDB for InfluxDB® databases. This topic introduces the key concepts and terms of TSDB for InfluxDB® databases. The following table lists all the terms that are covered in this topic. We recommend that you read through this topic to gain a general understanding of TSDB for InfluxDB®.
|Database||Field key||Field set|
|Retention policy||Series||Tag key|
|Tag set||Tag value||Timestamp|
For more information, Terms.
The following table lists the sample data that is used in this topic. The sample data serves as a valuable reference for the setup of TSDB for InfluxDB®, even though the sample data is not collected from actual use cases. The data shows the number of
honeybees that are counted by the
perpetua scientists in location
1 and location
2. The two scientists counted the butterflies and honeybees over the time period from August 18, 2015 at midnight to August 18, 2015 at 6:12 in the morning. In this example, the data is stored in a database named
my_database and the
autogen retention policy is used. In the data, the measurement is
census. Timestamps are stored in the
time column. Field keys are
honeybees, and field values are the values in the
honeybees columns. Tag keys are
scientist, and tag values are the values in the
This section explains the sample data in TSDB for InfluxDB®.
TSDB for InfluxDB® is a time series database service. Therefore, this section starts with the analysis of time. The
time column exists in the sample data. All data in TSDB for InfluxDB® has this column.
The time column stores timestamps. Each timestamp shows the date and time that are associated with specific data. The date and time use the UTC+0 time zone and comply with the RFC 3339 protocol.
The two columns named
honeybees are fields. Fields consist of field keys and field values. The
honeybees field keys are stored as strings. In the
butterflies field key, the field values are 12 to 7 from top to bottom in the preceding table and indicate the number of butterflies. In the
honeybees field key, the field values are 23 to 22 from top to bottom in the preceding table and indicate the number of honeybees.
Field values are your data, and can be strings, floating-point numbers, integers, or Boolean values. Each field value is always associated with a timestamp, because TSDB for InfluxDB® is a time series database service. The field values in the sample data are provided as follows:
A field set is a collection of field key-value pairs. The sample data contains the following eight field sets:
* butterflies = 12 honeybees = 23
* butterflies = 1 honeybees = 30
* butterflies = 11 honeybees = 28
* butterflies = 3 honeybees = 28
* butterflies = 2 honeybees = 11
* butterflies = 1 honeybees = 10
* butterflies = 8 honeybees = 23
* butterflies = 7 honeybees = 22
Fields are a required element in the data structure of TSDB for InfluxDB®. In TSDB for InfluxDB®, you must specify fields to store data. Note that no indexes are created for fields. If you use field values as filter conditions for queries, the system must scan all the values that match the other conditions in the queries. As a result, queries based on field values require a longer response time than tag-based queries. The detailed information about tags is described in the following part. In general, fields cannot contain the metadata that is frequently queried.
The last two columns named
scientist are tags in the sample data. Tags consist of tag keys and tag values. Tag keys and tag values are stored as strings that record metadata. In the sample data, the tag keys are
location tag key has two tag values:
scientist tag key also has two tag values:
Tag sets are the different combinations of tag key-value pairs. The sample data has the following four tag sets:
* location = 1, scientist = langstroth
* location = 2, scientist = langstroth
* location = 1, scientist = perpetua
* location = 2, scientist = perpetua
In TSDB for InfluxDB®, tags are optional. You do not need to use tags in your data structure. However, you can benefit from using tags. Unlike fields, tags are indexed. The queries that use tags as filter conditions require a shorter response time than those use field values as filter conditions. Therefore, the tags are suitable for storing metadata that is frequently queried.
Importance of indexing: Schema use case
Assume that most of your queries use the field values of the
honeybees field keys as filter conditions.
SELECT FROM “census” WHERE “butterflies” = 1
SELECT FROM “census” WHERE “honeybees” = 23
Fields are not indexed. Therefore, TSDB for InfluxDB® scans every value of the
butterflies field key in the first query and every value of the
honeybees field key in the second query. Then, TSDB for InfluxDB® returns the query results. This prolongs the response time of queries, especially when you run queries based on large amounts of data. To improve query performance, you can rearrange your schema by changing the
honeybees fields to tags and the
scientist tags to fields. The following table describes the sample data that has the new schema.
Note that butterflies and honeybees are tags in this example. If you perform the preceding query again, TSDB for InfluxDB® does not need to scan every value of
A measurement is a container for tags, fields, and the
time column. The measurement name is the description of the data that is stored in the relevant fields. A measurement name is a string. For SQL users, a measurement is similar in concept to a table. The sample data has only one measurement:
The measurement name census indicates that the field values record the number of
honeybees instead of sizes, directions, or happiness indexes.
A measurement can belong to more than one retention policy. A retention policy specifies two parameters:
REPLICATION. The DURATION parameter specifies how long TSDB for InfluxDB® retains data. The REPLICATION parameter specifies how many copies of the data are stored in a cluster.
Note: Replication factors are not applicable to single-node instances.
In the sample data, all data in the
census measurement belongs to the
autogen retention policy. TSDB for InfluxDB® automatically creates the
autogen retention policy. This retention policy allows you to store data permanently, and the replication factor for this retention policy is set to 1.
Now you are familiar with measurements, tag sets, and retention policies. You can move on to get familiar with another key concept: series. In TSDB for InfluxDB®, a series is a collection of data points that share a retention policy, a measurement, and a tag set. The sample data has four series.
|Arbitrary series number||Retention policy||Measurement||Tag set|
|Series 1||autogen||census||location = 1, scientist = langstroth|
|Series 2||autogen||census||location = 2, scientist = langstroth|
|Series 3||autogen||census||location = 1, scientist = perpetua|
|Series 4||autogen||census||location = 2, scientist = perpetua|
We recommend that you understand the concept of a series before you design your data schema and use TSDB for InfluxDB® to process data.
A point is a field set that has the same timestamp in a series. The following example shows a point:
time butterflies honeybees location scientist
2015-08-18T00:00:00Z 1 30 1 perpetua
The point in this example belongs to series 3 where the measurement is
census, and the tag set is
location = 1, scientist = perpetua. The retention policy for this series is
autogen. The timestamp of the point is
All of the elements introduced in this topic are stored in databases, such as the series, tag sets, field sets, and points. The sample data is stored in the
my_database database. A TSDB for InfluxDB® database is similar to traditional databases, and serves as a logical container for users, retention policies, continuous queries, and time series data.
A database supports multiple users, continuous queries, retention policies, and measurements. TSDB for InfluxDB® is a schemaless database service. This allows you to easily add measurements, tags, and fields at any time. TSDB for InfluxDB® is designed to provide an easy and efficient method for you to process time series data.
Now you have read through this topic and familiarized yourself with the key concepts and terms of TSDB for InfluxDB®. If you are new to TSDB for InfluxDB®, we recommend that you view the following topics: Quick start, Use the HTTP API to write data, and Use the HTTP API to query data. TSDB for InfluxDB® strives to deliver excellent user experience.
InfluxDB® is a trademark registered by InfluxData, which is not affiliated with, and does not endorse, TSDB for InfluxDB®.