Before you dive into Time Series Database (TSDB) for InfluxDB®, we recommend that you familiarize yourself with specific key concepts of databases. This topic introduces key concepts and common terms that are related to TSDB for InfluxDB®. The following table lists all the terms that are covered in this topic. We recommend that you read through this topic to fully understand TSDB for InfluxDB®.
database | field key | field set |
---|---|---|
field value | measurement | point |
retention policy | series | tag key |
tag set | tag value | timestamp |
For more information, see Terms.
Sample data
This section provides an example in which the following sample data is used. The sample data is not collected from actual use cases, but it serves as a valuable reference for the setup of TSDB for InfluxDB®. The sample data shows the numbers of butterflies
and honeybees
that are counted by two scientists langstroth
and perpetua
in location 1
and location 2
during the time period from 00:00 on August 18, 2015 to 06:12 on August 18, 2015. In this example, the sample data is stored in a database named my_database
, and the autogen
data retention policy is used. In the sample data, the measurement is census
. The timestamps are stored in the time
column. Field keys are butterflies
and honeybees
, and field values are the values in the butterflies
and honeybees
columns. Tag keys are location
and scientist
, and tag values are the values in the location
and scientist
columns.
name: census
time | butterflies | honeybees | location | scientist |
---|---|---|---|---|
2015-08-18T00:00:00Z | 12 | 23 | 1 | langstroth |
2015-08-18T00:00:00Z | 1 | 30 | 1 | perpetua |
2015-08-18T00:06:00Z | 11 | 28 | 1 | langstroth |
2015-08-18T00:06:00Z | 3 | 28 | 1 | perpetua |
2015-08-18T05:54:00Z | 2 | 11 | 2 | langstroth |
2015-08-18T06:00:00Z | 1 | 10 | 2 | langstroth |
2015-08-18T06:06:00Z | 8 | 23 | 2 | perpetua |
2015-08-18T06:12:00Z | 7 | 22 | 2 | perpetua |
Description
This section explains the sample data in TSDB for InfluxDB®.
TSDB for InfluxDB® is a time series database service. Therefore, this section starts with the analysis of time. The preceding sample data contains a column named time
. All data in TSDB for InfluxDB® has this column. The time
column stores timestamps. Each timestamp shows the date and time that are associated with specific data. The date and time use the UTC+0 time zone and comply with RFC 3339 standards.
The butterflies
and honeybees
columns are fields. A field consists of a field key and a field value. In the sample data, butterflies
and honeybees
are field keys. Field keys are strings. The butterflies
field key shows the number of butterflies, and the corresponding field values are 12 to 7 from top to bottom. The honeybees
field key shows the number of honeybees, and the corresponding field values are 23 to 22 from top to bottom.
Field values are actual data. They can be strings, floating-point numbers, integers, and Boolean values. A field value is always associated with a timestamp because TSDB for InfluxDB® is a time series database service. The sample data contains the following field values:
12 23
1 30
11 28
3 28
2 11
1 10
8 23
7 22
A field set is a collection of field key-value pairs. The preceding sample data contains eight field sets.
* butterflies = 12 honeybees = 23
* butterflies = 1 honeybees = 30
* butterflies = 11 honeybees = 28
* butterflies = 3 honeybees = 28
* butterflies = 2 honeybees = 11
* butterflies = 1 honeybees = 10
* butterflies = 8 honeybees = 23
* butterflies = 7 honeybees = 22
Fields are a required element of the data structure in TSDB for InfluxDB®. In TSDB for InfluxDB®, you must specify fields to store data. Take note that fields in TSDB for InfluxDB® are not indexed. If you use field values as filters for a query, the system must scan all the values that match the other conditions in the query. As a result, queries that use field values as filters require a longer response time than queries that use tags as filters. The detailed information about tags is described in the following part. In most cases, a field cannot contain metadata that is frequently queried.
The location
and scientist
columns in the sample data are tags. A tag consists of a tag key and a tag value. Tag keys and tag values are stored as strings that record metadata. In the sample data, the tag keys are location
and scientist
. The location
tag key has two tag values: 1
and 2
. The scientist
tag key also has two tag values: langstroth
and perpetua
.
Tag sets are different combinations of tag key-value pairs. The sample data contains four tag sets.
* location = 1, scientist = langstroth
* location = 2, scientist = langstroth
* location = 1, scientist = perpetua
* location = 2, scientist = perpetua
In TSDB for InfluxDB®, tags are optional. You do not need to use tags in your data structure. However, you can benefit from using tags because tags are indexed. This makes queries that use tags as filters run faster than queries that use field values as filters. Therefore, tags are suitable for storing metadata that is frequently queried.
Why indexing is important: A schema use case
This section provides a use case in which most of your queries use the values of the butterflies
and honeybees
fields as filters, such as SELECT FROM "census" WHERE "butterflies"= 1 and SELECT FROM "census" WHERE "honeybees"= 23.
No indexes are created on the fields. Therefore, TSDB for InfluxDB® scans all values of butterflies
in the first query and all values of honeybees
in the second query. Then, TSDB for InfluxDB® returns the query results. This prolongs the response time of queries, especially when you run queries based on large amounts of data. To optimize query performance, you can rearrange your schema by changing the butterflies
and honeybees
fields to tags and the location
and scientist
tags to fields.
name: census
time | location | scientist | butterflies | honeybees |
---|---|---|---|---|
2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
2015-08-18T00:06:00Z | 1 | perpetua | 3 | 28 |
2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
This way, when you rerun the preceding queries, TSDB for InfluxDB® does not need to scan every value in butterflies
and honeybees
before query results can be returned.
A measurement is used as a container for tags, fields, and the time
column. The name of the measurement describes the data that is stored in the associated fields. A measurement name is a string. For SQL users, a measurement is similar in concept to a table. In the sample data, only one measurement exists. The measurement is named census
. The census
measurement shows that field values record the numbers of butterflies
and honeybees
, instead of sizes, directions, or happiness indexes.
A measurement can belong to more than one retention policy. A retention policy defines the amount of time for which TSDB for InfluxDB® retains data and the number of copies for each data point in a cluster. You can use the DURATION
clause to set a retention period and use the REPLICATION
clause to specify the number of data copies.
Replication factors are not applicable to standalone instances.
In the sample data, all data in the census
measurement belongs to the autogen
retention policy. TSDB for InfluxDB® automatically creates the autogen
retention policy. This retention policy allows you to permanently store data, and the replication factor for this retention policy is set to 1.
In TSDB for InfluxDB®, a series is a collection of data points that belong to the same retention policy and share the same measurement and tag set. The preceding sample data consists of four series.
Arbitrary series number | Retention policy | Measurement | Tag set |
---|---|---|---|
series 1 | autogen | census | location = 1, scientist = langstroth |
series 2 | autogen | census | location = 2, scientist = langstroth |
series 3 | autogen | census | location = 1, scientist = perpetua |
series 4 | autogen | census | location = 2, scientist = perpetua |
We recommend that you understand series before you design your data schema and use TSDB for InfluxDB® to process data.
A data point consists of a set of fields that are associated with the same timestamp in the same series. The following data provides an example of data points:
name: census
-----------------
time butterflies honeybees location scientist
2015-08-18T00:00:00Z 1 30 1 perpetua
In this example, the retention policy is autogen
, the measurement is census
, and the tag set is location =1, scientist = perpetua
. The timestamp of the data point is 2015-08-18T00:00: 00Z
.
All elements that are described in this topic are stored in databases. The sample data is stored in the my_database
database. TSDB for InfluxDB® databases are similar to traditional databases. A TSDB for InfluxDB® database serves as a logical container for users, retention policies, continuous queries, and time series data.
A database supports multiple users, continuous queries, retention policies, and measurements. TSDB for InfluxDB® provides schemaless databases. This allows you to add new measurements, tags, and fields in an easy manner based on your business requirements. TSDB for InfluxDB® provides an easy and efficient method for you to process time series data.
If you are new to TSDB for InfluxDB®, we recommend that you view the following topics: Quick start, Use the HTTP API to write data, and Use the HTTP API to query data.
InfluxDB® is a trademark registered by InfluxData, which is not affiliated with, and does not endorse, TSDB for InfluxDB®.