Key concepts - Time Series Database - Alibaba Cloud Documentation Center

Before you dive into Time Series Database (TSDB) for InfluxDB®, we recommend that you familiarize yourself with specific key concepts of databases. This topic introduces key concepts and common terms that are related to TSDB for InfluxDB®. The following table lists all the terms that are covered in this topic. We recommend that you read through this topic to fully understand TSDB for InfluxDB®.

database	field key	field set
field value	measurement	point
retention policy	series	tag key
tag set	tag value	timestamp

For more information, see Terms.

Sample data

This section provides an example in which the following sample data is used. The sample data is not collected from actual use cases, but it serves as a valuable reference for the setup of TSDB for InfluxDB®. The sample data shows the numbers of butterflies and honeybees that are counted by two scientists langstroth and perpetua in location 1 and location 2 during the time period from 00:00 on August 18, 2015 to 06:12 on August 18, 2015. In this example, the sample data is stored in a database named my_database, and the autogen data retention policy is used. In the sample data, the measurement is census. The timestamps are stored in the time column. Field keys are butterflies and honeybees, and field values are the values in the butterflies and honeybees columns. Tag keys are location and scientist, and tag values are the values in the location and scientist columns.

name: census

time	butterflies	honeybees	location	scientist
2015-08-18T00:00:00Z	12	23	1	langstroth
2015-08-18T00:00:00Z	1	30	1	perpetua
2015-08-18T00:06:00Z	11	28	1	langstroth
2015-08-18T00:06:00Z	3	28	1	perpetua
2015-08-18T05:54:00Z	2	11	2	langstroth
2015-08-18T06:00:00Z	1	10	2	langstroth
2015-08-18T06:06:00Z	8	23	2	perpetua
2015-08-18T06:12:00Z	7	22	2	perpetua

Description

This section explains the sample data in TSDB for InfluxDB®.

TSDB for InfluxDB® is a time series database service. Therefore, this section starts with the analysis of time. The preceding sample data contains a column named time. All data in TSDB for InfluxDB® has this column. The time column stores timestamps. Each timestamp shows the date and time that are associated with specific data. The date and time use the UTC+0 time zone and comply with RFC 3339 standards.

The butterflies and honeybees columns are fields. A field consists of a field key and a field value. In the sample data, butterflies and honeybees are field keys. Field keys are strings. The butterflies field key shows the number of butterflies, and the corresponding field values are 12 to 7 from top to bottom. The honeybees field key shows the number of honeybees, and the corresponding field values are 23 to 22 from top to bottom.

Field values are actual data. They can be strings, floating-point numbers, integers, and Boolean values. A field value is always associated with a timestamp because TSDB for InfluxDB® is a time series database service. The sample data contains the following field values:

A field set is a collection of field key-value pairs. The preceding sample data contains eight field sets.

* butterflies = 12   honeybees = 23
* butterflies = 1    honeybees = 30
* butterflies = 11   honeybees = 28
* butterflies = 3    honeybees = 28
* butterflies = 2    honeybees = 11
* butterflies = 1    honeybees = 10
* butterflies = 8    honeybees = 23
* butterflies = 7    honeybees = 22

Fields are a required element of the data structure in TSDB for InfluxDB®. In TSDB for InfluxDB®, you must specify fields to store data. Take note that fields in TSDB for InfluxDB® are not indexed. If you use field values as filters for a query, the system must scan all the values that match the other conditions in the query. As a result, queries that use field values as filters require a longer response time than queries that use tags as filters. The detailed information about tags is described in the following part. In most cases, a field cannot contain metadata that is frequently queried.

The location and scientist columns in the sample data are tags. A tag consists of a tag key and a tag value. Tag keys and tag values are stored as strings that record metadata. In the sample data, the tag keys are location and scientist. The location tag key has two tag values: 1 and 2. The scientist tag key also has two tag values: langstroth and perpetua.

Tag sets are different combinations of tag key-value pairs. The sample data contains four tag sets.

* location = 1, scientist = langstroth
* location = 2, scientist = langstroth
* location = 1, scientist = perpetua
* location = 2, scientist = perpetua

In TSDB for InfluxDB®, tags are optional. You do not need to use tags in your data structure. However, you can benefit from using tags because tags are indexed. This makes queries that use tags as filters run faster than queries that use field values as filters. Therefore, tags are suitable for storing metadata that is frequently queried.

Why indexing is important: A schema use case

This section provides a use case in which most of your queries use the values of the butterflies and honeybees fields as filters, such as SELECT FROM "census" WHERE "butterflies"= 1 and SELECT FROM "census" WHERE "honeybees"= 23.

No indexes are created on the fields. Therefore, TSDB for InfluxDB® scans all values of butterflies in the first query and all values of honeybees in the second query. Then, TSDB for InfluxDB® returns the query results. This prolongs the response time of queries, especially when you run queries based on large amounts of data. To optimize query performance, you can rearrange your schema by changing the butterflies and honeybees fields to tags and the location and scientist tags to fields.

name: census

time	location	scientist	butterflies	honeybees
2015-08-18T00:00:00Z	1	langstroth	12	23
2015-08-18T00:00:00Z	1	perpetua	1	30
2015-08-18T00:06:00Z	1	langstroth	11	28
2015-08-18T00:06:00Z	1	perpetua	3	28
2015-08-18T05:54:00Z	2	langstroth	2	11
2015-08-18T06:00:00Z	2	langstroth	1	10
2015-08-18T06:06:00Z	2	perpetua	8	23
2015-08-18T06:12:00Z	2	perpetua	7	22

This way, when you rerun the preceding queries, TSDB for InfluxDB® does not need to scan every value in butterflies and honeybees before query results can be returned.

A measurement is used as a container for tags, fields, and the time column. The name of the measurement describes the data that is stored in the associated fields. A measurement name is a string. For SQL users, a measurement is similar in concept to a table. In the sample data, only one measurement exists. The measurement is named census. The census measurement shows that field values record the numbers of butterflies and honeybees, instead of sizes, directions, or happiness indexes.

A measurement can belong to more than one retention policy. A retention policy defines the amount of time for which TSDB for InfluxDB® retains data and the number of copies for each data point in a cluster. You can use the DURATION clause to set a retention period and use the REPLICATION clause to specify the number of data copies.

Note

Replication factors are not applicable to standalone instances.

In the sample data, all data in the census measurement belongs to the autogen retention policy. TSDB for InfluxDB® automatically creates the autogen retention policy. This retention policy allows you to permanently store data, and the replication factor for this retention policy is set to 1.

In TSDB for InfluxDB®, a series is a collection of data points that belong to the same retention policy and share the same measurement and tag set. The preceding sample data consists of four series.

Arbitrary series number	Retention policy	Measurement	Tag set
series 1	autogen	census	location = 1, scientist = langstroth
series 2	autogen	census	location = 2, scientist = langstroth
series 3	autogen	census	location = 1, scientist = perpetua
series 4	autogen	census	location = 2, scientist = perpetua

We recommend that you understand series before you design your data schema and use TSDB for InfluxDB® to process data.

A data point consists of a set of fields that are associated with the same timestamp in the same series. The following data provides an example of data points:

name: census
-----------------
time                    butterflies  honeybees   location    scientist
2015-08-18T00:00:00Z    1            30          1           perpetua

In this example, the retention policy is autogen, the measurement is census, and the tag set is location =1, scientist = perpetua. The timestamp of the data point is 2015-08-18T00:00: 00Z.

All elements that are described in this topic are stored in databases. The sample data is stored in the my_database database. TSDB for InfluxDB® databases are similar to traditional databases. A TSDB for InfluxDB® database serves as a logical container for users, retention policies, continuous queries, and time series data.

A database supports multiple users, continuous queries, retention policies, and measurements. TSDB for InfluxDB® provides schemaless databases. This allows you to add new measurements, tags, and fields in an easy manner based on your business requirements. TSDB for InfluxDB® provides an easy and efficient method for you to process time series data.

If you are new to TSDB for InfluxDB®, we recommend that you view the following topics: Quick start, Use the HTTP API to write data, and Use the HTTP API to query data.

InfluxDB® is a trademark registered by InfluxData, which is not affiliated with, and does not endorse, TSDB for InfluxDB®.