This topic introduces the concept of continuous queries in LindormTSDB and describes the features that are related to continuous queries. LindormTSDB is a time series engine provided by Lindorm.
What is a continuous query?
Continuous queries are time series queries that are automatically and periodically executed within a time series engine.
In time series applications, real-time data is written into databases based on timestamps. In some cases, you want to calculate the time series data that matches specified conditions at scheduled intervals and save the calculation results based on your business requirements. For example, you want to perform aggregate operations on the time series data that is written within a specified time window at scheduled intervals. The following figure shows a use case in which continuous queries are performed.
Continuous queries are developed and provided to meet the business requirements in these scenarios. In a time series engine, continuous queries provide a simplified stream computing capability. This capability allows the time series engine to calculate the data written within a time window after the time window ends.
Use continuous queries
Manage continuous queries
LindormTSDB allows you to use SQL to manage continuous queries in databases.
Create a continuous query.
Create a continuous query for a specified database. For information about the specific SQL syntax, see CREATE CONTINUOUS QUERY.
Delete a continuous query.
Delete an existing continuous query from a specified database. For information about the specific SQL syntax, see DROP CONTINUOUS QUERY.
View information about continuous queries.
Query the metadata of existing continuous queries that are performed on the current database. For information about the specific SQL syntax, see SHOW CONTINUOUS QUERIES.
Schedule time windows for continuous queries
Continuous queries use the timestamps of local LindormTSDB nodes to determine when to perform calculations and use these timestamps as a reference to calculate time windows. A continuous query automatically runs to calculate data that is written within the previous window each time a new time window starts. For example, if the specified time window for a continuous query is 1 hour, the continuous query calculates data at the beginning of each hour. If a calculation is triggered at 20:00, the time range of the data to be calculated is [19:00:00,20:00:00)
.
Although a continuous query starts to calculate data immediately after the end of the defined time window, it may take a period of time to save the calculation results to the destination table. The latency varies based on the amount of data that you want to query and the real-time loads on the instance in which the data is stored.
The accuracy of continuous queries depends on the orderliness of the raw data. In some cases, LindormTSDB still writes data that belongs to the previous time window when a new time window starts. In these cases, these data records are out of order and cannot be queried or calculated during the previous time window.
When a database is deleted, all continuous queries that are created for the database are deleted.
Use scenarios and examples
Common scenarios
Use continuous queries to achieve cost-efficient long-term data storage.
For scenarios in which a large amount of data is stored, storage costs become a concern for users. You can use continuous queries together with the time-to-live (TTL) or cold data archiving capability to reduce the cost of storing data that requires long-term storage. For more information about the TTL and cold data archiving capabilities, see Database management.
Use continuous queries to improve query performance.
If you directly run a downsampling query that spans a large time range or an aggregate query across time series on raw data, the query may take a long period of time to process. In this case, you can use a continuous query to precalculate the results, and then query the data from the result table when necessary. This way, query performance can be improved, and the query throughput can also be improved.
Examples
The following example shows how to use continuous queries and the TTL feature to achieve cost-efficient long-term data storage.
For example, if your business application generates large amounts of raw data, only the data of the last month can be retained. After downsampling is performed, the data of the last several years can be retained. In the following example, the sampling period of raw data is 1 second and downsampled to 1 minute to store data for a longer period of time.
Execute the following statement to create a database named db_sensor_month to store raw data and set the maximum data retention period to 30 days:
CREATE DATABASE db_sensor_month WITH (ttl=30);
Execute the following statement to create another database to store data that is returned after a continuous query is run:
CREATE DATABASE db_sensor_year WITH (ttl=365);
Execute the following statement to create a table to save the calculation results:
CREATE TABLE db_sensor_year.sensor ( device_id VARCHAR TAG, region VARCHAR TAG, time TIMESTAMP, temperature DOUBLE, humidity BIGINT);
Execute the following statement to create a continuous query to perform downsampling:
CREATE CONTINUOUS QUERY db_sensor_year.my_cq WITH (INTERVAL='1m') AS INSERT into db_sensor_year.sensor(time, temperature, humidity, device_id,region) SELECT time, mean(temperature) AS temperature, mean(humidity) humidity, device_id, region FROM db_sensor_month.sensor SAMPLE BY 60s;