By Zhaofeng Zhou (Muluo)
KairosDB was originally a branch that was forked from OpenTSDB version 1.x, with the goal of implementing secondary development based on the OpenTSDB code to meet new functional requirements. One of its refinements is to support pluggable storage engines. For example, support for H2 facilitates local development and testing, instead of being strongly coupled with HBase like OpenTSDB. In earlier versions, HBase was also its primary storage engine. However, in the subsequent storage optimizations, HBase was gradually replaced by Cassandra, which made it the first time series database developed based on Cassandra. In the latest versions, HBase is no longer supported, because some attributes unique to Cassandra but not available in HBase are used for storage optimization.
The overall architecture is similar to that of OpenTSDB, and both use a mature database as the bottom-layer storage engine. The main logic is only a thin logic layer above the storage engine layer. The deployment architecture of this logic layer is a stateless component that can be easily scaled horizontally.
In terms of functional differences, it performs secondary development on OpenTSDB 1.x, which is also to optimize some features of OpenTSDB or develop some features not available in OpenTSDB. I'll outline some of the major functional differences I about:
The main design feature is the use of UID encoding. This greatly saves storage space, and many queries are optimized by using the HBase filter based on the fixed byte count attribute of UID encoding. However, UID encoding also has many defects. First, the mapping table of metric/tagkey/tagvalue to UID needs to be maintained. All data point writes and reads need to be converted through the mapping table. The mapping table is usually cached in the TSD or client, which increases additional memory consumption. Second, due to the UID encoding, the number of metric/tagkey/tagvalue has an upper limit, depending on the number of bytes used by UID, and conflicts may occur in UID allocation, which affects writing.
Essentially, the UID encoding optimization adopted by the OpenTSDB storage model mainly solves two problems:
To solve these two problems, KairosDB adopts a different method that doesn't require UID encoding, and these problems are avoided. Let's take a look at the storage model of KairosDB first. It is mainly composed of the following three tables:
<metric><timestamp><tagk1><tagv1><tagk2>tagv2>...<tagkn><tagvn>. The difference is that the metric, tagkey and tagvalue all store the original values, instead of UIDs.
The KairosDB storage model takes advantage of Cassandra's wide tables. In the bottom-layer file storage format of HBase, each column corresponds to a key value, and the key is the rowkey of the row. Therefore, each column in an HBase row stores the same rowkey repeatedly. This is the main reason why UID encoding can save a lot of storage space, and also the reason why the compaction policy (to merge all columns in a row into one column) can be adopted to further compact the storage space after the UID encoding. The bottom-layer file storage format of Cassandra is different from that of HBase. Each column in a row of Cassandra does not store the rowkey repeatedly, so UID encoding is not required. One of the optimization solutions to reduce storage space in Cassandra is to reduce the number of rows, which is why it stores three weeks of data instead of one hour of data per row. For more information about the reasons for these two solutions, see Hbase File Format and Cassandra File Format.
Using Cassandra's wide tables, even without UID encoding, the storage space is not much worse than OpenTSDB with UID encoding. The following is the official explanation:
For one we do not use IDs for strings. The string data (metric names and tags) are written to row keys and the appropriate indexes. Because Cassandra has much wider rows there are far fewer keys written to the database. Not much space is saved by using id's and by not using id's we avoid having to use any kind of locks across the cluster.
As mentioned, Cassandra has wider rows. The default row size in OpenTSDB HBase is 1 hour. Cassandra is set to 3 weeks.
The query optimization method adopted is also different from that of OpenTSDB. The following is the entire process of querying within KairosDB:
Compared with OpenTSDB that scans directly on data tables to filter row keys, KairosDB can absolutely reduce the amount of data scanned by using index tables. In the case of limited tagkey and tagvalue combinations under the metric, the query efficiency is greatly improved. KairosDB also provides a QueryPlugin method, which can scale and use external components to index row keys. For example, ElasticSearch or other indexing systems can be used, because indexing is the best query solution after all. This is also the biggest improvement of Heroic over KairosDB.
The official KairosDB documents contain sections on how to configure the auto-rollup. But in the discussion group, the description of the auto-rollup is as follows:
First off Kairos does not do any aggregation on ingest. Ingest is direct to the storage on purpose - performance.
Kairos aggregation is done after the fact at query time. The rollups are queries that are ran and the results are saved back as a new metric. Right now the rollups are all configured on a per kairos node basis. We plan on changing this in the future.
Right now Kairos does not share any state with other Kairos nodes. They have very little state on the node (except for rollups).
As for consistency it is up to you on how much you want or how important the data is to you.
In summary, the auto-rollup solution provided by KairosDB is still relatively simple to implement. It is a configurable stand-alone component that can be started at a scheduled time, read out the written data, and then write the data again after aggregation. It is indeed very primitive, with low availability and performance.
However, it is better than nothing. Auto-rollup support is a trend for all TSDBs, and it is also a key function that increases functional differences and improves core competency.
The previous section mainly analyzes KairosDB, the first TSDB built on Cassandra, so I'll continue to analyze other TSDBs built on Cassandra.
BlueFlood is also a TSDB built on Cassandra. From this PPT, you can see that there are three main core components in the overall architecture:
Compared with KairosDB, its data model is slightly different from that of other TSDBs, mainly in:
Due to deficiencies in the BlueFlood model, tag query optimization doesn't need to be considered. Instead, all efforts are devoted to the optimization of other features, such as auto-rollup. It is much better than KairosDB and OpenTSDB in terms of auto-rollup support. The auto-rollup features are:
From its 2014 introduction PPT, we can see several function points regarding its future planning:
In summary, if you do not need a support for tags and have strong demand for rollups, BlueFlood is a better choice than KairosDB. On the contrary, KairosDB should be chosen.
The third Cassandra-based TSDB to be introduced is Heroic, which ranks 19th on the DB-Engines. Although it lags behind BlueFlood and KairosDB, I think its design and implementation are the best. For more information about its origins, see this article or this PPT, both of which introduce valuable experiences and lessons.
Spotify chose KairosDB to replace the bottom layer of the old monitoring system in TSDBs, such as OpenTSDB, InfluxDB, and KairosDB, before deciding to develop Heroic. However, problems with queries in KairosDB soon became apparent. The main problem is that KairosDB has no index on metrics and tags, so queries became very slow after the significance of metrics and tags reach a certain number of levels. Therefore, the biggest motivation for Spotify to develop Heroic is to solve the query problems in KairosDB. They adopted a solution of using ElasticSearch as an index to optimize the query engine, while the solution for data writing and data table are completely consistent with KairosDB.
Its features are briefly summarized as follows:
If you need a TSDB to support a complete data model and want to obtain efficient index queries, then Heroic is the choice.
Alibaba Cloud Storage - April 25, 2019
Alibaba Cloud Storage - April 25, 2019
ApsaraDB - July 23, 2021
Alibaba Cloud Storage - April 25, 2019
Alibaba Cloud Storage - March 3, 2021
Alibaba Clouder - November 28, 2018
A fully managed NoSQL cloud database service that enables storage of massive amount of structured and semi-structured dataLearn More
Build business monitoring capabilities with real time response based on frontend monitoring, application monitoring, and custom business monitoring capabilitiesLearn More
A cost-effective online time series database service that offers high availability and auto scaling featuresLearn More
TSDB is a stable, reliable, and cost-effective online high-performance time series database service.Learn More
More Posts by Alibaba Cloud Storage