All Products
Search
Document Center

Lindorm:Overview

Last Updated:Feb 28, 2024

The hot and cold data separation feature provided by Lindorm allows you to separately store hot and cold data in different types of storage media to reduce storage costs. In addition, you can regularly migrate data to the cold storage to reduce the number of queries performed on data in the hot storage. This way, the query performance of hot data can be improved. This topic describes how the hot and cold data separation feature works and the usage notes of this feature.

Background information

In big data scenarios, a table may store large amounts of historical data, such as orders and monitoring data. The historical data becomes cold over time and is rarely accessed. In this case, the storage cost of the historical data becomes a challenge. To reduce the storage cost of historical data, Lindorm supports the hot and cold data separation feature. This feature allows you to separately store hot and cold data in different types of storage media. Cold data is stored in storage of the Capacity type. Hot data is stored in storage of the following types: Standard, Performance, local SSDs, and local HDDs. The unit price of Capacity storage used to store cold data is 80% lower than the unit price of Standard storage. This way, the storage cost of cold data is significantly reduced.

How it works

Lindorm separately stores the hot data and cold data in the same table. Data is stored in hot storage or cold storage based on the timestamps or custom time columns of the data and the hot and cold data boundary specified for the table. Lindorm stores new data in hot storage first and then transfers the data to cold storage after the age of the data exceeds the hot and cold data boundary.

You can easily access a table for which hot and cold data separation is enabled in the same way as you access a normal table. When you query data in a table for which hot and cold data separation is enabled, you can specify hints or a time range to query only hot data.

Lindorm separately stores hot data and cold data based on custom time columns or timestamps.

  • Hot and cold data separation based on custom time columns: You can configure a custom time column for data and specify a hot and cold data boundary for the table. Lindorm determines whether to store the data in cold storage or hot storage based on the custom time column and the specified hot and cold data boundary. If no value is specified for the custom time column of a row, the row is stored in hot storage. For more information, see Separately store hot data and cold data based on custom time columns.

  • Hot and cold data separation based on timestamps: You can specify a timestamp for data when you write the data to a table. Lindorm determines whether to store the data in cold storage or hot storage based on the timestamp of the data and the hot and cold data boundary specified for the table. If you do not specify a timestamp, the time when the data is written to the table is used to determine whether to archive data to cold storage. For more information, see Separately store hot data and cold data based on timestamps.

Limits

  • Hot and cold data separation based on custom time columns: Only tables created by using SQL statements are supported. Tables created by using an HBase shell or API are not supported. If your table is created by using SQL statements, we recommend that you use this method to implement hot and cold data separation.

  • Hot and cold data separation based on timestamps: Tables created by using SQL statements, HBase shells, and HBase APIs are supported. This method is applicable to scenarios in which custom time columns cannot be configured for the table. If your table is created by using an HBase shell or HBase API, we recommend that you use this method to implement hot and cold data separation.

Usage notes

  • Capacity storage is suitable for scenarios in which data is not frequently queried because the IOPS of Capacity storage is low.
  • The write throughput of Capacity storage is close to that of standard storage.
  • Capacity storage is not suitable for processing a large number of concurrent read requests. An error may occur if Capacity storage is used to process a large number of concurrent read requests.
  • If you purchase a large capacity of Capacity storage for your Lindorm instance, you can adjust the read IOPS based on your business requirements. For more information, contact the Technical support.
  • We recommend that you store no more than 30 TB of cold data on each node. To store more cold data in Capacity storage, contact the Technical support.
  • If more than 95% of the Capacity storage of an instance is used, data can no longer be written to Capacity storage. Monitor the utilization of the Capacity storage of your instance. For more information, see View the size of cold storage.