In big data scenarios, tiered storage is frequently used to store cold and hot data separately. ApsaraDB for HBase provides a new cold storage medium to store cold data. It provides equivalent write performance at one third the storage cost of ultra disks. You can query cold data in the cold storage at any time. Cold storage is applicable to various cold data scenarios such as data archiving and infrequently accessed data consumption. Cold storage is easy to use and can reduce storage costs. When you purchase an ApsaraDB for HBase instance, yo can choose cold storage as an additional storage medium. Then, you can run table creation statements to store cold data in cold storage.

Activate cold storage
You can buy cold storage independently and use it as an additional storage.


Use cold storage
Note: To use cold storage, you must upgrade ApsaraDB for HBase Performance-enhanced Edition to a version later than 2.1.8. The version of the client dependency alihbase-connector must be later than 1.0.7 or 2.0.7. The version of HBase Shell must be later than alihbase-2.0.7-bin.tar.gz.
ApsaraDB for HBase Performance-enhanced Edition allows you to set storage properties based on column families. You can set the Storage parameter of a column family or all column families of a table to COLD. Then all data of this column family or all column families in the table is stored in cold storage and does not occupy the Hadoop Distributed File System (HDFS) space of the cluster. You can specify the property when you create a table or modify the property of the column family after you create a table.
You can use Java API or HBase Shell to create a table and modify the table properties. If you use the Java API, you must install the SDK for Java and configure the parameters first. For more information, see Use the Java API to access ApsaraDB for HBase. If you use HBase Shell, you must download and configure HBase Shell first. For more information, see Use HBase Shell to access ApsaraDB for HBase.
Create a table that uses cold storage
HBase Shell
hbase(main):001:0> create 'coldTable', {NAME => 'f', STORAGE_POLICY => 'COLD'}
Java API
Admin admin = connection.getAdmin();
HTableDescriptor descriptor = new HTableDescriptor(TableName.valueOf("coldTable"));
HColumnDescriptor cf = new HColumnDescriptor("f");
cf.setValue("STORAGE_POLICY", AliHBaseConstants.STORAGETYPE_COLD);
descriptor.addFamily(cf);
admin.createTable(descriptor);
Modify the table property to use cold storage
If you have created a table, you can modify the property of a column family in the table to use cold storage. If the column family contains data, the data is archived to cold storage after a major compaction.
HBase Shell
hbase(main):011:0> alter 'coldTable', {NAME=>'f', STORAGE_POLICY => 'COLD'}
Admin admin = connection.getAdmin();
TableName tableName = TableName.valueOf("coldTable");
HTableDescriptor descriptor = admin.getTableDescriptor(tableName);
HColumnDescriptor cf = descriptor.getFamily("f".getBytes());
// Set the storage type of the table to cold storage.
cf.setValue("STORAGE_POLICY", AliHBaseConstants.STORAGETYPE_COLD);
admin.modifyTable(tableName, descriptor);
Set the property of the table to hot storage. If you want to change the storage type from cold storage to hot storage, you can modify the table property. If the column family contains data, the data is archived to hot storage after a major compaction.
java hbase(main):014:0> alter 'coldTable', {NAME=>'f', STORAGE_POLICY => 'DEFAULT'}
Admin admin = connection.getAdmin();
TableName tableName = TableName.valueOf("coldTable");
HTableDescriptor descriptor = admin.getTableDescriptor(tableName);
HColumnDescriptor cf = descriptor.getFamily("f".getBytes());
// Set the storage type of the table to the default type. By default, the storage type is hot storage.
cf.setValue("STORAGE_POLICY", AliHBaseConstants.STORAGETYPE_DEFAULT);
admin.modifyTable(tableName, descriptor);
View the cold storage status


Performance testing
Environment requirements
Master: ECS. c5.xlarge, 4-core 8 GB memory, 20 GB ultra disk. 4 RegionServer: ECS. c5.xlarge, 4-core 8 GB memory, 20 GB ultra disk. 4 Test Machine: ECS. c5.xlarge, 4-core 8 G memory.
Write performance
Table type | avg rt | p99 rt |
---|---|---|
Hot tables | 1736 us | 4811 us |
Cold tables | 1748 us | 5243 us |
Note: Each data record includes 10 columns and has 100 bytes data stored in each column. This means that there is 1 KB data in each row. The system writes data in 16 parallel threads.
Random GET performance
Table type | avg rt | p99 rt |
---|---|---|
Hot tables | 1704 us | 5923 us |
Cold tables | 14738 us | 31519 us |
Note: If you disable BlockCache, the system reads the data from the disk every time. Each data record includes 10 columns and has 100 bytes data stored in each column. This means that there is 1 KB data in each row. The system reads 1 KB data for each request by using eight parallel threads.
Scan performance within a specified range
Table type | avg rt | p99 rt |
---|---|---|
Hot tables | 6222 us | 20975 us |
Cold tables | 51134 us | 115967 us |
Note: Disable the BlockCache of the table. Each data record includes 10 columns and has 100 bytes data stored in each column. This means that there is 1 KB data in each row. The system reads 1 KB data for each request by using eight parallel threads. Set the Caching parameter to 30.
Notes
1. The read Input/Output Operations Per Second (IOPS) of cold storage is low (up to 25 times/s per node), so cold storage is applicable to infrequent queries.
2. The write throughput of cold storage equals to the throughput of the ultra disks that are used for hot storage.
3. Cold storage cannot process a large number of concurrent read requests. An error may occur if cold storage is used to process a large number of concurrent read requests.
4. If you have purchased an extremely large cold storage space, you can adjust the read IOPS as needed. You can submit a ticket to request technical support.
5. We recommend that you store no more than 30 TB of cold data in each core node. To expand the storage capacity of each core node, you can submit a ticket.