PolarDB for MySQL allows you to archive cold data that is rarely updated, added, modified, or read to Object Storage Service (OSS) to greatly reduce data storage costs. This topic describes the cold data archiving feature.
How it works
The following figure shows how the cold data archiving feature (stored in the CSV or ORC format) works. The hot data of a PolarDB cluster is stored in PolarStore. The nodes in the cluster access the data in PolarStore by using PolarFileSystem. To reduce data storage costs and separate hot data from cold data, you can transfer cold data in the shared storage to OSS.
For common tables, you can archive cold data in the CSV or ORC format and store cold data in OSS. Then, PolarDB nodes can access the data in OSS over the internal network.
For partitioned tables, you can archive cold data in the CSV or ORC format and store cold data in OSS. You can also create DLM policies on the primary node to automatically archive cold data. The cold data archiving feature allows you to archive some data in a partitioned table to OSS as an OSS foreign table, or transfer data in some partitions of a partitioned table to OSS.
The archived data is in the CSV or ORC format and stored as multiple files in OSS. The corresponding data in PolarStore is automatically deleted. Less storage space is occupied and therefore the storage costs are reduced.
After cold data is archived, the original table in PolarStore is converted to an archived table in OSS. You cannot delete archived tables. To delete archived tables, go to Quota Center. Enter the
polardb_delete_oss_data/polardb_delete_oss_data
quota ID to find the quota name. Click Apply in the Actions column.The cold data archiving feature for partitioned tables is in the canary release. To use this feature, go to Quota Center. Enter the
polardb_mysql_hybrid_partition
quota ID to find the quota name. Click Apply in the Actions column.After cold data is archived, the archived table in OSS is read-only, and the query performance becomes low. If you use serial queries, the query performance of an archived table without an index is about one-fifth to one-tenth of that on the InnoDB storage engine. We recommend that you use test data to check whether the query performance meets your requirement before you archive data.
Billing
Cold data is charged based on the space occupied in OSS. The following billing rules are used:
For regions within the Chinese mainland: USD 0.0000325 per GB-hour.
For regions outside the Chinese mainland: USD 0.0000455 per GB-hour.
You can purchase a PolarDB storage package to offset the storage costs of cold data in OSS. For more information about PolarDB storage plans, see What are storage plans
The following table describes the offset rules of PolarDB storage plans.
Region type | Offset ratio | OSS storage capacity offset by a 1-GB storage plan |
Regions in the Chinese mainland | 0.045 | 1/0.045=22.22 GB |
Regions outside the Chinese mainland | 0.045 | 1/0.045=22.22 GB |
For example, you purchase a storage plan with a capacity of 100 GB, and 50 GB is left after it offsets the data storage usage. The cold data of the cluster consumes 50 GB of OSS storage space in a region inside the Chinese mainland. In this case, the storage plan automatically uses 2.25 GB (50/2.25) to offset the backup storage usage and the storage plan has 47.75 GB of capacity left.
If the remaining capacity of the storage plan is insufficient to deduct the OSS storage space consumed by the code data, you are charged for additional storage space on a pay-as-you-go basis.
For more information about how to view the storage capacity of cold data in OSS, see View the information of the cold data archived in OSS.
For more information about billing details, see Bills.
Supported versions
To archive data in the CSV or ORC format, your Enterprise Edition cluster must meet the following requirements:
The cluster is of Cluster Edition and uses one of the following PolarDB for MySQL versions:
PolarDB for MySQL 8.0.1.1.31 or later
PolarDB for MySQL 8.0.2.2.9 or later
The cluster is of Multi-master Cluster (Database/Table) Edition and uses PolarDB for MySQL 8.0.1.0.13 or later.
Comparison between different file formats
When you archive cold data, you can select an appropriate file format based on your requirements.
File format | Archiving speed | Reading speed | Open-source | Storage consumption | How to modify archived data |
CSV | Faster than the ORC format. Note Only single-threaded archiving is supported. | Faster than the ORC format on row store nodes. Note Both single-threaded and multi-threaded reading methods are supported. | Yes. | Consistent with the storage space occupied by tables using the InnoDB engine and without indexes. | You must import data from OSS tables to PolarStore before you modify the data. |
ORC | Slow. Note Only single-threaded archiving is supported. | Suitable for AP queries on a separate column store node. Note Only single-threaded reading is supported. | Yes. | For the same amount of data, ORC files occupies 45% of the storage space occupied by CSV files. |
Usage
Archive cold data
After you log on to the PolarDB console, enable cold data archiving, and connect to a cluster, you can archive cold data.
For common tables, you can archive cold data in the CSV or ORC format and store cold data in OSS.
For partitioned tables, you can archive cold data in the CSV or ORC format and store cold data in OSS. You can also create DLM policies on the primary node to automatically archive cold data.
Query archived cold data
After you archive cold data for common tables and partitioned tables, you can use one of the following methods to query the archived cold data:
Common tables: Use the same method as querying hot data. You do not need to modify the access mode.
Partitioned tables: For more information about how to query cold data, see Query data in a hybrid partitioned table.
The archived cold data is distributed across multiple files within a single table. You can use parallel queries to check the data. For more information, see Perform multi-file queries for a single OSS foreign table.
Modify archived cold data
If you need to modify the cold data archived in OSS at a low frequency, you can execute the
ALTER ENGINE
statement to import data from OSS to PolarStore. After the data is imported to PolarStore, the cold data in OSS is deleted. After you modify the data, you can archive the modified table in OSS again. For more information, see Import data from OSS tables to PolarStore.Delete archived cold data
To delete archived cold data, go to Quota Center. Enter the polardb_delete_oss_data/polardb_delete_oss_data quota ID to find the quota name. Click Apply in the Actions column.
Contact us
If you have any questions about the cold data archiving feature, join the DingTalk group 24490017825. You can ask experts to answer your questions by mentioning them in the group or use the PolarDB for MySQL chatbot assistant that provides 24/7 support.