This topic describes how to use the cold data archiving feature. Before you begin, confirm the archive format and ensure that your cluster meets the requirements. For more information, see Comparison of archive formats and Version requirements.
Version requirements
For data archiving in CSV format
If the product edition is Cluster Edition, the kernel version must be one of the following:
MySQL 8.0.1 with a revision version of 8.0.1.1.47 or later.
MySQL 8.0.2 with a revision version of 8.0.2.2.10 or later.
If the product edition is Multi-master Cluster (Limitless) Edition, the kernel version must be 8.0.1.0.13 or later.
For data archiving in ORC format
If the product edition is Cluster Edition, the revision version must be 8.0.2.2.30 or later.
If the product edition is Multi-master Cluster (Limitless) Edition, the revision version must be 8.0.2.2.30 or later.
Archive cold data
You must first enable cold data archiving. After the feature is enabled, you can archive data from standard tables or partitioned tables.
Standard tables
Partitioned tables
You can manually archive one or more partitions of a partitioned table or archive them to an OSS external table. For more information, see Archive partitioned tables or Archive partitioned tables to OSS external tables.
You can create a Data Lifecycle Management (DLM) policy to automatically archive cold data. For more information, see Automatically archive cold data.
Query cold data
Standard tables
Regardless of the archive format you choose, you can query the archived data without changing the way you access the table.
Partitioned tables
Archive to an OSS external table: To query the archived data, you must change your access method. This means you must query the specified archive table. For more information, see Archive partitioned tables to OSS external tables.
Archive partitioned tables: After archiving, the table becomes a hybrid partitioned table. For more information, see Query hybrid partitions.
You can query data of partitions that use different engines by using one of the following methods:
If you set the
hybrid_partition_query_mix_engine_enabledparameter to ON in the console, query results contain data in the partitions that do not use the InnoDB engine. Example:SELECT * FROM t1;t1is the hybrid partitioned table.If you set the
hybrid_partition_query_mix_engine_enabledparameter to OFF in the console, you can query data in a specified partition. Example:SELECT * FROM t1 partition (p1);t1is the hybrid partitioned table.p1is the partition.
Archived cold data for a single table is stored in multiple files. To optimize query performance when you query cold data, you can use a parallel query. For more information, see Parallel query for cold data.
Modify cold data
If you infrequently need to modify cold data stored in OSS, you can use the ALTER statement to move the cold data from OSS back to PolarDB for modification. When the data is moved back, the original cold data in OSS is automatically deleted. After you modify the data, you can archive the updated table data to OSS again.
Standard tables
Syntax
ALTER TABLE table_name ENGINE[=]engine_name;Parameters
Parameter | Description |
table_name | The name of the OSS table to move back. |
engine_name | The engine type after a failback operation. |
Usage notes
OSS tables are read-only and do not support modification operations such as INSERT, UPDATE, or DELETE. To modify archived cold data, you must convert the OSS table into a read-write table, such as an InnoDB table. If you try to modify a read-only OSS table, the following error message is returned:
1036 - Table 't' is read onlyExample
Move the OSS table t from the oss_test database back to PolarDB.
ALTER TABLE `oss_test`.`t` ENGINE = InnoDB;Modify the data in the InnoDB table t. After you modify the data, you can archive table t from the InnoDB engine back to OSS. For example:
ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';or
ALTER TABLE t ENGINE = CSV STORAGE OSS;Partitioned tables
Syntax
ALTER TABLE table_name REORGANIZE PARTITION part_name INTO (partition_definition);Parameters
Parameter | Description |
table_name | The name of the OSS table to move back. |
part_name | The name of the partition to move back. |
partition_definition | The partition definition. This must be the same as the |
Example
Move the data in the p1 partition of the partitioned table t from OSS back to PolarDB.
ALTER TABLE t REORGANIZE PARTITION p1 INTO(PARTITION p1 values less than(100));Delete cold data
The corresponding files from OSS can be deleted only for PolarDB for MySQL clusters of the following versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.42 or later.
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.23 or later.
If the revision version of your cluster does not meet the preceding requirements, you cannot delete the corresponding files from OSS. We recommend that you upgrade the revision version of your cluster. For more information, see Manage minor versions.
After you delete a table from OSS or move a table from OSS back to PolarDB, the corresponding file in OSS is not automatically deleted. If you confirm that the data is no longer needed, you can delete the corresponding file from OSS. For more information, see Delete the corresponding files from OSS.
Standard tables and OSS external tables
Delete the archived table using the
DROP TABLEstatement, and then delete the corresponding file in OSS using theCALL dbms_oss.delete_table_file('database_name', 'table_name');command.Partitioned tables
You can use the
CALL dbms_oss.delete_table_file('database_name', 'table_name');command to delete the corresponding file in OSS.
Deleting the corresponding file in OSS is an asynchronous operation. The file is completely deleted only after all nodes in the cluster no longer depend on it. In addition, this process may experience latency during periods of high traffic.
The FORCE STORAGE OSS option
Currently, the FORCE STORAGE OSS option is available only on PolarDB for MySQL clusters that are running version 8.0.2 with a revision version of 8.0.2.2.29 or later.
Standard tables and OSS external tables
DROP TABLE table_name [FORCE STORAGE OSS];NoteThe FORCE STORAGE OSS option in the DROP TABLE command provides a one-step cleanup method. It automatically deletes the associated OSS storage files when you delete the table schema.
Partitioned tables
ALTER TABLE table_name DROP PARTITION part_name [FORCE STORAGE OSS];NoteYou can use the FORCE STORAGE OSS option with the DROP PARTITION clause of the ALTER TABLE statement to synchronously delete partition metadata and the corresponding OSS storage files.
CHANGE PARTITION ENGINE
ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV|ORC [FORCE STORAGE OSS]NoteThe CHANGE PARTITION ENGINE cold data archiving feature includes a forced overwrite mechanism that automatically replaces files with the same name.
Comparison of archive formats
When you archive cold data, you can use the following comparison to select a suitable format.
Archiving data from standard tables, OSS external tables, and partitioned tables is subject to certain limits. Review these limits carefully before you perform an archive operation to prevent disruptions to your business.
After cold data is archived, it is stored in the system's default Object Storage Service (OSS) bucket, not in an OSS bucket that you own. Currently, you can view the list of archived data only in the PolarDB console.
Descriptions of partitioned table archiving methods:
Archive partitioned tables: This method archives a partition of a partitioned table in place. The data remains in the original table, but the storage medium for that partition changes from PolarDB (hot storage) to OSS (cold storage). The table itself becomes a hybrid partitioned table that contains both hot and cold partitions.
Archive partitioned tables to an OSS external table: This method moves the data from a partition of a partitioned table to a new, independent OSS external table. The original partition in the source table is deleted.
Comparison item | CSV | ORC |
Open source format | Yes | Yes |
Data archiving method |
| Manual data archiving: |
Data archiving speed | Fast Note Only single-threaded data archiving is supported. | Slow Note Only single-threaded data archiving is supported. |
Query speed |
Note Both single-threaded and multi-threaded data reads are supported. |
Note Only single-threaded data reads are supported. |
Transaction support | No | No |
Indexing capability | No | No |
Method to modify archived data | Archived tables on OSS are read-only. To modify the data, import it from OSS back to PolarDB storage. | |
Storage space used | The storage space used is the same as that for a table in the InnoDB engine without an index. | For the same amount of data, the storage space used is 45% of that used by a CSV format file. |
Backup and recovery | Not supported. Note When you perform a backup operation, cold data that is archived to OSS is not backed up. Therefore, you cannot use backups to recover databases and tables, restore from a backup, or perform a point-in-time recovery. | |
Impact after data archiving |
|
|