This topic describes how to use the cold data archiving feature. Before you archive cold data, confirm the data archive format and make sure that your cluster meets the data archiving requirements. For more information, see Comparison of archiving data in different formats and Supported versions.
Supported versions
To archive data in the CSV or ORC format, a cluster of Enterprise Edition must meet the following requirements:
If the cluster is of Cluster Edition, it must run one of the following database engine versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.31 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.9 and later.
If the cluster is of Multi-master Cluster (Database/Table) Edition, it must run PolarDB for MySQL 8.0.1.0.13 or later.
Archive cold data
To archive cold data, you must first enable cold data archiving. You can archive common tables or partitioned tables based on your business requirements.
Common tables
You can manually archive common tables in the CSV format. For more information, see Archive common tables.
Partitioned tables
You can manually archive specific partitions of a partitioned table to OSS or an OSS foreign table. For more information, see Archive partitioned tables or Archive table partitions to an OSS foreign table.
You can create a data lifecycle management (DLM) policy for automatic cold data archiving. For more information, see Automatically archive cold data.
Query archived cold data
Common tables
Regardless of the cold data archive format you use, you can query the archived data of a common table by specifying the name of the table.
Partitioned tables
Archive table partitions to an OSS foreign table: You can query the archived data only by specifying the archive table name. For more information, see Archive table partitions to an OSS foreign table.
Archive partitioned tables: After specific partitions of a table are archived to OSS, the table becomes a hybrid partitioned table. For information about how to query the archived data, see Query data in a hybrid partitioned table.
You can query data of partitions that use different engines by using one of the following methods:
If you set the
hybrid_partition_query_mix_engine_enabled
parameter to ON in the console, query results contain data in the partitions that do not use the InnoDB engine. Example:SELECT * FROM t1;
t1
is the hybrid partitioned table.If you set the
hybrid_partition_query_mix_engine_enabled
parameter to OFF in the console, you can query data in a specified partition. Example:SELECT * FROM t1 partition (p1);
t1
is the hybrid partitioned table.p1
is the partition.
The archived cold data in a single table is distributed across multiple files. You can use parallel queries to view the cold data. For more information, see Parallel query execution on cold data.
Modify archived cold data
If you have infrequent requirements to modify cold data stored in OSS, you can use the ALTER
statement to move the cold data from OSS back into the PolarDB storage space for modification. After the data is moved back to PolarDB, the original cold data in OSS is automatically deleted. After the data is modified, you can archive the updated table data to OSS again.
Common table
Syntax
ALTER TABLE table_name ENGINE[=]engine_name;
Parameters
Parameter | Description |
table_name | The name of the OSS table that you want to move back to the PolarDB storage space. |
engine_name | The new storage engine that you want to use for the table. |
Usage notes
OSS tables are in the read-only state and cannot be modified by using the INSERT
, UPDATE
, or DELETE
statement. To modify archived cold data, convert OSS tables into tables that can be read and written, such as InnoDB tables. When you modify an OSS table in the read-only state, the following error message is reported:
1036 - Table 't' is read only
Examples
Move the OSS table named t
back to the PolarDB storage space in the oss_test
database.
ALTER TABLE `oss_test`.`t` ENGINE = InnoDB;
Modify the data in the InnoDB table named t
. After the data is modified, archive the t
table from the InnoDB engine to OSS again. Examples:
ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';
or
ALTER TABLE t ENGINE = CSV STORAGE OSS;
Partitioned table
Syntax
ALTER TABLE table_name REORGANIZE PARTITION part_name INTO (partition_definition);
Parameters
Parameter | Description |
table_name | The name of the OSS table that you want to move back to the PolarDB storage space. |
part_name | The name of the partition that you want to move back to the PolarDB storage space. |
partition_definition | The partition definition. The value must be the same as the |
Examples
Move data in the p1
partition of the partitioned table named t
from OSS back to the PolarDB storage space in the database.
ALTER TABLE t REORGANIZE PARTITION p1 INTO(PARTITION p1 values less than(100));
Delete archived cold data
You can delete the corresponding file in OSS only if your PolarDB for MySQL cluster runs one of the following versions:
PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.42 or later
PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.23 or later
If the revision version of your cluster does not meet the preceding requirements, you cannot delete the corresponding files from OSS. We recommend that you upgrade the revision version of your cluster. For more information, see Minor version upgrade.
After you delete a table from OSS or move a table from OSS back to the PolarDB storage space, the corresponding file in OSS is retained. If you no longer use the data, you can perform the following steps to delete the corresponding file in OSS. For more information, see Delete the corresponding file in OSS.
Common tables and OSS foreign tables
First execute the
DROP TABLE
statement to delete the archive table, and then execute theCALL dbms_oss.delete_table_file('database_name', 'table_name');
statement to delete the corresponding file in OSS.Partitioned tables
Execute the
CALL dbms_oss.delete_table_file('database_name', 'table_name');
statement to delete the corresponding file in OSS.
The deletion of files from OSS is an asynchronous operation. The deletion process is completed only after all nodes in the cluster no longer depend on the OSS files. During periods of high traffic, the deletion process may experience delays.
Comparison of archiving data in different formats
The following table compares data archiving in different formats. Select a suitable format based on your business requirements.
Item | CSV | ORC |
Open source | Yes | Yes |
Archiving method |
| Manual archiving |
Archiving speed | Faster than the ORC format. Note Only single-threaded archiving is supported. | Slow. Note Only single-threaded archiving is supported. |
Query speed |
Note Single-threaded reading and multi-threaded reading are supported. |
Note Only single-threaded reading is supported. |
Transactions | Not supported | Not supported |
Indexing | Not supported | Not supported |
How to modify archived data | Archive tables in OSS are read-only. If you want to modify archived data, you must import the data from the OSS table to the PolarDB storage space. | |
Storage consumption | Same storage consumption as tables without indexes that use the InnoDB engine. | ORC files occupy 45% of the storage space occupied by CSV files for the same data volume. |
Backup and restoration | Not supported. Note When you perform backup operations, the cold data archived in OSS is not backed up. As a result, you cannot restore archived cold data from backups. | |
Impact |
|
|
Before you archive common tables, OSS foreign tables, and partitioned tables, review the limits to prevent negative impacts on your business.
Archived data is stored in the default OSS bucket. You can view the archived data list only in the PolarDB console.