All Products
Search
Document Center

PolarDB:Manually archive cold data in the CSV or ORC format

Last Updated:Jul 10, 2024

This topic describes how to archive cold data in the CSV or ORC format and how to import data from OSS tables to PolarStore.

Prerequisites

  • To archive cold data in files in the CSV format, make sure that the Enterprise Edition cluster meets the following requirements (for more information about how to view the cluster version, see query the version number):

    • The cluster is of Cluster Edition and meets one of the following version requirements:

      • PolarDB for MySQL 8.0.1, with a revision version of 8.0.1.1.32 or later

      • PolarDB for MySQL 8.0.2, with a revision version of 8.0.2.2.10 or later

        Note

        Binary logs are not recorded for the clusters of the following versions when you archive cold data:

        • PolarDB for MySQL 8.0.1, with a revision version of 8.0.1.1.33 or later

        • PolarDB for MySQL 8.0.2, with a revision version of 8.0.2.2.11.1 or later

    • The cluster is of Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.0.13 or later.

  • To archive cold data in the ORC format, make sure that the Enterprise Edition cluster version is PolarDB for MySQL 8.0.2, with a revision version of 8.0.2.2.16.2 or later. For information about how to view the cluster version, see Query the engine version.

  • The cold data archiving feature is enabled. For more information, see Enable cold data archiving.

  • The cluster is connected. For more information, see Connected to a cluster.

Usage notes

Archive common tables

Cold data archiving is the process of preserving cold data within tables. These tables, which store the cold data, are referred to as archived tables (read-only) and utilize the OSS engine. The data files corresponding to these archived tables are stored in OSS. After cold data is archived, the PolarFileSystem space occupied by the original local table is released.

Syntax

  • Archive cold data in the CSV format:

    ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server';

    If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, you can also execute the following statement to manually archive cold data:

    ALTER TABLE table_name ENGINE = CSV STORAGE OSS;
  • Archive cold data in the ORC format:

    ALTER TABLE table_name ENGINE = ORC STORAGE OSS;
Note

If OSS contains a file that has the same name as the file to be archived, the system returns an error message, indicating that the file already exists in OSS. Sample message:

Target file for archived table exists on oss.

You can add the FORCE keyword to the preceding statement to overwrite the existing file in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:

ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server' , FORCE;

Parameters

Parameter

Description

table_name

The name of the table to be archived to OSS.

Precautions

  • Only data in the tables that use the InnoDB storage engine and X-Engine can be archived.

  • You cannot modify the table by executing DDL or DML statements during cold data archiving.

  • Data files cannot be archived to custom OSS servers.

  • To archive cold data in a table that uses the InnoDB storage engine, ensure that the table has a primary key.

  • After cold data is archived, the table in OSS becomes read-only, and the query speed for this table is slow. We recommend that you use test data to evaluate if the query performance meets your requirements before proceeding with data archiving.

Examples

Archive data in the t table to OSS in the CSV or ORC format.

  1. Create a table named t in the oss_test database. The table uses the InnoDB storage engine.

  2. CREATE TABLE t(a int, b int, c int, primary key(a)) ENGINE = INNODB;
  3. Insert data into the t table.

  4. INSERT INTO t VALUES (1,2,3);
  5. Execute the ALTER statement to archive cold data.

    • Archive data in the CSV format:

      ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';

      If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, execute the following statement:

      ALTER TABLE t ENGINE = CSV STORAGE OSS; 
    • Archive data in the ORC format:

      ALTER TABLE t ENGINE = ORC STORAGE OSS;
  6. After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the archived tables.

    • To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster and choose Settings and Management > Cold Data Archive to view information about the archived databases and tables in OSS.

    • To view data in the archived tables, you can execute SQL statements in the same way as you execute SQL statements before the archiving operations. Sample message:

      SELECT * FROM t;

Archive partitioned tables

Note
  • The partitioned table archiving feature is in the canary release phase. To use this feature, go to the Quota Center. Find the quota name corresponding to the Quota ID value of polardb_mysql_hybrid_partition. Click Apply in the Actions column.

  • Your Enterprise Edition cluster must run PolarDB for MySQL 8.0.2 with a revision version of 8.0.2.2.14 or later.

Partitioned table archiving is the process of archiving data in partitions. The archived table is a hybrid partitioned table. The data files of the archived partitions are stored in OSS. After cold data is archived, the PolarStore space occupied by the original partitioned table is released.

Syntax

  • Archive cold data in the CSV format:

    ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV;
  • Archive cold data in the ORC format:

    ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = ORC;
Note

If OSS contains a file that has the same name as the file to be archived, the system returns an error message, indicating that the file already exists in OSS. Sample message:

Target file for archived table exists on oss.

You can add the FORCE keyword to the preceding statement to overwrite the existing file in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:

ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV FORCE;

Parameters

Parameter

Description

table_name

The name of the table to be archived to OSS.

part_name

The name of the partition to be archived to OSS.

Precautions

  • The cold data archiving feature is supported only for partitioned tables that use the InnoDB storage engine.

  • When you archive data in a partitioned table, make sure that at least one partition remains on the InnoDB storage engine. This means that you cannot archive data in the last partition on the InnoDB storage engine.

  • The archived table is a hybrid partitioned table. You cannot execute DDL statements on a hybrid partitioned table. For more information, see Create a hybrid partitioned table.

  • You cannot modify data of archived partitions.

  • Cold data archiving is not supported for subpartitions in a partitioned table.

  • You cannot archive data in the default partitions of a list default hash partitioned table.

  • You cannot archive data in a hash or key partitioned table.

  • You cannot archive data for the entire partitioned table.

  • If OSS contains a file that has the same name as the file in the partition to be archived, the execution fails and an error message that is similar to the following message is returned:

    Target file for archived table exists on oss.

    You need to execute the following statement to overwrite the existing file in OSS: Replace table_name and part_name based on your environment prior to executing the statement.

    ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV/ORC FORCE;

Examples

Archive data in the p1 and p2 partitions of the t table to OSS in the CSV format.

  1. Create a table named t in the database. The table uses the InnoDB storage engine.

    CREATE TABLE t(a int, b int, c int, primary key(a))
    PARTITION BY RANGE(a)
    (PARTITION p1 values less than(100),
     PARTITION p2 values less than(200),
     PARTITION p3 values less than MAXVALUE
    );
  2. Insert data into the t table.

    INSERT INTO t VALUES(1,1,1);
    INSERT INTO t VALUES(10,10,10);
    INSERT INTO t VALUES(100,100,100);
    INSERT INTO t VALUES(150,150,150);
    INSERT INTO t VALUES(200,200,200);
    INSERT INTO t VALUES(1000,1000,1000);
  3. Execute the following statements to archive data in the p1 and p2 partitions to OSS:

    • Archive data in the CSV format:

      ALTER TABLE t CHANGE PARTITION p1 ENGINE = csv;
      ALTER TABLE t CHANGE PARTITION p2 ENGINE = csv;
    • Archive data in the ORC format:

      ALTER TABLE t CHANGE PARTITION p1 ENGINE = ORC;
      ALTER TABLE t CHANGE PARTITION p2 ENGINE = ORC;
  4. After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the hybrid partitioned table.

  • To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster and choose Settings and Management > Cold Data Archive to view information about the archived databases and tables in OSS.

  • To view data in the hybrid partitioned table, see Query data in a hybrid partitioned table.

Import data from OSS tables to PolarStore

Import data from a regular table that is archived in OSS

Before you modify the cold data archived in OSS tables, you can execute the ALTER ENGINE statement to import data from OSS to PolarStore. Then, you can modify the data in the imported table. After data is imported to PolarStore, cold data in OSS is deleted. After the data is modified, you can archive the modified table to OSS again.

Syntax
ALTER TABLE table_name ENGINE[=]engine_name;
Parameters

Parameter

Description

table_name

The name of the OSS table to be imported.

engine_name

The engine of the imported table.

Precautions

OSS tables are in the read-only state and cannot be modified (by executing the INSERT, UPDATE, or DELETE statement). To modify archived cold data, convert OSS tables into tables that can be read and written, such as InnoDB tables. When you modify an OSS table in the read-only state, the following error message is reported:

1036 - Table 't1' is read only
Examples

Import data from the OSS table t to PolarStore in the oss_test database.

ALTER TABLE `oss_test`.`t` ENGINE = InnoDB;

Modify the data in the InnoDB table t. After the data is modified, archive the t table in the InnoDB engine to OSS again. Examples

ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';

or

ALTER TABLE t ENGINE = CSV STORAGE OSS;

Import data from a hybrid partitioned table that is archived in OSS

Before you modify the data in a hybrid partitioned table that is archived in OSS tables, you can execute the ALTER statement to import data from OSS to PolarStore. After data is imported to PolarStore, cold data in OSS is deleted.

Syntax
ALTER TABLE table_name REORGANIZE PARTITION part_name INTO (partition_definition);
Parameters

Parameter

Description

table_name

The name of the OSS table to be imported.

part_name

The name of the partition to be imported.

partition_definition

The partition definition. The value must be the same as the value of partition_definition of the partition to be imported.

Examples

Import data from the p1 partition of the partitioned table t that is archived in OSS to PolarStore in the database.

ALTER TABLE t REORGANIZE PARTITION p1 INTO(PARTITION p1 values less than(100));

Delete the corresponding file in OSS

Note
  • You can delete the corresponding file in OSS only when your PolarDB for MySQL cluster runs one of the following versions:

    • PolarDB for MySQL 8.0.1, with a revision version of 8.0.1.1.42 or later

    • PolarDB for MySQL 8.0.2, with a revision version of 8.0.2.2.23 or later

  • If the revision version of your cluster does not meet the preceding requirements, you cannot delete the corresponding files from OSS. We recommend that you upgrade the revision version of your cluster. For more information, see Minor version upgrade.

After you delete a table from OSS or import a table from OSS to PolarStore, the corresponding file in OSS is not deleted. After you confirm that the data is no longer needed, you can use the following procedure to delete the corresponding file from OSS:

CALL dbms_oss.delete_table_file('database_name', 'table_name');

The OSS file is deleted asynchronously. Therefore, you need to wait until all nodes in the cluster no longer depend on the OSS file before you can delete the file. In addition, there is a delay during heavy traffic. Therefore, if the preceding statement fails and the "OSS files are still in use" error message is displayed, wait for a period of time before rerunning the statement.