All Products
Search
Document Center

PolarDB:Archive cold data in the CSV or ORC format

Last Updated:Apr 12, 2024

This topic describes how to archive cold data in the CSV or ORC format and how to import data from OSS tables to PolarStore.

Prerequisites

  • To archive files in the CSV format, an Enterprise Edition cluster must meet the following requirements. For information about how to check the version of a cluster, see Query the engine version.

    • The cluster is of Cluster Edition and meet one of the following version requirements:

      • A cluster of PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.32 or later.

      • A cluster of PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.10 or later.

        Note

        Binary logs are not recorded for the clusters of the following versions when you archive cold data:

        • A cluster of PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.33 or later.

        • A cluster of PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.11.1 or later.

    • The cluster is of Multi-master Cluster (Database/Table) Edition whose revision version is 8.0.1.0.13 or later.

  • To archive files in the ORC format, an Enterprise Edition cluster must run PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.16.2 or later. For information about how to check the version of a cluster, see Query the engine version.

  • The cold data archiving feature is enabled. For more information, see Enable cold data archiving.

  • The cluster is connected. For more information, see Connected to a cluster.

Archive common tables

Cold data archiving is performed to archive tables. The tables after the cold data archiving operation are called archived tables (read-only). The engine of archived tables is OSS engine, and the data files of archived tables are stored in OSS. After cold data is archived, the PolarFileSystem space occupied by the original local table is released.

  • Syntax

    • Archive cold data in the CSV format:

      ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server';

      If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, you can also execute the following statement to manually archive cold data:

      ALTER TABLE table_name ENGINE = CSV STORAGE OSS;
    • Archive cold data in the ORC format:

      ALTER TABLE table_name ENGINE = ORC STORAGE OSS;
    Note

    If Object Storage Service (OSS) contains a file that has the same name as the file to be archived, the system returns an error message indicating that the file already exists in OSS. Sample message:

    Target file for archived table exists on oss.

    You can add the FORCE keyword to the preceding statements to overwrite existing files in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:

    ALTER TABLE table_name ENGINE = CSV CONNECTION = 'default_oss_server' , FORCE;
  • Parameters

  • Parameter

    Description

    table_name

    The name of the table to be archived to OSS.

  • Precautions

    • Only data in the tables that use the InnoDB storage engine and X-Engine can be archived.

    • You cannot modify the table by executing DDL or DML statements during cold data archiving.

    • Data files cannot be archived to custom OSS servers.

    • To archive cold data in a table that uses the InnoDB storage engine, the table must have a primary key.

    • After cold data is archived, the archived table in OSS is read-only and provides poor query performance. We recommend that you use test data to check whether the query performance meets your requirement before you archive data.

  • Examples

    Archive data in the t table to OSS in the CSV or ORC format.

    1. Create a table named t in the oss_test database. The table uses the InnoDB storage engine.

    2. CREATE TABLE t(a int, b int, c int, primary key(a)) ENGINE = INNODB;
    3. Insert data into the t table.

    4. INSERT INTO t VALUES (1,2,3);
    5. Execute the ALTER statement to archive cold data.

      • Archive data in the CSV format:

        ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';

        If the cluster is of PolarDB for MySQL 8.0.1.1.33 or later, PolarDB for MySQL 8.0.2.2.13 or later, or Multi-master Cluster (Database/Table) Edition PolarDB for MySQL 8.0.1.1.15 or later, execute the following statement:

        ALTER TABLE t ENGINE = CSV STORAGE OSS; 
      • Archive data in the ORC format:

        ALTER TABLE t ENGINE = ORC STORAGE OSS;
    6. After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the archived tables.

      • To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster and choose Settings and Management > Cold Data Archive to view information about the archived databases and tables in OSS.

      • To view data in the archived tables, you can execute SQL statements in the same way as you execute SQL statements before the archiving operations. Sample statement:

        SELECT * FROM t;

Archive partitioned tables

Note
  • The partitioned table archiving feature is in the canary release phase. To use this feature, go to the Quota Center. Find the quota name corresponding to the Quota ID value of polardb_mysql_hybrid_partition. Click Apply in the Actions column.

  • Your Enterprise Edition cluster must run PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.14 or later.

For partitioned tables, data in partitions are archived. The archived table is a hybrid partitioned table. The data files of the archived partitions are stored in OSS. After cold data is archived, the PolarFileSystem space occupied by the original partitioned table is released.

  • Syntax

    • Archive cold data in the CSV format:

      ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV;
    • Archive cold data in the ORC format:

      ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = ORC;
    Note

    If OSS contains a file that has the same name as the file to be archived, the system returns an error message indicating that the file already exists in OSS. Sample message:

    Target file for archived table exists on oss.

    You can add the FORCE keyword to the preceding statements to overwrite existing files in OSS. The following example shows how to add the FORCE keyword to the statement that archives cold data in the CSV format:

    ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV FORCE;
  • Parameters

    Parameter

    Description

    table_name

    The name of the table to be archived to OSS.

    part_name

    The name of the partition to be archived to OSS.

  • Precautions

    • The cold data archiving feature is supported only for partitioned tables that use the InnoDB storage engine.

    • When you archive data in a partitioned table, at least one partition must remain on the InnoDB storage engine. This means that you cannot archive data in the last partition on the InnoDB storage engine.

    • The archived table is a hybrid partitioned table. You cannot execute DDL statements on a hybrid partitioned table. For more information about the precautions when you use a hybrid partitioned table, see Create a hybrid partitioned table.

    • You cannot modify data of archived partitions.

    • Cold data archiving is not supported for subpartitions in a partitioned table.

    • You cannot archive data in the default partitions of a list default hash partitioned table.

    • You cannot archive data in a hash or key partitioned table.

    • You cannot archive data for the entire partitioned table.

    • If OSS contains a file that has the same name as the file in the partition to be archived, the execution fails and an error message that is similar to the following message is returned:

      Target file for archived table exists on oss.

      You need to execute the following statement to overwrite the existing file in OSS: Replace table_name and part_name based on your environment when you execute the statement.

      ALTER TABLE table_name CHANGE PARTITION part_name ENGINE = CSV/ORC FORCE;
  • Examples

    Archive data in the p1 and p2 partitions of the t table to OSS in the CSV format.

    1. Create a table named t in the database. The table uses the InnoDB storage engine.

      CREATE TABLE t(a int, b int, c int, primary key(a))
      PARTITION BY RANGE(a)
      (PARTITION p1 values less than(100),
       PARTITION p2 values less than(200),
       PARTITION p3 values less than MAXVALUE
      );
    2. Insert data into the t table.

      INSERT INTO t VALUES(1,1,1);
      INSERT INTO t VALUES(10,10,10);
      INSERT INTO t VALUES(100,100,100);
      INSERT INTO t VALUES(150,150,150);
      INSERT INTO t VALUES(200,200,200);
      INSERT INTO t VALUES(1000,1000,1000);
    3. Execute the following statements to archive data in the p1 and p2 partitions to OSS:

      • Archive data in the CSV format:

        ALTER TABLE t CHANGE PARTITION p1 ENGINE = csv;
        ALTER TABLE t CHANGE PARTITION p2 ENGINE = csv;
      • Archive data in the ORC format:

        ALTER TABLE t CHANGE PARTITION p1 ENGINE = ORC;
        ALTER TABLE t CHANGE PARTITION p2 ENGINE = ORC;
    4. After the data is archived, you can log on to the PolarDB console to view the archived databases and tables in OSS, or execute SQL statements to view data in the hybrid partitioned table.

    • To view the archived databases and tables in OSS, log on to the PolarDB console. Find the cluster and choose Settings and Management > Cold Data Archive to view information about the archived databases and tables in OSS.

    • To view data in the hybrid partitioned table, see Query data in a hybrid partitioned table.

Import data from OSS tables to PolarStore

Import data from a regular table that is archived in OSS

Before you modify the cold data archived in OSS tables, you can execute the ALTER ENGINE statement to import data from OSS to PolarStore. Then, you can modify the data in the imported table. After data is imported to PolarStore, cold data in OSS is deleted. After the data is modified, you can archive the modified table to OSS again.

  • Syntax

    ALTER TABLE table_name ENGINE[=]engine_name;
  • Parameters

    Parameter

    Description

    table_name

    The name of the OSS table.

    engine_name

    The engine of the imported table.

  • Precautions

    OSS tables are in the read-only state and cannot be modified (by executing the INSERT, UPDATE, or DELETE statement). To modify archived cold data, you must convert OSS tables into tables that can be read and written, such as InnoDB tables. When you modify an OSS table in the read-only state, the following error message is reported:

    1036 - Table 't1' is read only
  • Examples

    Import data from the OSS table t to PolarStore in the oss_test database.

    ALTER TABLE `oss_test`.`t` ENGINE = InnoDB;

    Modify the data in the InnoDB table t. After the data is modified, archive the t table in the InnoDB engine to OSS again. Sample statements:

    ALTER TABLE t ENGINE = CSV CONNECTION = 'default_oss_server';

    or

    ALTER TABLE t ENGINE = CSV STORAGE OSS;

Import data from a hybrid partitioned table that is archived in OSS

Before you modify the data in a hybrid partitioned table that is archived in OSS tables, you can execute the ALTER statement to import data from OSS to PolarStore. After data is imported to PolarStore, cold data in OSS is deleted.

Syntax

ALTER TABLE table_name REORGANIZE PARTITION part_name INTO (partition_definition);

Parameters

Parameter

Description

table_name

The name of the OSS table to be imported.

part_name

The name of the partition to be imported.

partition_definition

The partition definition. The value must be the same as the value of partition_definition of the partition to be imported.

Example

Import data from the p1 partition of the partitioned table t that is archived in OSS to PolarStore in the database.

ALTER TABLE t REORGANIZE PARTITION p1 INTO(PARTITION p1 values less than(100));