All Products
Search
Document Center

PolarDB:Export local tables to OSS

Last Updated:Feb 19, 2024

You can use the OSSOUTFILE feature in PolarDB to export local data files as CSV files and store them in Object Storage Service (OSS).

Prerequisites

Your PolarDB cluster uses one of the following versions:

  • A PolarDB for MySQL 8.0.2 cluster whose revision version is 8.0.2.2.8 or later.

  • A PolarDB for MySQL 8.0.1 cluster whose revision version is 8.0.1.1.30 or later.

For more information about how to check the cluster version, see Query the engine version.

Usage notes

  • You cannot use Data Management (DMS) to connect to a PolarDB cluster and export data files to OSS. Otherwise, the following error message is returned:

    Can not issue executeUpdate() or executeLargeUpdate() for SELECTs

    You can connect to a PolarDB cluster by using a client or the command line and export data files to OSS. For more information, see Connect to a cluster.

  • If you want to back up incremental data to OSS, you can purchase and configure a backup plan. For more information, see Purchase a backup plan and Configure a backup plan.

  • We recommend that you perform export tasks on a read-only node. You can add the /*FORCE_SLAVE*/ hint to an SQL statement or connect to a read-only node for data export.

Parameters

Parameter

Description

loose_oss_outfile_buffer_size

The memory size allowed for each OSS outfile thread. Valid values: 102400 to 536870912. Default value: 134217728. Unit: bytes. A large memory size means a high export speed.

loose_max_oss_outfile_threads

The number of OSS outfile threads that can be started on PolarDB at a time. Valid values: 1 to 100 Default value: 1.

You can run the SHOW STATUS LIKE "Oss_outfile_threads_running"; command to view the number of OSS outfile threads that are running in the current cluster.

Note

This parameter does not take effect for PolarDB for MySQL 8.0.1.1.38 or later.

Note

When you use the OSSOUTFILE feature to export CSV-formatted data files to OSS, the maximum memory occupied by the cluster is obtained by using the following formula:

×

loose_max_oss_outfile_threads × loose_oss_outfile_buffer_size

When you use the OSSOUTFILE feature, we recommend that the total memory size for the feature is less than 5% of the memory capacity of the current node. Otherwise, other services on the current node may be affected.

Usage

Syntax

SELECT * FROM table_name INTO OSSOUTFILE 'outfile_path' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';
Note
  • COLUMNS TERMINATED BY: the delimiter between fields.

  • OPTIONALLY ENCLOSED BY: the identifiers at both ends of a field. After OPTIONALLY is added, identifiers are added only to both ends of a field of the string type. Otherwise, identifiers are added to both ends of a field of all types.

  • NULL_MARKER BY: the character indicating that the NULL value is exported as text. The output value of NULL can be defined in the following methods:

    • NULL_MARKER: directly defines the output value of NULL. NULL_MARKER can be defined as any value. This method has the highest priority.

    • ESCAPED BY: Only a single character can be used. The output value of NULL is the character followed by N. For example, ESCAPED BY '\' means that the output value of NULL is \N. This method has the second priority.

    • NULL: If neither of the preceding two methods is defined, the output value of NULL is the string of NULL by default. This method has the lowest priority. NULL does not have ENCLOSED identifiers at both sides.

  • LINES TERMINATED BY: the delimiter between rows.

For more information, see MySQL documentation.

Parameters

Parameter

Description

outfile_path

The OSS directory for the file. It contains the following parts:

  • The OSS server information, which identifies the current OSS bucket and the basic path. For more information, see Use OSS foreign tables to access OSS data.

  • The path of the current task (optional).

  • The name of the exported file.

The preceding three parts are separated with forward slashes (/). A single path of the current task can contain multiple forward slashes (/), indicating that a multi-level path is used. After a file is uploaded to OSS, the complete path is: the OSS server path + the path of the current task. The name of the exported file must be included in outfile_path.

table_name

The name of the table for which you want to create the mapping table.

Example

  1. Create an OSS server.

    CREATE SERVER outserver FOREIGN DATA WRAPPER oss OPTIONS ( EXTRA_SERVER_INFO '{"oss_endpoint": "oss-cn-zhangjiakou.aliyuncs.com", "oss_bucket": "polardb", "oss_access_key_id": "*******", "oss_access_key_secret": "********", "oss_prefix":"B_outfile"}');
  2. Export the local data file to OSS. In the following sample code, the file that is exported to OSS is oss://polardb/B_outfile/tpch/1t/parallel-lineitem.TXT, the bucket name is polardb, and the path of the file is B_outfile/tpch/1t/parallel-lineitem.TXT.

    SELECT * FROM lineitem INTO OSSOUTFILE 'outserver/tpch/1t/parallel-lineitem.TXT' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';

    If the cluster version is PolarDB for MySQL 8.0.1.1.38 or later, you can run the following command to view the total memory size and the total number of threads that are occupied by the current export task in real time:

    • View the total memory size occupied by the export task in real time. Unit: bytes.

      SHOW STATUS LIKE "%Oss_outfile_memory_used%";
    • View the total number of threads occupied by the export task in real time.

      SHOW STATUS LIKE "%Oss_outfile_threads_running%";
    Note

    When you connect to a cluster endpoint to execute an export task, make sure that the SHOW STATUS and export commands are executed on the same node to query information about memory resources occupied by the export command on the current node.

Errors

If the ERROR 1086 (HY000): File 'xxxx' already exists error message is returned, a file with the same name already exists in the current OSS bucket.

You can resolve this issue in the following two ways:

  • Manually delete the existing file on OSS and execute the select into ossoutfile statement again.

  • Use the OSS_REPLACE_EXISTED_FILE() hint to forcibly overwrite the existing file. Syntax:

    SELECT /*+ OSS_REPLACE_EXISTED_FILE() */ * FROM lineitem INTO OSSOUTFILE 'outserver/tpch/1t/parallel-lineitem.TXT' COLUMNS TERMINATED BY '|@|' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY "" LINES TERMINATED BY '\n';
    Note
    • If you use the preceding hint, the existing file in OSS is forcibly overwritten. If you disable the OSS versioning feature, the overwritten file cannot be retrieved. Proceed with caution. For more information about how to enable the OSS versioning feature, see Overview.

    • To use hints, your PolarDB cluster must use one of the following versions:

      • A PolarDB for MySQL 8.0.2 cluster whose revision version is 8.0.2.2.9 or later.

      • A PolarDB for MySQL 8.0.1 cluster whose revision version is 8.0.1.1.31 or later.