PolarDB for MySQL lets you export a local table to Object Storage Service (OSS) using multiple parallel workers, which significantly reduces export time for large datasets.
Prerequisites
Before you begin, ensure that you have:
A PolarDB for MySQL cluster running Enterprise Edition, with revision number 8.0.1.1.38 or later (version 8.0.1) or 8.0.2.2.25 or later (version 8.0.2)
An OSS bucket and an OSS server configured in your cluster. For setup instructions, see Use OSS foreign tables to access OSS data
(Recommended) A connection to a read-only node. Run export tasks on a read-only node to avoid impacting the primary node. Add the
/*FORCE_SLAVE*/hint to your SQL statement, or connect directly to a read-only node
Limitations
Only single-table queries (
SELECT * FROM table) can be exported in parallel. Queries withJOIN,ORDER BY, orGROUP BYare not eligible for parallel export.
Syntax
SELECT [/*+parallel(N)*/] * FROM table_name
INTO OSSOUTFILE 'outfile_path'
[COLUMNS TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[NULL_MARKER BY 'string' | ESCAPED BY 'char']
[LINES TERMINATED BY 'string'];To trigger parallel export, add the /*+parallel(N)*/ hint, where N is the number of parallel workers. Alternatively, enable the elastic parallel query feature so that the database automatically determines the degree of parallelism. For more information, see Overview.
For the full SELECT syntax that the INTO OSSOUTFILE clause extends, see MySQL documentation.
Clauses
| Clause | Description |
|---|---|
COLUMNS TERMINATED BY 'string' | Delimiter between fields. |
OPTIONALLY ENCLOSED BY 'char' | Character placed at both ends of a field. With OPTIONALLY, the character is applied only to string-type fields. Without it, it is applied to all field types. |
NULL_MARKER BY 'string' | Text representation of NULL values. Takes the highest priority. Can be any string. |
ESCAPED BY 'char' | Single-character escape prefix. The NULL output becomes the escape character followed by N — for example, ESCAPED BY '\' outputs \N. Takes second priority after NULL_MARKER. If neither is set, NULL is output as the literal string NULL without enclosing characters. |
LINES TERMINATED BY 'string' | Delimiter between rows. |
Parameters
| Parameter | Description |
|---|---|
outfile_path | The destination path in OSS. Consists of three parts separated by forward slashes (/): the OSS server name, an optional task path (which can itself contain multiple / for multi-level paths), and the output file name. The complete OSS path after upload is: OSS server path + task path. The file name must be included. |
table_name | The name of the local table to export. |
Global parameter
| Parameter | Default | Valid range | Description |
|---|---|---|---|
loose_oss_outfile_buffer_size | 134217728 bytes (128 MiB) | 102400–536870912 bytes | Memory allocated per parallel export worker thread. Increasing this value generally improves export throughput. |
Considerations
Minimum data threshold for parallel export
Each worker must write at least 1,024 KB (pq_oss_min_worker_write_size) of data. If the estimated per-worker output falls below this threshold, parallel export is not used. Use the troubleshooting steps below to confirm whether parallel export was activated.
Monitor a running export
Run the following commands in a separate session while the export is in progress. When connecting through a cluster endpoint, run these commands on the same node as the export statement.
Total memory used by the current export task (bytes):
SHOW STATUS LIKE "%Oss_outfile_memory_used%";Total number of active export worker threads:
SHOW STATUS LIKE "%Oss_outfile_threads_running%";
Export a table in parallel
The following example exports the lineitem table with 4 parallel workers.
Check whether the query supports parallel execution. Run
EXPLAINto verify that the planner can parallelize the scan:EXPLAIN SELECT /*+parallel(4)*/ * FROM lineitem;Expected output:
+----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+---------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+---------------------------+ | 1 | SIMPLE | <gather1> | NULL | ALL | NULL | NULL | NULL | NULL | 5392844 | 100.00 | NULL | | 1 | SIMPLE | lineitem | NULL | ALL | NULL | NULL | NULL | NULL | 1348211 | 100.00 | Parallel scan (4 workers) | +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+---------------------------+Parallel execution is available when the Extra column shows
Parallel scan.Create an OSS server. Skip this step if you already have an OSS server configured.
CREATE SERVER outserver FOREIGN DATA WRAPPER oss OPTIONS ( EXTRA_SERVER_INFO '{"oss_endpoint": "oss-cn-zhangjiakou.aliyuncs.com", "oss_bucket": "polardb", "oss_access_key_id": "<your-access-key-id>", "oss_access_key_secret": "<your-access-key-secret>", "oss_prefix": "B_outfile"}' );Replace the placeholder values with your OSS credentials and bucket information.
Verify that parallel export is supported for this query. Run
EXPLAINon the full export statement to confirm the planner will use parallel export:EXPLAIN SELECT /*+parallel(4)*/ * FROM lineitem INTO OSSOUTFILE 'outserver/t1.CSV' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';Expected output:
+----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------------------+ | 1 | SIMPLE | <gather1> | NULL | ALL | NULL | NULL | NULL | NULL | 5805759 | 100.00 | Parallel export OSS outfile | | 1 | SIMPLE | lineitem | NULL | ALL | NULL | NULL | NULL | NULL | 1451439 | 100.00 | Parallel scan (4 workers); Parallel export OSS outfile | +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------------------+Parallel export is active when the Extra column shows
Parallel export OSS outfile. If it does not appear, see Troubleshooting.Run the export.
SELECT /*+parallel(4)*/ * FROM lineitem INTO OSSOUTFILE 'outserver/t1.CSV' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';While the export runs, monitor resource usage from another session using the
SHOW STATUScommands described in Considerations.
Troubleshooting
If the EXPLAIN output does not show Parallel export OSS outfile, use the optimizer trace to find the reason.
Enable the optimizer trace:
SET optimizer_trace="enabled=on";Execute the following statement to view the execution plans of SQL statements whose results are exported in parallel:
EXPLAIN SELECT /*+parallel(4)*/ * FROM lineitem WHERE l_orderkey < 100 INTO OSSOUTFILE 'default_oss_server/t1' COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' NULL_MARKER BY 'NULL' LINES TERMINATED BY '\n';Query the trace result:
SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;In the output, locate the
considered_parallel_outfileobject and read thecausefield:"considered_parallel_outfile": { "choose": false, "storage": "OSS", "format": "CSV", "mean_outfile_record_length": "79 B", "estimate_single_worker_outfile_size": "8 KB", "cause": "The data written by each worker should be greater than 1024 KB (pq_oss_min_worker_write_size)." }The
causefield explains why parallel export was not used.