Batch Tool is a Java-based CLI utility developed by the PolarDB-X team for bulk data operations on PolarDB-X databases. Built on a producer/consumer model, it supports importing data from files, exporting data to files, and migrating data across MySQL, PolarDB-X 1.0, and PolarDB-X 2.0 data sources.
Prerequisites
Before you begin, make sure you have:
-
Java installed on your machine
-
Network access to the target PolarDB-X database
Download Batch Tool
Download the JAR package from the following link:
To view all available options, run:
java -jar batch-tool.jar --help
Parameters
All examples in this document use UTF-8 encoding and commas (,) as field separators by default.
The table below lists all parameters. Required parameters vary by operation; see the Examples section for command patterns.
| Parameter | Long form | Description | Default |
|---|---|---|---|
-o |
--operation |
Operation type: export, import, delete, or update |
— |
-h |
--host |
Host IP address of the database | — |
-P |
--port |
Port number of the database | — |
-u |
--user |
Username | — |
-p |
--password |
Password | — |
-D |
--database |
Database name | — |
-t |
--table |
Target table name | — |
-s |
--sep |
Field separator (delimiter) | — |
-f |
--file |
Source file path(s), separated by semicolons | — |
-dir |
--directory |
Directory containing files to import | — |
-F |
--filenum |
Fixed number of exported files | — |
-L |
--line |
Maximum rows per exported file | — |
-w |
--where |
WHERE condition for filtering rows (e.g., col1>99 AND col2<100) |
— |
-col |
--columns |
Target columns to export, separated by semicolons | — |
-format |
--fileFormat |
File format: NONE, TXT, CSV, XLS, or XLSX |
NONE |
-comp |
--compress |
Compression: NONE or GZIP |
NONE |
-encrypt |
--encrypt |
Encryption: NONE, AES, or SM4 |
NONE |
-key |
--secretKey |
Secret key used for encryption | — |
-DDL |
--DDL |
DDL mode: NONE, ONLY, or WITH |
NONE |
-cs |
--charset |
Character set of the files | — |
-header |
--header |
Treat the first line as column names | false |
-quote |
--quoteMode |
Quote mode for field values: AUTO, FORCE, or NONE |
AUTO |
-lastSep |
--withLastSep |
Whether each line ends with the separator | false |
-sharding |
--sharding |
Enable sharding mode | Depends on operation |
-batchsize |
--batchSize |
Batch size for insert operations | — |
-pro |
--producer |
Number of producer threads | — |
-con |
--consumer |
Number of consumer threads | — |
-fcon |
--forceConsumer |
Force consumer parallelism | — |
-tps |
--tpsLimit |
Transactions per second (TPS) limit (-1 means no limit) |
-1 |
-readsize |
--readSize |
Read block size (MB) | — |
-ringsize |
--ringSize |
Ring buffer size (must be a power of 2) | — |
-maxConn |
--maxConnection |
Maximum connection count (Druid pool) | — |
-minConn |
--minConnection |
Minimum connection count (Druid pool) | — |
-maxWait |
--connMaxWait |
Maximum wait time when acquiring a connection (ms) | — |
-initSqls |
--initSqls |
Connection initialization SQL statements (Druid) | — |
-param |
--connParam |
JDBC connection parameters in key1=val1&key2=val2 format |
— |
-lb |
--loadbalance |
Enable JDBC load balance. Specify multiple hosts in -h as host1:port1,host2:port2 |
false |
-O |
--orderby |
Sort order for export: asc or desc |
— |
-OC |
--orderCol |
Column names to sort by, separated by semicolons | — |
-para |
--paraMerge |
Use parallel merge when exporting with ORDER BY | false |
-local |
--localMerge |
Use local merge sort | false |
-mask |
--mask |
JSON configuration for masking sensitive columns during export | — |
-in |
--whereIn |
Use WHERE col IN (values) syntax |
false |
-func |
--sqlFunc |
Use SQL functions for update operations | false |
-noEsc |
--noEscape |
Disable value escaping for SQL | false |
-i |
--ignore |
Enable insert-ignore mode and resume from breakpoint | false |
-error |
--maxError |
Maximum error count before the program exits | — |
-H |
--historyFile |
History file path for breakpoint resumption | — |
-pre |
--prefix |
Prefix for exported file names | — |
-rfonly |
--readFileOnly |
Read and process files without executing SQL | false |
-perf |
--perfMode |
Enable performance mode (reduces compatibility) | false |
-config |
--configFile |
Path to a YAML configuration file | — |
-help |
--help |
Help message | — |
-v |
--version |
Print the Batch Tool version and exit | — |
Examples
The following examples use UTF-8 encoding and commas (,) as field separators. Replace the placeholder values with your actual connection details.
Export data
The following examples export data from the customer table in the tpch database in sharding mode.
Export with default settings
The number of output files equals the number of table shards.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s ,
Export to a fixed number of files
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -F 3
-F 3 splits the output into exactly three files.
Limit rows per file
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -L 100000
-L 100000 sets a 100,000-row limit per file. Each file can contain 200 to 100,000 rows.
Filter rows with a WHERE condition
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -w "c_nationkey=10"
-w "c_nationkey=10" exports only rows where c_nationkey equals 10.
Force-quote field values
If field values contain the delimiter character, use -quote force to wrap all values in double quotes.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -quote force
Export selected columns
Use -col to export only the specified columns.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -col "c_custkey;c_name;c_nationkey"
-col "c_custkey;c_name;c_nationkey" exports only those three columns, separated by semicolons.
Export with GZIP compression
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -comp GZIP
-comp GZIP compresses each output file. Use this when exporting large tables to reduce disk usage.
Export with DDL
Use -DDL WITH to include the CREATE TABLE statement alongside the data.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -DDL WITH
To export the DDL only (no data), use -DDL ONLY.
Export with ORDER BY
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o export -t customer -s , -O asc -OC "c_custkey"
-
-O ascsets the sort direction to ascending. -
-OC "c_custkey"specifies the column to sort by.
For large tables, add -para true to use parallel merge sort, which improves export speed.
Import data
The following examples import data into the lineitem table in the tpch database in sharding mode.
Create the target table before importing. The table structure must match the structure of the source file.
Import a single file
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o import -t lineitem -s , -f "./data/lineitem.csv"
-f "./data/lineitem.csv" specifies the source CSV file path.
Import all files in a directory
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o import -t lineitem -s , -dir "./data/lineitem/"
-dir "./data/lineitem/" imports all files in the specified directory. Each file name must start with the target table name (e.g., lineitem0_1).
Import with breakpoint resumption
If an import is interrupted, use -i true and -H to resume from where it stopped.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o import -t lineitem -s , -f "./data/lineitem.csv" -i true -H "./lineitem_history.txt"
-
-i trueenables insert-ignore mode and breakpoint resumption. -
-H "./lineitem_history.txt"specifies the history file that tracks progress.
Tune import throughput
For large imports, adjust concurrency and batch size to improve performance.
java -jar batch-tool.jar -P 3306 -h 127.0.XX.XX -u user_**** -p 12**** -D tpch -o import -t lineitem -s , -dir "./data/lineitem/" -pro 4 -con 8 -batchsize 500
-
-pro 4sets four producer threads to read data from files. -
-con 8sets eight consumer threads to write data to the database. -
-batchsize 500inserts 500 rows per batch.
Start with these values and adjust based on your database capacity and observed throughput.