This topic describes how to use Batch Tool to export and import data.
Introduction
Batch Tool is developed by the PolarDB-X team. You can use Batch Tool to import and export data for PolarDB-X databases.
Batch Tool is implemented based on Java statements. Batch Tool uses a producer/consumer model. Batch Tool allows you to perform operations by using multiple threads. Batch Tool also allows you to export, import, delete, and update multiple data rows at a time. If you use Batch Tool, data is transmitted in the CSV format. This allows you to perform operations on data in a convenient manner.
The following code block describes the parameters of and how to run the command that is provided by Batch Tool.
usage: BatchTool [-?] [-batchsize <arg>] [-con <consumer count>] [-cs
<charset>] -D <database> [-dir <directory>] [-f <from>] [-F
<filenum>] [-func] -h <host> [-header] [-i] [-in] [-L <line>]
[-lastSep] [-lb] [-local] [-maxConn <arg>] [-minConn <arg>]
[-noesc] [-np] [-O <order by type>] -o <operation> [-OC <ordered
column>] -p <password> [-P <port>] [-para] [-pre <prefix>] [-pro
<producer count>] [-quote <auto/force/none>] [-readsize <arg>]
[-rfonly] [-ringsize <arg>] -s <sep> [-t <table>] -u <user> [-w
<where>]
-?,--help Help message.
-batchsize,--batchSize <arg> Batch size of emitted tuples.
-con,--consumer <consumer count> Configure number of consumer
threads.
-cs,--charset <charset> Define charset of files.
-D,--database <database> Database to use.
-dir,--dir <directory> Directory path including files to
import.
-f,--from <from> Source file(s), separated by ; .
-F,--filenum <filenum> Fixed number of exported files.
-func,--sqlfunc Use sql function to update.
-h,--host <host> Connect to host.
-header,--header Whether the header line is column
names.
-H,--historyFile <filename> history file name for resuming from breakpoint
-i,--ignoreandresume Flag of insert ignore & resume from breakpoint
-in,--wherein Using where ... in (...)
-L,--line <line> Max line limit of exported files.
-lastSep,--withLastSep Whether line ends with separator.
-lb,--loadbalance If using load balance.
-local,--localmerge o local merge sort.
-maxConn,--maxConnection <arg> Max connection number limit.
-minConn,--minConnection <arg> Mim connection number limit.
-noesc,--noescape Don't escape values.
-np,--noparttion No use of parttion.
-O,--orderby <order by type> asc or desc.
-o,--operation <operation> Batch operation type: export /
import / delete / update.
-OC,--orderCol <ordered column> col1;col2;col3.
-p,--password <password> Password to use when connecting to
server.
-P,--port <port> Port number to use for connection.
-para,--paraMerge Using parallel merge when doing
order by export.
-pre,--prefix <prefix> Export file name prefix.
-pro,--producer <producer count> Configure number of producer
threads.
-quote,--quoteMode <auto/force/none> The mode of how field values are
enclosed by double-quotes when
exporting table. Default value is
auto.
-readsize,--readSize <arg> Read block size in MB.
-rfonly,--rfonly Only read and process file, no sql
execution.
-ringsize,--ringBufferSize <arg> Ring buffer size.
-s,--sep <sep> Separator between fields
(delimiter).
-t,--table <table> Target table.
-tps,--tpsLimit <arg> Tps limit
-u,--user <user> User for login.
-w,--where <where> Where condition: col1>99 AND
col2<100 ...
Parameter Description
The following section describes the common parameters:
- -o: specifies the operation that you want to perform. Valid values: export, import, delete, and update.
- -t: specifies the name of the destination table. Only one table can be specified.
- -s: specifies the delimiter. You can use a string or a character as a delimiter.
- -f: specifies one or more source files. If you specify multiple source files, separate the names of the source files with semicolons (;).
- -OC: specifies one or more columns based on which data is sorted when the data is exported. If you specify multiple columns, separate the names of the columns with semicolons (;).
- -cs: specifies the character set of the source file or source files. Default value: UTF-8.
- -lastSep: specifies whether each line in the source file or source files ends with a delimiter.
- -quote: specifies how double quotation marks (") enclose a column value when the data
is exported or imported. Valid values:
- auto: uses double quotation marks (") to enclose a column value based on whether the column value includes special characters, such as a delimiter and a line feed. This value is the default value.
- force: enforces the usage of double quotation marks (") to enclose each column value.
- none: forbids the usage of double quotation marks (") to enclose each column value. This value can be applied in scenarios in which the columns of the specified table contain only numeric values or contain string values that do not include special characters.
- -header: specifies whether the first row is a column header.
- -i: specifies whether to enable resumable uploads and execute the INSERT IGNORE statement.
- -pre: specifies the prefix of the files that you want to export.
- -F: specifies the number of files that you want to export.
Download Batch Tool
Click Batch Tool to download the JAR file of Batch Tool.
Examples
java -jar batch-tool.jar -?
- Export multiple data rows at a time.
## 1. Export multiple files at a time in a default manner. The number of files to be exported equals the number of sharded tables from which you export the files. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o export -t table_name -s , ## 2. Export three files at a time. -F specifies the number of files that you want to export. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o export -t table_name -s , -F 3 ## 3. Export multiple files at a time. Each file that is exported contains a maximum of ten thousand rows. -L specifies the maximum number of rows that the files that are exported can contain. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o export -t table_name -s , -L 10000 ## 4. Use the -w parameter to specify a condition. The -w parameter works in the same manner as the WHERE clause. If the condition contains a space, enclose the condition within double quotation marks ("). java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o export -t table_name -s , -w "id < 150 and id > 120"
- Import an amount of data at a time. Before you import data, you must create a destination
table. Batch Tool supports only data transmission and cannot create destination tables.
## 1. Separate multiple files with semicolons (;). java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o import -t table_name -s , -f "table_name_0;table_name_1;table_name_2;table_name_3" ## 2. By default, data is inserted based on the shard keys of the destination table. If you do not want to insert data based on the shard keys, enter -np in your command. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o import -t table_name -s , -f "file0;file2" -np ## 3. Specify numbers of producer threads and consumer threads. -pro specifies the number of producer threads. The producer threads are used to read files. -con specifies the number of consumer threads. The consumer threads are used to import files. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o import -t table_name -s , -f "table_name_0;table_name_1" -np -pro 16 -con 16 ## 4. Allow Batch Tool to perform resumable uploads and execute the INSERT IGNORE statement. java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o import -t table_name -s , -f "table_name_0;table_name_1" -i
- Delete multiple data rows at a time. The following sample code provides an example
on how to delete the data in the specified database. The data is contained in the
specified file. In the following code, a delete query is created to delete the data
based on the table schema of the specified table.
java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o delete -t table_name -s , -f "file0"
- Update multiple data rows at a time. The following sample code provides an example
on how to update the data in the specified database. The data is contained in the
specified file. In the following code, an update query is created to update the data
based on the table schema of the specified table.
java -jar batch-tool.jar -h 127.0.0.1 -u testUser -p testPasswd -P 3306 -D db_name -o update -t table_name -s , -f "file0"