This guide shows how to write a Java program that reads data from a CSV file and imports it into PolarDB-X using batch inserts via JDBC (Java Database Connectivity).
Prerequisites
Before you begin, make sure you have:
A running PolarDB-X instance
Java Development Kit (JDK) 8 or later
mysql-connector-java-8.0.20.jar— download from Maven CentralA MySQL database with the source data you want to import
How it works
The import process follows three steps:
Export data from the source MySQL database to a CSV file.
Create the destination table in PolarDB-X.
Run a Java program that reads the CSV file and inserts rows into PolarDB-X in batches.
The example in this guide uses the following source table schema:
CREATE TABLE `sourcedb` (
`id` int(11) NOT NULL,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);Export data from the source database
Run the following commands to export data from a MySQL database. You can export as tab-separated text first, then convert to CSV:
mysql -h <host> -P <port> -u <username> -p<password> <db_name> -N \
-e "SELECT id,k,c,pad FROM sourcedb;" > /home/data_1000w.txtThe values in the tab-separated file are separated by tab characters: 188092293 27267211 59775766593-64673028018-...-09474402685 01705051424-...-54211554755
To convert to CSV format, run:
mysql -h <host> -P <port> -u <username> -p<password> <db_name> -N \
-e "SELECT id,k,c,pad FROM sourcedb;" | sed 's/\t/,/g' > /home/data_1000w.csvCSV is the recommended format because it is easier to parse in the import program than tab-separated text. The resulting file looks like:
188092293,27267211,59775766593-64673028018-...-09474402685,01705051424-...-54211554755Replace<host>,<port>,<username>,<password>, and<db_name>with your actual MySQL connection details.
Create the destination table in PolarDB-X
The CSV file does not contain a CREATE TABLE statement, so create the destination table in PolarDB-X before importing:
CREATE TABLE `targetdb` (
`id` int(11) NOT NULL,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);Import data with a Java program
The program below reads the CSV file line by line and inserts rows into PolarDB-X in batches of 1,000 using JDBC PreparedStatement.
Why batch inserts matter
By default, JDBC sends each INSERT as a separate statement:
INSERT INTO targetdb(id, k, c, pad) VALUES(1, 10, 'aaa', 'bbb');
INSERT INTO targetdb(id, k, c, pad) VALUES(2, 20, 'ccc', 'ddd');Setting rewriteBatchedStatements=true in the JDBC URL causes the driver to rewrite batched statements into a single multi-value INSERT, which is significantly faster:
INSERT INTO targetdb(id, k, c, pad) VALUES(1, 10, 'aaa', 'bbb'), (2, 20, 'ccc', 'ddd');Key JDBC parameters
| Parameter | Effect | Recommended value |
|---|---|---|
rewriteBatchedStatements | Rewrites batched INSERT calls into a single multi-value statement | true |
autoReconnect | Reconnects automatically if the connection drops during a long import | true |
socketTimeout | Sets the socket timeout (milliseconds) for long-running batch operations | 600000 (10 min) |
Example program
Compile with: javac LoadData.java
Run with: java -cp .:mysql-connector-java-8.0.20.jar LoadData
// Requires mysql-connector-java-8.0.20.jar
// Download: https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.20/mysql-connector-java-8.0.20.jar
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
public class LoadData {
public static void main(String[] args) throws IOException, ClassNotFoundException, SQLException {
File dataFile = new File("/home/data_1000w.csv");
String sql = "INSERT INTO targetdb(id, k, c, pad) VALUES(?, ?, ?, ?)";
int batchSize = 1000;
try (
Connection connection = getConnection("<host>", 3306, "<db_name>", "<username>", "<password>");
BufferedReader br = new BufferedReader(new FileReader(dataFile))
) {
PreparedStatement st = connection.prepareStatement(sql);
long startTime = System.currentTimeMillis();
int batchCount = 0;
String line;
while ((line = br.readLine()) != null) {
String[] data = line.split(",");
st.setInt(1, Integer.valueOf(data[0]));
st.setInt(2, Integer.valueOf(data[1]));
st.setObject(3, data[2]);
st.setObject(4, data[3]);
st.addBatch();
if (++batchCount % batchSize == 0) {
st.executeBatch();
System.out.println(String.format("Inserted %d records", batchCount));
}
}
// Flush remaining rows
if (batchCount % batchSize != 0) {
st.executeBatch();
}
long elapsed = System.currentTimeMillis() - startTime;
System.out.println(String.format(
"Done: %d records in %d seconds (TPS: %d)",
batchCount, elapsed / 1000, batchCount / (elapsed / 1000)
));
}
}
/**
* Opens a JDBC connection to PolarDB-X.
*
* @param host IP address of the PolarDB-X instance
* @param port Port number (default: 3306)
* @param database Database name
* @param username Login username
* @param password Login password
*/
private static Connection getConnection(String host, int port, String database,
String username, String password)
throws ClassNotFoundException, SQLException {
Class.forName("com.mysql.cj.jdbc.Driver");
String url = String.format(
"jdbc:mysql://%s:%d/%s?autoReconnect=true&socketTimeout=600000&rewriteBatchedStatements=true",
host, port, database
);
return DriverManager.getConnection(url, username, password);
}
}Replace the following placeholders with your PolarDB-X connection details:
| Placeholder | Description | Example |
|---|---|---|
<host> | IP address of the PolarDB-X instance | 192.168.1.100 |
<db_name> | Target database name | mydb |
<username> | Login username | admin |
<password> | Login password | — |
Optimize performance
Two adjustments have the most impact on import throughput:
Batch size: A batch size of 1,000 works well for most workloads. Increase it (for example, 5,000) if network latency is low and rows are small; decrease it if you encounter memory or timeout errors.
Multithreading: Partition the CSV file into segments and process each segment in a separate thread with its own JDBC connection. This is the most effective way to saturate PolarDB-X write capacity.